Hi everyone,
In this topic, I’ll provide a detailed explanation of the current status of our project:
A few weeks ago, I opened a pull request which included the Gene Disease Package and a new Streamlit App design based on Sam’s feedback and suggestions. In the pull request, the following updates were made:
Main pipeline
- Renamed and reorganized modules within the
gene_disease
package, with a definition of a simple script to run the pipeline using main.py and config.py.pipeline overview
-
The pipeline begins by fetching disease data using the provided EFO ID and selecting either the BigQuery or GraphQL API as the data source. It then constructs a graph combining PPI and target data, creating both positive and negative edges for gene-disease associations. Node embeddings are generated using algorithms like Node2Vec, ProNE, or GGVec, based on configuration, and are used for feature extraction and labeling. The data is split, and a classifier is trained. Model performance is evaluated on unseen validation sets, with metrics displayed. Finally, predictions are made to classify proteins as associated or non-associated with the disease, with results saved in CSV files.
- Updated the Dockerfile and requirements.txt to support different installation methods detailed in the README.md.
Streamlit App Design
The Streamlit app includes API calls to fetch the target and PPI datasets. It constructs the graph and performs a similar process as in the pipeline, but in a user interface manner. Below are screenshots of the app:
Project Website
The project website includes content such as the project purpose, session workflow, file formats, and guidance on navigating through GDAP, serving as documentation for the project. The site is currently live, but the pull request is still pending to be merged into the main branch. Some pages also need updates.
I am calling on the team members and mentors to review, modify, and share any suggestions or ideas.
@stemaway @Samuel_bharti @anya @ayahashim16 @hahaharsini @Wajeehthebaji @Prasun_Sharma @Moh_Saiger @Sabdha
I am also requesting to merge the pull request for the project source code so that we can deploy the Streamlit app and make the repository and the website public.