GDAP: Gene-Disease Association Prediction

Hi everyone,

In this topic, I’ll provide a detailed explanation of the current status of our project:

A few weeks ago, I opened a pull request which included the Gene Disease Package and a new Streamlit App design based on Sam’s feedback and suggestions. In the pull request, the following updates were made:

Main pipeline

  • Renamed and reorganized modules within the gene_disease package, with a definition of a simple script to run the pipeline using main.py and config.py.
    pipeline overview
      The pipeline begins by fetching disease data using the provided EFO ID and selecting either the BigQuery or GraphQL API as the data source. It then constructs a graph combining PPI and target data, creating both positive and negative edges for gene-disease associations. Node embeddings are generated using algorithms like Node2Vec, ProNE, or GGVec, based on configuration, and are used for feature extraction and labeling. The data is split, and a classifier is trained. Model performance is evaluated on unseen validation sets, with metrics displayed. Finally, predictions are made to classify proteins as associated or non-associated with the disease, with results saved in CSV files.

Streamlit App Design

The Streamlit app includes API calls to fetch the target and PPI datasets. It constructs the graph and performs a similar process as in the pipeline, but in a user interface manner. Below are screenshots of the app:

Project Website

The project website includes content such as the project purpose, session workflow, file formats, and guidance on navigating through GDAP, serving as documentation for the project. The site is currently live, but the pull request is still pending to be merged into the main branch. Some pages also need updates.


I am calling on the team members and mentors to review, modify, and share any suggestions or ideas.

@stemaway @Samuel_bharti @anya @ayahashim16 @hahaharsini @Wajeehthebaji @Prasun_Sharma @Moh_Saiger @Sabdha

I am also requesting to merge the pull request for the project source code so that we can deploy the Streamlit app and make the repository and the website public.

2 Likes

Hey! It looks great. Had no issues running the program, the results looked okay. though I am facing an error with the app on windows. Seems to be throwing an error in the config file. I validated the toml code and it is right but I keep getting a key error due to an invalid character. I havent tried it with a docker. I have tried different versions of steamlit but the error persists. Could use your help or guidance here. The webpage is also looking good. Let me know what pages you want to fill up. I did write own some FAQs and was planning on a app tutorial but cant get it to run.

Hi @hahaharsini,

Can you share the error message for this invalid character? Also, how did you run the app? Did you read the guidance in the README? You need to run it as a module from the project root, or install the package first