Module 7: (Capstone project)
Technical Area
• Reading datasets and meta-data in R
• Quality control
• Normalization and background correction
• Batch effect removal
• Annotation and gene filteration
• KEGG pathway and GO analysis
• GSEA
• Gene concept network analysis
• TFs analysis
• PPI network analysis by Cytoscape
• Survival analysis
Tools
• R packages: Affy, arrayQualityMetrics , sva, ggplot2, pheatmap, WGCNA, limma, EnhancedVolcano, hgu133plus2.db, enrichplot, org.Hs.eg.db, msigdbr, magrittr, clusterProfiler, enrichplot, tidyr, clusterProfiler, Rcpp
• Cytoscape (STRING and Cytohubba plugins)
• GEPIA
Soft Skills
• I prepared a presentation of my Capstone project. So I worked on my presentation skills
• Preparing Powerpoint for the presentation
Tasks completed
I merged two datasets containing 70 samples of lung cancer and removed the batch effect between them and then implemented differential expression analysis for them. After obtaining DEGs, I found enriched KEGG pathways and GO enriched terms for them. Then, I plotted a gene-concept network and TF network for them and performed a GSEA. To find hub genes I plotted a PPI network and found the 10 key genes in that network. After that, I implemented the survival analysis for those 10 genes and found 5 genes that the value of their expression was effective in the survival of patients with lung cancer.