SELF ASSESSMENT FOR WEEK 4
Things learned
- Technical area: Learned how to map Probeset IDs to gene symbols, use limma package to fit linear model for the data set, and visualize the deferentially expressed genes using heatmap and EnhancedVolcano plot.
- Tools: R, Slack, G Suite, GitHub.
- Soft skills: I volunteered to be the task lead for this week and created the online documents for the group members to contribute in, and also compiled the work. I communicated and cross-checked the results with everybody.
Achievement highlights
- As a task lead, I communicated with my team members to coordinate the work, and and also made sure that everybody contributed towards completing the tasks.
- Performed differential gene expression analysis and visualized multiple plots by setting different arguments and cutoff values for better understanding.
- Completed the deliverables and presentation much before the deadline.
List of training and meetings attended
7/27 Team meeting, 7/28 Team Presentation, 7/29 GitHub Webinar, 7/30 Office Hours, 7/31 Discussion
Goals for the upcoming week
- For the upcoming week, functional analysis is a totally new and time consuming part to perform. I will try to organize my work in order to complete the deliverables smoothly.
- Will try to get to know my group members better
Tasks done
- For the technical part, I mapped gene symbols, filtered genes, and fitted linear model to create a table for differentially expressed genes (sorted by their adj.P.Value). I visualized the results using heatmap based on clustering, and Volcano Plot. I interpreted the volcano plot by comparing the log fold change and p values of different genes.
- For me, the coding part was easier this week compared to the last one. I encountered fewer problems and challenges, and the major one was related to filtering the genes. I managed to solve that by taking help through Office Hours.
SELF ASSESSMENT FOR WEEK 5
Things learned
- Technical area: This week, I made an attempt to describe the functions and interactions of the top DEGs for our colorectal cancer dataset through computational approach, i.e., I made use of packages like clusterProfiler and enrichplot.
- Tools: R, Slack, G suite, GitHub, GEPIA, StringDB
- Soft skills: Made an extra effort to know my team members better, and communicate this week’s results with them in detail.
Achievement highlights
- “Kyoto Encyclopedia of Genes and Genomes” was just a regular database for me, before I conducted KEGG analysis. I interpreted the results using dotplots, and also used gene-concept network for understanding complex associations among the genes. This way, I got to know more about the database and what it is about and how to utilize it.
- I made several attempts to get the cnetplot with the desired arguments. And finally, I did it on my own after specifying the arguments again & again, and redefining the vectors multiple times.
- Read more about the different databases available for biological pathways, and how plots are interpreted, for example, the enrichment score and FDR from gsea plot.
List of training and meetings attended
8/4 Presentation and Diversity Discussion, 8/5 Deliverables Webinar, 8/6 Office Hours
Goals for the upcoming week
- Compile the results of functional analysis and prepare a write-up to draw conclusions from the whole process.
- Start working on the final task and presentation.
Tasks and Challenges
- Filtered the genes from limma analysis to creata a gene vector, with logFC values in decreasing order. Performed GO analysis by using enrichGO and KEGG analysis by using enrichKEGG, and visualized the results by making barplot/dotplots. I also used string database to locate the hub genes.
- Also performed GSE analysis and TF analysis using GSEA() and visualized the results by working on two new types of plot: gseaplot and cnetplot.
- Faced problem with assigning arguments to some of my plots, but solved this issue after few attempts.
- Defining the gene vector was the most time consuming process for me, as I wasn’t sure about the no. of genes to be considered while making the plots, and the segregation of genes into up-regulated and down-regulated genes made it more confusing, but I took help from Office hours and my team mates to figure this out.
SELF ASSESSMENT FOR WEEK 6 and 7
Things learned
- Technical area: Performed principal component analysis, quality control using ArrayQualityMetrics package; DGE analysis using limma, EnhancedVolcano and pheatmap; function analysis : GO, KEGG pathway, gene network, StringDB, TFA, GSEA.
- Tools: R, Slack, G suite, GEPIA, StringDB, Powerpoint, GEO
- Soft skills: Improved on my presentation skills while working for the final project.
Achievement highlights
- Completed my final presentation on ccRCC and learned about the GEO dataset in detail. Also learned how to organize and make the metadata for different groups.
- Spent a lot of time in interpreting all the analyses performed.
- Besides renal cell carcinoma, I also tried analyzing the datasets for parkinson’s, breast cancer and AML.
List of training and meetings attended
8/10 Team meeting, 8/11 Office Hours, 8/12 Functional Analysis Webinar, 8/13 Office Hours, 8/14 Group presentation, 8/17 Team Meeting and Office Hours, 8/18 Webinar on professional presentation, 8/21 Final Presentation, 8/24 Final Team Meeting
Goals for the future
Work more to enhance the skills acquired through this internship, be it technical or soft skills.
Tasks and Challenges
- I tried to merge different datasets for my final presentation but did not get good results and ended up being confused about why such errors were arising but then I sought help through office hours and managed to work it out.
- Interpreting each and every graph was a tedious task, but I made an effort to understand every plot in detail.
- Faced problem with particular datasets since different packages were required to load them, as a result of which I learned about the different array types and how the data is differently organized in each type of microarray.