Technical Area: Microarrays, Different types of RNA, R for data analysis
Tools: RStudio, GEO Database, Stem-Away forums, Google Drive
Soft Skills: I am learning how to collaborate in a virtual environment
Three Achievement Highlights
This is my first exposure to Bioinformatics. I am beginning to understand the different data processing and analysis steps and how they fit together.
I am beginning to get better at R with the help of the assigned exercises as well as by exploring related ones.
I got the right answer to Erin’s trivia question (lol)
List of Meetings/ Training Attended Including Social Team Events
Team introduction meetings (Alex and Erin)
Technical webinars
R training (two)
Asana training
Goals For The Upcoming Week
I am still a little bit behind because a lot of this is new to me. However, I will try to catch up by the end of next week and get to know my team. Thank you @egunduz and @yvesgaetan for all the guidance.
Here is an animation that I found helpful in understanding how micro RNAs silence/regulate genes:
Technical Area: New R packages (for example: affy), how to read the affy QC Report, plot RLE and NUSE histograms, and PCA Plot.
Tools: More RStudio experience, Asana, csv files
Soft Skills: Scheduling meetings with teammates, participation in team building (happy hour), working in a team of new people
Three Achievement Highlights
I have become more comfortable with new R packages and how to apply them for different tasks.
I managed to understand the six-page affy QC report by looking through the bioconductor documentation and through some additional internet searching.
My team managed to finish the deliverable on time despite losing a team member.
List of Meetings/ Training Attended Including Social Team Events
Team meetings
Technical webinars
Happy Hour (Family Feud)
Tasks Completed
I completed all steps of the Week 3 assignment using the Bioconductor package documentation to help solve issues. I initially struggled with plotting the histograms but found that the issue was pretty simple and was able to fix it. I also had to rewrite some code since RStudio crashed, but that wasn’t too much of an issue.
Goals For The Upcoming Week
I want to go over the deliverable solutions provided by Yves (thank you for the fast grading!), so I can be ready for the next deliverables. I am also preparing for the presentation on Friday. Also, my happy hour group is in charge of this weeks happy hour, so we are going to prepare that for Friday.
Tasks Completed
I completed all steps of the Week 4 assignment using the Troubleshooting Forum and Bioconductor package documentation to help solve issues. I had an issue with the functions not being compatible with my objects, but I learned how to sort it out. I played around with volcano plots a lot to better understand them.
Goals For The Upcoming Week
I am focusing mainly on learning R at this point so that I can fully understand each step and what it does in the overall project. I plan to catch up on Github and Python soon.
Technical Area: New R packages (topGO, cluserProfiler), Gene Ontology Enrichment (enrichGO and groupGO), KEGG and DAVID enrichment analysis
Tools: More RStudio and Asana experience, DAVID (new tool learned)
Soft Skills: Further improvement of presentation skills, Time management
Three Achievement Highlights
Technical deliverable completed on my own (team mates left)
Improved ability to debug issues with code
Improved ability to understand how each step connects to the overall pipeline
List of Meetings/ Training Attended Including Social Team Events
Team meetings
Technical webinars
Happy Hour
Tasks Completed
I completed all steps of the Week 5 deliverable, consisting of the groupGO, enrichGO, KEGG, and DAVID functions / tools for data enrichment. I had some issues initially because I did the first step in a slightly different way than intended and, therefore, had an object which the GSEA analysis would not take, but after playing around with the vectors, I managed to sort it out. I also had an issue with the dotplots not showing up, but some debugging helped me realize that I didn’t have enough data because I started with the wrong data set (I had started with the Top 100, not the full table).
Technical Area: R Bioinformatics functions: DEG Analysis and Quality Control (affyQCReport, PCA, rma, limma, pheatmap), Gene Ontology Enrichment (enrichGO and groupGO), KEGG (EnrichKEGG and Pathview) and DAVID enrichment analysis
Soft Skills: Presentation skills, Time management, Collaboration/ Communication
Three Achievement Highlights
Bioinformatics programming in the context of a scientific goal
Increased programming knowledge and ability to debug issues
Collaboration with peers and leads to effectively execute steps of the colorectal cancer paper
List of Meetings/ Training Attended Including Social Team Events
Final Presentation
Team Meetings
Happy Hour
Tasks Completed
For these last two weeks, all of the tasks I completed were for my final presentation. I applied each step of the programming pipeline to the new data set we were tasked with, and things went smoothly for the most part, as I learned how to efficiently deal with any coding issues the first time around. Initially, I forgot to remove some of the files that were of data that we didn’t need to use, so R kept saying that the functions wouldn’t work on such a big data set, but luckily, I figured out that error rather quickly. Also, I had removed only 3 samples to start during quality control and only realized that it would be better to have removed a fourth sample afterwards, but that was also an easy fix. Details of steps and plots are in the attached presentation: Final presentation.pdf (2.1 MB)