Overview of things learned:
Week 2 (7/28/20 - 8/3/20)
- 
Technical  - Using R Studio, as well as databases provided in the deliverables, limma matrices and volcano plots were created to compare cancer groups with normal groups
 
- 
Tools  - R Studio, Slack, Google Meet, GitHub, Bioconductor
 
- 
Soft Skills  - communication, problem-solving, cooperation, adaptability, critical thinking
 
Three achievement highlights:
- With group, figured out an efficient method to remove duplicates in the matrix for both gene symbols and probe IDs
 
- With group, determined a method to filter out genes below the 2nd centile through the expression dataset created the previous week
 
- Met with group to not only discuss individual progress but also plan out a concise presentation and familiarized ourselves with each other
 
List of meetings attended including social team events:
Attended all meetings except for happy hour and office hours
- 29/07 - GitHub webinar
 
- 20/07 - Office Hours
 
- 01/08 - Group 3 meeting to discuss deliverables
 
- 03/08 - Team 1 meeting
 
- 04/08 - Team 1 deliverables presentation
 
Goals for the upcoming week:
- Complete deliverables at a quicker pace. Try to get more done in an individual day
 
- Network with my group and others in STEMAway
 
- Practice using R and familiarize myself with certain functions
 
- Practice using GitHub and merging code
 
- Communicate more effectively with team, and make sure all work is completed in a timely fashion
 
Detailed statement of tasks done:
Deliverables:
- Using hgu133plus2.db, created expression matrix with probeset IDs and gene symbols
 
- Filtered out certain genes based on expression and availability of data
 
- Created and analyze limma matrix
 
- Transfer data from limma matrix into a volcano plot
 
Other:
- Practiced communication with team on slack
 
- Familiarized myself with GitHub and its functions
 
- Toyed around with Asana (may be able to use it in the future)
 
Challenges and how those challenges were overcome:
- Struggled with removing duplicated keys with collapseRows(). Worked with team to determine an effective method to remove all duplicates in gene symbols
 
- Struggled with filtering out certain genes via expression matrix. Realized that it required data retrieved from the previous week
 
             
            
              
              
              1 Like
            
                
            
           
          
            
            
              Overview of things learned:
Week 1 (7/21/20 - 7/27/20)
- 
Technical - Using R Studio, as well as several Quality Control, Normalization, and Batch Correction Packages, processed data from GEO repositories to an easily analyzable form. Created PCA plots and heatmaps to analyze initial data.
 
- 
Tools - R Studio, Slack, Bioconductor
 
- 
Soft Skills - communication, problem-solving, cooperation, persistence, critical thinking
 
Three achievement highlights:
- Learned basics of R, as well as how to perform quality control, normalization, and batch correction with R
 
- Got to know my group (background, hobbies, interests, etc.) and discussed out initial progress and thoughts
 
- Created first PCA plots and heatmaps with R
 
List of meetings attended including social team events:
Attended all meetings except for happy hour and office hours
Goals for the upcoming week:
- Complete deliverables at a quicker pace.
 
- Network with my group and others in STEMAway, get to know my group better
 
- Practice using R and familiarize myself with certain functions
 
- Communicate more effectively with team, and make sure all work is completed in a timely fashion
 
Detailed statement of tasks done:
Deliverables:
- 
Performed quality control with packages such as ArrayQualityMetrics, affyPLM, and simpleAffy to analyze the raw data and remove outliers
 
- 
Performed normalization with gcrma to standardize data and reduce variability, which may impede results
 
- 
Performed batch correction using ComBat (an sva package) with provided metadata
 
- 
Created first PCA plots and heatmaps, allowing us to visualize our processed data
Other:
 
- 
Practiced communication with team on slack
 
- 
Learned the basics of R and its functions
Challenges and how those challenges were overcome:
 
- 
Sometimes used Python language rather than R. Fixed this by carefully looking over my code
 
- 
Struggled with downloading raw data. Fixed this by watching tutorial more carefully.
 
             
            
              
              
              
            
            
           
          
            
            
              Overview of things learned:
Week 3 (8/4/20 - 8/11/20)
- 
Technical  -  Created several plots to analyze correlations in data, such as the GO and KEGG plot
 
- 
Tools  - R Studio, Slack, Bioconductor, Google, GitHub
 
- 
Soft Skills  - communication, problem-solving, cooperation, persistence, critical thinking, independence, patience
 
Three achievement highlights:
- Created first GO and KEGG plots using R. Able to see correlations and genes most responsible
 
- Gained better understanding of data, as well as the genes primarily responsible for it
 
- Worked with team to work out small discrepancies in plots
 
List of meetings attended including social team events:
Attended all meetings except for happy hour and office hours
Goals for the upcoming week:
- Begin working on final project
 
- Network with my group and others in STEMAway, get to know my group better
 
- Practice using R and familiarize myself with certain functions
 
- Communicate more effectively with team, and make sure all work is completed in a timely fashion
 
Detailed statement of tasks done:
Deliverables:
- Created GO plots using data. Shows upregulation and downregulation of certain genes that led to cancer
 
- Created KEGG plots and performed KEGG analysis. Saw which diseases seem most similar to the cancer in terms of gene involvement
 
- Created a gene concept network, attributing genes with certain symptoms
 
- Survival Analysis was performed, seeing survival curves in a certain gene to the cancer.
 
Other:
- Learned more advanced R and its functions
Challenges and how those challenges were overcome:
 
- Arguably most difficult deliverable. Required lots of time and patience
 
- Some plots were not perfect. Tried making them as correct as possible, but still room for improvement
 
             
            
              
              
              
            
            
           
          
            
            
              Overview of things learned:
Week 4 (8/12/20 - 8/18/20)
- 
Technical  -  Started Final project
 
- 
Tools  - R Studio, Slack, Bioconductor, Google
 
- 
Soft Skills  - problem-solving, persistence, critical thinking, independence, patience
 
Three achievement highlights:
- Began working on final project. Got to understand lung cancer and its symptoms
 
- Developed further understanding of the genes associated with lung cancer
 
- Learned certain connections between lung cancer and other diesases
 
List of meetings attended including social team events:
Attended all meetings except for happy hour and office hours
Goals for the upcoming week:
- Finish working on final project
 
- Network with my group and others in STEMAway, get to know my group better
 
- Practice using R and familiarize myself with certain functions
 
Detailed statement of tasks done:
Deliverables:
- Performed Quality Control, Normalization, and Batch Correction on raw data
 
- Created own metadata from GEO database
 
- Created heatmaps, GO plots, and KEGG plots
 
- Made conclusions on cancer
 
Other:
- Learned more advanced R and its functions
Challenges and how those challenges were overcome:
 
- Lack of communication with a team was difficult. Worked this out with lots of googling and self-studying
 
- Some struggles with quality control,. worked issue out with help of Anca
 
             
            
              
              
              
            
            
           
          
            
            
              Overview of things learned:
Week 5 (8/12/20 - 8/17/20)
- 
Technical  -  Finalized and presented final project
 
- 
Tools  - R Studio, Google Meets, Slack
 
- 
Soft Skills  - communication, problem-solving, cooperation, persistence, critical thinking, independence, patience, presentation skills, public speaking
 
Three achievement highlights:
- Finished project and prepared presentation
 
- Practice presentation with Sarah. Received feedback
 
- Presented presentation in front of Yves and Sarah. Received important constructive criticism.
 
List of meetings attended including social team events:
Attended all meetings except for happy hour and office hours
Goals for the upcoming week:
- Learn more R
 
- Consider other bioinformatics internships
 
Detailed statement of tasks done:
Deliverables:
- Using similar bioinformatics pipeline learned in internship, performed similar preprocessing, processing, and analysis with new raw data for lung cancer.
 
- Made conclusions and compared conclusions with hypothesis
 
- Set future goals for R
 
Other:
- Learned more advanced R and its functions
Challenges and how those challenges were overcome:
 
- Performed extremely poorly on presentation. Will work on public speaking skills and R skills
 
- Time management. Finding a time to present and practice was difficult. Will work on time management in future.
 
Link: Final Project - Google Slides