Meenoti2001 - Bioinformatics Pathway

Week 2/3 Self Assessment
Overview of things learned:

  • Technical - R Studio and performing quality control on data
  • Tools - R Studio, Google Meet, Slack
  • Soft Skills - Scheduling meetings within the subgroup, working on a presentation virtually and overall presentation skills

Three achievement highlights:

*Learned more about R studio and quality control by successfully completing the week 1 deliverables
*Task lead for sub group 2, met with my group and worked together to make a presentation
*Attended office hours to clarify a step in the deliverables

List of meetings attended including social team events:
Monday meeting, Thursday office hours, Friday happy hour, 2 sub group meetings

Goals for the upcoming week:

*Update my linkedin for networking
*Learn more about GitHub
*Clearly understand each step in the deliverables and focus more on analysis

Detailed statement of tasks done:

*Quality control- AffyQCReport
*Data normalization using RMA- creating boxplots
*Batch correction
*PCA plotting using ggplot- three separate plots

*Organized sub-group meeting through a slack channel and worked on the presentation with other group members

Challenges:

*I struggled a little bit with PCA plotting because initially my graphs looked different than my group members and the ones in the document. They did not show any clusters of the sample data. However, after meeting with my group, and comparing plots using our slack channel, we resolved the issue.

*I was unsure about my batch corrected data because I did not know exactly what to put in the arguments of the Combat function. After going to office hours I confirmed that my steps were correct.

2 Likes

Week 4 Self Assessment

Overview of Things Learned:

  • Technical - R Studio, Gene Analysis and visualization
  • Tools - R Studio, Google Meet, Slack
  • Soft Skills - Collaborating/meeting with subgroup, Communication and problem solving skills

Three achievement highlights:

*Learned about differential gene expression and visualization by completing the week 2 deliverables
*Worked with subgroup members to compare our results and create a presentation with focus on gene filtering
*Created a GitHub account and updated linkedin a bit

List of meetings attended including social team events:
Monday meeting, sub group meetings, watched GitHub webinar recording

Goals for the upcoming week:

*Start the next set of deliverables earlier and take advantage of office hours
*Understand each step in the functional analysis deliverables
*Try and identify why my subgroup and I are getting different heatmaps/volcano plots

Detailed statement of tasks done:

  • Annotation using the hgu133plus2.db package to map the probeset ids to genes

  • Used collapse rows to get rid of duplicated gene symbols

  • Gene filtering- got rid of missing values using the na.omit() function

  • Filtered out low expression genes using the quantile function- used the means of each row

  • Created a model matrix to compare the cancer and normal groups

  • Used top table to extract the DEGs by adjusting the p value and plotted the top 50 DEGs in a heatmap, and a volcano plot

  • Wrote the top 100 differentially expressed genes to a file

  • Communicated with sub-group through a slack channel and worked on the presentation with other group members

Challenges:

  • Overall, I found this weeks deliverables to be more challenging than last week. I had trouble with the collaspeRows() functions, but was able to resolve the issue by working with members of my subgroup.
  • Our group found that we had different heatmaps/volcano plots and we were unable to figure out why there were differences in our results. We tried to trouble shoot this issue while meeting and I will continue to see if there are any problems earlier in my code.
  • I felt like I started the week 2 deliverables a bit later than last week, and was not able to identify my issues before office hours. This week, I hope to be more on top of the deliverables so I can get my questions addressed at office hours.
2 Likes

Week 5 Self Assessment
Overview of things learned:

  • Technical - Gene Ontology Analysis and Visualization, KEGG Analysis, Gene concept network visualization, gene set enrichment analysis
  • Tools - R Studio, Google Meet, Slack
  • Soft Skills - Team communication, problem solving and troubleshooting within the team

Three achievement highlights:

  • Communicated with team members effectively and worked together to troubleshoot problems.
  • Completed the deliverables and learned about functional enrichment analysis.
  • Tried to get a better understanding of the biological aspect of the data.

List of meetings attended including social team events:
Monday meeting, sub group meetings, office hours

Goals for the upcoming week:

  • Continue working on understanding the biology side of the data we are analyzing
  • Attend happy hour to meet new people
  • Effectively manage my time to complete the final presentation

Detailed statement of tasks done:

  • Defined a significant DEGs vector
  • Converted the gene symbols into gene IDs
  • Performed EnrichGo analysis and used the set readable function for conversion- allowed for data visualization through a bar plot
  • Used EnrichKEGG to perform pathway analysis- visualized the ErichKEGG result using dotplot
  • Used EnrichDGN and setReadable and plotted two cnetplots
  • Found Hub genes using the STRING DB database
  • Performed GSEA to analyze hallmark genes- plotted gene set and enrichment score distribution

*Communicated with sub-group through a slack channel and worked on the deliverables with other group members

Challenges:

*I struggled with creating the correct significant DEGs vector for the first step. This led to problems while plot the cnetplot. I was able to troubleshoot this issue by going to office hours and consulting my group members.
*There were errors during the visualization of the GSEA result step. I was able to fix this problem by changing the geneVector, and finding help on the forum.

1 Like

Week 6/7 Self Assessment
Overview of things learned:

  • Technical - Going through the quality control, normalization, differential gene analysis, and functional analysis steps using a new data. Creating a final presentation using the results.
  • Tools - R Studio, Google Meet/Slides, Slack
  • Soft Skills - Creating and presenting the final project

Three achievement highlights:

  • Created and presented a 10-15 minute long presentation on a new colorectal cancer data set
  • Completed QC, normalization, DEG, and functional analysis on a new data set
  • Got better understanding of the biological aspect of the data and included that in the presentation

List of meetings attended including social team events:
Monday meeting, sub group meetings, office hours

Goals for the upcoming week:

  • Continue to learn more about data analysis in R
  • Update my resume

Detailed statement of tasks done:

  • Conducted quality control using QC report, simple affy, and array quality metrics to identify the outliers
  • Removed the outliers from the downloaded data set and compared box plots and pca plots before and after RMA normalization
  • Got the metadata using the instructions from google drive
  • Annotation and gene filtering which removed the NA and low expression genes
  • Conducted analysis with limma and plotted Volcano plots and Heatmaps- wrote the top 100 DEGs into a file
  • Defined a significant DEG vector and converted the gene symbols into gene IDs
  • Performed EnrichGO, EnrichKEGG, and EnrichDGN analysis and visulaized the results
  • Performed GSEA analysis to analyze hallmark genes
  • Created a presentation explaining the results

Challenges:

  • I had to change the data set I was using a few times because I was running into problems during the quality control step. However, this was resolved with some trial and error.
  • I spent some more time understanding the biological aspect of the data to give an in- depth explanation during the presentation
1 Like