Daniel_Drucker - Bioinformatics Pathway

Self Assessment 7/28
Overview of Things Learned

  • Technical: R syntax and use of specific libraries for data cleaning
  • Tools: R Studio, Slack
  • Soft Skills: Continuous exposure to and experience with the STEM-Away website has made navigating it easier, team communication

Achievement Highlights

  • Completing specific deliverables
  • Interpreting the details of an academic paper
  • Got along rather nicely with my teammates who I’d just been starting to get to know

Meetings Attended

  • Monday 7/20 info session
  • Tuesday 7/21 Webinar
  • Wednesday 7/22 technical presentation
  • Sunday 7/26 subteam meeting

Tasks Done

  • Using AffyPLM, produced boxplots for the data, as well as RLE and NUSE graphs
  • Normalized data with gcrma, identifying outliers
  • Corrected batch effects, ultimately giving a data set adequate to perform analysis in the following steps on

Goals for the week

  • Communicate with team mates more quickly
  • Get started on the deliverables sooner
  • Practice R syntax

Challenges and how those challenges were overcome

  • Ran into a memory allocation error in the normalization step. A post on the STEM-Away forum helped me resolve this since another team member ran into a similar issue. The solution was essentially to use a one line command permitting R Studio to allocate more RAM
  • Unfamiliarity with R language made it difficult for me to complete the tasks. Ultimately, I produced an adequate table of cleaned, corrected data. My group-mate, who is more familiar with R helped me to understand the semantic details I wasn’t aware of in the visualization steps
2 Likes

Self Assessment 8/4
Overview of Things Learned

  • Technical: Becoming more familiar with R syntax semantics, interpreting biological meaning while working on the data processing, presenting results concisely and meaningfully
  • Tools: Still more of R Studio, Google slides (to clarify, I hadn’t just learned how to make powerpoints this week, but it’s actually been a rather long time since I’ve needed to make one)
  • Soft Skills: Managing time for deliverables, public speaking (over webinar), initiating team discussions while team lead of the week was my responsibility

Achievement Highlights

  • Independently completing and understanding certain deliverables tasks
  • Successfully putting together a presentation on the week’s work
  • Organizing team meeting and acting to catalyze team discussion as the team lead for the week

Meetings Attended

  • Monday 7/27 teem meeting addressing the week’s deliverables
  • Tuesday 7/28 deliverables presentations and networking webinar
  • Saturday 8/1 team meeting to discuss deliverables and presentation

Tasks Done

  • Annotation, by using the gene database, associated the IDs in our data set with interpretable gene symbols
  • Filtering redundancies in IDs and symbol mappings, as well as N/A genes
  • Limma analysis (which I finished inependently for the sake of my own understanding, though I was behind my peers and ultimately, it was my teammate’s results which were presented), to yield statistical information on the differential expression values of our genes
  • Yielded the differential expression data to be used in functional analysis
  • Put together and delivered our team’s presentation for the week

Goals for the week

  • Learn to use github
  • Be more independent in my trouble shooting
  • Practice R syntax (again)

Challenges faced and how those challenges were overcome

  • Difficulty understanding and working on syntax in R. Namely, through the data processing, keeping track of the objects in a data frame such as row.names was not intuitive to me. Having help from my group-mates helped clarify these things for me
2 Likes

Self Assessment 8/11
Overview of Things Learned

  • Technical: R libraries for functional analysis
  • Tools: Github, R Studio
  • Soft Skills: Social skills with team, independence in motivation

Achievement Highlights

  • Completed deliverables much more quickly than the previous weeks
  • Got over my nerves to reach out for help
  • Was more independently capable of making progress on the deliverables

Meetings Attended

  • Monday 8/3 team meeting
  • Tuesday 8/4 prior week’s deliverables presentations

Goals for the week

  • Work on string database and gene set enrichment analysis
  • Learn more about the specific cellular functions in the functional analysis
  • Attend office hours to clarify semantic issues I hadn’t resolved since early in the program

Tasks Done

  • Organized differentially expressed genes by statistical significance
  • Performed gene ontology enrichment and Kyotot Encyclopedia of Genes and Genomes enrichment
  • Output plots of cellular function figures

Challenges faced and how those challenges were overcome

  • Unable to understand exactly what the meaning of the cellular functions from gene ontology enrichment step. My group mate, who has a stronger biology background than me, was able to provide some technical knowledge
  • Ambiguity in the documentation of certain functions. Through trial and error I figured out the correct syntax

Self Assesment 8/18
Overview of Things Learned

  • Technical: STRING database, R troubleshooting,
  • Tools: STRING
  • Soft Skills: public speaking, time management, team communication

Achievement Highlights

  • Worked on deliverables independently
  • Worked on my public speaking skills outside of this program
  • Managed my time with another project I’ve been working on

Meetings Attended

  • Monday 8/10 team meeting
  • Friday 8/14 deliverables presentation

Goals for the week

  • Work on my independent analysis of the Alzheimer’s dataset
  • Clean up previous weeks’ code for book-keeping’s sake
  • Write a CV, including my work here in this project

Tasks Done

  • Transcriptional analysis, and output a gene concept map to illustrate relations between genes
  • Used STRING database to associate gene relations and functions

Challenges faced and how those challenges were overcome

  • Found the output of STRING to be completely unintuitive. I don’t necessarily know that I’ve completely resolved this problem, but I’ve certainly worked on trying to understand it a little better
  • Concept network plots were extremely messy when output. I spent extra time studying the arguments that dictate the aesthetics of the output

Presentation link: https://docs.google.com/presentation/d/1Vc7sZyclUhGo7_US-A0d_nyGzygp3L69u4a-XT2DJhc/edit?usp=sharing

Self Assessment 8/25 Overview of Things Learned

  • Technical: Biological background of my Alzheimer’s data set (ie, Braak stage discretizing)
  • Tools: R Studio, data visualization methods especially
  • Soft Skills: Independent motivation, public speaking

Achievement Highlights

  • Completed the entire pipeline independently
  • Made sense of a data set structured differently than the dataset we worked on previously
  • Delivered my presentation successfully, and have a clear finished product to show for what I’ve learned here

Meetings Attended

  • Monday 8/17 office hours
  • Monday 8/17 team meeting
  • Wednesday 8/19 office hours
  • Monday 8/24 final team meeting

Goals for the week

  • Clean up my code
  • Add figures to my final output documented on this platform
  • Do more research about the cellular functions that come up from my enrichment analysis

Tasks Done

  • Data cleaning (normalization and batch correction)
  • Differential gene analysis, using limma to output statistical data on gene expression differentials
  • Gene ontology enrichment analysis to show the functions of the genes affected in the Alzheimer’s samples

Challenges faced and how those challenges were overcome

  • The normalization step was different for this set because the data could not be given to the functions in the affyPLM library. Batch correction was extremely confusing to me because it wasn’t clear how the data could even be distinguished by batches. Nonetheless, with the mentors’ help at office hours, I yielded a reasonable output
  • The data set used many non-standardized gene symbols, which caused a major error in a few stages. I tediously renamed the members of the data frame with the highest differential expression values for the gene ontology enrichment step, looking up the aliases of the gene symbols used
  • Presenting my work was amazingly nerve-wracking because of my lack of assurance in my independent work. Coming into this program without a biology background, I had no way to talk about the cellular functions expressed in the gene ontology enrichment step, which is the part that interested me most, and hence was a step I would have loved to be able to talk about with greater intuitive understanding. I had assurance from peers and mentors. I feel I can sate my curiosity by simply continuing to do research even though this program is ended now