Presentation (Word Document):
** All relevant code is found in the document**
- The main challenge that I faced was the normalization section of the deliverables. I normalized the data after removing the outliers when I should have normalized the data before any outlier removal. This was overcome after our first deliverable presentations and the issue was corrected to avoid invalid data in future deliverables.
- With no previous coding experience other than the training videos provided by STEM-Away, details that strayed away from the code shown in the training videos impacted my progression more than I would have liked. I relied on the troubleshooting thread to overcome this challenge.
- Another issue that occured was that I removed all the outliers given in the arrayQualityMetrics index file instead of just removing the major outliers. I resolved this issue before moving onto future deliverables as well.
- Getting in touch with my team members was also difficult due to the situations they were facing outside the intership. With the help of my team lead and technical leads, I was was able complete the first set of deliverables.
Summary of Work:
- Made a schedule/guide for the team, highlighting when deliverables should be due
- Scheduled google meetings with team members
- Data curation and pre-processing
- Quality control- Simpleaffy and ArrayQualityMetrics
- Normalization- mas5 & log 2: Boxplot before and after normalization
- Batch effect correction- Combat
- Visualization- heatmaps before and after batch correction
- Created a deliverable overview document containing created code and visuals
- Presented all data and work during the deliverables meeting
- I wished had I more time to explore other forms of data normalization, but other than that, I enjoyed the quality control section of the project. The technical leads were very helpful when it came to explaining what we were doing with our data.