Data Curation and Processing Assessment
- Throughout this week I was able to learn many things. Since an exact script of what had to be done was not given, I was able to learn and troubleshoot when I came across issues using the skills learned when training for the program.
- I learned how to analyze the plots in both a technical and scientific manner.
- I was also able to collaborate with my team members to complete the project and present the results.
- Working collaboratively with my team members
- Learning how to work with different packages in R
- Troubleshooting issues that arose during the project
List of Meetings
- The team meeting that I attended was a great introduction into the program since I could talk to and hear from other members.
- The biology webinar was extremely helpful as it helped put the project into perspective from a biological point of view since the portion that I did was mainly technical.
- However, I was unable to attend the Happy Hour due to poor internet connection.
Goal for the Week
- My goal for the upcoming week is to have more collaboration with all of my team members and to see how the tasks performed can be applied to other projects.
Tasks and Challenges
- Most of the code that I ran took a long time to process which slowed down my progress to a certain extent.
- In addition, since this was my first time working by myself to code R, I often had to look up what was being done and the reasoning behind it.
- The data was loaded in and merged.
- Quality control using simpleaffy and arrayQualityMetrics was done on the raw data.
- The results were visualized using various plots to identify possible outliers and to determine the quality of the data.
- The data was then normalized using RMA.
- Once again, the resulting data was visualized to determine the outliers which were then removed.
- The data that no longer had the outliers was normalized and batch correction was done on the clean data to prevent sample clustering.
- Once again, the results were visualized using plots. The plots used included heatmaps, pca plots, boxplots, RUSE plots, and NUSE plots.