Week 1
Technical Area
Refamilarized with pandas library, mainly dataframe and associated utilities
Refamilarized with Regular Expression library
Learned Beautiful Soup, Pubmed_parser, and Requests libraries
Jupyer Notebook (via Google Colab)
Magic commands (like %time and %pip)
Beautiful Soup library
Pubmed_parser library
Regular expression library
Requests library
Soft Skills
Facilitated international team meetings and answered general questions
Achievements Highlights
Successfully finish the web crawler and scraped all data
Review research paper and attached code
Scraped all Medline data but not time efficient (~7 hours to parse and process)
Upcoming Goals
- Data cleaning
- Feature Engineering
- Stanford parser
- Dependency matrix
Used web crawler to scrape required data from Medline database website and created csv file to hold abstract data.