Aaditya - Machine Learning (Level 1) Pathway

Technical Area:

  1. Familiarized myself with the stemAWAY website and forums.
  2. Learned the possible approaches for building a recommender system; content-based and collaborative approach.
  3. Learned the pros, cons and requirements for building a recommender system using the content-based approach (diversity, explainability, and relevancy).
  4. Explored the possible models for filtering data in this approach (classification and regression models)
  5. Learned about the typical workflow of handling data for ML (choosing a dataset, pre-processing, tabulation, creating word representations, and vectorizing the data).
  6. Built a web scraper for the quotes.toscrape site and learned how to use it for basic data mining tasks.
  7. Learned the basics of version control, Git, and GitHub.

Tools: VScode, GitHub

Soft Skills: time management, Problem solving (the additional resources section helped a lot in understanding how to operate Beautiful Soup and the complete NLP workflow, especially Alice Zhao’s YouTube lectures.)

Tasks Completed:

  1. I prepared my workspaces. VScode was already set up for me, but I had very little experience with Jupyter notebooks and Colab. I installed the beautiful soup, Selenium, Spacy, Sentence_transformers, transformers, and PyTorch libraries.

  2. I familiarized myself with the beautiful soup and requests libraries. I am new to Python, but my experience with JS helped me get a good enough grasp that I could follow along the Web Scraping intro. I also made my own web spider.

  3. I attempted to follow along the workflow of preparing a dataset for an ML program, but I could only follow along till the pre-processing, and word-representations steps (as in, I could understand the video, but couldn’t replicate beyond that step).

  4. I read about how BERT and a logistic regression model can together be used to evaluate positive/negative sentiments.