Aritra_Baksi - Machine Learning Pathway

Things learned:

  • Learned how to scrape the web using BeautifulSoup and Selenium
  • Technical Area: NLP Models
  • Tools: BeautifulSoup, Selenium, BERT, Word2Vec
  • Soft Skills: Team-Work, Remote Collaboration, Managing Deadlines

Tools: GitHub, PyTorch, GoogleColab

Achievements:

  • Collected data from 4000+ posts, 50+ webpages and made a csv dataset on the data.

  • Implemented TF-IDF on the data.

  • Visualized the findings using matplotlib.

Meetings attended:

  1. Week 1 Team meetings, 2 hours (x2)
  2. Office Hour, 1

Goals

  1. Implement BERT on scraped data.
  2. Get to know more about types of BERT model.
  3. Communicate more with the team.

Tasks completed

  • Performed web scraping of thousands of posts.

  • Researched about BERT and implemented it in the .csv dataset from the webscraping.

Things learned:
Technical: learned to scrape data from multiple categories, change the orientation of datasets, implement BERT.

Tools: GitHub, JupiterNotebook, GoogleCollab.

Achievements:

  • Collected data from 4200+ posts, 50+ webpages, 9 categories, and made a csv dataset with 7 features.

Meetings attended:
Week 4, 2 meetings

Goals

  1. To implement word embeddings on the data set and plot the embeddings in a way reflecting their corresponding words’ meaning.
  2. Implement BERT on a bigger dataset.
  3. Get started with variations of BERT.

Tasks completed
This week I’ve been working on the dataset and adding multiple features to it, as well as understanding and implementing BERT.