Medhini7 - Machine Learning Pathway

Self Assessment 8/4

Technical Area:

  • Web Scraping using BeautifulSoup and Selenium
  • Implemented TF-IDF on scraped data
  • Understood web embeddings and BERT

Tools:

  • Jupyter Notebook
  • Github
  • GSuites
  • Slack

Soft skills:

  • Communicated during meetings to ask leads questions
  • Communicated with support group to organize meetings for presentations and to ask questions

Three achievements:

  • Implemented web scraping on student forum (collected data on 4000+ posts)
  • Implemented TF-IDF on the scraped data
  • Understood word embeddings and BERT

List of meetings/trainings attended:
7/21 Team Meeting, 7/28 Team Meeting, 7/31 Team Meeting, 8/4 Team Meeting, attended support group meetings as well

Tasks Completed:

  • Web scraping on student forum
  • Applied TF-IDF to the scraped data
  • Visualized TF-IDF results using matplotlib

Goals for upcoming week:

  1. Try to implement word embeddings (I did not get a chance to do this by myself)
  2. Implementation of BERT
  3. Ask support group more questions when I am confused

Detailed Statement of Tasks Done:
The first week we worked on web scraping, specifically scraping articles from a student forum. We were successfully able to scrape 4000+ posts using code explained during the first team meeting and modifying it. Additionally during the first week, we presented on a high-level overview of BERT to understand the underlying mechanisms that are used. The second week I worked on implementing TF-IDF on the data that was scraped. I did this using the code from office hours as well as other online resources and modifying it to accommodate the large amount of data we were analyzing. From these results, I was able to create a histogram and bar plot of the data using matplotlib. These results were shared during a presentation in today’s team meeting.

1 Like