Jennifer Helguera -Machine Learning Pathway- Self Assessment

Week: 7/27 – 8/01

Overview of Things Learned:

Technical Area:
Learned a lot on web scraping and using BeautifulSoup. I am still trying to understand how to use scrapy. I also learned how to save info into a CSV file.

Tools: Requests, BeautifulSoup, re, and Pandas libraries, scrapy

Achievement Highlights

  • web scraping a website
  • learning python
  • learned how to use HTML and JSON
  • learned how to use panda

Meetings attended

7/20 – ML team kick-off meeting

7/27 – Team-4 meeting

7/29 – web scraping check-in

7/31 – web scraping check-in

Goals for the Upcoming Week

  • Learning TFIDF
  • Exploring BERT library

Tasks Done

Scraped titles, usenames, latest updates, from the Hopscotch forum, preprocessed and stored the data in CSV files, still need to push it to github

Week: 8/10
Overview of Things Learned:

Technical Area: TF-IDF, BERT
Tools: transformers, torch, pandas

Achievement Highlights

  • Learned how to use DistilBert

  • learned how to plot information in a graph

Meetings Attended

8/10 - Present Pre-Processing and TF-IDF
8/12 - Check in an Implementing the BERT Model
8/14 - Present BERT Model implementations

Goals for the Upcoming Week

  • Learning about AWS

Tasks Done

  • Create a TF-IDF and finished implementing BERT

Week: 8/3
Overview of Things Learned:

Technical Area: Pre- processing
Tools: Beautiful Soup, pandas, copy, re, io, markdown, string, requests, csv

Achievement Highlights:

  • Learned how to clean raw data by getting rid of foreign characters, emoji, and symbols

  • Created a new csv file with updated text

Meetings Attended

8/5 - Pre-processing Check in

8/7 Presenting Pr-processing and TF-IDF

Goals for the Upcoming Week

  • Learn about BERT

Task Done

Completed TF-IDF and preprocessing. Was able to clean the raw data and save it into a csv file and is ready for use in BERT.