Piercepatrick - Machine Learning Pathway

Name: Pierce Patrick

Team: Machine Learning Team 8 (Bertinator)

Things learned:
-Technical Area: I have already learned so much through the few weeks of working on this project! How webscrapers/crawlers work and how to build one, the HTML/CSS language, cleaning and manipulating data, a better understanding of the machine learning field, natural language processing, and specific models such as Bidirectional Encoder Representations from Transformers (BERT).
-Tools: Git, Github, VS Code, Scrapy, Colab, Python, Pandas
-Soft skills: Working as a team on a collaborative project and getting past language and time zone barriers. Using WhatsApp and slack and kanbachi.

Achievements/highlights

  1. Successfully scraping the airline forum for necessary features
  2. Meeting everyone in the first zoom meeting
  3. Working on a Google Colab for the first time

Meetings attended:
Initial zoom meeting and one google meeting. Watched all recordings

Goals for upcoming week:
Train the forum classification model and see what accuracy I can get.

Tasks done:
Data collection through scraping the airline forum. Faced lots of obstacles such as overcoming the infinite scrolling website.
Data cleaning such as converting the HTML in post_text to regular text. Faced struggles learning how to use the HTML2Text library.
My partners helped tremendously.