Name: Pierce Patrick
Team: Machine Learning Team 8 (Bertinator)
-Technical Area: I have already learned so much through the few weeks of working on this project! How webscrapers/crawlers work and how to build one, the HTML/CSS language, cleaning and manipulating data, a better understanding of the machine learning field, natural language processing, and specific models such as Bidirectional Encoder Representations from Transformers (BERT).
-Tools: Git, Github, VS Code, Scrapy, Colab, Python, Pandas
-Soft skills: Working as a team on a collaborative project and getting past language and time zone barriers. Using WhatsApp and slack and kanbachi.
- Successfully scraping the airline forum for necessary features
- Meeting everyone in the first zoom meeting
- Working on a Google Colab for the first time
Initial zoom meeting and one google meeting. Watched all recordings
Goals for upcoming week:
Train the forum classification model and see what accuracy I can get.
Data collection through scraping the airline forum. Faced lots of obstacles such as overcoming the infinite scrolling website.
Data cleaning such as converting the HTML in post_text to regular text. Faced struggles learning how to use the HTML2Text library.
My partners helped tremendously.