Week of 7/22/2020
Reviewing discussion forums of CodeChef and making a report on it.
Everyone collaborated to create the report on CodeChef and discuss it’s viability as a suitable data source for our project.
Things learned:
Slack
Asana
All meetings attended.
Week of 7/27/2020
Scraping data from Codecademy discussion forums.
Used BeautifulSoup and Selenium to scrape data from Codecademy forums.
Collaborated with the other members to improve the scraping algorithm and to discuss appropriate data to scrape by conducting meeting of our own.
Things Learned:
BeautifulSoup and Selenium for webpage scraping
Online collaboration
2 Group Meetings Attended
2 CodeChef group meetings attended
Week of 8/3/2020
Data Cleaning
Performed preprocessing and cleaning to data to apply training models to it.
Things learned:
Data Cleaning techniques-
- Removing punctuations and newlines
- Tokenization
- Removing stop words
- Lemmatization
- Stemming
All meetings attended
Week of 8/10/2020
Studying the training model to be used for training the dataset. I studied simpletransformers. Our group created a powerpoint presentation on simpletransformers.
Things learned:
How to use simpletransformers
Different uses of simpletransformers
Could not attend the meeting on 8/14/2020 because I was moving apartments.
Performed final implementation of topic recommendation system using BERT and DistilBERT. Calculated performance metrics by using the methods Logistic Regression, OneVsRest Classification, Random Forest Classification.
Had the final presentation of the project on 4 September 2020.