Urvi - Machine Learning Pathway

Urvi · August 14, 2020, 5:33pm

As part of task 1, we were required to explore multiple forums similar to the STEM-Away that comprised topics/categories/posts/tags pertinent to STEM. The forum that I worked on was Discourse Meta. Thereafter, we had to present our observations in the form of a report. It was followed by scraping the StackOverflow forum as it was found to more relevant. After generating the csv, we went through the data preprocessing stage. We’re now exploring the BERT model.

Overview of things learned -

Technical -

Web scraping using BeautifulSoup
Data mining
Data cleaning

Tools used -

Asana (for project management)
Jupyter (for python coding)

Soft skills -
Interacted with people coming from diverse backgrounds and level of expertise. Collaborated with teams and learnt new things from their work.
Achievement highlights -

Got familiar with Asana
Honed web scraping and data preprocessing skills
Connected with the leads and team-mates and made new friends

Meetings attended -

Attended a group meeting
Attended all team meetings

Tasks done -

Prepared a report along with my group, stating the pros and cons of using Discourse Meta for web scraping.
Scraped the StackOverflow forum for one tag, namely, data science, that had around 5.5k posts.
Presented a data analysis report of the csv obtained from scraping 13 categories of StackOverflow. The report consisted of the anomalies that needed to be addressed as part of the data preprocessing stage.
Cleaned the dataset.

Goals for the upcoming week -
Implementing the BERT model

Urvi · September 12, 2020, 4:55pm

Overview of things learned during final phase:

Technical -
Implemented the DistilBERT Model.
Tools used -

Google Colab
Jupyter

Soft Skills -

Collaborated with other team members and prepared the final presentation.
Delivered my part in the same.
Learned about giving a professional edge to the presentation.

Achievement highlights -

Implemented recommender system.
Delivered the final presentation.

Meetings attended -
Attended all meetings
Tasks done -

Created a machine learning recommendation model using DistilBERT which gave an accuracy of 93%.
Delivered the final presentation. Presented an overview of the DistilBERT model, its working and its results.