Cwong4000 - Machine Learning Pathway

Concise overview of things learned. Break it up into Technical Area, Tools, Soft Skills
Technical Area: I have learned how to use Selenium for webscraping, and other things related such as xpath, pandas, etc.
Tools: I learned about the purpose of Github and how it is used, as well as using Slack for communication with the team.
Soft Skills: Communication with team members.

Three achievement highlights
I gained an understanding of how webscraping is done with Selenium
Better understanding of dataframes/xpath and using them to store data from webscraping
Using Slack for team communication

List of meetings/ training attended including social team events
I have attended all team meetings so far, and I viewed the Git webinar along with previously recorded webinars.

Goals for the upcoming week
Learning to understand BERT and other learning models, how they work, and how to implement them to our project. Also, with more knowledge, contributing more to the team code.

Detailed statement of tasks done. State each task, hurdles faced if any and how you solved the hurdle. You need to clearly mark whether the hurdles were solved with the help of training webinars, some help from project leads or significant help from project leads.
I have set up accounts for GSuite, Asana, Slack, and Github. Initially had problems using the GSuite account, but that was solved during a meeting.
I learned to scrape the Amazon Custom forum and put the data into several columns using the Pandas dataframe, but encountered some trouble, but I fixed them with the help of my teammates’ code, some researching online, and resources given by team leads.

Concise overview of things learned. Break it up into Technical Area, Tools, Soft Skills
Technical Area: I learned more about BERT, Attention Mechanisms, and NLP in general.
Tools: Learning how to use DistilBERT and learning how to train a model for classification.
Soft Skills: Communication and working with the team.

Three achievement highlights
Gaining an understanding of NLP
Learning how to use DistilBERT and working to train a model
Learning from the webinars

List of meetings/ training attended including social team events
Team meeting, the NLP webinars, and intermediate Python webinar

Goals for the upcoming week
Finishing working on using DistilBERT and training the classification model

Detailed statement of tasks done. State each task, hurdles faced if any and how you solved the hurdle. You need to clearly mark whether the hurdles were solved with the help of training webinars, some help from project leads or significant help from project leads
I combined the title and post text into one column, and trying to get DistilBERT to embed the sentences. Having some trouble with it, but I am working to fix it with resources like the webinars, the internet, and resources given by project leads.

Concise overview of things learned. Break it up into Technical Area, Tools, Soft Skills
Technical Area: Learned how to use DistilBERT and how to train a classification model.
Tools: DistilBERT, Transformers, Scikit
Soft Skills: Worked on teamwork and presentation skills

Three achievement highlights
Able to implement DistilBERT and train the classification model, created a csv file with post text split into a max of 512 words each, and gained a better understanding of machine learning.

List of meetings/ training attended including social team events
Attended the team meetings.

Goals for the upcoming week
Work on increasing accuracy of the classification model, and turning the model into a recommendation system.

Detailed statement of tasks done. State each task, hurdles faced if any and how you solved the hurdle. You need to clearly mark whether the hurdles were solved with the help of training webinars, some help from project leads or significant help from project leads.
I was able to implement DistilBERT and train the classification model. Initially had problems running DistilBERT, but fixed it by decreasing the max tokens. I made a csv file that combined title and post text, limiting each row to 512 words max. Had some trouble splitting the post text initially, but got it to work by fixing the code and more testing. I worked on the final presentation with teammates and practiced my presentation skills. Also, I learned more about machine learning concepts from team leads resources.