Godbless_Chille - Machine Learning Pathway

Assessment 1

Things learned:

  • Technical: Web scraping, ML Theory, NLP tools and overview, and ML Theory
  • Tools: Github, Slack, Jupyter Notebook, Asana
  • Soft skills: Teamwork, Communication, and gathering online resources to solve problems.

My three achievements

  • Built my first web crawler to scrape data from the given website with high efficiency
  • Explored different NLP tools from monkey learn.com
  • Managed to gather a clean csv file from the data collected by web crawling

List of meeting/training attended

  • STEM-Casts: Overview of ML and project
  • GitHub (1st session)
  • GitHub (2nd session)
  • Weekly team meeting on 6/8
  • Weekly team meeting on 6/15

Goals for upcoming week

  • Finish data cleaning and EDA
  • Learn how to pre-process text
  • Learn how to do data visualization

2nd Assessment

1.0 Concise overview

What I learned last week:

  1. Technical:
  • Learnt how to use TF-IDF model to preprocessing corpus
  • Learnt about the basics of BERT
  • Explored other ways of preprocessing eg. Bag of words, sentiment analysis
  1. Tools: Sci-kit learn library, Google Collab

2.0 Three achievement highlights

  1. Learn about TF-IDF model and its implementation.
  2. Use scikit-learn library implement TF-IDF model.
  3. Optimized the code to minimize number of commands.
  4. Understood the essence of BERT and made a presentation with my teammate, @afink

3.0 List of meetings/ training attended including social team events

  1. Weekly team meeting on Monday
  2. STEMCast: NLP Webinar

4.0 Goals for the upcoming week

  1. Learn how to implement BERT model
  2. Use BERT efficiently for topic modelling

5.0 The how’s (Tasks)

  • I imported the csv files that contained more than 681 rows, and create 2 corpus, one for starter content one for reply content, each of them has 3000 documents. I used scikit-learn library for the purpose of TF-IDF model. I did this with the help of team-lead and my teammates. I also referred to online resources.
  • I also worked with my teammate, Alan, to craft a presentation for BERT. We both volunteered to help our team-leads with BERT implementation
  • I reached out to @Akim_Borbuev to get help on data cleaning and pre-processing as well. We scheduled a zoom meeting that will be on very soon.
  • I managed to implement all the requirements with the help of my team, specifically @brandonc, @Hpk0304, @Harshavardhan and Rahul.

Personal note : Projects and internships have helped me be better at working with others to accomplish certain goals. I believe there is no ‘me’ in tasks here. I like my team so far and I am hoping to work with them over the next 3 weeks and get to know them even better.

3rd Self Assessment

1.0 Deep overview of tasks

What I learned last week:

  1. Technical: BERT Model, Implementation of BERT
  2. Tools: fastai, JP notebook, colab
  3. Soft Skills: Presentation, public speaking, collaboration on group slides

2.0 Three achievement highlights

  1. Use deep learning framework to implement the BERT Model
  2. Use BERT model to do text classification to predict the category of posts

3.0 List of meetings/ training attended including social team events

  1. Weekly team meeting on Monday
  2. BERT demo by team leads

4.0 Next week’s goals

  1. Assess and improve accuracy of text classification results
  2. Learn new things on BERT implementation

5.0 How-it-was-done

I finished all the following tasks by myself and with the help of online resources.

After thorough reading and comprehension of BERT model, I used the deep learning framework fast.ai to implement a text classification program with BERT model. The program classify the 3000 posts from lastest page of SmartThings based on the posts content, and compared with the category metadata on those posts. I still have to assess accuracy and implement it better.

Achievement highlight (non-technical) I was recommended by my team lead @Hpk0304 to become a lead! I gladly, and humbly accepted the promotion and I am now looking forward for the coming weeks. I am really interested in seeing the end of this beautiful masterpiece.

Personal note I will really miss my teammates (those who are not extending) and my team lead as well. It was a very enriching experience and I am looking forward to learning more for these coming weeks!