Hpk0304 - Machine Learning Pathway

My first self-assessment on 16.06.2020

Overview: from zero to something, especially it’s a real challenge when I’m in the group of leader - NLP is such a new world for me. However, I am proud of my team for overcoming all difficulties and find the roadmap for the project. So lucky am I to work with very nice friends.

Learning process:
1. Technical area

  • Bond of ML in NLP, typically in text-classification
  • Overall the project as follows NLP roadmap with BERT
  • What I achieved: web-scraping, preprocessing, and currently researching more BERT model.
  • What difficulties: It took me much time to understand how text-classification works. I read lots of articles, joined webinars of StemAway, learned from free courses, and asked many friends (both the team members and other team leads). The key method helping me overcome the difficulty is asking as much as possible, tried to understand the big picture before going into details
  • What I need to improve: coding skills. Regarding the plan, I will try other methods in pre-processing (especially TF-IDF) to make the dataset nicer and prepare well for the BERT section.

2. Tools

  • Support the technical side: Python, Jupyter notebook, Anaconda, Spider, Git
  • Support management side: G-suit, Zoom and platforms such as StemAway, Asana, Slack, Github
  • What I achieved: I can use confidently most of the mentioned tools and explore lots of other cool features from those tools. For example, Mindmup app, Gsuit plus (from G-suit)
  • What difficulties: not good at Asana (set priority and other features). I will take time to figure out it when I finish my week teamwork.
  • My plan: learn torch and TensorFlow to prepare for BERT

3. Soft-skills

  • leadership is the center. It requires many other related skills: organization, communication, team-building, teamwork, problem-solving, and time-management. The most challenging things are how to understand the motivation and talents of team members and how to keep track and improve the teamwork.
  • What I achieved: friendship (in the team and other teams, and other pathways). They shared a lot of tips and tricks as well as some relevant training sections or teambuilding activities.
  • What difficulties: team-building and ice-breakers. The interaction level in the first two meetings was really low. Thanks to weekly post-meeting feedback and advice from Stephanie or other leads, we improve the teamwork in the 3rd meeting. I will work more with our leads to do more team-building activities to enhance communication and workflow.
  • My plan: continue listening, watching, noting, asking and sharing

Three achievement highlights:

  • Understand the roadmap
  • Good collaboration with other team leads and members, learn how to share and break the tasks
  • Learn and perform step-by-step the project with my team

Meetings/training sections:

  • Weekly Monday team meetings, team lead meetings
  • Weekly lead meetings (ML pathway)
  • Weekly other pathway meetings (UX, FS leads)
  • Webinars: ML, Asana, leadership, industry mentors for text-classification & Git, Python
  • Some OH with Stephanie

2nd assessment on Week 4

Hashtag: team-building, BERT

Learning process:
1. Technical area

  • BERT model research: understand concepts and connect to the current work
  • What I achieved: Run a demo for full BERT codes from tokenizer to predictions
  • What difficulties: resolve the confusions between other technique to do pre-processing and make comparisons with BERT model. Moreover, I have to solve some bug errors in evaluation metrics and predictions
  • My plan: continue solving the errors and implement successfully the prediction. Moreover, I’m trying to do multi-label for the dataset to perform full dataset with BERT

2. Tools

  • Technical side: pytorch, Tensorfow
  • Management side: Zoom breakout room function, new apps from Gsuit to increase the interactive meeting and track the workflow
  • What I achieved: got familiar with new technical tools and explored more functions to support the group meeting
  • What difficulties: Asana trouble, 3 friends could not access Asana. But I contacted with Jonah and Katie to set up a meeting to solve that issue next week.
  • My plan: AWS application

3. Soft-skills

  • leadership again. I collaborated with leads to do team-building like presentations, continue the guess and answer the movie series and do a small competition. All are successful, except for the competition (but it’s because we are currently working on the BERT model).
  • What I achieved: be available for team questions, explain in different ways like chat, office-hours (share screen), or fast visualize the process to make sure friends get info. If the problems that I could not solve them, recommend them to a person who can help them (not just leads). Joining several meetings with other leads in ML or other pathways bring me lots of ideas about the team-building work to increase the interaction.
  • What difficulties: time balance for the project and daily life.
  • My plan: do a checklist everyday to balance life and take care of my health.

Three achievement highlights:

  • Understand BERT and proceed 60% BERT in classification
  • Engage teamwork and improve the meeting rate
  • Have an overview of the ML practices related to other pathways to develop web

Meetings/training sections:

  • Industry mentor NLP with leads
  • Monday team meeting
  • Guide section of team on Wednesday
  • Asana call with Jonah to solve issues
  • Interpathway meeting with UX, FS and BI
  • Site evolution meeting on Friday

Last assessment

Overview: Our team finished the project with text classification and some knowledge about the recommendation system. The accuracy is not high, but we found some other ways to make the improvement. This work will continue in the next 3 weeks. Moreover, I’m working with my team for presentation, we’ll also share the meeting rates and evaluation metrics throughout 5-week internship.

Learning process:
1. Technical area

  • BERT model: solved the errors about the training loop. Our team discussed and changed the dataset a bit to reduce the bias of the text.
  • What I achieved: combined different pre-processing methods with best to increase the accuracy
  • What difficulties: lots of bias. There’re many categories (32) with large variety of post amounts.
  • I stopped my work here but I also had the last Monday meeting with team to clarify the following steps to increase the accuracy up to 80%. However, I will follow my team’s work

2. Tools

  • Technical side and management side: no additional tools, use the current tools but do more research to combine the various approaches
  • What I achieved: solved Asana problem and shared information with team or the new leads. - What difficulties: I tried colab to do epoch for training but it also took lots of time. Maybe AWS can be the solution for running time

3. Soft-skills

  • problem-solving skill and working under pressure are most focused because of the stressful work last week. To reduce the confused, I have to control my stress to lead the team. Moreover, all leads have to communicate continuously throughout the week to support the team

Three achievement highlights:

  • Solve the Bert problem and proceed full Bert (even the accuracy is not high)
  • Motivate team to finish the project
  • Explain with team the connections in June section, extra 3-week and July section

Meetings/training sections:

  • Monday team meeting
  • Interpathway meeting with UX, FS and BI
  • Call meeting with Antonie about site evolution