Technical Area:
- Learned about the different types of recommender systems and collaborative vs. content based filtering
- Learned three types of similarity measures: cosine similarity, dot product, and the Euclidean distance.
- Followed the tutorial to build a movie recommender system
From NLP Basics Parts 1 - 3 webinar and some additional research on my own I learned about
- The challenges of NLP, and the many ways subtle inflections or subtext can change a sentences’s meaning.
- Cox proportional models and how to express them as hazard functions and how to interpret hazard ratios
- Some old language models and their limitations
- One hot encoding limitations: columns of resulting matrix are mutually orthogonal (dot product is always 0)
- word2Vec and gl0ve limitations: a word gets the same vector regardless of context.
- Vanilla Neural Network, Recurrent Neural Networks and LSTM basics
- Attention:
- I learned the basic architecture of a transformer
- I studied the linear algebra steps of the core attention model: compute compatibility between the query and the keys via a dot product, and then normalize it through a softmax function to get the attention weights.
Tools:
- I studied the BeautifulSoup documentation and learned the fundamentals of parsing HTML text, what the important objects are and how to use them.
- I Learned how to use Scrapy to build a simple web crawler
- I studied the Selenium and Webdriver documentation
- Set up Colab
Soft Skills:
- I improved my ability to read technical papers by reading the “Attention is all you need” paper and how to filter out relevant information
- I learned that it is sometimes better to focus on the big picture at first, before getting stuck in the nitty gritty details. For example; it is best to first understand what the softmax function does before delving into the formula itself.
- I’m still learning how to organize my time between studying theoretical concepts and applications.
Achievements
- I now have a basic understanding of the fundamentals of NLP - what it is, the challenges, the solutions, and some of the linear algebra behind attention models.
- I followed two tutorials: the movie recommendation system and the web crawler
- I am getting better at using BeautifulSoup and Scrapy
Tasks Completed:
- Downloaded all the necessary libraries
- Watched the first four STEMCasts and all of the NLP Basics series
- Followed GitHub tutorial
- Worked on training a sentiment analysis ML model (I still have a few bugs here and there, but nothing too serious).