Overview of things learned :
Getting familiar with the workflow of a ML project :
1. Understanding the business probelm
2. Collecting data
3. preprocessing data
4. Data exploration
5. ML models
6. Models evaluation
Discovering the recommandation system and its types ( collaboratives filtering, Content based filtering) and measures of similarity ( Cosine, Dot Product and Euclidean distance)
Discovering NLP, and the importance of converting our data (text) to a numeric values so our machine can learn from it, for that we can use several representation like: One Hot Encoding, Bag of Words, TF-IDF, Word embedding.
Discovering how we can gather our data using web scraping (getting data from a single page) and web crawling (getting data from multiple pages), but we have make sure that the website allows the scrapping/crawling.
- BeautifulSoup, Scrapy and Selenium for web scrapping/crawling.
- Pytorch for deep learning algorithms
- Getting familiar with google Colaboratory
- Spacy for NLP
- Trello a tool for project management
- Learning about the Agile Method for project management and its framework SCRUM.
Three achievement highlights
- Playing with BeautifulSoup and Scrapy to scrap data.
- Learning from a tutorial how to use Logistic Regression to build a text classifier.
- Learning the difference between Scrapy, BeautifulSoup and Selenium
- Workspace prepared ( creating a virtual environment and installing the libraries required).
- Getting familiar with the concepts of NLP ( Word Embedding, Document similarity…).
- Learning web scrapping and web crawling and their tools.
- Learning more about Git and how we can use it.