Things that I learned
Technical Area:
- Utilized web scraping tools to gather text data from websites. Additionally used the Tweepy API to gather text data
- Pre-processed and cleaned the text data by removing tags, comments and mentions
- Understood the process of verctorization of tweets and how words are represented as vectors
- Used basic ML models like Logistic regression and naive bayes to find similar words
- Employed a better neural network based approach for recommender systems
- Compared performance of both methods using the results obtained
Soft Skills:
- Improved usage of various python libraries and web scraping tools
- Utilized jupyter notebooks
- Developed familiarity with flask for web app deployement
Achievements:
- Successfully scraped data from one of the forums provided
- Successfully scraped data from a simpler website
- Cleaned and pre-procssed data
- Successfully employed classification algorithms
Tasks Completed:
- Performed statistical analysis of data obtained
- Was able to use classification algorithms. Also understood the algorithms through video tutorials
Link: https://github.com/harshita19244/classifiers/blob/main/preprocess.py