Technical Area
Learned the overall goal and workflow for the forum recommender project and grew familiar with the core ideas of ML, basic models, NLP, and data mining.
Tools
Grew more comfortable with Python and learned the purpose of Beautiful Soup and Selenium. Prepared workspace in VS Code.
Soft Skills
Broadened project management skills by reading into common best practices (planning, documentation, communication) and familiarized with scrum.
Grew more comfortable with GSuite groups, communication, and calenders.
Highlights/Achievements
Followed along with webinars and meetings to familiarize myself with basic machine learning fundamentals. Navigated through Discourse forums in order to grow more comfortable with the platform. Got to know team members and determined means of communcation/Trello and weekly meeting times.
Technical Area
This week I prioritized learning how to gather data from a source. Specifically, I learned how to build a Python web scraper and automize one with Selenium. I also learned how to extract text from HTML and reformat it. I read on how to store data into a csv file and perform basic exploration of the data.
Tools
VSCode, Selenium, Pandas, NumPy, Github, Trello
Soft Skills
Read up on and joined Trello group to prepare for collaborative work/sprints. Brushed up on Git to prepare for collaborative coding.
Highlights/Achievements
Set up the environment to support all needed web scraping packages.
Read and followed tutorials 0, 1, and 2 to scrape data and perform exploratory data analysis. Cloned Sara’s github branch to follow along how to create and run web scraper and store data.
To further work on
I plan on rereading the given material and practicing to the point where I can grow comfortable doing so without help. My next step is to scrape and explore data from a foreign source using the same strategies.
Technical Area
This week I continued following the tutorials and focused on storing extracted data and performing data analysis. I also began my analysis of the research paper.
Tools
Anaconda, Jupyter Notebooks, Python, Selenium Beautiful Soup
Soft Skills
Communicated with the team to determine how to analyze the research paper.
Highlights/Achievements
Set up the environment to support web scraping and data analysis in a Jupyter Notebook. Read and followed tutorials 0, 1, and 2 to scrape data and perform exploratory data analysis. Followed GitHub tutorial to analyze the Ecommerce Sellers Forum.
To further work on
I need to read more into ML theory in order to analyze the methods described in the paper. I plan on following more tutorials to get better at storing then exploring data.
Technical Area
This week I continued following the tutorials that focused on storing extracted data and performing basic data analysis. I also began my analysis of the research paper.
Tools
Anaconda, Jupyter Notebooks, Python, Selenium Beautiful Soup, Google Slides
Soft Skills
Communicated with team to determine how to analyze the research paper. Read on methodology and summarized the given section.
Highlights/Achievements
Set up the environment to support web scraping and data analysis in a Jupyter Notebook. Read tutorials 0, 1, and 2 to scrape data and perform exploratory data analysis. Followed GitHub tutorial to analyze the Ecommerce Sellers Forum. Finished slide for ML paper presentation.
To further work on
I plan on reading more into BERT and recommender systems to get a better understanding of the paper and prepare for the following module. Over the next few days, I will begin to perform scraping of an Amazon forum and attempt to perform basic data analysis.
Technical Area
Analyzed the preprocessing procedures used in the Twitter Sentiment paper and summarized them in the slide presentation.
Tools
Google Slides, Zoom
Soft Skills
Improved presentation skills by presenting the given portion of the ML paper and receiving feedback from Colin. Learned how to better present technical areas of an ML paper.
Highlights/Achievements
Presented summary and analysis of Twitter Sentiment paper as a team. Discussed how we can move forward in implementing a classifier system.