Week: July 27th - August 1st
Overview of Things Learned:
Technical Area: Web Scrapping on the posts within the community car talk by going through the subsequent different HTML tags when using BeautifulSoup which proved to be a bit tiresome hence I have been looking for an ‘easier’ alternative to BeautifulSoup e.g Scrapy.
Tools: Jupyter, Excel, Matlab, BeautifulSoup
- Being able to set up the web scrapping code on Jupyter along with Matlab although it took a lot of time and getting relatively similar results.
- Tried out a few debugging techniques
- Further understanding how to work with Jupyter and how certain extensions to it can make it easier to work with.
Meetings attended I have been unable to attend meetings for this first week since my google calendar didn’t show any updates but have been watching the recordings of the meetings and have been keeping up to date with the necessary tasks due. However, I feel some of the technical issues I keep facing would be cleared up if I had attended the meetings which is why this week I am particularly conscious of why I need to not miss a meeting. I did attend the Python webinar which was a huge help in further understanding the language.
Goals for the Upcoming Week
I want to be able to attend a meeting in person particularly this Friday’s meeting and ensure all my pre-processing work has been done. I also want to try and attend at least one office hour this week so that I can be able to ask and get clarification on the different issues I have been having while carrying out this week’s deliverables.
Learning TF-IDF which is so new and very confusing for me.
Completed the web scrapping using BeautifulSoup in Python and on Matlab. However, running the codes on Matlab and Jupyter with the languages being very different takes a lot of time. I still stored the collected data on my laptop as a csv file by directly downloading it from Jupyter and as a mat file from Matlab.
I have began the pre-processing part, however its proving a bit hard to use Matlab to do this and then transfer some sorted code to Jupyter in python of the same thing so I am trying to solely work on the csv file.