Overview of things learned:
- Various NLP models including BERT and bag of words
- Learned how to scrape the web using BeautifulSoup and Selenium
- Gained experience in inspecting HTML elements in different discourse forums
- BeautifulSoup, Selenium, BERT
- Remote Collaboration
- Setting independent goals
- Understanding how we can use BeautifulSoup to webscrape. It allowed us to create soup objects in which we can extract only the information we desire from our discourse forum.
- Fixed the issue where I was only getting a limited number of elements after BeautifulSoup. This problem was because I wasn’t loading all the HTML elements. Selenium allowed me to scrape a large number of data points.
- A minor achievements was getting python up and running on my computer. I was getting compilation errors for simple commands. Through the command line I was able to update the packages used allowing me to continue with my tasks.
I have been able to attend every team meeting. I have also attended the useful webinars provided some of the other team leaders.
Goals for Upcoming Week
Finish my BERT implementation. Also attempt to implement a different model in order to compare accuracy between the two.
- Watched each of the training webinars to better prepare myself for the project and the technical requirements required to succeed.
- Our task for the first week was to select a forum and scrape key information from it. Our team selected the Atom Discourse Forum as our subject of data collection. I was able to use BeautifulSoup and Selenium to collect a large quantity of data from the Atom Discourse Forum and separate it into a CSV file. The resources that Anubhav and Saad posted were really helpful in fully understanding how BeautifulSoup is utilized.
- Our task for the second was to go through a handful of NLP implementations that would help us increase the accuracy of our recommender system. I’m still working on getting BERT fully operational. The resources that Anubhav has posted have really helped out.