- Learned how to use Python requests to access information from a website.
- Learned how to utilize Json data in order to scrape information from a webpage.
- Learned how to write to a CSV file using the writerow functions from the csv library.
- Learned how to access webpage data from retrieved .json data.
- Reviewed concepts like lists and dictionaries in python.
- PyCharm + Atom (IDE and Text editor)
- Python Requests
- Debugging - I ran into several errors when trying to retrieve data from the webpage and had to find different ways to resolve the issues.
- Troubleshooting - I used various online resources, including stack overflow to help resolve certain issues in my program.
- Growth Mindset - Through the various issues I encountered, I made sure to stay motivated and focused on the task that I was trying to complete in order to be successful.
- I first went into the discourse communities and decided on which forum I would be scraping the results from. I chose the SitePoint forum.
- I then explored the python requests function to figure out exactly what I needed to do after establishing a connection using the website’s URL.
- I initially was focused on using BeautifulSoup to parse through the HTML of the code, but then decided that it would be easier to utilise the JSON data from the website, as the labels were all stored in a dictionary and did not require me to remove the HTML tags.
- I took the data from the dictionaries returned in the JSON data and converted them into list format. I then took the lists and converted them to columns in my CSV file, leaving me with a CSV file that has 1200 entries with the title, number of views, and the number of replies to each post.
Github link to code: GitHub - eeshanw/MachineLearningRecommender