About the Machine Learning Pipeline (NLP) category

Introduction to ML and NLP foundations by designing recommendation and classification systems. Create a free account to see full content.

Project Content

  • Python foundations: coding, packages, debugging
  • Machine learning foundations: defining the problem, collecting data, training and evaluating the ML models, deploying the solution
  • Introduction to Natural Language Processing (NLP) fundamentals and recommender systems

Problem statement

Build a forum post classifier and basic recommender, that will help us classify/recommend forum posts to help the user find similar posts or figure out what category a post belongs to.

How are we going to do this?

  • Collect the data: We will pick the Discourse Hub Community forums as our main data resource. You can either scrap one or several forums.
  • Explore your data (EDA): After gathering our data, we will analyze it to understand the data and get familiar with it before feeding it to ML models.
  • Calculate similarity between posts and recommend: We will vectorize our data, and compute the similarity matrix then use it to recommend posts similar to a certain post.
  • Train ML classifiers and classify a post into its appropriate category.
  • Compare and choose the best approach: evaluate the results of step 3 and 4 and choose the best one.
  • Build a simple web app using Flask or Streamlit.
  • Deploy your web app using AWS or Heroku or some other service.
3 Likes