Topic suggestions and tag clustering with Machine Learning algorithms, using data from STEM-Away® forums (Discourse)

Machine Learning algorithms using Discourse Forums as a case study.

We are expecting to run 3 distinct projects:

  1. Collaborative filtering based topic suggestions
  2. Content based topic suggestions
  3. Content based tag clustering

This is a pilot project which will help evolve the Mentor Chains® platform. In addition to the display of technical and soft skills through the 1-Click® Resume, participants can elect to be showcased on our social media pages.

Join the Machine-Learning-1 Group to receive all project announcements, including the call for student leads and other student participants.

Project Overview:

All projects involve the analysis of Discourse Forums to decide input data and methods of collecting input data. The ML teams can interface with the team working on Discourse plugin development for help with data collection. The data mentioned below is a starting point.
  1. Collaborative Filtering based topic suggestions
    • Idea
      • Suggest posts based on user’s history
    • Data
      • Direct Measures: Likes
      • Indirect Measures: Views, Number of Replies
    • Algorithm
      • Collaborative Filtering process

  2. Content Based topic suggestions
    • Idea
      • Suggest posts based on textual semantic similarity
    • Data
      • Title, main post, replies (text)
      • Tags
      • Author
      • Date published
    • Algorithm
      • Embedding the text itself
        • TF-IDF, CountVec, Spacy
        • Deep Learning models - BERT, XLNET
      • Similarity between posts
        • Cosine similarity
        • KNN model
      • Bring other attributes into model
        • Same author
        • Set intersection of tags
        • Post closer in time

  3. Content Based tag clustering
    • Details to be added shortly

Skills:

  • Fundamental skills necessary for Machine Learning
  • Deep Learning models - BERT, XLNET
  • Similarity models - Cosine similarity, KNN model
  • Collaborative filtering
  • Communication
  • Teamwork
  • Leadership & Mentoring (project & task leads)

Dates:

Summer 2020

Prerequisites:

1 Like