Besart
July 14, 2021, 6:23am
1
Technical Area:
Acquired an overview of machine learning fundamentals through STEM-Away content material.
Learned about the different measures of similarity, including cosine, dot product, and Euclidean distance.
Gleaned the linguistic difficulties inherent to NLP systems such as metaphorical language, multiple subjects, and slang.
Learned about Root Mean Squared Error as a means of evaluating network test results.
Understood the pitfalls of vanilla neural networks applied to NLP tasks.
Began the process of scraping data from Discourse community forums using BeautifulSoup and Selenium.
Tools:
Git
GitHub
Python
PyCharm
BeautifulSoup
Selenium
Soft Skills:
Initiated meeting icebreaker as a means of facilitating team communication.
Established communication channels between co-leads.
Set up daily scrum meeting availability with co-leads.
Addressed questions and concerns in a timely manner through Discord.
Built a rapport with co-leads.
Achievement Highlights:
Managed to scrape multiple forums ahead of the next module as a means of practicing necessary tools.
Gained a stronger understanding of machine learning fundamentals via supplementary third party tutorials.
Fostered an environment of communication within meetings.
Tasks Completed:
Established a team GitHub repository.
Watched the STEM-Casts pertaining to machine learning and NLP processes.
Attended project management meetings, lead meetings, and team meetings.
Goals:
Better incentivize meeting attendance.
Gain familiarity with data analysis libraries and methods.
Delve more into NLP content to derive a thorough understanding.
1 Like
Besart
July 21, 2021, 8:15am
2
Technical Area:
Learned how to utilize Selenium and BeautifulSoup in tandem to scrape a site.
Read up on methods of data visualization such as a word cloud and bigrams.
Explored several methods of data analysis, including frequency and bag-of-words.
Tools:
Pycharm
Git
GitHub
Ubuntu
Selenium
BeautifulSoup
Pandas
NumPy
Soft Skills:
Engaged with teammates during icebreakers by engaging with follow-up discussions.
Attentively engaged with co-lead about sub-team structure.
Periodically asked if anyone needed clarification during the meeting.
Achievement Highlights:
Provided a tutorial on BeautifulSoup, Selenium, and GitHub during the weekly meeting.
Met with co-leads to formulate team suggestions.
Resolved permissions issues with GitHub.
Tasks Completed:
Completed scraping team selected forum.
Allocated sub-team with co-leads.
Invited team to repository.
Goals:
Finish visualizations of the scraped data.
Get ahead on learning BERT fundamentals.
Trace sub-team progress.
Besart
July 28, 2021, 6:09am
3
Technical Area:
Learned to navigate various modes of data exploration via a bevy of python libraries.
Grew accustomed to Pandas DataFrame objects and tinkered with extra features.
Produced bigrams, trigrams, and word clouds as a means of visualizing the data.
Explored word-embeddings options in an effort to evaluate their pros and cons.
Tools:
Pycharm
Git
GitHub
Ubuntu
Selenium
BeautifulSoup
Pandas
NumPy
Seaborn
TextBlob
WordCloud
nltk
matplotlib
Soft Skills:
Facilitated the icebreaker during the meeting.
Proactively met and discussed with leads via Discord and Google Meet.
Engaged with direct messages and questions asked on Discord from member interns.
Achievement Highlights:
Pushed the scraper, eda, csv file, and visualizations to GitHub.
Met with co-leads to re-formulate team dynamic and approach to work allocation.
Explored extra machine learning resources as a means of discerning effective strategies for recommender system training.
Tasks Completed:
Successfully completed module deliverables.
Added additional functionality to scrape class to give the user the option to scrape the entire forum or up to a maximum limit.
Updated GitHub README file to function as a repository guide and reminder of common Git commands.
Attended and helped facilitate scrum meetings.
Goals:
Determine which form of word embeddings best suits the scraped data.
Practice modeling and training fundamentals on scraped data.
Prepare progress deck alongside the rest of the team.
Besart
August 4, 2021, 5:15am
4
Technical Area:
Digested content pertaining to recommender system concepts.
Split data in training and test sets and attempted to train a model based on said data using a simple transformer.
Navigated Colab and Jupyter Notebooks in an effort to discern pros and cons of both environments for the project at hand.
Tools:
Pycharm
Colab
Git
GitHub
Ubuntu
Pandas
NumPy
Transformers
SkLearn
PyTorch
BERT
Soft Skills:
Facilitated the icebreaker during the meeting.
Proactively met and discussed with leads via Discord and Google Meet.
Maintained communication on Discord.
Achievement Highlights:
Delved into materials in an effort to clarify machine learning concepts as issues arose.
Established additional meetings with leads to readjust project problem statement and pipeline goals.
Attended session 1 presentation to gain insights on potential future goals and obstacles.
Tasks Completed:
Calculated cosine similarity between posts.
Prepared a bag-of-words and tf-idf word embeddings ahead of training.
Attended and helped facilitate scrum meetings.
Goals:
Determine issue causing Colab and Jupyter to terminate upon attempting to training the model.
Utilize cosine similarity as basis to train a simple recommender.
Train additional classifiers.
Besart
August 10, 2021, 6:21pm
5
Technical Area:
Trained a plethora of models in an effort to gauge the best performance including: naive-bayes, random forest, decision tree, logistic regression, and linearSVC.
Switched between various word embedding methods to improve accuracy scoring.
Learned about various forms of hyperparameter tuning and applied some minor changes to the models.
Artificially rebalanced select categories.
Tools:
Pycharm
Colab
Git
GitHub
Ubuntu
Pandas
NumPy
Transformers
SkLearn
PyTorch
Jupyter Notebooks
Seaborn
matplotlib
scikit-learn
Soft Skills:
Assisted members directly via personal messaging communication channels on an as needed basis.
Held multiple meetings with leads to determine course of action pertaining to team communication efforts.
Maintained communication on Discord and through scrum meetings.
Achievement Highlights:
Produced a problem statement befitting the fluid nature of the selected forum to scrape alongside co-leads.
Established sub-teams to divide focus on improvements regarding the recommender system and the classifier training efforts, alongside co-leads.
Made improvements to the data cleaning process, resulting in a team-wide CSV file that served to unify and streamline training efforts.
Grew accustom to BERT model structure and weighed accuracy against scikit-learn algorithms.
Tasks Completed:
Developed a rudimentary recommender system.
Trained a wide range of models and tinkered with parameters to induce accuracy improvements.
Created and tweaked confusion matrix heatmaps.
Pushed team-wide CSV to GitHub.
Attended and helped facilitate scrum meetings.
Goals:
Gauge team progress.
Continue to modify parameters in an effort to improve accuracy scores.
Identify whether accuracy scores cohere with confusion matrix output and adjust accordingly.
Potentially add more artificial data to assist with certain categories if accuracy stagnates.
Besart
August 18, 2021, 5:46am
6
Technical Area:
Performed hyperparameter tuning on trained models.
Completed recommender.
Tools:
Pycharm
Colab
Git
GitHub
Ubuntu
Pandas
NumPy
Transformers
SkLearn
PyTorch
Jupyter Notebooks
Seaborn
matplotlib
scikit-learn
Soft Skills:
Assisted members directly via personal messaging communication channels on an as needed basis.
Held multiple meetings with leads to determine course of action pertaining to team communication efforts.
Maintained communication on Discord and through scrum meetings.
Achievement Highlights:
Trained and tuned Bert and Roberta classification models.
Achieved marked improvements in accuracy relative to former attempted models.
Tasks Completed:
Adjusted recommender system using a suitable CSV.
Trained several advanced classification models.
Produced confusion matrices for each model.
Attended and helped facilitate scrum meetings.
Goals:
Continue to modify parameters in an effort to improve accuracy scores.
Tinker with Flask app.
Complete final presentation.