BAISHALI_SOW_MONDAL - Machine Learning (Level 3) Pathway

BAISHALI_SOW_MONDAL · June 18, 2021, 3:56pm

Overview of things that I learned

Technical Area:

Learned more about overview concepts of Machine learning and some machine learning algorithms by watching the lecture videos
Learned more about fundamental ideas about NLP and some networks like Distributional Semantics,Linguistics problem,EBC etc.
Learning about web scraping and data mining

Tools:

Soft Skills:

Being more prepared for machine learning and NLP as a whole
Get more familiar with how to explore Machine Learning in the field of Bioinformatics
Medline: learned the database of Medline and how to extract data from it.
EBC: Learned what Ensemble Biclustering for Classification (EBC) and hierarchical clustering algorithms

Achievements and tasks:

Learned about concepts of Machine Learning and web scraping
Built virtual environment
Became more familiar with machine learning,NLP ,Bioinformatics, Biomedical field
Read the research paper and made journal tasks based on that paper

BAISHALI_SOW_MONDAL · June 26, 2021, 3:35am

Module 1 - Overview:

Technical skills:
- Prsed the raw data from Medline by Pubmed parser
- Understand how to use the Stanford Parser.
- Read and understand given scientific papers
- learned more foundational knowledge of Dependency Parsing
- Used Dependency parser using java
Tools/Libraries:
- Java: Downloaded and implemented it with parsing the .txt file.
- VS Code: Installed it and tried to understand how to use it.
- Stanford Dependency parser
- Successfully installed jython2.7.2
Soft Skills:
- Natural Language Processing(NLP)
- Trying to understand Dependency Parsing
- Get trying to more familiar with how to use VS Code and how to debug it for a python file
- Have a basic understanding about the parsed database
- Attend all the teamwork sessions and have a discussion about works.
- Virtual-collaboration: Actively participated in training/Q&A sessions held by colin.

Achievement Highlights:

Learned how Dependency parsing works and what the foundational knowledge of Neural Transition Parser is.
Finished reading the Stanford Parser Manual to have a deep understanding of grammatical relationships between words and different format/style for the output.

Tasks Completed:

Goals for The Upcoming Week:

Combine the output from the Pubmed parser to the Stanford parser and embed it with EBC.

BAISHALI_SOW_MONDAL · July 26, 2021, 7:42pm

Module-2 Overview:

Technical skills:
- Build the sparse Dependency matrix using Stanford Parser
- Used spaCy, a common NLP library in dependency parsing.
- Learned more about Sparse Matrics in Machine Learning algorithm and how it can be used in Dependency parsing.
- Understand how Stanford NLP works and how this can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of speech, and morphological features, and to give a syntactic structure dependency parse, which is designed to be parallel among more than 70 languages
Soft skills:
- Stanford NLP
- Dependency Parsing

Three Achievement Highlights:

Goals for The Upcoming Week:

Extract the research papers which contains abstract and extract the drug-gene pairs with its dependency path

Tasks Done:

BAISHALI_SOW_MONDAL · July 28, 2021, 4:07pm

Module-3 Overview:

Technical skills:
- Filtered the Medline publications according to which ones contain abstracts using PubMed parser
- Learned to use string matching to extract the sentences that contain drug-gene pairs to be input to the Stanford parser
- Successfully extracted drug-gene pairs by using drug bank( for drug) and pharmGKB(for the gene)
- Learned how to use the Stanford parser to .extract the dependency paths of the drug-gene pairs.
- Biclustered the dependency matrix using the Ensemble Biclustering Algorithm.
- Successfully constructed a graph using an arbitrary number of data files in Dask
Soft skills:
- Biclustering
- EBC
- Dask

Three Achievement Highlights:

Goals for The Upcoming Week:

Tasks Done: