Machine Learning for Biomedical Data
The field of machine learning provides methodologies that are ideally suited to the task of extracting knowledge from data. Both statistical modelling and ML seek to build a mathematical description, a model, of the data and the underlying mechanism it represents; thus inevitably there is substantial overlap between the two. However, historically they differ in their rationale as follows. Statistical models start with an assumption about the underlying data distribution (e.g. Gaussian, Poisson). The focus is on inference; estimating the parameters of the statistical model that most likely gave rise to the observed data, and providing uncertainty bounds for these estimates. For ML, the focus is typically on prediction; without necessarily assuming a functional distribution for the data, a model that achieves optimal predictive performance is identified. It is this hypothesis-free approach that makes ML an attractive choice for dealing with complex data sets. While in traditional statistical modelling a hypothesis (model) is put forward and is then accepted/rejected depending on how consistent it is with the measured observations, ML methods learn this hypothesis directly from the training data set. In this course, we will review some of the useful way Machine learning is used in clinical research (for clinical end-use) and in industry Research and Development, ranging from early discovery to challenges emerging in clinical trials.
Many problems of interest to clinicians and pharmaceutical companies can be addressed using machine learning approaches. These problems include:
• molecular mechanisms of disease
• Co-morbidity and other factors
• design of signatures for the identification of potential responders to therapies
• analysis of molecular mechanisms of disease progression;
• Classification of patients
• detection of toxicity at the in vitro stage
• analysis of molecular mechanisms of a drug action, including identification of primary targets and “reaction waves” caused by the initial impact; “target discovery”
• identification of additional diseases for which a drug can be potentially efficient. “drug repurposing” or “repositioning”
Sophisticated commercial software solutions have been developed to address these challenges, however, knowledge of basic methods is important both for optimization of these solutions, and for their efficient usage with a correct and deep interpretation of results.
- Lectures 27
- Quizzes 5
- Duration 50 hours
- Skill level Intermediate
- Language English
- Students 347
- Certificate Yes
- Assessments Yes
What is Machine Learning?
This section will provide a conceptual overview of how Machine Learning became a "must have" asset for biomedical research
High Throughput Data
Preparing High Throughput Biomedical Data For Machine Learning
Preparing High-throughput Data for Machine Learning
Samples, replicates and numbers
The tile of this course is MISLEADING!
This course begins with a nice introduction on two different types of Machine Learning (ML) i.e supervised and unsupervised. However, it clearly fails to apply these ML techniques on the biomedical (RNA-seq) data set they are using for this course! After the "Intro to ML" section, this course dives deep into explaining High-Throughput techniques used in Bioinformatics, analyzing/preparing raw RNA-seq data for "Statistical" (as opposed to ML) analysis by building RNA-seq pipelines and then gives introduction to doing statistical analysis on RNA-seq data (Box plots, descriptive statistics, Normalization, etc.) which I should say is nicely covered (Hence the 2 stars!). But the title of the course is clearly misleading as they don't apply any specific ML technique to RNA-seq data in this course. So, I think they should call this course "Introduction to Statistical Analysis on RNA-Seq data" and completely eliminate the ML component since it is only briefly covered (in first session).
BioML manages to combine all the significant portions that Transcriptomics & Genomics covered, and provides you with a recapitulation. It allows you to test yourself with the 5 different quizzes that it has at the end of each module. While BioML does not essentially guide you through any specific tutorial on running pipelines it definitely provide a brief introduction to statistics to take meaning out of the outputs you have generated in the previous omics courses. All in all an apt summarizing course for the omics courses and a perfect start for upcoming ML analysis courses!
BioML covers all the basics of machine learning starting from gathering data till it's classification and clustering.