Machine Learning for Biomedical Data (LBRN)
NOTE: This workshop is free for LBRN members, online access to non-LBRN members is available upon registration.
This workshop is going to provide an overview of useful techniques based on unsupervised and supervised approaches in data mining and machine learning. We will use an oncology example to speak about the methods, their tuning and how to apply them in combination with standard regression-based and hypothesis testing techniques to get the most out of your data.
The field of machine learning (ML) provides methodologies that are ideally suited to the task of extracting knowledge from data, particularly large and complex datasets. Both statistical modeling and ML seek to build a mathematical description, or model, of the data and thus the underlying mechanism it represents. Inevitably there is substantial overlap between the two approaches, however they differ in that statistical modeling infers while ML predicts. Statistical models start with an assumption about the underlying data distribution (e.g. Gaussian, Poisson), estimate the parameters of the statistical model that most likely gave rise to the observed data, and either accept or reject the model depending on how consistent it is with the measured observations. For ML, t, a model that achieves optimal predictive performance is identified without necessarily assuming a functional distribution for the data. It is this hypothesis-free approach that makes ML an attractive choice for dealing with complex data sets. In this course, we will review some of the useful ways ML is used in clinical research (for clinical end-use) and in industry research and development, ranging from early discovery to challenges that emerge in clinical trials.