Interest in Data Science is growing in many industries as data becomes a key part of business, finance, and today, biomedical research and medicine in general. Biologists, clinicians and lab technicians can set themselves apart, advance research and become more efficient in their day-to-day activities by leveraging data. However, biomedical data of the future is high-throughput, molecular-level data. This data is non-trivial to process, analyze and interpret. High-throughput data like genomics, transcriptomics, metabolomics and even structural chemical data is poised to revolutionize our understanding of health and disease.
To address this challenge, most courses focus on developing technical skills like coding and statistics, but those can be intimidating for someone that is just getting started. That’s why we created a course that leverages user-friendly tools and modern techniques to help you take advantage of the data revolution happening in the world of biomedicine. In just 1 month, you will learn about high-throughput biomedica datasets, projects in oncology and neurodegenerative diseases and perform your own in-depth analysis to discover biological processes driving disease and creating your own gene signatures associated with disease onset and progression. At the same time, you will understand the workflow of a data scientist that has to leverage infrastructural solutions like the T-BioInfo platform and user-friendly coding environments like R studio. Key concepts in advanced analysis from exploration to machine learning will become clear in the context of their use in biomedical research and discovery.
Curriculum: The 4 week long training program, consists of following modules:
Session 1: Introduction to High-throughput Biomedical Data
- What is high-throughput biomedical data
- Examples: transcriptomics, genomics, epigenetics, metagenomics, chemical
- Processing and standardizing a high-throughput datasets
- Repositories and databases
Session 2: Introduction to R: Data structures and Visualization
- Data Visualization (DV) – For Decision Making
- Exploratory Data Analysis using DV
- Data Quality Diagnostics
- Density charts and modal diagnostics.
Session 3: Efficiently processing large-scale high-throughput data with T-BioInfo and Machine Learning methods for Exploratory Data Analysis
- HPC environments for data processing: T-BioInfo
- Machine learning and statistical analysis
- Visualization and annotation
- PCA output visualizations (creating biplots)
Session 4: Modifying R script from your pipelines to improve visualization, make it more informative and Interpreting your data along with project planning
- Annotating genes
- Pathways and networks
- Interaction and integration
- Exploratory Data Analysis Concepts