PCA: biomedical data visualization in R
This module is designed to demonstrate the utility of multivariate analysis methods that help with complex biomedical data visualization in R using RNA-seq as an example. The tutorial will cover the following topics:
- Need for dimensionality reduction techniques for high throughput biomedical data like RNA-seq
- The utility of principal component analysis for biomedical data visualization in R
- Demonstration of the R coding environment to achieve improved visualization of analysis results and interpretation
The course is divided into 4 major components:
- Analysis of RNA seq data: Preparing the Gene Expression Matrix
- Principal Component Analysis and how to execute it on T-BioInfo
- Expanding PCA Visualization using R
Objective: We will begin with the context for principal component analysis (PCA) by discussing its application for complex data visualization by reducing dimensionality. We will use the T-BioInfo platform for processing raw RNA-seq data and then run PCA in the data mining section of the platform. We will then look at the output plot and learn how to use the available R script to run it on your own computer and understand the code. We will then learn about expanding the basic functions of this script and gather further insight into samples and gene expression patterns in this dataset.
In this course, you will learn about the T-BioInfo platform and the R Studio IDE (Integrated Development Environment). After testing the initial script on a sample dataset with standard configurations, we will explore several analyses and visualization packages to make visual improvements. This exercise will be useful for those seeking to improve their skills in data analysis and visualization. We will walk through all the steps necessary to make and improve PCA scatter plots and understand the analysis results.
Prerequisites: For those not familiar with RNA-seq data, gene expression and what type of information it can offer, we recommend completing the Transcriptomics 1 course. This online course will provide a detailed explanation of RNA-seq data and basic analysis steps. In our example, we will use public-domain data. You are encouraged to explore the dataset and read the associated publication – both are described in detail in the project Modeling Precision Medicine.
- Lectures 10
- Quizzes 0
- Duration 50 hours
- Skill level All levels
- Language English
- Students 202
- Certificate No
- Assessments Yes
Preparing Data for Analysis
Principal Component Analysis and Visualization
Expanding PCA Visualization using R
A scientifically structured course for those who want to visualize biomedical data
In the era of biomedical big data analysis, it is now crucial to learn various statistical methods and computational algorithms for data visualization. This course is therefore exactly what I was looking for. Its part-by-part categorization into RNA-seq data preparation, Principle Component Analysis with its need and instructions on how to perform, R-scripting(from how to download RStudio to how to code and run a program for expanding the way of visualizing data) are so much helpful to combine biology, computational operations and statistical methods for finding solutions to real-life biological problems. Hence I strongly recommend my fellow batchmates and peers to attain this wonderful course.
Wonderful course! I highly recommend it.
PCA: biomedical data visualization in R is a very detailed course that discusses how to perform PCA and even improve the visualization for aesthetics and better explanation of the biomedical data. The course not only contains an explanation of what PCA is but also debriefs a user on how to use R to perform exploratory data analysis, from scratch, in a step-by-step manner.