Transcriptomics 3 is a continuation of Transcriptomics 1 and 2. In Transcriptomics 1, we learned how to convert raw reads from a Next Generation Sequencer into a table of expression and visualize the table of expression using PCA, which we then used to develop hypotheses. In Transcriptomics 2, we looked at statistical methods for determining differentially expressed elements in known groups of samples. We explored Student’s t-test, Bayesian methods such as Deseq and EdgeR, and Factor Regression Analysis to dissect the influence of multiple influential factors. Here, we will explore different methods for identifying groups of samples without prior knowledge (clustering) and then examine methods for developing classifiers from known samples to classify unknown samples.
We will be inquiring into the clustering and the classification methods using a biological example from this publication (Modeling precision treatment of breast cancer). We will not repeat the analysis presented in the paper; rather we will re-analyze the paper and later will be able to compare the author’s and our own results.