Apaar Agrawal studied bioinformatics at Amity University, Jaipur. After completing several online programs and serving as an ambassador for the OmicsLogic training program, he joined Pine Biotech as a bioinformatics research intern. Recently, he completed the Transcriptomics course collection. In this post, he is sharing his summary of the course content and how he benefitted from taking these courses. You can visit his profile by following this link: https://edu.t-bio.info/members/apaaragrawal/courses/
Transcriptomics is the most trending, prevalent, and important area of study in the field of life science, and with the exponential growth in the availability of the biomedical data, it is a must that biologists have ample knowledge about this field and not just limited to us, Bioinformaticians.
As an undergraduate student of Bioinformatics, Pine Biotech’s Transcriptomics course helped me build a poster and a report on Differential analysis of Breast Cancer cell-lines. The poster was presented with other contributors at the Louisiana Biomedical Research network Conference in Louisiana, USA.
My thoughts about the Transcriptomics Course series on T-BioInfo:
Pine Biotech’s online Transcriptomics course series provides everyone with the much needed knowledge of this field, thus making the NGS analyses easy and approachable for all biologists and clinicians who do not have a background in coding.
The online Transcriptomics course series has been divided into 3 parts – Transcriptomics 1, 2 and 3, each focusing on educating and enabling people to have a better understanding of what are the biomedical data, what are the data types, how is the data processed, how this data is analysed, the tools and algorithms used to analyse this huge data, visualising the processed data and it’s analysis, and thus making them able to conduct an independent project on their own or apply what they learned to their own research.
Transcriptomics 1 (T1) helped me understand the biology of the transcriptome and that how this concept came into the picture, which then explains very vividly how the RNA-seq data is being generated giving an in-depth knowledge of the wet lab studies involved behind the data generation, which might be wet lab but Pine biotech’s virtual course modules with the help of elaborate videos and glossary and animations helped me visualising the wet lab on my dry computational state.The course explains about the steps that are involved in the processing and analyses of this data in a step-by-step guided manner with multiple supplementary education material and an example on how this data is worked upon.
For me, as a student of Bioinformatics, this course has acted as a foundation to learn about the transcriptome from scratch, by applying the knowledge of 1) data science, 2) Computational analyses, 3) Statistics 4) Biology for data interpretation and post processing.
One is introduced to the integration of all these domains with the much needed basic knowledge of the algorithms involved and covering the following concepts in detail:
- In the pre-processing of the data- PCR clean, Trimmomatic etc.
- For the mapping of these raw reads by explaining the pipeline tools like bowtie, tophat, cufflinks etc.
- The differential expression tools like edgeR and DESeq etc
- The statistics involved to process this data into information, and the machine learning concepts like clustering, component analyses etc.
- The tools needed for annotation, and finally how to interpret this information
Transcriptomics 1 sets up a background in all 3 domains, biology, computer science and statistics, for a beginner to work on the complex data and make sense out of it, by following the successive courses of Transcriptomics 2 and 3.
Transcriptomics 2 (T2) continues with the concepts taught in transcriptomics 1 by elaborating more on them and enabling one to make use of those tools and techniques to analyse the biomedical data using the T-Bioinfo’s analytical platform. T2 focuses on the differential analyses by using statistical methods like T-test, after we have ran the processed and normalized the data.
The Transcriptomics 2 course has helped me learn about the theoretical concepts from T1 more clearly by enabling me to convert the theory into a practical pipeline. As a Bioinformatician, I found T2 to be a very helpful case as it gave me an insight of what the RNA-seq pipeline looks like and how it takes input and analyses these inputs to give out informative results in a very visual manner, which I failed to achieve on my own due to the computational limitations because the RNA-seq data is quite huge and requires a lot of processing power and time, which one doesn’t usually have on their personal systems. Since Pine’s analytical platform processes the pipeline on cloud we get huge computational power in terms of processing and spatial terms.
The module throws light on the statistical methods to select the responsible factors for the particular results we are getting by covering the topic of Factor Regression analysis, the terms I studied in college in the Statistics’ lectures on mathematical data, Pine’s T2 module bridges that gap and links Statistics to Biology.
Not only does this module covers the details of execution of the pipeline but also covers the complications one would face during the process and thus enabling one to conduct the same type of research with different ideas and data completely on their own, thus making students independent.
Transcriptomics 3 (T3) in continuation to T1 and T2 where we learnt about how to process the NGS data, and how to work with the statistics to gather meaningful information out of it by applying tools like T-Test, edgeR, DESeq, and factor regression analysis, which builds a base for clustering, grouping and differentiating the components of this data with or without any prior knowledge, thus covering the machine learning aspects of biology to develop classifiers.
This course helped me convert my knowledge of statistics and machine learning into practical experiment with the help of example pipelines and quizzes to test my learning. T3 elaborates in detail about the Supervised and Unsupervised analyses for classification and clustering and then the annotation of the Biomedical data and can make even a non-biologist able enough to understand the “biology as data science”.