Background: Subhrajit Barua is a final year undergraduate student of B.tech Biotechnology at Amity University, Kolkata that is now doing his Bioinformatics Internship in India with pine Biotech. You can learn more about him and see his T-BioInfo profile here: https://edu.t-bio.info/members/subhrajit-barua/
How did I find out about the internship?
I found out about the internship from Dr. Pratim Chakraborty and Dr. Mohit Mazumder when they came to conduct a workshop at Amity University, Kolkata. They shared about the role bioinformatics is having on multiple industries and I was interested to learn more. While I knew something about bioinformatics already from the courses I took at Amity University in Kolkata, I was motivated to learn more with this internship opportunity.
Previous knowledge of bioinformatics?
Previously, when I was a student at Amity University, in my second year of studies we had classes dedicated to bioinformatics. Knowledge from my curriculum included BLAST, Needleman-Wunch algorithm, Smith-Waterman algorithm, Phylogenetic tree construction, Next Generation Sequencing – Ilumina, Pfam, Gor4, etc. I felt like I knew a lot about bioinformatics already, but I was not sure how to proceed and how to understand that bioinformatics is used in industry.
Why was I interested in this internship?
This bioinformatics internship was interesting to me because it involves multi-omics and machine learning, two very popular domains in translational research. Participating in this internship gave me an insight to the various prospects of Bioinformatics and Biology as a data science in the world today. I gained practical knowledge of dealing with data analysis problems and gained experience in solving them. It was also great to have experts next to me ready to guide me and answer my questions.
What was I expecting?
When I joined this internship, I was expecting to do a project and gain knowledge and experience simultaneously. Learning something new was my main motivation. I feel like this bioinformatics internship is a great way to achieve these goals.
In the first 20 days of this internship, I completed many of the basic bioinformatics courses, including:
- Introduction to Bioinformatics
- Transcriptomics 1
- Transcriptomics 2
- Transcriptomics 3
- Transcriptomics 4
- Introduction to Genomics
- Genomics 1
- Introduction to Metagenomics
- Epigenetics 1
A very strong foundation is laid to the knowledge of the algorithms used in their pipelines, like:
- For Pre-processing of data – PCR clean, Trimmomatic, etc
- For mapping of raw reads – TopHat, Bowtie, Cufflinks, etc
- For differential expression – DESeq, edgeR, etc
- For Variant Calling – Strelka, etc
Machine Learning is used to analyze multi-omics data since the data is huge, i.e. Big Data. The courses also cover numerous Machine Learning algorithms used to analyze biological data like:
- Principal Component Analysis (PCA)
- Clustering: Hierarchical-Clustering and k-means Clustering
- Linear Discriminant Analysis, Support Vector Machine, etc
I also learned about data interpretation and annotation using NCBI, DAVID, etc which is the most important part of any biomedical research.
What was difficult:
Understanding the underlying mathematics involved in machine learning was a little difficult for me to grasp at first but on repeated reading, using them in pipelines and getting help from Dr. Mohit Mazumder and Dr. Pratim Chakraborti, I got my concepts clarified.
I am very much impressed with Pine Biotech’s educational platform, called T-bio.info and its courses. I learned about so many new concepts, algorithms, and tools in such a short span of time, to be precise just 20 days. The courses are designed for both beginners and experts in the field of biomedical research. After going through the various courses on the platform, I now have a firm understanding of the basics of Multi-omics and its applications. As an undergraduate student of Biotechnology, I found these courses to be very helpful and rich in knowledge. Pine Biotech’s courses on Transcriptomics, Genomics, Epigenetics, etc have given me the concept of “Biology as a data science”.
The entire team at Pine Biotech is very helpful. Users get one-on-one interaction, guidance, and help. I gained a lot of knowledge and experience from Pine Biotech. I definitely recommend it to everyone!
What I am doing now?
Working on a project that will take data from TCGA (The Cancer Atlas Genome) and apply multi-omics integration methods to study the interaction of gene expression and micro-RNA signaling in liver cancer.