Anyone who teaches biology will eventually talk about genomics, transcriptomics, metagenomics and other “omics”. However, omics biology can no longer be taught without understanding how the data revolution is impacting the field. Even to speak about the human genome project, we have to acknowledge the role that computational technologies and data analysis methods played in its completion. Sometimes, to simplify instruction, it can be easier to teach biology by separating the subject into two completely separate parts: data and the wet-lab components.
The data side of biology, also known as bioinformatics, is gaining traction as it is demonstrating its impact, technology for analysis and generation of data is accessible and more people are interested in topics like data science and machine learning. At the same time, advances in biotechnology, pharma and clinical diagnostics are showing the transformational potential of wet-lab techniques like gene editing, immunotherapy and molecular diagnostics.
So how can bioinformatics be more integrated with a traditional biology or clinical curriculum? Who can teach bioinformatics and how to evaluate bioinformatics resources for teaching purposes?
One of the challenges for the integration of bioinformatics into biology education is to have courses that are too technical, but yet allows the students to explore these powerful data-driven techniques as they learn about biology.
Here are some challenges, anecdotal examples and goals we have identified:
The first hurdle is the list of prerequisites for someone to try bioinformatics. For a biologist or someone that does not have a technical background to be interested and try bioinformatics as a part of their biology curriculum, we have to eliminate or significantly reduce prerequisites like prior knowledge of coding and advanced statistics/mathematics skills. In this way, the audience eligible to participate in the class is broadly expanded and those intimidated with statistics and complex datasets can have a positive experience and possibly stay motivated.
Proficiency in coding:
In fact, the majority of bioinformatics resources require some level of coding, often times quite a lot of it (bash scripting or R for example). By emphasizing the coding aspect of the field versus the biological and contextual understanding of how or why a disease manifests itself, progresses or is cured, the students lose some potential to add future value in the advancement of science and medicine. It is not that these requirements related to coding are too difficult to teach or are not useful in themselves, but instruction should be broader and delivered in the context that algorithms are tools which can be used to solve biological problems by expertly leveraging huge datasets and mathematical models.
Mathematics and statistics:
Deep mathematical understanding of bioinformatics techniques is not a requirement to use these methods in the same way that GPS can be used for navigation without understanding the satellite-based technology. It can still be used effectively to solve real-world challenges. In the same way,one does not need to know the way every dimensionality reduction technique works to effectively use one of them in a project.
Time and momentum:
Time is of the essence, especially at the beginning – if the students are intimidated by coding or get stuck learning, they will need extensive troubleshooting and repeated review of the basics. As a result, students may become turned off by bioinformatics. Plus, students will be unable to learn how to think about the “big picture” through the application of computational science to solving a biological question.
Another limitation is the need for effective computational infrastructure. This is more than just cores in a CPU or large RAM to run the analysis: infrastructure also includes the installation of libraries and software packages to make the algorithms function. The required data sets that must be downloaded are huge, extracting them, transforming those data into tables and rendering it all for visualization is even more demanding and daunting of a task. That’s why even experienced bioinformaticians post their requests to “Bioinformatics Santa”:
And with so many methods regularly posted and updated in scientific literature, it might seem like most of the analysis problems have been solved and all you need to do is run the code, but in reality it might not be so straightforward. Small technical issues can slow you down significantly, endangering the whole process. So many times, whole classes can be spent on debugging simple mistakes:
Finally, the need to teach involves some level of basics that need to be covered as a foundation for the instruction that will follow. And it often happens that the longer it takes to go through the basics, the less likely are you to finish a project from beginning to the end and understand something relevant to what you might be motivated to study. That’s why so many graduates can only put a single publication or poster on their resume. And even when they do, their name might be one of ten authors. Let alone an undergraduate student – so rarely does one have a chance to work through the complete project!
Preparing projects that are aligned with other topics students are learning can significantly increase their interest and motivate them. It can also help transition from traditional learning to participation in research. CUREs are NSF-funded course-based undergraduate research experiences, essentially projects that help students learn and apply skills in a research project (https://serc.carleton.edu/curenet/index.html). The same model can be expanded and incorporated into bioinformatics classes for all levels of students.
So what is the solution? Our team has been working on addressing these challenges and developing a comprehensive solution. Pine Biotech is committed to improve bioinformatics education and provide high-quality bioinformatics teaching resources that address these challenges.
We have prepared an online platform that allows seamless transition between practical hands-on exercises, online content delivery and a coding-free analysis environment. The T-BioInfo platform can be leveraged in an existing course or as an independent curriculum. Moreover, powerful data capture and analytics allows scalability for thousands of students – unprecedented achievement for hands-on STEM education.
The curriculum is project-based, offering projects in oncology, neuroscience, agriculture and biotech. Each project is offered as a tutorial that provides a public-domain dataset with the background and significance of the challenge it demonstrates. Methods are explained and some guidance offered to plan an independent research plan. Then, anyone can develop an independent take on this project and personalize their approach trying different methods or combining the dataset with other available datasets on NCBI.
Learn more at edu.t-bio.info