One of the most commonly asked questions by Biologists is: “I want to analyze my data, but I don’t know where to begin.”
Bioinformatics encompasses so many different field of analysis under its umbrella that it can be daunting to figure out where to begin, especially given how complex the analysis can be when you really focus on the mathematics, statistics, and biological knowledge that goes into each tool people talk about. As such, we’ve compiled a list of some of the best resources we’ve come across for helping you get started on your journey to learn Bioinformatics!
Remember that everyone learns at their own pace, and some of these resources might be more helpful than others depending on your learning style, goals, and level of expertise, so be patient and try multiple approaches!
A not for profit organization with a mission to teach bioinformatic knowledge from experts, the Canadian Bioinformatics Workshop offers it all. Providing courses in R, Metabolomics, RNA-Seq Analysis, Network Analysis, Epigenomics, Bioinformatics for Cancer Genomics, and many more, individuals can find lecture slides, recordings, and even lab practicals spanning multiple years and gain theoretical and practical knowledge in any field. The languages involved depend on the course, but primarily go over Bash scripting and R.
One of the largest fields of bioinformatics (and computational biology) has been Single Cell Sequencing. While this field of study is still new and discovering its ‘gold standard’ analysis, the Satija Lab has been leading the way with their analysis toolkit Seurat. Aside from a great library of tools, the Satija lab provides detailed, step-by-step guides (called Vignettes) on how to analyze Single Cell data in R.
Delivered by Malachi and Obi Griffith, the Griffith lab seeks to train students and post-docs with a biology background in the field of bioinformatics, focusing on RNA-Sequencing Analysis. They contributed tremendously to developing the RNA-Sequencing Courses for the Canadian Bioinformatics Workshop, but offer their own GitHub repository to go through more detailed lectures and lab practicals. They even help users learn how to run their analysis in the cloud. Their teaching focuses on using Bash scripting and R.
During the course of your work, there inevitably comes a time when you encounter a problem that you just have to look up online or ask someone. Since bioinformatics experts are scattered all over the world, Biostars offers a platform to post and answer questions bioinformaticians are facing. From asking how to debug a tool, to wanting a tutorial on plotting data using a new tool, to learning an entire analysis flow, Biostars should be the first point to look into and learn from the community at large.
Rosalind is a platform for learning bioinformatics through problem solving. Rosalind caters to students who want to learn by doing by applying their theoretical knowledge to practical problems, with a focus on python and bioinformatics.
As bioinformatics emphasizes the need for machine learning, Kaggle provides a great playground for individuals who want to learn by solving machine learning puzzles. Kaggle focuses on using python, and while it doesn’t directly teach bioinformatics, it is a fantastic resource for learning the machine learning side of informatics.
Continuing off the “learn by doing” approach, UseGalaxy.org is a great quick playground where you can take existing public data and run it through pre-set tools and pipelines, with the limiting factor being reduced availability to run your workflows. Under the “Shared Data” tab you’ll find workflows and visualizations that people have shared with the community that can help you start your own analysis pipelines. While it does not really teach a specific language, Galaxy helps you get acquainted with how tools are strung together into workflows and pipelines.
While practical application of knowledge is enough to run your analysis, truly understanding why your data appears a certain way requires learning the theory behind the application. Coursera offers plenty of courses with varying degrees of difficulty, ranging from general bioinformatics courses that cover basic computational biology, to courses on different ways to plot or even learn about sequencing. While users can get a credit for the course they take, auditing the course is completely free.
For a more mathematical perspective on bioinformatics theory, MIT offers full courses that are completely free on a variety of computational biology focused subjects. You’ll be able to find video lectures, notes, assignments, and even projects that you can undertake and test your knowledge. This resource is great for people who want a more traditional university-lecture style setup for learning.
The HBC Core is a hub for bioinformatics for the Harvard Community. Providing help with Next-Gen Sequencing Analysis, Functional Analysis, and contributing to Bioinformatics Infrastructure, you can find multiple resources to help you learn bioinformatics in a step-by-step manner. Their training modules even have prerequisites listed to help users understand the level of knowledge they’re expected to bring to each course.
While there are many more resources that you can find to help you learn bioinformatics, always remember that “learning bioinformatics” is like “learning math”. It’s a never ending journey that will take time to truly understand and digest, and the key message to keep in mind is patience with yourself as you seek to accomplish your learning goals. People have entire careers based on their bioinformatics expertise, and while they may be scattered around the world, many of them are available to help teach others and share their knowledge on these resources. As cliché as it might be, the saying “a journey of a thousand miles begins with a single step” is never more true than it is here. So make sure to try every one of these resources and become the subject expert that you absolutely can be!