How I Would Learn Bioinformatics From Scratch 12 Years Later: A Roadmap

Published 5 months ago • 4 min read

Hello Bioinformatics lovers,

First of all, thanks to all the subscribers and I hope you get value from this newsletter.

I want to shout out to Krittiyabhorn (Namthip), a PhD student in Thailand. Her journey should be an inspiration for all of you.

read her post here. And make sure you check out her blog posts too. They are all well-written and attending to details.

The best way to learn is to "just play around".

Unfortunately, many of the students just do not get started.

How I Would Learn Bioinformatics From Scratch 12 Years Later: A Roadmap

I wrote a blog post: My opinionated selection of books/urls for bioinformatics/data science curriculum six years ago, and many links are broken. so I decided to write a new one.

You Can Change Your Appetites

Linear algebra, statistics, machine learning—these used to feel abstract to me.

I had zero experience with bioinformatics when I was studying for my PhD in a wet lab.

I memorized formulas without truly understanding them.

But over time, I found the right resources that made these concepts click, especially in the context of bioinformatics.

If I were starting my bioinformatics/computational journey again 12 years ago,

here are the FREE resources I would recommend.

1. Master the Linux Command Line

Knowing how to work in a Unix environment is a must for any bioinformatician. Start with these:

Software Carpentry UNIX Shell Playlist – My first introduction to Unix. It’s old but still solid.
How Linux Works (Book) – A great deep dive into how Linux operates.
Data Science at the Command Line (Book) – Fun read with lots of useful tricks.
Practice in a Bioinformatics Sandbox – Great for hands-on command-line practice.

2. Learn R for Genomics

R is an essential tool for bioinformatics, especially for data wrangling and visualization.

R for Data Science – Learn tidyverse, master data manipulation, and reshape data efficiently.
Data Analysis for the Life Sciences with R – I took this course 3 times and learned a ton!
Computational Genomics with R – Practical applications of genomics with R.
Bioinformatics Data Skills – A must-have for anyone serious about bioinformatics. It covers unix, R and python.

3. Build a Strong Statistical Foundation

Understanding statistics is critical. These books and videos will help:

Discovering Statistics with R – A fantastic and engaging stats book (2nd edition coming soon).
Modern Statistics for Modern Biology – A great resource for statistical applications in biology.
StatQuest by Josh Starmer – I’ve watched the PCA video at least 20 times.

4. Linear Algebra: Make It Click

I never understood eigenvectors and eigenvalues—until I found these:

3Blue1Brown Linear Algebra Videos – The best visual explanations of matrix transformations.
MIT 18.06: Linear Algebra – A formal, in-depth introduction to linear algebra.

Why it is important to learn linear algebra? Most of the genomics data are just matrices:

an RNAseq expression matrix is a gene-by-sample matrix, with entries to be read counts for each gene
a single-cell expression matrix is a gene-by-cell matrix, with entries to be read counts for each gene
a ChIP-seq count matrix is a peak-by-sample matrix, with entries to be the number of reads in each peak
a drug response matrix is a drug-by-sample matrix, with entries to be IC50 for example

and many more… in other words,

Matrix is EVERYWHERE for bioinformatics (and many other data science topics)!

Many of the bioinformatics problems can be rephrased as matrix manipulation.

Understand what does matrix multiplication mean deeply;

Why matrix factorization is useful for genomics (see my post).

Matrix calculation is also the foundation of deep learning!

5. Get Comfortable with Machine Learning

Statistics and machine learning go hand in hand:

StatQuest ML Videos – Clear explanations of PCA, regression, decision trees, SVM and more.
An Introduction to Statistical Learning – One of the best ML books, now with R and Python editions.

6. Python for Bioinformatics

I’m primarily an R user, but I use Python for workflow automation. If I had to start again:

Python for Biologists – A great beginner-friendly book.
Biology Meets Programming: Bioinformatics for Beginners – I took this Coursera course and found it useful.

Just Start!

Pick any resource that fits your learning stage and dive in.

Waiting won’t change anything. Taking action will.

Those who start and experiment win in bioinformatics.

of course, subscribe to my youtube channel chatomics to learn bioinformatics too! https://www.youtube.com/@chatomics

If you found this newsletter helpful, share it with others who might benefit.

Happy learning!

Tommy aka crazyhottommy

Chatomics! — The Bioinformatics Newsletter

How I Would Learn Bioinformatics From Scratch 12 Years Later: A Roadmap

How I Would Learn Bioinformatics From Scratch 12 Years Later: A Roadmap

You Can Change Your Appetites

1. Master the Linux Command Line

2. Learn R for Genomics

3. Build a Strong Statistical Foundation

4. Linear Algebra: Make It Click

5. Get Comfortable with Machine Learning

6. Python for Bioinformatics

Just Start!

Other posts that you may find helpful from last week.