I am a bioinformatician/computational biologist with six years of wet lab experience and over 12 years of computation experience. I will help you to learn computational skills to tame astronomical data and derive insights. Check out the resources I offer below and sign up for my newsletter!
Dear Bioinformatics lovers, Looking at bioinformaticians’ profiles today, it seems like everyone is immersed in cutting-edge single-cell analyses and AI-driven bioinformatics. It’s inspiring to see how far the field has come, but amidst all the buzz, something crucial is often overlooked—the fundamentals of bioinformatics. Why Fundamentals Still MatterBasic bioinformatics skills, like exploratory data analysis (EDA) and data sanity checks, seem to be losing the spotlight. Everyone’s talking about deep learning pipelines and generative AI models, but these advanced techniques often depend on a solid foundation in the basics. Bioinformatics isn’t just about fancy models. It’s about understanding messy, real-world data—quality control, normalization, and ensuring the data makes sense before diving into machine learning. The pioneers of this field didn’t start with deep learning on single-cell data or transformer models for genomics. They began by asking simple, meaningful questions, doing EDA, and spotting biological patterns. This approach has led to some of the most impactful breakthroughs. Unfortunately, it’s becoming harder to find bioinformaticians who take the time to question their data and ask, “Does this make sense biologically?” Too often, there’s a rush to fit the latest model without fully understanding the data. Sure, learning machine learning is valuable—I’m exploring the fast.ai course myself—but the real power in bioinformatics lies in knowing when to trust your data and when to question it. My advice? Before chasing the latest trend, master the basics. Learn how to clean, explore, and deeply understand data. These skills will take you further than any buzzword ever will. Connecting Biology to BioinformaticsBuilding on this foundation, understanding the biology behind your data is equally essential. Without it, even the most rigorous analyses can lead to flawed conclusions. One critical example is the discrepancy between RNA and protein levels. Relying on RNA-seq data alone might not tell the whole story. Proteins are regulated at multiple levels, and this can lead to significant differences in what RNA and protein levels reveal about a system. Take CTLA-4 in T cells:
Or consider IFNAR1 after interferon-alpha stimulation:
Another classic example is HIF-1α under hypoxia (my first PhD paper is on how CTCF blocks HIF1 enhancers :):
Multi-Omics Is the KeyTo address these gaps, integrating RNA and protein data—multi-omics—offers a more complete picture:
In single-cell studies, RNA and protein often tell complementary stories. For example:
If you rely solely on RNA data, you’ll miss crucial phenotypic details. Even if you have a high protein level, their activity may be regulated by post-translational modifcation (e.g., phosphorylation) Takeaways
By focusing on these principles, you’ll not only become a better bioinformatician but also a more impactful scientist. Let me leave you with a question: What are your favorite examples of RNA-protein mismatches? I’d love to hear your thoughts! Happy Learning! Tommy aka, Crazyhottommy PS: my other Linkedin posts that you find find helpful for the past week:
PPS: If you want to learn Bioinformatics, there are other ways that I can help:
Stay awesome! |
I am a bioinformatician/computational biologist with six years of wet lab experience and over 12 years of computation experience. I will help you to learn computational skills to tame astronomical data and derive insights. Check out the resources I offer below and sign up for my newsletter!