Why Subscribe?✅ Curated by Tommy Tang, a Director of Bioinformatics with 100K+ followers across LinkedIn, X, and YouTube✅ No fluff—just deep insights and working code examples✅ Trusted by grad students, postdocs, and biotech professionals✅ 100% free
Hello Bioinformatics lovers, Tommy here. I was interviewed yesterday by Nature yesterday, about best practices using spreadsheets and naming files. P.S.: This is a must read for any scientists (both wet and dry) https://www.tandfonline.com/doi/full/10.1080/00031305.2017.1375989 It turns out to be a big problem for bioinformaticians. How many hours do bioinformaticians lose matching sample IDs across assays? Let’s talk about why this keeps happening—and how we can stop it. The Chaos Begins with a NameThe same cell line, labeled differently:
And that’s within the same team. Then Come the TimepointsAdd time, treatment, or dose… and now you’ve got:
And this is just one dataset. The Multi-Omics NightmareTeam A runs RNA-seq. Then someone says: Sound Familiar?You spend 3 hours cleaning it. No one noticed. This Isn’t Just a Small Lab ProblemEven large-scale efforts like TCGA, pharma pipelines, and multi-million dollar projects suffer from mislabeling and duplication. Why? The Fix Is Boring—but Powerful
This isn’t rocket science. In a Small Company?You can fix this in a day. In a Large Company?It’s harder. But fixing it will save millions per year. Final Thought:Bioinformatics isn’t janitorial work. That’s not where our talent belongs. Takeaways:
If you're spending time cleaning up naming errors, it's not your fault. Start with one rule: Other posts that you may find useful
PPS: If you want to learn Bioinformatics, there are four ways that I can help:
Stay awesome! |
Why Subscribe?✅ Curated by Tommy Tang, a Director of Bioinformatics with 100K+ followers across LinkedIn, X, and YouTube✅ No fluff—just deep insights and working code examples✅ Trusted by grad students, postdocs, and biotech professionals✅ 100% free