profile

Chatomics! — The Bioinformatics Newsletter

My 10-year-old analysis is still understandable


Hello Bioinformatics lovers,

Tommy here.

Ten years ago I wrote down how I processed scRRBS data https://gitlab.com/tangming2005/scRRBS. I still use those notes today.

Eight years ago I documented an enhancer-promoter interaction analysis. Still readable. Still makes sense.

Meanwhile, I have folders from last quarter named results_final_final2_revised and I have no idea what’s in them.

That’s the gap. And it’s the difference between a pipeline you can rerun, a dataset you can reuse and a pile of files you have to redo from scratch.

You will not remember

Not next week. Not next month. Definitely not in six months when a reviewer asks you to rerun the analysis with one more sample.

Bioinformatics is messy. You run fifteen commands. One works. You move on. Six months later, your shell history is gone, your tmux session is dead, and the working command is somewhere in a Slack DM you can’t find.

Documentation is not a tax on your time. It is your time, refunded later.

The habit, in three lines

  • One README per project. One README per data folder.
  • Write the working command right after it works. Not later. Later is never.
  • Note where the data came from, when, and how you got it.

That’s it. You don’t need essays. You need enough that a confused future-you can follow along — because they will be confused.

Where to put the notes

Comments in code. Markdown for the project-level stuff. Quarto or Jupyter when the analysis and the narrative belong together. Anywhere except your shell history.

The README is your lab notebook. Treat it that way.

Why this compounds

Reproducibility, debugging, collaboration — those are the obvious wins. The less obvious one: when AI tools like Claude Code can read your README, they pick up the project context in seconds. Bad documentation hobbles the AI the same way it hobbles you.

The notes you write today are the foundation of every pipeline you build next year. Skip them and you rebuild from zero every time.


What’s the oldest piece of your own documentation you still use? Hit reply and tell me — I’m curious how far back yours goes.

btw, I revamped my slides on Reproducible computing and gave a keynote at the nextflow Summit 2026 in Boston. I added some more recent experience with AI assisted coding. Check it out!

Happy Learning!

Tommy aka crazyhottommy

PS:

If you want to learn Bioinformatics, there are four ways that I can help:

  1. My free YouTube Chatomics channel, make sure you subscribe to it.
  2. I have many resources collected on my github here.
  3. I have been writing blog posts for over 10 years https://divingintogeneticsandgenomics.com/
  4. Lastly, I have daily posts on Bioinformatics and AI on Linkedin https://www.linkedin.com/in/%F0%9F%8E%AF-ming-tommy-tang-40650014/recent-activity/all/

Stay awesome!

Chatomics! — The Bioinformatics Newsletter

Why Subscribe?✅ Curated by Tommy Tang, a Director of Bioinformatics with 100K+ followers across LinkedIn, X, and YouTube✅ No fluff—just deep insights and working code examples✅ Trusted by grad students, postdocs, and biotech professionals✅ 100% free

Share this page