Why most single-cell annotation benchmarks are missing the point

Published 2 months ago • 1 min read

Hello Bioinformatics lovers,

Tommy here. I made a 40-minute video to show you how to do RNAseq analysis end-to-end.

Watch it here!

I was recently interviewed by Pure Storage: Data Cleaning ‘Janitorial Work’ is Key to Unlocking Life Sciences Breakthroughs

Today, we will talk about single-cell cell type annotation.

No alternative text description for this image — *Granular cell type annotation*

Your model might be accurate — but is it biologically meaningful?

Everyone’s benchmarking single-cell annotation models these days.

You train on a million cells. You annotate a new dataset.

Everything looks great.

But here’s the uncomfortable truth:

Your prediction is only as good as your reference.

What’s the ground truth, really?

We don’t even have a universal definition of a “cell type.”

It’s a human-imposed label, a convenient shorthand.

In reality, many cells exist in a continuum — not discrete boxes.

Take CD8 T cells, for example. In healthy tissues, they behave differently from tumors.

You’ll find states like:

Progenitor exhausted
Central memory
Effector memory
Naive
Terminally exhausted

Each state has unique transcriptional signatures — and biological implications.

Here’s the issue:

If your model is trained on millions of cells without state-level annotation,

it can’t predict these nuanced states in a new dataset.

And those states are exactly what matter in biology.

They drive immune responses, therapy outcomes, and disease progression.

📖 Example:

Antitumor progenitor exhausted CD8+ T cells are sustained by TCR engagement

So what’s the point of a model that just says “CD4” or “CD8”?

If your predictions stop at the broad categories, you’re missing the biology that truly matters.

Instead, we should be building models that understand states, not just types.

I highly recommend exploring ProjecTILs — a tool that does this elegantly.

The takeaway

We build models to understand biology — not to show off scale.

If your model doesn’t bring new biological insight, it doesn’t matter how many cells it was trained on.

Deep domain knowledge isn’t optional.

It’s what separates real understanding from mere computation.

Happy Learning!

Tommy aka crazyhottommy

PS:

If you want to learn Bioinformatics, there are other ways that I can help:

My free YouTube Chatomics channel, make sure you subscribe to it.
I have many resources collected on my github here.
I have been writing blog posts for over 10 years https://divingintogeneticsandgenomics.com/

Stay awesome!

Share this page