profile

Chatomics! — The Bioinformatics Newsletter

AI in pharma: the wrong half is winning


Hello Bioinformatics lovers,

Three years ago I wrote that AI hadn't revolutionized drug development.

The 2026 update: still hasn't. But the part that's working isn't the part anyone writes articles about.

Here's what I'd revise.

Two buckets to categorize AI:

  1. AI touching Process — trial ops, statistical programming for FDA submissions, pharmacovigilance. Text and structured data.
  2. AI touching Biology — target ID, lead generation, toxicity prediction. Noisy, sparse, context-dependent.

Process AI is delivering value today. Biology AI is mostly still crawling.

How my 2023 calls aged

AlphaFold: Right call. I still cannot point to a clinical success that happened because of AlphaFold.

BenevolentAI: Worse than predicted. Another 30% layoff, US site closed, planning to delist.

Verge Genomics: VRG50635 Phase 1b in ALS completed enrollment in August 2024. Reports suggest the efficacy signal didn't land.

Recursion: Merged with Exscientia. They've also abandoned cell painting for plain brightfield — deep learning works just as well on unstained images.

Where biology AI updated my prior

De novo antibody design. The Baker lab's RFdiffusion (Nature 2025) showed atomically accurate antibody design, confirmed by cryo-EM. Absci dosed the first humans with ABS-101 in May 2025 — first AI-de-novo-designed antibody in the clinic.

One real Phase 2 data point. Insilico's rentosertib for IPF showed a 98 mL FVC gain at high dose vs a 62 mL decline on placebo (Nature Medicine, June 2025). The first clinical proof of concept for an end-to-end AI-discovered drug.

One real AI target ID story. GV20's GV20-0251 — AI-designed antibody against an AI-discovered target (IGSF8) — showed partial responses in two anti-PD1-resistant melanoma patients at ASCO 2025.

Where biology AI keeps stalling

A Nature Methods 2025 benchmark tested five foundation models on perturbation prediction. Linear baselines matched them. Anshul Kundaje at Stanford has been blunt: current "virtual cell" models are doing fancy nearest-neighbor lookup against prior knowledge networks built by biologists.

The data problem hasn't gone away. My friend Jinfeng once analyzed all of GEO and found up to 25% sample mislabeling.

The quiet wins

  • Statistical programming for FDA submissions. Hill Research claims 2,000x faster TLF generation, used by two of the top three pharma companies.
  • Pharmacovigilance. Insilicom's IPV cuts literature review workload at least 10x.
  • Trial coordination — flagging data entry errors, typos, inconsistent date formats. Unsexy. Compounds.

None of this gets a press release. All of it is saving real money today.

My Claude Code workflow caveat

Three failure modes I've hit recently:

  • Hallucinating from outdated training data. A colleague asked about RIPTACs — confident nonsense.
  • Composite errors that look right. TCGA + GTEx target list with sky-high tumor expression and zero in normal — because gene ID versions didn't match and the model filled in zeros.
  • Z-score axis mistakes. Row-scaled a ChIP-seq heatmap when the answer was column-scaling. Biology decides which axis.

AI accelerates the work. It does not replace judgment.

Three years from now

Process AI will compound faster than people expect. By 2029, every Phase 3 trial of any size will have AI in the operations stack. The biology unlock is cheap, real-time data. Until then, we're training on snapshots of dead cells.

Where are you seeing AI move the needle — or overpromise? Hit reply.

Happy Learning!

Tommy aka crazyhottommy

PS:

If you want to learn Bioinformatics, there are four ways that I can help:

  1. My free YouTube Chatomics channel, make sure you subscribe to it.
  2. I have many resources collected on my github here.
  3. I have been writing blog posts for over 10 years https://divingintogeneticsandgenomics.com/
  4. Lastly, I post daily on Linkedin about AI and bioinformatics

Stay awesome!

PPS:

Chatomics! — The Bioinformatics Newsletter

Why Subscribe?✅ Curated by Tommy Tang, a Director of Bioinformatics with 100K+ followers across LinkedIn, X, and YouTube✅ No fluff—just deep insights and working code examples✅ Trusted by grad students, postdocs, and biotech professionals✅ 100% free

Share this page