profile

Chatomics! — The Bioinformatics Newsletter

11 tools to fetch GEO and other databases' metadata and data


Hello Bioinformatics lovers,

Tommy here, I am still in China and will be back in the US on July 21. Today's topic is about public data. Why do you care about it?

Because if you are doing bioinformatics, public data is a gold mine for you. Without spending money to do the experiment, you can instead use others' data! but there is a problem.

Public data such as GEO datasets can be hard to obtain programmatically if you do not know the tools.

I have you covered!

11 tools to fetch GEO and other databases’ metadata and data

  1. GEOfetch https://geofetch.databio.org/en/latest/
  2. bioconductor package GEOquery https://bioconductor.org/packages/release/bioc/html/GEOquery.html
  3. ffq Fetch metadata information from databases. https://github.com/pachterlab/ffq
  4. pysradb: a python package to query next-generation sequencing metadata and data from NCBI sequence read archive.
  5. GEOparse https://github.com/guma44/GEOparse
  6. MetaSRA: normalized sample-specific metadata for the Sequence Read Archive
  7. SRA-explorer This tool aims to make datasets within the Sequence Read Archive more accessible.
  8. fetchngs https://github.com/nf-core/fetchngs
  9. fastq-dl https://github.com/rpetit3/fastq-dl
  10. iSeq: An integrated tool to fetch public sequencing data https://www.biorxiv.org/content/10.1101/2024.05.16.594538v2
  11. https://wwood.github.io/kingfisher-download/

Sharing is Caring! Please share this newsletter with your friends if you find it useful. https://divingintogeneticsandgenomics.ck.page/profile

Happy Learning!

Tommy aka crazyhottommy

Let's connect on twitter and Linkedin!

Chatomics! — The Bioinformatics Newsletter

Why Subscribe?✅ Curated by Tommy Tang, a Director of Bioinformatics with 100K+ followers across LinkedIn, X, and YouTube✅ No fluff—just deep insights and working code examples✅ Trusted by grad students, postdocs, and biotech professionals✅ 100% free

Share this page