profile

Hi! I'm Tommy Tang

11 tools to fetch GEO and other databases' metadata and data


Hello Bioinformatics lovers,

Tommy here, I am still in China and will be back in the US on July 21. Today's topic is about public data. Why do you care about it?

Because if you are doing bioinformatics, public data is a gold mine for you. Without spending money to do the experiment, you can instead use others' data! but there is a problem.

Public data such as GEO datasets can be hard to obtain programmatically if you do not know the tools.

I have you covered!

11 tools to fetch GEO and other databases’ metadata and data

  1. GEOfetch https://geofetch.databio.org/en/latest/
  2. bioconductor package GEOquery https://bioconductor.org/packages/release/bioc/html/GEOquery.html
  3. ffq Fetch metadata information from databases. https://github.com/pachterlab/ffq
  4. pysradb: a python package to query next-generation sequencing metadata and data from NCBI sequence read archive.
  5. GEOparse https://github.com/guma44/GEOparse
  6. MetaSRA: normalized sample-specific metadata for the Sequence Read Archive
  7. SRA-explorer This tool aims to make datasets within the Sequence Read Archive more accessible.
  8. fetchngs https://github.com/nf-core/fetchngs
  9. fastq-dl https://github.com/rpetit3/fastq-dl
  10. iSeq: An integrated tool to fetch public sequencing data https://www.biorxiv.org/content/10.1101/2024.05.16.594538v2
  11. https://wwood.github.io/kingfisher-download/

Sharing is Caring! Please share this newsletter with your friends if you find it useful. https://divingintogeneticsandgenomics.ck.page/profile

Happy Learning!

Tommy aka crazyhottommy

Let's connect on twitter and Linkedin!

Hi! I'm Tommy Tang

I am a bioinformatician/computational biologist with six years of wet lab experience and over 12 years of computation experience. I will help you to learn computational skills to tame astronomical data and derive insights. Check out the resources I offer below and sign up for my newsletter!

Share this page