profile

Hi! I'm Tommy Tang

The blueprint of reproducing a genomics paper figure


Hello Bioinformatics lovers,

Happy Holidays! We had a big snow in Boston.

Early in the morning, my in-law told me our neighbor was helping us to remove the snow using a gas snow blower.

I went out and said thanks. He told me that he liked helping others.

I can tell the pure JOY from his words.

Helping others is helping yourself.

Giving roses to others leaves fragrance on your hands.”

Enough for the life-lessons :)

How are you going to spend the holidays? For me,

I am writing the end-to-end tutorial to replicate the figure below, which I mentioned in my last newsletter.

Read it if you missed it https://divingintogeneticsandgenomics.kit.com/posts/good-to-great-bioinformatician-how

To be able to do it, you first need to master the basic skill: plotting!

If you master these six types of plots you can reproduce 90% of the figures in any genomics paper.

  1. barplot
  2. scatterplot
  3. histogram
  4. line plot
  5. boxplot
  6. heatmap
video preview

Watch the video here:

I did not include others such as Venn diagram, or pie chart. But you get the idea.

Looking at this figure,

  • figure d is a scatter plot,
  • figures e, f and k are (stacked) bar plot
  • g is a IGV genome browser view of ChIPseq signal, but it is really just a histogram.
  • h is a pie chart
  • i is a heatmap
  • j is a line plot

I still use ggplot2 for most of my figures and use complexheatmap for making heatmaps.

If you do want to stay within the ggplot2 ecosystem, take a look at ggalign.

It can make complicated heatmaps using ggplot2:

compare it with complexheatmap

Okay, that's the first step. I highly recommend you go through the following free books:

Then, you need to learn how to pre-process the underlying sequencing data. In this case, ChIP-seq;

analyze the figures and see what is needed to plot them;

get the data into a dataframe that is ready for you to plot.

I will show you how to do it all!

Wish me good luck with the writing.

Happy Learning!

Tommy aka. Crazyhottommy

PS:

If you want to learn Bioinformatics, there are ways that I can help:

  1. My free YouTube Chatomics channel, make sure you subscribe to it.
  2. I have many resources collected on my github here.
  3. I have been writing blog posts for over 10 years https://divingintogeneticsandgenomics.com/

Stay awesome!

Hi! I'm Tommy Tang

I am a computational biologist with six years of wet lab experience and over 12 years of computation experience. I will help you to learn computational skills to tame astronomical data and derive insights. Check out the resources I offer below and sign up for my newsletter!

Share this page