I am a bioinformatician/computational biologist with six years of wet lab experience and over 12 years of computation experience. I will help you to learn computational skills to tame astronomical data and derive insights. Check out the resources I offer below and sign up for my newsletter!
Hello Bioinformatics lovers, Tommy here. I want to teach you the fundamentals so you understand how to solve a problem from the ground up! The other day, I was on Reddit and saw someone ask how to generate the figure below.
Let me give you some general principles for approaching this type of problem. How to Recreate Bioinformatics Figures Have you ever seen a great figure in a paper and wondered how to make something similar? The key is to break it down step by step before you start coding. Step 1: Analyze the FigureBefore jumping into ggplot2 or ComplexHeatmap, take a moment to examine the figure:
Step 2: Recognize Common Plot TypesMost bioinformatics figures are built from a handful of basic plots: Step 3: Master the Right Tools
Step 4: Get Your Data Structure RightThe most important step isn’t plotting—it’s structuring your data correctly.
Step 5: Ask the Right QuestionsTo structure your data properly, ask:
Let's analyze this plot. (we will not discuss whether a bar plot is a good visualization or not today :))
Once you know this, you should get a data frame like this. column1: metabolites column2: log2Foldchange (logFC) column3: standard deviation(sd) Then, you are ready to plot: ggplot(df, aes(x=metabolites, y = logFC)) + # We map the x-axis to metabolites column, y-axis to logFC
geom_bar(stat= "identity", aes(color = logFC)) + # we map the color to logFC
geom_errorbar(aes(ymin=logFC-sd, ymax=logFC+sd), width = 0.2) + # we add the error bar, map the ymax and ymin
coord_flip() # we flip the x and y-axis
Read here for a tutorial on the error bar with ggplot2. Another Real Example – Single-Cell RNA-seq Dot PlotA dot plot in single-cell RNA-seq is essentially a heatmap:
If using ggplot2, make sure your data is in tidy format. Use If using ComplexHeatmap, prepare two separate matrices:
Step 7: Learn MoreFor a detailed walkthrough with code, check out this guide: and you can see how complicated you can go if you have the basic skills in your belt: How to make a multi-group dotplot for single-cell RNAseq data Yes, there are R packages to make such plots, but knowing that you can make any figure you want is liberating! Key TakeawayBefore plotting, structure your data properly. That’s 90% of the work! and learn ggplot2 and ComplexHeatmap! What’s your biggest struggle when making bioinformatics figures? Reply and let me know! Other posts that you may find helpful
Happy Learning! Tommy aka. crazyhottommy PS: If you want to learn Bioinformatics, there are four ways that I can help:
Stay awesome! |
I am a bioinformatician/computational biologist with six years of wet lab experience and over 12 years of computation experience. I will help you to learn computational skills to tame astronomical data and derive insights. Check out the resources I offer below and sign up for my newsletter!