I am a computational biologist with six years of wet lab experience and over ten years of computation experience. I will help you to learn computational skills to tame astronomical data and derive insights. Check out the resources I offer below and sign up for my newsletter! https://github.com/crazyhottommy/getting-started-with-genomics-tools-and-resources
Hello Bioinformatics lovers, It is a beautiful day here. I had a dilemma: should I sit and write this newsletter or go out and play with the kids? I chose the latter. That's why it is a little late when you get this. I love teaching and I also love my family. Today we will talk about how to develop a data sense. What do I mean? It would help if you verified whatever you get from a command by:
I have talked about exploratory data analysis (EDA) before. Let me give you another example. I have an RNAseq count matrix with rows of genes and columns of samples. I need to normalize it to counts per million (CPM). The calculation is simple:
In R, you can do it by: t(t(mat)/colSums(mat)) The key is transposing the matrix first, dividing the column sum, and then transposing it back. By default, if you divide a matrix by a vector, it is in a row-wise manner. If you want to further normalize to it RPKM (reads per kilobase per million), you need to divide each by the length of the gene for each row (gene). If you have another vector with gene length in the same order as the matrix rows. t(t(mat)/colSums(mat))/ gene_length In this case, you can just divide the matrix by the gene_length vector, because the division is row-wise. However, you may not know the division is by rows and assume the division is by column. Then you may make a mistake! Instead, you can use a dummy matrix with only 3 rows, so you can see the calculation with your own eyes! That's exactly what I did. Watch this YouTobe Video. The key takeaway is always verifying your results. Happy Learning! Tommy PS: If you want to learn Bioinformatics, there are three ways that I can help:
Stay awesome! |
I am a computational biologist with six years of wet lab experience and over ten years of computation experience. I will help you to learn computational skills to tame astronomical data and derive insights. Check out the resources I offer below and sign up for my newsletter! https://github.com/crazyhottommy/getting-started-with-genomics-tools-and-resources