How to get a list of all ribosomal genes in Rat?
2
0
Entering edit mode
6.4 years ago
c_u ▴ 520

I have High-throughput sequencing data (cDNA from ribosomal profiling) and when I ran Fastqc on it and looked at the over-represented sequences, many of them were ribosomal genes. This is bad because the experimenters did use a protocol to remove the ribosomal RNA.

Now, I mapped the reads using Tophat, and now want to remove the instances that are mapped to the ribosomal genes. For this, I need a list of all ribosomal genes in rat, and then I can use samtools to remove the said reads. So, is there someplace I can find such a list?

Thank you!

RNA-Seq rat gene-list • 3.5k views
ADD COMMENT
1
Entering edit mode

This may be your best bet. rDNA repeats are not well characterized in many cases.

ADD REPLY
0
Entering edit mode

I think both approaches - through gtf and using Biomart are good ideas. However I was not sure about the the link you have mentioned in this comment. As far as I can see, it corresponds to just one rRNA gene, and I was hoping to find a list of all such genes.

ADD REPLY
1
Entering edit mode

This is a repeating unit. There would be multiple copies of the core sequence. If you want to get all known copies then use the rRNA/GTF answer given below.

ADD REPLY
0
Entering edit mode

Thanks a lot for the help!

ADD REPLY
0
Entering edit mode

What have you tried so far to find this information?

ADD REPLY
0
Entering edit mode

I tried to search for it online, and found that Biomart is a way to download them. So there I tried for Rat genes, but I am not sure if the version is correct. The tophat was run with Rn6 version and I was unsure if the rat data in Biomart belonged to that version. Aslo, when I downloaded the list, there were ~330 genes, but I was unsure if this was even close to comprehensive

ADD REPLY
1
Entering edit mode

As long as you find the sequence of the repeat unit you should be reasonably ok. There are multiple copies of these genes across multiple chromosomes and they are not fully characterized even in humans and mouse, afaik.

ADD REPLY
1
Entering edit mode
6.4 years ago
igor 13k

Check this previous discussion about rRNA: RNA-seq rRNA contamination

ADD COMMENT
1
Entering edit mode
6.4 years ago
lshepard ▴ 470

You can extract all genes from a gtf file with the gene_biotype "rRNA".

ADD COMMENT
3
Entering edit mode

To complete this answer here is the Ensembl rat genome 6.0 version.

ADD REPLY

Login before adding your answer.

Traffic: 1926 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6