Removing rRNA and tRNA sequences using GTF files
0
3
Entering edit mode
7.7 years ago
pixie@bioinfo ★ 1.5k

Hello,

The reference build on which I am currently working did not provide the FASTA files for rRNA and tRNA. However, they have provided the GTF files. How can I use them to remove the sequences and which software should I use for my RNA-seq data?

RNA-Seq • 9.3k views
ADD COMMENT
1
Entering edit mode

You strictly don't need to remove the sequences. Just ignore those features when you do your gene counts or before you do the DE analysis.

Take a look at this for software suggestions: (Modern, mid-2016) RNA-seq software pipeline

ADD REPLY
0
Entering edit mode

Thank You, yea this looks simple

ADD REPLY
0
Entering edit mode

What are your goals and what species are you using? I've tried prefiltering by mapping against the reference rRNA cassette for mouse/human and a good chunk of rRNA reads remain for things like RiboSeq. I usually deal with that by creating blacklist regions, but if you just need to do normal DE analysis then you can just ignore these regions.

ADD REPLY
0
Entering edit mode

Hi Devon, can you elaborate on how you came about your reference rRNA cassette? I too have been grappling with rRNA contamination within my samples, there are some where depletion was not as effective as the rest.

What I have done in my work is to take the GTF from repeatmasker annotations of rRNA as well as the fasta sequences from this annotation. I too, find rRNA reads remain with my pipeline (taking the GTF file and using split_bam.py in RNA-SeQC)

ADD REPLY
2
Entering edit mode

You can find the human rDNA repeat sequence in this post: entire human rDNA

Mouse rDNA repeat can be found here.

ADD REPLY
0
Entering edit mode

Thank you genomax, don't know how I missed this annotation (well to be honest wasn't aware of it) I will add this to my rRNA reference and try again.

Cheers.

ADD REPLY
0
Entering edit mode

Thanks for the suggestions. I am new to this. I am working on rice and would be interested in 1) DE analysis 2) Co-expression Networks. I plan to carry out the following pre-processing steps before I go for DE:

1) Remove low quality bases from 5' and 3' end 2) Remove rRNA and tRNA sequences 3) Remove bases that are shorter that 20 bp Does this look fine ?

ADD REPLY
1
Entering edit mode

For that you can skip (2).

ADD REPLY

Login before adding your answer.

Traffic: 2736 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6