Paired end sequencing with different read length: better to trim everything to a short length or use long single end
0
0
Entering edit mode
6.6 years ago
Lalla ▴ 40

Dear all,

I am working on alternative splicing. I have paired-end sequencing data of different length (100 bp for the forward and 66 for the reverse, trimmed because of low quality). My problem is that some tools for alternative splicing, such as MISO or rMATS, require reads of the same length. I could trim the reads to get the same size, and use both forward and reverse, but in this case all the reads will be quite short (66 bp). Alternatively I can use only the forward reads, which are considerably longer (100 bp). I searched a bit but I could not find a clear answer on which strategy would be better. To my knowledge is better to not use <70 bp reads for alternative splicing detection, but I find papers that publish alternative splicing data with 50bp reads single ends.

Any advice would be highly appreciated.

Thanks!

rna-seq alternative splicing read length • 3.9k views
ADD COMMENT
0
Entering edit mode

What Q score threshold did you trim the reads at? You may want to go back to the original reads and try them to see if they still work. When you have a reference genome you can afford to use reads with less than optimal quality.

ADD REPLY
0
Entering edit mode

Dear genomax

thank you for your reply. Unfortunately I do not know. I am not a bioinformatician, and the quality trimming, mapping and alignment has been done by our in house bioinformaticians. We work on mouse, so I believe that if they decided to trim those reads it was for a good reason and I wouldn't trust low quality reads when it comes to alternative splicing detection.

Thank you anyway

ADD REPLY
1
Entering edit mode

If MISO or rMATS require reads of same length then you don't have an option but to trim R1 to same length as R2.

ADD REPLY
0
Entering edit mode

I agree with genomax. In addition, nobody prevents you from BOTH trimming read one to 66bp and using the R1 as 100bp single read and integrate the results, so that you get the most possible information. Finally... I am pretty sure that tools such as tophat-cufflinks (and probably the newer Hisat, as well) can align paired reads of different lengths and detect different isoform, so you might also try to work in that direction.

ADD REPLY
0
Entering edit mode

Ok, I will try both ways then. Thanks for the suggestions!

ADD REPLY
0
Entering edit mode

Is it mandatory that the read length should be same even when we are working with BAM files? (in case of rmats)

ADD REPLY

Login before adding your answer.

Traffic: 2881 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6