Biostar Beta. Not for public use.
Trimming single end reads for STAR?
2
Entering edit mode
9 months ago
caggtaagtat • 620

Hi,

I just started to work with single end reads, which are already trimmed for adapter sequences and quality. Do I have to trimm the reads now to the same length of e.g. 100nt for mapping them with STAR? Is there a negative effect, if I don't?

ADD COMMENTlink
4
Entering edit mode
9 months ago

If the qualities are ok and there are no adapters you can proceed with mapping. There is a recent paper about trimming of RNAseq data and its possible consequence on downstream analysis - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4766705/

ADD COMMENTlink
0
Entering edit mode

Thank you! I will proceed with the mapping than.

ADD REPLYlink
2
Entering edit mode
11 months ago
h.mon 25k
Brazil

If they are already trimmed for adapters and quality, don't trim more. Trimming will make sequences shorter, and shorter sequences tend to map more to multiple locations.

What is the length range of your reads? I generally keep reads only within a certain range, and discard the shorter reads. For example, for a 100bp dataset, I keep reads from 70-100bp after trimming, and discard the rest.

ADD COMMENTlink
0
Entering edit mode

That makes sense! My reads are 40-155nt long.

Here is a plot of the percentage I would discard vs the possible minimal read length. Would a minimal length of 80nt be appropriate?

https://ibb.co/gLZ7q7

ADD REPLYlink
1
Entering edit mode

80 seems reasonable. What is the organism? Also, if you used trimmomatic for trimming then it has an option to remove trimmed reads shorter than given value.

ADD REPLYlink
0
Entering edit mode

Ok thank you. The reads were obtained from human cardiovascular endothelial cells. Thank you, I was going to use trimmomatic :)

ADD REPLYlink
0
Entering edit mode

50bp should be fine for counting applications for human genome. You may be throwing good data away by being too strict.

ADD REPLYlink
0
Entering edit mode

Ok, but since I do analysis of alternative splicing, I will stick with a minimal lenght of 75nts for now. I read somewhere in this forum, that reads schould not be shorter than 70nt for isoform analysis

ADD REPLYlink
0
Entering edit mode

That sounds reasonable. Curious why you did not choose to do paired-end sequencing to get spatial information in that case.

ADD REPLYlink
0
Entering edit mode

I was told that using single-end sequencing would be better for doing splicing analysis, althoug I can't remember why . Besides, I was not included in that desicion and would maybe also guess financial reasons ;)

ADD REPLYlink
0
Entering edit mode

Sufficient makes sense rather than better. The financial reason angle is always critical :-)

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1