Splitting long reads into shorter chunks
0
0
Entering edit mode
5.8 years ago
pennakiza ▴ 60

Hi all,

I was wondering if splitting a long read to multiple ones would have any consequences on the quality of the read and I would like to read your opinions on that.

Just a very simple example, say I have a 10 base sequence:

@read1
ATGTGGATCA

and I split it into two 5 base ones:

@read1_1
ATGTG
@read1_2
GATCA

Thanks
Peny

long-read • 1.3k views
ADD COMMENT
1
Entering edit mode

Why would you want to do that? You don't tell your goal, so it is difficult to evaluate what would be the consequences of splitting the reads. The "quality" of the reads remain the same as before, if you also split the fastq qualities.

However:

If you intend to use these reads for mapping or assembly, the results of those analyses would be poorer: shorter reads results in less specific, more multi-mapping reads; and also results in more fragmented assemblies. Downstream analyses based on those mappings or assemblies would also be negatively affected.

ADD REPLY
0
Entering edit mode

I have very long reads that I would like to pass through a fusion gene detection tool, which will not take the reads as they are, mainly because of the aligner that it uses. However, I am thinkinv of splitting them in quite large chucks (1000 bases, as opposed to the 10000 bases sequences ghat I have now).

ADD REPLY
0
Entering edit mode

Then your example of 5 and 10 nucleotide reads is not representative of the real question.

ADD REPLY
0
Entering edit mode

Just as an example, obviously my real-life reads are huge, around 10000 nt each. However, I was planning to chunk them into pieces of 1000 nts each.

ADD REPLY

Login before adding your answer.

Traffic: 1588 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6