What Could Be The Situation That Contigs With Zero Reads And Singleton Contigs Are Included In The Assembly?
3
1
Entering edit mode
12.3 years ago
Prakki Rama ★ 2.7k

Hi all,

as asked above, do you think would it be possible or would there be any significant reason for the inclusion of singletons in the assembly?

Literally, zero read contigs does not make sense.

I assembled 25M, 90bp fastq reads using trinity assembler and obtained 0.13 Million contigs/transcripts.

I have mapped fastq reads to those transcripts i got by assembling the reads using trinity.

Now when i found the number of reads mapped to each contig/transcript using bowtie2 and samtools, i am confused to see few contigs also had 1 read as well as 0 read.

I also observed that these 1 read, 0 read contig had length of 100bp-500bp length. If the reads are not overlapped into contigs why would the length be >90bp. But, my question is why did bowtie2 showed 0 reads for few of the contig still?

All ideas are welcome and help is appreciated.

Thanks in advance.

contigs assembly • 4.2k views
ADD COMMENT
4
Entering edit mode
12.3 years ago

Realigning is not really the litmus test you think it is. You should try to see what read-tracking export formats are in Trinity.

Graph-based assemblers assemble kmers, not reads, and will willingly break reads up at the points of ambiguity (repeat junctions). So it is possible some of your reads have their feet planted in two or more contigs.

ADD COMMENT
1
Entering edit mode

+1. Yes. This is a better explanation. Although what I just said is not wrong, it should have a smaller effect. I am retracting my answer. Yours is the correct one.

ADD REPLY
0
Entering edit mode

+1. Yes. This is a better explanation. Although what I just said is not wrong, it should have a smaller effect. I am retracting my answer.

ADD REPLY
0
Entering edit mode

Thank you Jeremy,lh3. Yes i agree that kmers are used in the assembly. But could i please know why i am getting transcripts without any kmer? I used a kmer of 43 in velvet and used inbuilt (.afg) file to get the following results. The numbers below show Contig_ID, Reads( must be kmers from different reads),length of the contig.

93764 0 87 95957 0 88 105528 0 89 509 1 85 2015 1 85 6601 1 85 . . . 897 2 85 2728 2 87 4386 2 85 . .

How is it possible that the length of contig becomes 87 without any overlap with other kmers? Please clarify and help me if i am misunderstanding.

ADD REPLY
0
Entering edit mode

did you actually turn on read tracking?

ADD REPLY
0
Entering edit mode

what was your velvetg command?

ADD REPLY
0
Entering edit mode

Hi Jeremy. I used the following commands to assemble: ./velveth dir 39,65,4 -fastq -short s_read_1.fq s_read_2.fq ./velvetg dir_43 -read_trkg yes -amos_file yes -unused_reads yes

ADD REPLY
0
Entering edit mode

we would have to see the afg file. AFG is a pretty weird format in that scaffolds and contigs are treated similarly. Make sure you review the spec carefully. http://biostar.stackexchange.com/questions/10745/questions-about-the-asm-file-produced-by-the-velvet-program

ADD REPLY
0
Entering edit mode
12.3 years ago
Neilfws 49k

My understanding is that unused reads are those reads which are not input to the assembly algorithm, normally because they have failed some quality control test. Singletons, on the other hand, are "good" reads which are put into the assembly but do not overlap any other reads.

As for "contigs with zero reads", I think you must have misunderstood something that you have read. Contigs are, by definition, made up of overlapping reads. There's no such thing as a contig with zero reads.

ADD COMMENT
1
Entering edit mode

I don't know what you mean by "assembly transcripts". How were they obtained? Are we talking about transcripts assembled from ESTs?

ADD REPLY
0
Entering edit mode

hello neilfws, thank you for your answer. The reason why i raised the issue of zero read contig is, when i mapped the fastq reads to assembly transcripts using bowtie2 and tried to find out the number of reads falling in each contig, i found a significant amount of contigs with no reads mapped. Should'nt i consider them zero read contigs? would be thankful to your reply.

ADD REPLY
0
Entering edit mode

sorry for my typo. I meant those Transcripts obtained after assembling the fastq reads.

ADD REPLY

Login before adding your answer.

Traffic: 1510 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6