Any Possible Method/Tool That Could Refers The Number Of Overlapping Sequences Used To Build Each Contig?
1
0
Entering edit mode
10.4 years ago
bambus0725 ▴ 50

HI,

I work with Metatranscriptomics data(sequenced using Illumina technology). I did de-novo transcriptome assembly using SOAP-Denovo-Trans assembler and now looking for a tool/software that could help me out to find the total number of overlapped reads involved to form a single contig to understand how good or bad the coverage is.

Any suggestions could be helpful.

Thank you in advance.

biology assembly • 2.8k views
ADD COMMENT
0
Entering edit mode
10.4 years ago

If you have your contigs and reads in sorted BED format (e.g., myContigs.bed and myReads.bed) and you know your overlap criteria, then you could use BEDOPS bedmap to answer that question:

$ bedmap --echo --count <overlap-options> myContigs.bed myReads.bed > myContigsWithCountOfOverlappingReads.bed

If you leave out <overlap-options> then the default overlap between read and contig is one base. Otherwise, you can specify number of bases of overlap between files with --bp-ovr or require a fraction of contig or read length with --fraction-ref and --fraction-map respectively. Other overlap options are also available. This is discussed more fully in the BEDOPS documentation.

ADD COMMENT
0
Entering edit mode

Thank you for the comment Alex.

the problem is that the data I work with is in Fasta format,is there an option to convert fasta file to the format that could be acceptable by BEDOPS like BAM/SAM.Does it works?

ADD REPLY
0
Entering edit mode

It will depend on your Fasta file and whether it already contains coordinate and chromosome information (in the header, for instance). If not, you'll need to align your sequences to a reference genome to turn into BAM, SAM or PSL, and convert from there into BED with a conversion script (such as those in BEDOPS).

ADD REPLY
0
Entering edit mode

No actually the header line has only the sequence ID,and for most of the organisms doesn't exist any reference genome yet so I guess I can't do, even this is the reason I did de-novo transcriptome assembly

ADD REPLY

Login before adding your answer.

Traffic: 3277 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6