help regarding abyss assembly
1
0
Entering edit mode
5.3 years ago
praasu ▴ 40

Hi,

I have paired DNA-seq data of some protista. I tried to do assembly using the abyss 2.1.1 after trimming using Trimmomatic. Trimmomatic preprocessed reads into Paired end reads (Forward and Reverse) and unpaired reads from each file that I have considered as single end reads for abyss assembly.

Used command for the abyss

for k in {150..250..10};
do 
    mkdir k$k;
    cd k$k; 
    abyss-pe k=$k name=1604 pe='../protista_forward_paired.fastq ../protista_reverse_paired.fastq'  se='../protista_forward_unpaired.fastq ../protista_reverse_unpaired.fastq'; 
    cd ..;
done

I got three contig files like protista-1.fa, protista-2.fa, protista-3.fa.

complete list of output :

coverage.hist,
protista-bubbles.fa,
protista-1.fa,
protista-1.dot,
protista-1.path,
protista-2.dot1,
protista-2.fa,
protista-2.dot,
protista-3.dot,
protista-2.path,
protista-3.fa,
protista-indel.fa,
protista-unitigs.fa ,
protista-3.fa

I am not sure which config file I should use for further analysis including scaffolding. Could someone please help me with this.

Thank you very much in advance.

next-gen Assembly genome SNP sequencing • 2.0k views
ADD COMMENT
0
Entering edit mode
5.3 years ago

You can run some stats on each of the .fa FASTA files there to check which one looks the best.

I recommend

stats.sh

from the bbmap package, available in bioconda. It will give you lots of stats on the assembly #bp and #contigs, + N50 etc.

ADD COMMENT
0
Entering edit mode

Hi,

Thank you very much for your reply. I'll check the assembly statistics for each K mer output. I will select the best k-mer based on the stat. My question is that I have used both paired end and unpaired data for the abyss assembly. I got three contig files (protista-1.fa, protista-2.fa, protista-3.fa) However I was expecting single file. So I want to know, where I will use those file for scafolding or unitigs file (protista-unitigs.fa).

ADD REPLY
0
Entering edit mode

Meanwhile, I just notice that contig file (protista-3.fa) and unitigs file (protista-unitigs.fa) are same. They have signficantly higher N50, N80 as well as other parameters. Thank you very much.

ADD REPLY

Login before adding your answer.

Traffic: 2718 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6