Generate Consensus sequence from BAM file
3
0
Entering edit mode
7.2 years ago
SOHAIL ▴ 400

Hi everyone, I want to generate diploid consensus sequence for one human individual from BAM file for PSMC analysis purposes. my command line is:

samtools mpileup -C50 -uf REF.fa BQSR_AAGC0220.bam | bcftools view -c - | perl vcfutils.pl vcf2fq -d 8 -D 100 | gzip > AAGC0220.fq.gz

The command run for two days on my server computers and after that i got an error message:

Error: Could not parse --min-ac -

Use of uninitialized value in length at /leofs/zengchq_group/sohail/softwares/bcftools-1.2//vcfutils.pl line 565.

Use of uninitialized value in length at /leofs/zengchq_group/sohail/softwares/bcftools-1.2//vcfutils.pl line 565.

[mpileup] 1 samples in 1 input files

<mpileup> Set max per-file depth to 8000

I am using BAM file that came out after BQSR in GATK pipeline.. does it has any impact or am i making any mistake in command-line?? i am using samtools-1.3.x, and bcftools-1.3.x

Can anyone please guide me how to solve this??

Thanks!

samtools mpileup psmc • 12k views
ADD COMMENT
2
Entering edit mode
7.2 years ago
Dan ▴ 530

This question was already asked/answered here: Generate consensus from BAM file

If you still have errors, you should reword your question ;-)

ADD COMMENT
0
Entering edit mode

Hi @Dan, thanks for the answer.. can you suggest any way to run samtools faster?

ADD REPLY
1
Entering edit mode
7.2 years ago
shengweima ▴ 60
samtools mpileup -uf chr5B_leaf_rust.fasta M27454_5B_reads_sorted.bam | bcftools call -c | vcfutils.pl vcf2fq

My command. You can make test using a small data

ADD COMMENT
0
Entering edit mode

Hi @Shengweima thanks for the help.. it works on small dataset.. can you please give an idea how much time it takes normally for 30X genome.? i am running my job from last 20 hours..

ADD REPLY
0
Entering edit mode
7.2 years ago
SOHAIL ▴ 400

Hi, I had a problem in running samtools/bcftools while generating consensus sequence. my command line was:

samtools mpileup -C50 -uf REF.fa BQSR_AAGC0220.bam | bcftools view -c - | perl vcfutils.pl vcf2fq -d 8 -D 100 | gzip > AAGC0220.fq.gz

later as i learnt bcftools modified the options for generating consensus sequences. my command-line is:

samtools mpileup -C50 -uf REF.fa snippet.bam | bcftools call -c  | perl vcfutils.pl vcf2fq -d 8 -D 100 | gzip > snippet.fq.gz

It runs smoothly with samtools-1.3.x and bcftools1.3.x with input files coming out from BQSR/picards.. Please note that older versions of samtools (like samtools-0.7.x) not support the picard or BQSR inputs and prompts an error message regarding the header of the file.

However i couldn't figure it out yet that how much it will take time for deeply sequenced genomes..

Hope it helps for the rest of the new people like me..

Thanks!

ADD COMMENT
0
Entering edit mode

Hi sohail, I have 6 wheat samples sequenced using RNA-seq. I received forward and reverse fastq files and I generated bam files by using hisat2 tool which are aligned with the reference wheat genome. I have been asked to build multiple sequence alignment for 3 genes from this sequenced rna seq data. I believe I need to select a gene sequence from all the samples and do a multiple sequence alignment. But I am struck in fetching the gene sequence from bam files. How do I select one gene sequence for all the 6 samples? Any suggestions? kindly send me commands for fetching the sequence of genes from this RNA seq data in fastq file or aligned.bam file.

ADD REPLY
0
Entering edit mode

Have you solved this? I am also trying to do the same with no luck :/

ADD REPLY

Login before adding your answer.

Traffic: 2771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6