Hi,
I'm using samtools/1.2 and bcftools/1.2
I'm having the similar issue with https://github.com/samtools/bcftools/issues/50: (non of the replies solves my problem...)
samtools mpileup -uf ref.fa my.bam | bcftools call -c - | vcfutils.pl vcf2fq > my.fq
I'm getting all nnnnnnnnn
and !!!!!!!!!!!!!!!!!!
in the final fq file.
Is this something wrong with "vcfutils.pl" itself? I googled around, it seems people have same question, but no solution.
How can I get a correct fast file now?
P.S. Besides vcfutils.pl, I did try bcftools consensus, it worked fine for me. But my problem is, in my bam file, there are supposed to be some missing data. Since the consensus sequence was mapped to human reference genome, I guess all the missing/low quality sites are taken as the same as human reference genome? (even if this works, dead-end? and I have the vcf file I want, I don't need to generate them from bam file by myself.)
Thanks a lot!
did you check the output from
bcftools call -c
? (something likesamtools mpileup -uf ref.fa my.bam | bcftools call -c - -o output.vcf -O v
)Hi I checked the output vcf from bcftools, it looks fine. But indeed, it didn't distinguish between missing data from others. (Or this is it? it is basically like this?) All non-alternative allele sites showed as they are reference alleles. So I was thinking if I should add
-g INT
, but then it only output variable sites, but still, it doesn't solve the problem.well, i guess you need to look at your file again. You should be seeing sequences interspersed among Ns. Last ?! are quality scores.
No, it's not like there are sequences between
N
and?!
, I checked how a normal fastq file should look like, it's not like that. The generated fastq file is like:and
only Ns in entire file? What I got were Ns, contiguous sequences and quality scores in between and !! ?? at the end . Because this fastq is built from VCF, I expected fastq to have Ns and low scores, in addition to bases in VCF. Following is that command I ran and it seems working for me:
let me update on this again. Fastq validation is failing. I guess perl script is writing entire sequence and statistics into two lines instead of 4.
Yes, I'm getting all N, all "!" and all "~". It must be something wrong with either the vcfutils.pl itself, or my input bam file or bcf file generated from mpileup.
And this command is the same as what I ran, would you mind tell me the version of your samtools? Thanks~