Biostar Beta. Not for public use.
Question: Unexpected characters at consensus fasta generated by mpileup
0
Entering edit mode

Hi all,

I'm using mpileup function on Linux (compiler: Ubuntu 4.8.4, samtools version 1.6-5-gfe1a2e9) to generate a consensus fasta using the following command:

samtools mpileup -uf human_nanogv2.fa --bam-list bam_list | bcftools call -c | vcfutils.pl vcf2fq > consensus.fa

This command creates a consensus fasta, but with some characters other than ATCG such as M, R, Y, W and S. A sample from the generated consensus sequence is below:

AAGAMACAGTCTCGGGCCGGGCGTGGTGG
CTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGCGGGCGGATYRCCTGAGGTCAGG
AGTTCGAGACCAGCCTGGSCAACAYGGTGAAACCCCCATCTCTACTAAAATACAAAAAAT
TAGCTGGGCGTGGTGGCATGCGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGCAGGAGA
ATTGCTTGAACCCGGGAGGYGGAGGTCAGTGAGCTGAGATTGCACCACTGCACTCCAGCC
TGGGCGACAGAGCGAGACTTCTGTCTCAAAAAGAAAAAAAAAGAAGATGCTTATCATGGG
CCGGGCGCAGTGGCTCACACCTGTAATCCCAGCACTTTGGGAGGCCGAGGCAGGCGGATC
ACCTGAGGTCAGGAGTTCAAGACCAGCCTGGCCAACATAGTGAAACCCTGTCTCTACTAA
AAATACAAGAAAATTAGCTGGGCATGGTGGCRCGTGCCTGTAGTCCCAGCTACTTGGGAG
GCTGAGGCAGGAGAATCACTTGAACCCAGGAGGTGGAGGTTGCAGTGAGCCGAGATTGCG
CCACTGCACTCCAGCNTGGGCAACAGAGTGAGACTCTGTCTCAGAAAAAAAAAAAAAAAA
AAAAAAAAAGATGCCTATGGCCGGGCGAAGTGTCTCACACCTGTAATCCCAGCATTTTGG
GAGGCCAAGGCGGCTAGATCACTTGAGGTCAGGAGTTCAAGACCAGCCTGGCCAACATGG
TGAAACACTGTCTCTACTAAAAATACAAAGAATTAGCTAGGCATGGTAGCGGGTGCCTGT
AATCACACCTACTCAGGAAGCTGAGGNNNNNNNNTCTTTTTTTCTTTTTTTTTTGAGACA
GAGTTTTGCTCTTGTTGCCCAGGCTRGAGTGCARTGGCRYGATCTTGGCTCACYGCAACC
TCCRCCTCCCRGGTTCAAGTGATTCTCCTGCCTCAGCCTCCCRAGTAGCTGGGATTACAG
GCATGYGCCACCACGCCCRGCTAATTTTGTATTTTTAGTAGAGACGGGGTTTCWCCATGT
TGGYCAGGCTGGTCTYGAACTCCTGACCTCAGGTGATCCACCYRCCTCRGCCTC

I was wondering if this is an issue about mpileup and my fasta is corrupted or if these letters indicate indels etc. However, I could not find any documentation about such letters online. I would be more than happy if you could help me with this.

Best regards, Gökberk

3
Entering edit mode

These are standard IUPAC codes for ambiguous bases and the result of your vcf2fq call.

A potential solution to avoid these can be found here: Generate consensus sequence from BAM without ambiguity codes

ADD COMMENTlink 9 months ago benformatics • 870

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0