Annotating Mycobacterium tuberculosis VCF file using snpEFF and aNNOVAR
0
0
Entering edit mode
5.2 years ago
S AR ▴ 80

Hi,

I generated my vcf files from GATK pipeline using ploidy 1 as it is a mycobacterium tuberculosis genome. Now i want to annotate my variants using snpEFF and Annovar. I search snpEff database for mtb annotation using:

java -jar snpEff.jar download -v Mycobacterium_tuberculosis

it gave me numerous results showing that it contans the mtb database. Bit I'm not sure which one is mine/reference one that i used to generate the vcf file. My mtb reference genome file looks like this:

>M.tuberculosis_H37Rv NC_000962.3
ttgaccgatgaccccggttcaggcttcaccacagtgtggaacgcggtcgtctccgaacttaacggcgaccctaaggttgacgacggacccagcagtgatgctaatctcagcgctccgctgacccctcagcaaagggcttggctcaatctcgtccagccattgaccatcgtcgaggggtttgctctgttatccgtgccgagcagctttg.............................

I tried buildDbNcbi.sh script from snpEFF to build my own db but it is produced the following error:

Downloading genome NC_000962
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 17.7M    0 17.7M    0     0   157k      0 --:--:--  0:01:55 --:--:--  483k
00:00:00        SnpEff version SnpEff 4.3t (build 2017-11-24 10:18), by Pablo Cingolani
00:00:00        Command: 'build'
00:00:00        Building database for 'NC_000962'
00:00:00        Reading configuration file 'snpEff.config'. Genome: 'NC_000962'
00:00:00        Reading config file: /home/sark/snpEff/snpEff.config
00:00:01        done
No sequence found in feature file.
        Trying fasta file '/home/sark/snpEff/./data/genomes/NC_000962.fa'
        Trying fasta file '/home/sark/snpEff/./data/NC_000962/sequences.fa'
java.lang.RuntimeException: Cannot find sequence for 'NC_000962'
        at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryFeatures.sequence(SnpEffPredictorFactoryFeatures.java:467)
        at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryFeatures.addFeatures(SnpEffPredictorFactoryFeatures.java:111)
        at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryFeatures.create(SnpEffPredictorFactoryFeatures.java:330)
        at org.snpeff.snpEffect.commandLine.SnpEffCmdBuild.run(SnpEffCmdBuild.java:369)
        at org.snpeff.SnpEff.run(SnpEff.java:1183)
        at org.snpeff.SnpEff.main(SnpEff.java:162)
java.lang.RuntimeException: Error reading file '/home/sark/snpEff/./data/NC_000962/genes.gbk'
java.lang.RuntimeException: Cannot find sequence for 'NC_000962'
        at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryFeatures.create(SnpEffPredictorFactoryFeatures.java:344)
        at org.snpeff.snpEffect.commandLine.SnpEffCmdBuild.run(SnpEffCmdBuild.java:369)
        at org.snpeff.SnpEff.run(SnpEff.java:1183)
        at org.snpeff.SnpEff.main(SnpEff.java:162)
00:00:01        Logging
00:00:02        Checking for updates...
00:00:04        Done.

Then i kept my fasta file in the above mentioned error folder but now it is giving the following error:

Downloading genome NC_000962.3
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 17.7M    0 17.7M    0     0   332k      0 --:--:--  0:00:54 --:--:--  447k
curl: (16) Error in the HTTP2 framing layer

Then i thought of using the built in db for MTB so i just renamed my chr names in my file it is: M.tuberculosis_H37Rv And i tried to replace it with the built in one: ERS007734SCcontig000001 Still no success.

It is generating the following error in each variant of the vcf file:

9;ANN=A||MODIFIER|||||||||||||ERROR_OUT_OF_CHROMOSOME_RANGE

Can anyone help me with this please and can anyone tell how to use annovar for same vcf file?

Thank you. :)

SNP annotation snpEff Annovar MTB • 1.9k views
ADD COMMENT

Login before adding your answer.

Traffic: 2655 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6