Biostar Beta. Not for public use.
Updating allele frequency (AF) and minor allele frequency (MAF) INFO fields in .vcf
0
Entering edit mode
2.5 years ago
JSEM • 20

Hi everyone,

I'm processing .vcf files for the first time, and am hoping for some advice with something. I recently filtered a set of .VCFs for a dataset to exclude around half of the subjects, but now want to update the INFO columns in me output files to reflect the stats of the new filtered dataset.

I've tried using fill-an-ac from vcftools, which has successfully updated the AN/AC fields, but this doesn't updated the allele frequency/minor allele frequency fields. I know that I could use the AN/AC values to derive AF/MAF, but I was just wondering if there is an option in one of the existing tools to update these fields automatically, similar to the function of fill-an-ac?

I'd really appreciate any suggestions you might have.

Thanks in advance!

ADD COMMENTlink
1
Entering edit mode
14 months ago
France/Nantes/Institut du Thorax - INSE…

using vcffilterjdk: http://lindenb.github.io/jvarkit/VcfFilterJdk.html

use awk to insert the VCF headers AF and MAF if they're missing.

in vcffilterjdk: collect AN and AC for each alt allele. Map each AC to AF by dividing AC by AN. set the Attribute "AF", get the min (AF) and the set attribute "MAF".

$ gunzip -c  src/test/resources/rotavirus_rf.vcf.gz |\
awk '/^#CHROM/ {printf("##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Min Allele Frequency\">\n##INFO=<ID=AF,Number=A,Type=Float,Description=\"Allele Frequency\">\n");} {print}' |\
java -jar dist/vcffilterjdk.jar -e 'VariantContextBuilder vcb = new VariantContextBuilder(variant); float ac = variant.getAttributeAsInt("AN",0); if(ac>0) { List<Float> af = variant.getAttributeAsIntList("AC",0).stream().map(N->N/ac).collect(Collectors.toList());vcb.attribute("AF",af);vcb.attribute("MAF",af.stream().mapToDouble(X->X.floatValue()).min().orElse(-1.0) );} return vcb.make();'
ADD COMMENTlink
0
Entering edit mode

Thanks for the quick reply. This approach seems like it should be quite straightforward, difficulty is that I'm running this on an HPC cluster, where running java programs is not very straightforward given permissions...etc and modules available on HPC.

Are there any other approaches that are non-java based, or workarounds for HPCs? Thanks again!

ADD REPLYlink
0
Entering edit mode

, where running java programs is not very straightforward given permissions.

why java and not python or whatever ? This program use streaming = doesn't require much memory.

ADD REPLYlink
0
Entering edit mode
16 months ago
GK1610 • 60
United States

when I run this , I get the following error

VCF=unzipped_vcf_file awk '/^#CHROM/ {printf("##INFO=<id=maf,number=1,type=float,description=\"min allele="" frequency\"="">\n##INFO=<id=af,number=a,type=float,description=\"allele frequency\"="">\n");} {print}' $VCF |\ java -jar /sc/orga/projects/psychencode/kiran/jvarkit/dist/vcffilterjdk.jar -e 'VariantContextBuilder vcb = new VariantContextBuilder(variant); float ac = variant.getAttributeAsInt("AN",0); if(ac>0) { List<float> af = variant.getAttributeAsIntList("AC",0).stream().map(N->N/ac).collect(Collectors.toList());vcb.attribute("AF",af);vcb.attribute("MAF",af.stream().mapToDouble(X->X.floatValue()).min().orElse(-1.0) );} return vcb.make();' > temp.vcf

[DEBUG][VcfFilterJdk] Compiling : 1 import java.util.; 2 import java.util.stream.; 3 import java.util.function.; 4 import htsjdk.samtools.util.; 5 import htsjdk.variant.variantcontext.; 6 import htsjdk.variant.vcf.; 7 @javax.annotation.Generated(value="VcfFilterJdk",date="2019-02-21T17:30:17-0500") 8 public class VcfFilterJdkCustom1164653960 extends com.github.lindenb.jvarkit.tools.vcffilterjs.VcfFilterJdk.AbstractFilter { 9 public VcfFilterJdkCustom1164653960(final VCFHeader header) { 10 super(header); 11 } 12 @Override 13 public Object apply(final VariantContext variant) { 14 /* user's code starts here */ 15 VariantContextBuilder vcb = new VariantContextBuilder(variant); float ac = variant.getAttributeAsInt("AN",0); if(ac>0) { List<float> af = variant.getAttributeAsIntList("AC",0).stream().map(N->N/ac).collect(Collectors.toList());vcb.attribute("AF",af);vcb.attribute("MAF",af.stream().mapToDouble(X->X.floatValue()).min().orElse(-1.0) );} return vcb.make(); 16 /* user's code ends here */ 17 } 18 } [WARN][SnpEffPredictionParser]no INFO[EFF] or no description. This VCF was probably NOT annotated with SnpEff (old version). But it's not a problem if this tool doesn't need to access SnpEff Annotations. [WARN][VepPredictionParser]NO INFO[CSQ] found in header. This VCF was probably NOT annotated with VEP. But it's not a problem if this tool doesn't need to access VEP Annotations. [WARN][AnnPredictionParser]no INFO[ANN] or no description This VCF was probably NOT annotated with SnpEff(ANN version) . But it's not a problem if this tool doesn't need to access SnpEff Annotations. [SEVERE][Launcher]Contig 22 does not have a length field. htsjdk.tribble.TribbleException: Contig 22 does not have a length field. at htsjdk.variant.vcf.VCFContigHeaderLine.getSAMSequenceRecord(VCFContigHeaderLine.java:80) at htsjdk.variant.vcf.VCFHeader.getSequenceDictionary(VCFHeader.java:206) at com.github.lindenb.jvarkit.util.picard.SAMSequenceDictionaryProgress.<init>(SAMSequenceDictionaryProgress.java:259) at com.github.lindenb.jvarkit.tools.vcffilterjs.VcfFilterJdk.doVcfToVcf(VcfFilterJdk.java:800) at com.github.lindenb.jvarkit.util.jcommander.Launcher.doVcfToVcf(Launcher.java:563) at com.github.lindenb.jvarkit.util.jcommander.Launcher.doVcfToVcf(Launcher.java:583) at com.github.lindenb.jvarkit.tools.vcffilterjs.VcfFilterJdk.doWork(VcfFilterJdk.java:826) at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMain(Launcher.java:736) at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMainWithExit(Launcher.java:894) at com.github.lindenb.jvarkit.tools.vcffilterjs.VcfFilterJdk.main(VcfFilterJdk.java:841) [INFO][Launcher]vcffilterjdk Exited with failure (-1)

ADD COMMENTlink
0
Entering edit mode

Contig 22 does not have a length field.

there is an error in your header ##contig=<ID=22...> ...

ADD REPLYlink
0
Entering edit mode
14 months ago
seta ♦ 1.2k
Sweden

hi all I have a problem similar JSEM, which AF doesn't updated , is there any other command.

thanks in advance

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1