Biostar Beta. Not for public use.
Question: Updating allele frequency (AF) and minor allele frequency (MAF) INFO fields in .vcf
0
Entering edit mode

Hi everyone,

I'm processing .vcf files for the first time, and am hoping for some advice with something. I recently filtered a set of .VCFs for a dataset to exclude around half of the subjects, but now want to update the INFO columns in me output files to reflect the stats of the new filtered dataset.

I've tried using fill-an-ac from vcftools, which has successfully updated the AN/AC fields, but this doesn't updated the allele frequency/minor allele frequency fields. I know that I could use the AN/AC values to derive AF/MAF, but I was just wondering if there is an option in one of the existing tools to update these fields automatically, similar to the function of fill-an-ac?

I'd really appreciate any suggestions you might have.

Thanks in advance!

ADD COMMENTlink 23 months ago JSEM • 20 • updated 9 months ago seta ♦ 1.2k
1
Entering edit mode

using vcffilterjdk: http://lindenb.github.io/jvarkit/VcfFilterJdk.html

use awk to insert the VCF headers AF and MAF if they're missing.

in vcffilterjdk: collect AN and AC for each alt allele. Map each AC to AF by dividing AC by AN. set the Attribute "AF", get the min (AF) and the set attribute "MAF".

$ gunzip -c  src/test/resources/rotavirus_rf.vcf.gz |\
awk '/^#CHROM/ {printf("##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Min Allele Frequency\">\n##INFO=<ID=AF,Number=A,Type=Float,Description=\"Allele Frequency\">\n");} {print}' |\
java -jar dist/vcffilterjdk.jar -e 'VariantContextBuilder vcb = new VariantContextBuilder(variant); float ac = variant.getAttributeAsInt("AN",0); if(ac>0) { List<Float> af = variant.getAttributeAsIntList("AC",0).stream().map(N->N/ac).collect(Collectors.toList());vcb.attribute("AF",af);vcb.attribute("MAF",af.stream().mapToDouble(X->X.floatValue()).min().orElse(-1.0) );} return vcb.make();'
ADD COMMENTlink 23 months ago Pierre Lindenbaum 120k
Entering edit mode
0

Thanks for the quick reply. This approach seems like it should be quite straightforward, difficulty is that I'm running this on an HPC cluster, where running java programs is not very straightforward given permissions...etc and modules available on HPC.

Are there any other approaches that are non-java based, or workarounds for HPCs? Thanks again!

ADD REPLYlink 23 months ago
JSEM
• 20
Entering edit mode
0

, where running java programs is not very straightforward given permissions.

why java and not python or whatever ? This program use streaming = doesn't require much memory.

ADD REPLYlink 23 months ago
Pierre Lindenbaum
120k
0
Entering edit mode

when I run this , I get the following error

VCF=unzipped_vcf_file awk '/^#CHROM/ {printf("##INFO=<id=maf,number=1,type=float,description=\"min allele="" frequency\"="">\n##INFO=<id=af,number=a,type=float,description=\"allele frequency\"="">\n");} {print}' $VCF |\ java -jar /sc/orga/projects/psychencode/kiran/jvarkit/dist/vcffilterjdk.jar -e 'VariantContextBuilder vcb = new VariantContextBuilder(variant); float ac = variant.getAttributeAsInt("AN",0); if(ac>0) { List<float> af = variant.getAttributeAsIntList("AC",0).stream().map(N->N/ac).collect(Collectors.toList());vcb.attribute("AF",af);vcb.attribute("MAF",af.stream().mapToDouble(X->X.floatValue()).min().orElse(-1.0) );} return vcb.make();' > temp.vcf

[DEBUG][VcfFilterJdk] Compiling : 1 import java.util.; 2 import java.util.stream.; 3 import java.util.function.; 4 import htsjdk.samtools.util.; 5 import htsjdk.variant.variantcontext.; 6 import htsjdk.variant.vcf.; 7 @javax.annotation.Generated(value="VcfFilterJdk",date="2019-02-21T17:30:17-0500") 8 public class VcfFilterJdkCustom1164653960 extends com.github.lindenb.jvarkit.tools.vcffilterjs.VcfFilterJdk.AbstractFilter { 9 public VcfFilterJdkCustom1164653960(final VCFHeader header) { 10 super(header); 11 } 12 @Override 13 public Object apply(final VariantContext variant) { 14 /* user's code starts here */ 15 VariantContextBuilder vcb = new VariantContextBuilder(variant); float ac = variant.getAttributeAsInt("AN",0); if(ac>0) { List<float> af = variant.getAttributeAsIntList("AC",0).stream().map(N->N/ac).collect(Collectors.toList());vcb.attribute("AF",af);vcb.attribute("MAF",af.stream().mapToDouble(X->X.floatValue()).min().orElse(-1.0) );} return vcb.make(); 16 /* user's code ends here */ 17 } 18 } [WARN][SnpEffPredictionParser]no INFO[EFF] or no description. This VCF was probably NOT annotated with SnpEff (old version). But it's not a problem if this tool doesn't need to access SnpEff Annotations. [WARN][VepPredictionParser]NO INFO[CSQ] found in header. This VCF was probably NOT annotated with VEP. But it's not a problem if this tool doesn't need to access VEP Annotations. [WARN][AnnPredictionParser]no INFO[ANN] or no description This VCF was probably NOT annotated with SnpEff(ANN version) . But it's not a problem if this tool doesn't need to access SnpEff Annotations. [SEVERE][Launcher]Contig 22 does not have a length field. htsjdk.tribble.TribbleException: Contig 22 does not have a length field. at htsjdk.variant.vcf.VCFContigHeaderLine.getSAMSequenceRecord(VCFContigHeaderLine.java:80) at htsjdk.variant.vcf.VCFHeader.getSequenceDictionary(VCFHeader.java:206) at com.github.lindenb.jvarkit.util.picard.SAMSequenceDictionaryProgress.<init>(SAMSequenceDictionaryProgress.java:259) at com.github.lindenb.jvarkit.tools.vcffilterjs.VcfFilterJdk.doVcfToVcf(VcfFilterJdk.java:800) at com.github.lindenb.jvarkit.util.jcommander.Launcher.doVcfToVcf(Launcher.java:563) at com.github.lindenb.jvarkit.util.jcommander.Launcher.doVcfToVcf(Launcher.java:583) at com.github.lindenb.jvarkit.tools.vcffilterjs.VcfFilterJdk.doWork(VcfFilterJdk.java:826) at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMain(Launcher.java:736) at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMainWithExit(Launcher.java:894) at com.github.lindenb.jvarkit.tools.vcffilterjs.VcfFilterJdk.main(VcfFilterJdk.java:841) [INFO][Launcher]vcffilterjdk Exited with failure (-1)

ADD COMMENTlink 12 months ago GK1610 • 60
Entering edit mode
0

Contig 22 does not have a length field.

there is an error in your header ##contig=<ID=22...> ...

ADD REPLYlink 12 months ago
Pierre Lindenbaum
120k
0
Entering edit mode

hi all I have a problem similar JSEM, which AF doesn't updated , is there any other command.

thanks in advance

ADD COMMENTlink 9 months ago seta ♦ 1.2k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0