VEP (SV annotation): how to choose the best overlapping reference SVs?
0
0
Entering edit mode
3.4 years ago
c.c.a. • 0

Hello biostars community,

I am using VEP (version 100) to annotate a VCF containing SVs with the AF values from a reference VCF. I am running VEP with the following parameters:

vep \
    --cache \
    --offline \
    --dir_cache ${cache} \
    --cache_version 98 \
    -i $INPUT \
    -o $OUTPUT \
    --$OFMT \
    --no_stats \
    --force_overwrite \
    --fork 4 \
    --compress_output bgzip \
    --regulatory \
    --symbol \
    --custom ${gnomadSVpath},gnomadSV01,vcf,exact,0,EUR_AF \
    --plugin StructuralVariantOverlap,file=$gnomadSVpath,cols=EUR_AF,match_type=surrounding,label=gnomadSV02

As for the reference SVs, I am using the gnomAD SV v2 (lifted over to hg38):

21      45802812        nssv15966780    A       <DUP>   .       .       DBVARID;SVTYPE=DUP;END=46109638;SVLEN=306827;EXPERIMENT=1;SAMPLESET=1;REGIONID=nsv4284975;AC=1;AFR_AC=1;AMR_AC=0;EAS_AC=0;EUR_AC=0;OTH_AC=0;AF=4.6e-05;AFR_AF=0.000105;AMR_AF=0;EAS_AF=0;EUR_AF=0;OTH_AF=0;AN=21694;AFR_AN=9534;AMR_AN=1930;EAS_AN=2416;EUR_AN=7624;OTH_AN=190
21      46008428        nssv15966790    C       <DUP>   .       .       DBVARID;SVTYPE=DUP;END=46181422;SVLEN=172995;EXPERIMENT=1;SAMPLESET=1;REGIONID=nsv4273892;AC=2;AFR_AC=1;AMR_AC=0;EAS_AC=0;EUR_AC=1;OTH_AC=0;AF=9.2e-05;AFR_AF=0.000105;AMR_AF=0;EAS_AF=0;EUR_AF=0.000131;OTH_AF=0;AN=21694;AFR_AN=9534;AMR_AN=1930;EAS_AN=2416;EUR_AN=7624;OTH_AN=190

After annotation I get the following results:

chr21   46059888        chr21_46059889_deletion N       <DEL>   .       .       END=46059946;CSQ=deletion|downstream_gene_variant|MODIFIER|AP001476.2|ENSG00000226115|Transcript|ENST00000435738|lncRNA|||||||||||2322|1||Clone_based_ensembl_gene||||||||0&0.000131|100&100|nssv15966780&nssv15966790||,deletion|regulatory_region_variant|MODIFIER|||RegulatoryFeature|ENSR00000665311|||||||||||||||||||||||0&0.000131|100&100|nssv15966780&nssv15966790||

where 0&0.000131 is/are the EUR_AF from nssv15966780&nssv15966790 It'll be a lot easier during the filtering step if only one EUR_AF value was assigned, for example when using filter_vep.

For instance, what if I get the two EUR_AF values, where, say, one is 1 and the other is 0.001, but I'd like only to keep the SV with EUR_AF < 0.1 in the filtering step. What is the general approach? Should I try to filter first the VCF containing the AF values and then annotate the query VCF, then filter with filter_vep, or just write a custom script and choose the SV if one of the AF-values matches the filtering criteria (EUR_AF < 0.1).

sv structural variants vep • 1.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 2881 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6