VEP tools output
2
0
Entering edit mode
6.1 years ago
NB ▴ 960

Hello, I'm using VEP tools to annotate human WES data (GRCh37) and as many of us know it provides a prediction for each transcript per row.

  1. Can the tool (or a script?) provide info for one variant per row, ie including all the transcripts in one cell rather than many rows ?
  2. can we restrict the HGVS annotations to only known protein (NM id ) and known mRNA (NM id) only ?

I tried using annovar but the HGVS annotations are just not according to the nomenclature for many variants, esp INDELS.

Thank you

vep ensembl output • 3.8k views
ADD COMMENT
2
Entering edit mode
6.1 years ago
Ben_Ensembl ★ 2.4k

Hi Nandini,

  1. This is not possible using either the web interface or the standalone script. However, you can use a number of different filtering options, including --pick: http://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#filt

or the filter_vep script to filter the VEP results according to your custom criteria: http://www.ensembl.org/info/docs/tools/vep/script/vep_filter.html#filter_run

  1. Using the VEP script (http://www.ensembl.org/info/docs/tools/vep/script/index.html), you can specify the RefSeq transcript set, and exclude the predicted transcripts using the --refseq and --exclude_predicted options: http://www.ensembl.org/info/docs/tools/vep/script/vep_other.html#refseq

Including the --hgvs option will mean that the HGVS notation is returned in the context of the RefSeq transcripts (excluding the predicted transcripts).

I hope this helps.

Best wishes

Ben Ensembl Helpdesk

ADD COMMENT
0
Entering edit mode

Thanks Ben. The current command that I am using is as follows

./vep --cache --dir_cache /Software/ensembl-vep/.vep  --stats_text S39_Run3.html --refseq --hgvs --fork 4 -tab --custom /Software/ensembl-vep/.vep/score.bed.gz,score,bed,exact,0 --custom /Software/ensembl-vep/.vep/EX.bed.gz,EX,bed,exact,0 --pick_allele_gene --exclude_predicted --port 3337 -i S39_Run3.recode.vcf -o S39_Run3.txt

The output I get still includes nucleotide (NR_) annotations. Can this be excluded as well ? Also, is there a way to annotate the zygosity of each variant in the output ?

Thank you.

ADD REPLY
2
Entering edit mode

No problem- very happy to help.

You could do this using the --transcript_filter option, which uses similar notation and formatting as the filter_vep.

For adding the zygosity, you can use the --individual option, but this only works with VCF files containing individual genotype data: https://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#output

Best wishes

Ben Ensembl Helpdesk

ADD REPLY
0
Entering edit mode

Thanks Ben. I will give this a go.

ADD REPLY
0
Entering edit mode

Hi Ben, I've got VEP working to the desired output. Just one last question. I would like to include only those variants that are <1% in gnomad_NFE. Is there an option in vep or do I need to use filter_vep ?

Thank you.

ADD REPLY
1
Entering edit mode
6.1 years ago
Ben_Ensembl ★ 2.4k

Hi Nandini,

I'm glad to hear that. You will need to use filter_vep using the following guidelines: http://www.ensembl.org/info/docs/tools/vep/script/vep_filter.html#filter_write

Best wishes

Ben Ensembl Helpdesk

ADD COMMENT
0
Entering edit mode

thanks Ben. I'm running into the following error

-------------------- EXCEPTION --------------------
MSG: 
ERROR: Forked process(es) died: read-through of cross-process communication detected

STACK Bio::EnsEMBL::VEP::Runner::_forked_buffer_to_output /Software/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:554
STACK Bio::EnsEMBL::VEP::Runner::next_output_line /Software/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:361
STACK Bio::EnsEMBL::VEP::Runner::run /Software/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:202
STACK toplevel ./vep:222
Date (localtime)    = Wed Mar  7 10:27:19 2018
Ensembl API version = 91

the command being used is

./vep --cache --dir_cache /Software/ensembl-vep/.vep  --stats_text S3.html --refseq --everything --individual all --transcript_filter "stable_id match N[M]_" --fork 4 -tab --custom /Software/ensembl-vep/.vep/score.bed.gz,score,bed,exact,0 --custom Software/ensembl-vep/.vep/HEX.bed.gz,HEX,bed,exact,0  --port 3337 -i S3.recode.vcf -o S3.txt

Any idea what might be going wrong ? There is only one sample in this vcf with approx 5000 variants. thanks.

ADD REPLY
0
Entering edit mode

Hi Nandini,

My colleagues have said that they are currently helping you on GitHub: https://github.com/Ensembl/ensembl-vep/issues/150#issuecomment-371137459

For this error, it's best that they help you.

Best wishes

Ben Ensembl Helpdesk

ADD REPLY
0
Entering edit mode

yes that's correct. Thanks Ben

ADD REPLY

Login before adding your answer.

Traffic: 2771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6