Biostar Test Site

This is site is used for testing only. Visit: https://www.biostars.org to ask a question.

Issue on vg call order of paths
1
0
Entering edit mode
7 months ago
xwwang ▴ 20

The genome graph is created from multiple genome alignment with 'vg construct', e.g. from 4 genomes, named: A, B, C, D

Now we want to call variants with ‘vg call’ from packed graph alignment file (.pack). Command:

vg call -t 10 graph.vg -k graph.pack -a >graph.gam.vcf


[This will use default for all paths since without giving customized '–p' ]

We then checked the number of variants called relative to all paths. I pasted the summary below. The first column is the path name or genome name. The second column is the number of variants.

A 3000

B 500

C 200

D 100

1. Question one

It seems like these represent variants calls such that the variant will only be called relative to D if it is in a highly nested variant (I.e. Variant within a variant within a variant etc). Is this interpretation correct?

1. Question two

It seems that the default order of variant calls, when variant-nesting occurs, are in alphabetic order of paths. How can we get the variants for paths in a supplied order, e.g. in the order of C, B, A, D, instead of A, B, C, D? I know I can specify a subset of individual paths with -p, but these are still ordered and reported alphabetically.

vg • 227 views
1
Entering edit mode
6 months ago
glenn.hickey ▴ 240

Your assumption is correct: vg call finds all paths through a bubble (subsetting to those specified with -p), and chooses the first from an alphabetical sorting. There is no interface to change this, presently. I'd be curious to know what exactly your use case is and why you want this functionality?

You can, of course, just run call repeatedly specifiying one path at a time to get all calls vs all paths.