Biostar Test Site

This is site is used for testing only. Visit: https://www.biostars.org to ask a question.

vg deconstruct with path sizes
2
0
Entering edit mode
13 months ago
egoltsman • 0

Hi, I am wondering if there is a way to output snarls with path size information. Currently, if I go the route of 'vg snarls', then 'vg deconstruct', the vcf file contains only the variant sequences, and I am forced to parse those out and calculate the string size for each one, which is not too efficient when you throw a whole pangenome at it. If this information is already available internally during snarl calling, is there a way to extract/output it?

Thanks!

vg • 332 views
ADD COMMENT
0
Entering edit mode
13 months ago
glenn.hickey ▴ 240

If I understand correctly, you want the length of each allele stored in some kind of VCF Format field? I suppose this is possible, but as far as I know, must VCF parsers would be parsing the alleles into strings in memory anyway which would allow you to get the size just as efficiently.

As mentioned on github, there should soon be an interface to get snarl traversals using a variety of algorithms (including the one used in deconstruct -e) in GAF format. Hopefully that will be more efficient for you to parse.

ADD COMMENT
0
Entering edit mode
13 months ago
egoltsman • 0

That's great to know. Thanks!

ADD COMMENT

Login before adding your answer.

Traffic: 156 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6