Tool:Wham - a structural variant caller and association testing framework
0
4
Entering edit mode
9.2 years ago

Whole genome alignment metrics (Wham) is a structural variant (SV) caller. It was designed to identify the breakpoints of SVs, joint genotype across many individuals, and conduct association testing directly from binary alignment / mapping (BAM) files. The association test identifies shared breakpoints that have divergent allele frequencies between target and background individuals. Additionally, Wham provides SV classification using a random forest machine learning approach, which can be used to identify the nature of the SV allele (i.e. insertion or deletion). Wham can be used on pooled or individual level sequencing data.

Wham identifies breakpoints by integrating mapping annotations provided from BWA mem such as: split-read, alternative alignment, soft-clipping, consensus sequences and more. The genotyping is accomplished under a simple bi-allelic model and the association testing uses the genotypic counts across individuals. In the case of pooled (microbial / cancer / bulk segregant) sequencing, a set of provided utility scripts can be used to provide allele frequency information of the identified SV calls between two pools. Wham is written in C++ and built on top of Bamtools and SeqAn. OpenMP is used to allow Wham to run on multiple processors, which allows a 50x human genome to be called in ~ one hour using 40 CPUs with a minimal memory footprint.

For more information on Wham please visit http://jewmanchue.github.io/wham/

The code is hosted on https://github.com/jewmanchue/wham

Wham's code was written by Zev Kronenberg, E.J. Osborn and Mark Yandell (University of Utah)

genotype gwas bwa-mem bam structural-variant • 4.3k views
ADD COMMENT
0
Entering edit mode

Sound great. FYI, there is a wham aligner as well: http://research.cs.wisc.edu/wham/

ADD REPLY
0
Entering edit mode

We considered WHAM-BAM. Also the band should be mentioned:

Thank you for SA and XA.

ADD REPLY
0
Entering edit mode

Do SA/XA help? I have never done a careful evaluation myself. Thx.

ADD REPLY
0
Entering edit mode

They help a bunch. What happened to XN? I really wish that was in mem.

ADD REPLY
0
Entering edit mode

What does XN mean? Number of suboptimal alignments?

ADD REPLY
0
Entering edit mode
XN    "Number of ambiguous bases in the reference"
ADD REPLY
2
Entering edit mode

Ah, I forgot. Will consider it.

ADD REPLY
0
Entering edit mode

Can users simply use the provided training set? (and/or when to create a new one?)

Can target and background both be multi-sample VCFs BAMs?

ADD REPLY
0
Entering edit mode

There are canned training sets provided - or - you can train your own.

ADD REPLY
0
Entering edit mode

Thanks for the Wham!, I'm using Wham to call SVs from multiple samples (total 6 samples; 3 controls and 3 diseased), I would like to know if there are any visualization tools or scripts available to view my results at the end (ex: from obtained result files NA12878.wham.raw.vcf or NA12878.wham.raw.class.vcf)?, apart from IGV tool kit?. Also, I'm interested in annotating obtained SVs from known SVs databases, any suggestions?

ADD REPLY

Login before adding your answer.

Traffic: 2289 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6