Tools For Structural Variants
2
3
Entering edit mode
10.0 years ago
Kasthuri ▴ 300

I am currently working on human whole genomes and I am curious to know the tools the bioinformatics community uses for identifying structural variants. I know about couple of them, like breakdancer and GASV, but would like to get some consensus on good tools to begin with. I understand detecting SVs are the most painful things with lots of false positives throwing up and hence this question. Also, any help on the parameters used with the tools will be greatly appreciated. By the way, we do indel realignment and base quality recalibration as a default process hoping it might help with the process.

Thanks.

structural breakdancer • 5.3k views
ADD COMMENT
3
Entering edit mode
10.0 years ago
hardingnj ▴ 210

Current DREAM competition gives you a good idea of tools currently in use and how they perform: https://www.synapse.org/#!Synapse:syn312572/wiki/62348 Might be a good starting point

Although as far as I am aware there are no authoritative studies in the literature, waiting for the DREAM paper to come out is probably the best bet, otherwise all advice will be based on personal experiences.

Not what you asked for, but seqanswers is has a thorough listing (but no guidelines on performance): http://seqanswers.com/wiki/Special:BrowseData/Bioinformatics%20application?Biological_domain=Structural_variation

ADD COMMENT
0
Entering edit mode

Thank you hardingnj. This is a good resource. I will explore.

ADD REPLY
2
Entering edit mode
10.0 years ago
mdm-two ▴ 230

You don't say if you are looking for somatic or germline SVs, nor if you are calling in populations or individuals.

Squaredancer and CREST use clipped reads whereas Breakdancer uses mapped reads pairs. Integrating both approaches should improve your results for calling individuals especially if your coverage is low. All of these were originally developed for detection of somatic SVs so you may need to decrease the sensitivity if you are looking for Germline events.

https://github.com/genome/gms-core/blob/master/lib/perl/Genome/Model/Tools/Sv/SquareDancer.pl http://www.stjuderesearch.org/site/lab/zhang

You might also be interested in TIGRA: A Targeted Iterative Graph Routing Assembler for Breakpoint Assembly http://bioinformatics.mdanderson.org/main/TIGRA It accepts BreakDancer-formatted output as well as 1000 Genomes.

Other options:

Lumpy: https://github.com/arq5x/lumpy-sv

Genome STRiP has been used extensively for 1000 Genomes for population-based calling. http://www.broadinstitute.org/software/genomestrip/download-genome-strip

ADD COMMENT
0
Entering edit mode

This is wealth of information. I primarily need somatic calls but would also like to get a sense of germline. I am trying CREST. From the DREAM project, I infer DELLY is good as well. I will also try the programs you suggest. But getting some of them to work looks like a pain (Meerkat, for instance) with all the dependencies and especially in HPC when I am not the super user! :-( Anyway, thanks a lot for your suggestions. I am going to jump right on them.

ADD REPLY

Login before adding your answer.

Traffic: 2174 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6