Hi all,
I use breakdancer-max to run chr20 of sample NA12878. After I read the paper, I was confused about one sentence, which said "The algorithm searches for genomic regions that anchor substantially more anomalous read pairs(ARPs) than expected on average." And before search, each read pair is assigned a type defined in the paper. Thus, we can say all aligned read pairs are labeled.
My question is, how to define the genomic regions that are used for search? Is it like sliding window based brute force search (obviously not a brute force, just to make my question clear) or something else more efficient? And we can see there is an option -s specifies the minimum length of the region when we call breakdancer-max.
Thanks all and looking forward to discuss.