Why CNV calling using VarScan need two steps of fragments merging?
1
0
Entering edit mode
6.4 years ago
CY ▴ 750

I have been using VarScan to call CNV for a while but have not got a chance to look into it carefully.

The workflow basically like this: 1) Run copynumber to compare read depth between normal and tumor and get small fragments based on ration of NT/NT 2) Run circular binary segmentation (CBS) and some how merge small fragments into larger fragments again based on ration of NT/NT 3) Run mergeSegements.pl to further merge fragments and this is the final result of your CNV

I seem to understand the purpose of 1). It generate some intervals and assign each with the mean depth. These intervals are somehow like individual data point for further analysis.

What I don't understand are 2) and 3). Why do we need 2 steps of merging? What is the difference of these 2 steps? Why can't we just merge once and achieve the purpose? Why can't we set up more appropriate criteria / perimeters at the very first step (step 1) and spare the merging step?

cnv VarScan copy number variant • 1.5k views
ADD COMMENT
1
Entering edit mode
6.4 years ago
arta ▴ 670

Circular Binary segmentation (CBS) is an external tool which segments the fragments based on significant change-points by fitting a Gaussian distribution. It was written in R and Varscan uses the CBS as an intermediate tool, so they did not reimplement in C and Perl. The aim of step 3, mergeSegements.pl, is to find similar copy-number-variants and classify them into large-scale and focal.

Taken form paper:

Adjacent segments of similar copy number from the CBS algorithm were merged by an internally developed Perl script (MergeSegments), and classified by size. Events encompassing >25% of a chromosome arm were classified as large-scale; all others were considered focal events.

ADD COMMENT
0
Entering edit mode

Thanks for explaining. But way can't we just use the result of first step? The first step already identified a number of break point.

ADD REPLY
0
Entering edit mode

CDS does not classify the segments as amplification, deletion or neutral. By applying MergeSegments algorithm, these segments are classified as amplification (log ratio > 0.25), deletion (log ratio < -0.25) or neutral based (between -0.25 and 0.25) and merge the adjoints as same class. Moreover, amplifications and deletions are categorized as large-scale and focal. It is informative in terms of interpretation such as whole chromosome loss or chromosome arm lost or gain.

Hope it is clear now. :)

ADD REPLY
0
Entering edit mode

Yes, it is really helpful. Thanks :)

ADD REPLY

Login before adding your answer.

Traffic: 3314 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6