best way to separate copy number variations from VCF files
1
0
Entering edit mode
5.5 years ago

Hi

I am interested in finding copy number variation in my samples. I have raw VCF files. I have looked at the previous questions, but I have not gotten one clear answer. Is there a walker to find CNV's (duplications or deletions) in GATK from raw VCF files?

Hope to hear from you soon.

Regards Homa

SNP CNVs • 2.4k views
ADD COMMENT
0
Entering edit mode

snpsift?

ADD REPLY
2
Entering edit mode
5.5 years ago

What about using grep? I'd use something like:

cat <(grep '^#' myvariants.vcf) <(grep '<DEL>\|<DUP>' myvariants.vcf) > cnvs.vcf

But I'm not sure how your vcf looks like. The first grep takes the header lines, the second grep searchs for variants containing either the word <del> or the word <dup>.

ADD COMMENT
0
Entering edit mode

Thanks a lot for the reply

Since my VCF files are derived from the GATK software, I would prefer to continue the path with the GATK. Do you have any suggestions for separating the CNVs from the VCF file using GATK?

ADD REPLY
1
Entering edit mode

Please do not make the mistake of overcomplicating things. This is a simple pattern-extraction task. Even if you use a GATK filtering tool (if that exists, I don't know) it will do the exact same thing, just wrapped in a GATK_filter_whatever.jar. The suggested solution is perfectly fine.

ADD REPLY

Login before adding your answer.

Traffic: 2859 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6