Editing Manta structural VCF file
2
0
Entering edit mode
3.1 years ago

Hi,

Is there a way for filtering/removing the structural variants identified in ChrUn and other random contigs or in non-main chromosomes? I am using hg38 assembly.

Will 'vcftools' --chr filtering work here?

I followed 'https://www.biostars.org/p/201603/#273150'. Tried the below code, but it didn't filter the non-main chromosomes in my vcf.

grep -w '^#\|chr[1-9]\|chr[1-2][0-9]\|chr[X]\|chr[Y]' my.vcf > my_filtered.vcf

Example of Manta vcf:

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Normal1 Tumor1

chr1 30405827 MantaBND:0:70561:70576:1:0:0:0 T ]chr1_KI270760v1_alt:58532]T . PASS SVTYPE=BND;MATEID=MantaBND:0:70561:70576:1:0:0:1;IMPRECISE;CIPOS=-568,568;SOMATIC;SOMATICSCORE=41;BND_DEPTH=54;MATE_BND_DEPTH =24 PR 16,0 26,7

In the case of structural variants, translocation events will be present. So, I will have to remove the random chromosomes from ALT columns too. I am trying to do this to keep only chr1-22, X, Y in the Manta structural vcf file to do a circos plot.

Thanks for the help!

VCF Structural variants Manta WGS • 1.3k views
ADD COMMENT
2
Entering edit mode
3.1 years ago

try with awk:

$ awk '/^#/ || !/chr[0-9A-Za-z]+_/' test.vcf

This would print header rows of vcf and removes any line with chr(single of multiple number,upper or lower case text), followed by underscore _. Little risky without example data. But removes rows with text such as chrun_ or chr1_KI270760v1_alt:58532

ADD COMMENT
0
Entering edit mode

Thanks for the reply. This way the ALT column was not modified. Sorry, I forgot to mention about that. I have modified my question.

ADD REPLY
0
Entering edit mode

updated the code. Entire row is modified. Please be careful of output as it is a generic regex.

ADD REPLY
0
Entering edit mode

Great! This one worked. Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6