Biostar Beta. Not for public use.
Taking the difference of two VCFs (or removing singletons)
1
Entering edit mode
2.7 years ago
hermathena • 40
United Kingdom

Dear All,

Is there a way to take a difference of two VCF files? GATK can be used to take a Union or an Intersection, but I need the difference. There are two applications:

1. remove singletons. I have a VCF of all the SNPs and a VCF of the private ones. I need a VCF with the non-private SNPs.

2. get the non-CDS sequence. I can make a VCF of the exome SNPs by filtering against a gff file. I would also like to get the SNPs from non-CDS regions - which could be the difference of all SNPs and exome SNPs.

Any ideas, please?

Many thanks,

Krzysztof Kozak

Zoology

University of Cambridge

ADD COMMENTlink
0
Entering edit mode

Hello,

Thank you all for the suggestions, this looks promising!

Best,

Chris

ADD REPLYlink
2
Entering edit mode
12 months ago
Freiburg, Germany

You can use either vcftools or bcftools. You'll just use the isec command with the -C (complement) option. Note that this is position based rather than exact variant based.

ADD COMMENTlink
1
Entering edit mode
2.7 years ago
Kizuna • 780
France, Paris

Regarding point 1.

I think you can do it with R.

try to transform your 2 vcf files into Dataframes (DF1 and DF2) and then subset the content of your chromosomic position of the DF2 containing the private variants from the one having all variants (DF1)

this is an example:

DF1<-read.delim("....\allSNPs.vcf",header=T,sep="")
DF2<-read.delim("....\private.SNPs.vcf",header=T,sep="")
singletons.DF1<-DF1[!(DF1$chromosomic.position %in% DF2$chromosomic.positon),]
ADD COMMENTlink
0
Entering edit mode

The VariantAnnotation package contains a VRanges class that extends GRanges and would be convenient in this instance.

ADD REPLYlink
0
Entering edit mode
14 months ago
France/Nantes/Institut du Thorax - INSE…

I wrote a tool to include/exclude the variants in a VCF file: https://github.com/lindenb/jvarkit/wiki/VcfIn

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1