Finding sample swaps using VCF files
0
1
Entering edit mode
6.7 years ago
kulvait ▴ 270

Hi, I have bunch of samples from cancer patients amplicon sequencing. We are trying to identify somatic mutations related to those.

I would like to perform check for sample swaps in these samples since we have typically more than one sample per patient. What comes into my mind is to produce VCF files, filter them such that it contains only dbSNP records with genotype calls 0/1 or 1/1 and cluster the results.

Is there some tool to do this or I have to do everything manually?

Thanks Vojtech.

dna-seq amplicon-seq vcf • 2.1k views
ADD COMMENT
1
Entering edit mode

Hi, One quick way but not for a serious clinical settings is to feed the vcf to vcfkit (here ) and make a tree out of it. But I don't exactly know the implications for your work.

We gave students of our course some paternity testing exercises based on SNPs and they came up with the idea to make a tree, which worked quite robustly for small regions and non-filtered SNP calls. The command being

vk phylo tree INPUT.vcf
ADD REPLY
0
Entering edit mode

what do you mean by "sample swap"? you wan to check if all samples comes from the same patient?

ADD REPLY
0
Entering edit mode

Yes, exactly. I want to check if the samples that are labeled by single person have very similar genetic profiles.

ADD REPLY
0
Entering edit mode

To do that I would do a PCA based on the snp calling, the samples belonging to same patient should cluster together.

It would need a genotype calling based on all the samples to get a multi-vcf (for example with gatk pipeline)

Then you can do a PCA with the multi-vcf, it can be achieved with R SNPRelate package. (there are maybe easier and better solutions...)

ADD REPLY

Login before adding your answer.

Traffic: 1482 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6