Many 0 in ped file converted from vcf genomic data
0
0
Entering edit mode
5.2 years ago
shawn ▴ 20

Hi everyone,

I am learning do the gwas analysis. When I convert the genomic data "1001genomes_snp-short-indel_only_ACGTN.vcf.gz" download from here to plink ped format.

plink --vcf 1001genomes_snp-short-indel_only_ACGTN.vcf.gz--make-bed --out 1001genomes_snp-short-indel_only_ACGTN.vcf.gz

I find there are many 0 in the ped file like this:

88 88 0 0 0 -9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C C 0 0 T T 0 0 G G C C T T 0 0 0 0 T T G G 0 0 T T T T A A A A T T 0 0 T T 0 0 G G C C A A T T A A C C C C C C A A T T C C T T G G G G T T 0 0 C C G G G G T T T T T T A A T T C C G G G G G G C C C C G G G G G G G G C C C C G G C C T T T T G G A A C C T T A A G G 0 0 G G A A T T A A 0 0 0 0 C C C C T T G G G G G G A A T T 0 0 0 0 A A G G T T T T G G 0 0 C C T T C C 0 0 A A C C C C G G G G A A G G C C C C G G C C G G C C C C C C G G G G G G G G A A 0 0 C C C C A A C C C C C C G G C C C C C C C C C C C C C C G G T T C C C C C C C C A A 0 0 A A T T 0 0 T T T T T T A A G G T T G G G G T T C C G G G G C C G G C C 0 0 C C C C T T T T T T A A T T T T G G A A G G C C C C G G 0 0 G G C C G G T T T T C C 0 0 G G A A 0 0 C C G G C C T T 0 0 T T C C A A G G 0 0 C C A A A A G G C C C C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

And when I do the quality control

 plink --bfile 1001genomes_snp-short-indel_only_ACGTN --maf 0.05 --geno 0.02 --mind 0.02 --hwe 1e-6 --make-bed --out snp

it showed "Error: All people removed due to missing genotype data (--mind)". Does anyone know the reason? Do I choose the wrong dataset or I made some mistake? Thanks a lot.

SNP plink vcf gwas • 1.4k views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (10101) to highlight code and data examples.

ADD REPLY
0
Entering edit mode

I agree with ATpoint: this would make your example more readable. Also, it would be helpful if you posted the corresponding line of the vcf so that we can see if there was a problem in the converison.

ADD REPLY
0
Entering edit mode

Hi Fabio, I have adjusted the format. Thanks for your suggestion. Do you know the reason for my problem? Thank you very much.

Shawn

ADD REPLY
0
Entering edit mode

The reason is that you have too many missing genotypes (presumably all the zeros). How many missing data are there in the vcf? How much missing data is tolerated with your plink command? It can be a problem in conversion, or maybe the vcf had a lot of missing data.

ADD REPLY

Login before adding your answer.

Traffic: 2039 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6