Biostar Beta. Not for public use.
PLINK error locus has more than 2 alleles
Entering edit mode
23 months ago
biogirl • 170
European Union

Hi all,

I've come across a problem in PLINK when trying to do a Fishers exact test. The command I'm using is as follows:

plink --file test --fisher --allow-no-sex --1

And the error I get is:

ERROR: Locus 1:54208 has >2 alleles

           Individual Ind3 Ind3 has genotype [ G G ] but we've already seen [ A ] and [ T ]

I've checked my file rigorously and the data is indeed 'GG' with no A's or T's nearby! I also have no missing data. The length of each line (i.e. for each individual) is consistent throughout. I've tried both tab- and space-demilited files, but no difference. I haven't found any special characters etc. either (using vi :set list).

Interestingly, I've taken Ind3 out of the file and re-run the test, but the same error is thrown up (but now obviously on Ind4, which is now on line 3).

Any ideas?

plink gwas snps • 4.1k views
Entering edit mode
16 months ago
Brice Sarver ♦ 2.6k
United States

Plink requires that sites be ballelic. If ANY other individual has a nucleotide/nucleotides that make it multiallelic at that site, then plink fails.

Barring this, your file is formatted incorrectly. From the plink manual:

Genotypes (column 7 onwards) should also be white-space delimited; they can be any character (e.g. 1,2,3,4 or A,C,G,T or anything else) except 0 which is, by default, the missing genotype character. All markers should be biallelic. All SNPs (whether haploid or not) must have two alleles specified. Either Both alleles should be missing (i.e. 0) or neither. No header row should be given. For example, here are two individuals typed for 3 SNPs (one row = one person):

_FAM001  1  0 0  1  2  A A  G G  A C 
     FAM001  2  0 0  1  2  A A  A G  0 0 

The default missing genotype character can be changed with the --missing-genotype option, for example:

plink --file mydata --missing-genotype N

Entering edit mode

Hi, sorry, perhaps I wasn't clear in my original message. My data is biallelic, for example:

Ind1 Ind1 0 0 0 1 A A G G A A T T

Ind2 Ind2 0 0 0 2 T T C C C C T T

I have followed the plink manual to the letter with regards the delimits in the file. The file encoding is correct, given that I can reduce the line length down to a bare minimum and execute plink ok. Therefore, I think the file format is ok. Or do you mean my syntax is incorrect in the file?

Entering edit mode

I've just re-read your message and it's all come together. So what you're saying is that Ind1 might have AA at that particular locus, whilst Ind2 might have TT. So if Ind3 has CC, then it's going to fail. Thanks, I think I can work around this now.

Entering edit mode

Yep, you've got it. Glad to help.

Entering edit mode

How did you work around this? I think plink should be able to figure this out. Thanks.


Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1