Tri-allelic variants error when loading file in Plink - but there shouldn't be
2
1
Entering edit mode
8.4 years ago
jwhite ▴ 10

Hi - I'm new to Plink and am trying to read in a transposed fileset so that I can then convert it to bed/bim/fam, but I keep getting this message that says "Note: Variant ####### is triallelic. Setting rarest alleles to missing." Where there are many lines and the ####### ranges from 0-497197. But it still creates files that have the extension .temporary.bed.tmp, .temporary.bim, and .temporary.fam.

So my problem is that I can't figure out why it thinks that they're all triallelic when I can look at my .tfam file and see that they aren't (or at least don't appear so to me). Does anyone have any suggestions?

My tped input looks like this (just with 497198 SNPs):

12    rs1000000    0    126890980    A A    B B    B B    B B
4    rs10000023    0    95733906    B B    B A    B A    B A
4    rs10000030    0    103374154    B B    B B    B B    A B
4    rs10000041    0    165621955    A A    B B    B B    B A
4    rs10000042    0    5237152    B B    B B    B B    B B

My plink command looks like this:

./plink --tfile myfile --recode --out myfile

And the return I get from plink looks like this:

Note: Variant XXXXXX is triallelic.  Setting rarest alleles to missing.

In addition, I'm also getting the error (on the very last line) but I thought the centimorgan position was allowed to be 0?:

Error: Invalid centimorgan position on line 2 of .tped file

Any help would be much appreciated - thanks in advance!

Plink SNP • 6.5k views
ADD COMMENT
0
Entering edit mode

Hi,

Can you send me your .tfam file and enough lines from your .tped to recreate the problems, and the .log file you get? I'll investigate this tomorrow.

ADD REPLY
0
Entering edit mode

Hi,

This is my first post on the site, and I'm not sure how to upload files or send you a pm - would you mind letting me know how to best send these files to you?

Really sorry for the trouble!

ADD REPLY
0
Entering edit mode

You can send the files as attachments to an email to chrchang at alumni.caltech.edu, or make a post on the Google group.

ADD REPLY
0
Entering edit mode

Try dos2unix'ing your files:

% dos2unix myfile.tped
% dos2unix myfile.tfam

This will convert line breaks characters to unix-style ones. That might solve your problems.

ADD REPLY
0
Entering edit mode

Just tried your suggestion and got the same result :( - thanks though!

ADD REPLY
1
Entering edit mode
8.4 years ago

The .tped loader did not properly detect when a line didn't have enough genotypes for the number of samples in the .tfam file. This error is properly reported in the Nov 23 development build.

You now need to check why your .tfam has more lines than there are genotype pairs in the .tped.

ADD COMMENT
0
Entering edit mode

WOW! What a silly mistake on my part!

I made the .tfam file equal to the number of genotypes in the .tped file and bam - running smoothly.

Thanks for your help!

ADD REPLY
0
Entering edit mode
2.1 years ago
Zhitian Wu ▴ 60

I also want to add that this WARNING is so tricky that it treats the first variant of your .tped as "Variant 0", but not "Variant 1" .

So sometimes there might really has a tri-allele in your data, but you are looking at the wrong line. (it should be line number = Variant xxx + 1)

It takes me a hour to find this...

ADD COMMENT

Login before adding your answer.

Traffic: 1345 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6