How To Deal With Missing (Na) Data In Plink
0
0
Entering edit mode
10.4 years ago

I have a database like this

 1 1 0 0 1  1  A A  G T
2 1 0 0 1  1  A C  T G
3 1 0 0 1  1  0 0  G G
4 1 0 0 1  2  A C  T T
5 1 0 0 1  2  C C  G T
6 1 0 0 1  2  C C  T T
.ped

1 snp1 0 1
1 snp2 0 2
.map

I use the order --recodeA convert them to

FID IID PAT MAT SEX PHENOTYPE snp1_A snp2_G
1 1 0 0 1 1 2 1
2 1 0 0 1 1 1 1
3 1 0 0 1 1 NA 2
4 1 0 0 1 2 1 0
5 1 0 0 1 2 0 1
6 1 0 0 1 2 0 0
.raw

there is NA in my data, but it is not allowed in analysis. How to deal with it in plink.

Thank you.

plink snp • 6.9k views
ADD COMMENT
1
Entering edit mode

Please clarify, why do you need to convert it to raw (recodeA) format? Are you going to use plink for analysis, if yes, then why conversion?

ADD REPLY
0
Entering edit mode

because I am calculaing linear-regression with the model is not allowed Na( missing genotype),so I have to convert it to any other value.someone told me the plink can remedy the Na(missing genotype),I have found but can't succeed.Because my data come from experiment,I can‘t code NA to any value.

ADD REPLY
0
Entering edit mode

Still not clear why you need to convert to raw format. You could just use plink --file mydata --linear, with original PEDMAP file. Plink - Linear and logistic models

ADD REPLY
0
Entering edit mode

sorry,it is a other model group-lasso,it's not allowed NA. before I use it,I have to convert my data(include 50kb snp and they are coded withATCG)to 0,1,2.because there are 00 in my old data,so after convert ,NA is in the new data.

ADD REPLY
0
Entering edit mode

00 means nocall, when converted to raw, it becomes NA - not available. These samples need to be excluded from analysis. In R to exclude samples: snp1_A <- my.raw[ !is.na(my.raw$snp1_A), "snp1_A"]

ADD REPLY
0
Entering edit mode

I have try it ,but my model is a function which is designed already.Waht I need to do is convert my data as x(it is a matrix include recoded missing value), as your method,the data will not intact。

ADD REPLY
0
Entering edit mode

is there a method in plink that can convert the NA base on the other snps,then the error will be lower.

ADD REPLY
1
Entering edit mode

Open file with notepad & replace NA with whatever you like.

ADD REPLY
0
Entering edit mode

because my data is real, my genotypes are coded 0,1,2,so I could't code na( missing genotypes)with I like.

ADD REPLY

Login before adding your answer.

Traffic: 4037 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6