How to remove the high correlated SNPs in R
1
0
Entering edit mode
8.2 years ago
zwang10 ▴ 30

Hello all!

I have a data set (matrix) of a gene, each row represents an individual, and each column represents a genotype score (0, 1, 2). How can I remove the high correlated (r=0.8) SNPs?

I was trying using SNPRelate. But it needs GDS file. But there is no column name of the matrix.

LD SNP • 3.0k views
ADD COMMENT
0
Entering edit mode
8.2 years ago

Not an R solution, but you could try the pruning you data based on LD in PLINK; PLINK LD Prune. You would need to convert you matrix into PLINK files.

ADD COMMENT
0
Entering edit mode

Can you tell me use which tool can convert matrix into PLINK files?

ADD REPLY
0
Entering edit mode

You have a few options. With a little manipulation in R you could easily generate a map and ped (file specs are here). Another option would be to use the R package snpStats to; i) convert your matric to a snpStats object, ii) write out PLINK files in R.

ADD REPLY

Login before adding your answer.

Traffic: 2962 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6