Transforming Plink Files
3
0
Entering edit mode
9.2 years ago
FFK534 • 0

Does anyone have any suggestions for combining .ped & .map files from Plink and transforming them into a different format? For example, the peds look like this:

#FID      IID        PAT     MAT     SEX     STATUS     G1     G2     G3     G4     G5     G6     G7     G8     G9     G10
12322     12322A     0       0       1       1          1      1      1      1      1      1      1      1      2      1
12322     12322B     0       0       2       0          1      1      1      1      2      2      2      1      1      2
12322     12322C     0       0       2       1          2      1      1      1      1      1      1      1      1      1

and the maps look like this:

#CHR     G                   GD     BP
1        1_135195_A/G        0      135195
1        1_135203_G/A        0      135203
1        1_136596_GGGG/-     0      136596
1        1_136604_G/C        0      136604
1        1_136619_G/A        0      136619
1        1_136620_C/T        0      136620
1        1_136635_T/G        0      136635
1        1_136645_G/-        0      136645
1        1_136652_A/G        0      136652
1        1_136779_G/A        0      136779

And what I'd like is this:

1_135195_A/G     1_135203_G/A     1_136596_GGGG/-     1_136604_G/C     1_136619_G/A     1_136620_C/T     1_136635_T/G     1_136645_G/-     1_136652_A/G     1_136779_G/A     STATUS
1                1                1                   1                1                1                1                1                2                1                1
1                1                1                   1                2                2                2                1                1                2                0
2                1                1                   1                1                1                1                1                1                1                1

Where the 2nd column in the 2nd file becomes the header of the third file and the 6th column of the 1st file becomes the final column of the 3rd file.

R plink transform formatting • 2.2k views
ADD COMMENT
0
Entering edit mode
9.2 years ago
Ram 43k

If you're willing to use R, read these to data frames, extract as vectors and restructure as you see fit! Also, please only use relevant tags - I don't see how SQL is relevant here.

ADD COMMENT
0
Entering edit mode

I should note there are thousands of columns in the first file and thousands of rows in the second file. How can I extract multiple vectors without listing the columns/rows individually?

ADD REPLY
0
Entering edit mode

You can transpose and slice. Speaking of, try Python - that might make it a bit more flexible, but you'l have to spend more time on the logic.

ADD REPLY
0
Entering edit mode
9.2 years ago
FFK534 • 0

This works:

ped <- read.table ("/pathtodata.ped")
map <- read.table ("/pathtodata.map")

map$V1 <- map$V3 <- map$V4 <- NULL
tmap <- t(map)
ped$V1 <- ped$V2 <- ped$V3 <- ped$V4 <- ped$V5 <- NULL
status <- ped$V6
ped$V6 <- NULL
colnames(ped) <- c(tmap)
ped$Status <- status
ADD COMMENT
0
Entering edit mode
9.2 years ago
zx8754 11k

Look into --recodeA option in plink. That would create single *.raw format. http://pngu.mgh.harvard.edu/~purcell/plink/dataman.shtml#recode

ADD COMMENT

Login before adding your answer.

Traffic: 1505 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6