Does anyone have any suggestions for combining .ped
& .map
files from Plink and transforming them into a different format? For example, the peds look like this:
#FID IID PAT MAT SEX STATUS G1 G2 G3 G4 G5 G6 G7 G8 G9 G10
12322 12322A 0 0 1 1 1 1 1 1 1 1 1 1 2 1
12322 12322B 0 0 2 0 1 1 1 1 2 2 2 1 1 2
12322 12322C 0 0 2 1 2 1 1 1 1 1 1 1 1 1
and the maps look like this:
#CHR G GD BP
1 1_135195_A/G 0 135195
1 1_135203_G/A 0 135203
1 1_136596_GGGG/- 0 136596
1 1_136604_G/C 0 136604
1 1_136619_G/A 0 136619
1 1_136620_C/T 0 136620
1 1_136635_T/G 0 136635
1 1_136645_G/- 0 136645
1 1_136652_A/G 0 136652
1 1_136779_G/A 0 136779
And what I'd like is this:
1_135195_A/G 1_135203_G/A 1_136596_GGGG/- 1_136604_G/C 1_136619_G/A 1_136620_C/T 1_136635_T/G 1_136645_G/- 1_136652_A/G 1_136779_G/A STATUS
1 1 1 1 1 1 1 1 2 1 1
1 1 1 1 2 2 2 1 1 2 0
2 1 1 1 1 1 1 1 1 1 1
Where the 2nd column in the 2nd file becomes the header of the third file and the 6th column of the 1st file becomes the final column of the 3rd file.
I should note there are thousands of columns in the first file and thousands of rows in the second file. How can I extract multiple vectors without listing the columns/rows individually?
You can transpose and slice. Speaking of, try Python - that might make it a bit more flexible, but you'l have to spend more time on the logic.