How To Extract Snps From Chromosome 1-22 In One Command From *.Bed, *.Bim And *.Fam Files
2
2
Entering edit mode
12.6 years ago
User 0422 ▴ 150

Hi,

I need to extract chr 1-22 from SNP array data which also contains mtDNA and Y SNP data. Plink does it chromosome by chromosome and in this case I need to extract one chromosome by another and then join it, but I need something by which I can do it in one command.

Alternatively, the way to remove mt and Y SNPs from the main file would also work equally.

Please help if somebody can write the commands.

Thanks

plink snp chromosome • 12k views
ADD COMMENT
1
Entering edit mode

Providing an example of the contents of your file will likely lead to a quick answer.

ADD REPLY
2
Entering edit mode
12.6 years ago

awk '{ if ($1 == 1 || $1 == 22) print $2 }' file.bim > snp.txt

plink --bfile file --extract snp.txt --make-bed --out newfile

ADD COMMENT
1
Entering edit mode

Sorry, I misunderstood your question.

awk '{ if ($1 >= 1 && $1 <= 22) print $2 }' file.bim > snp.txt

ADD REPLY
0
Entering edit mode

this gives me only 1 and 22 numbers of chromosomes while I need 1 to 22! thanks

ADD REPLY
0
Entering edit mode

and you can get all chroms like: awk '($1 ~ /\d+/)' file.bim > snp.txt

ADD REPLY
0
Entering edit mode

^all numbered chroms^

ADD REPLY
0
Entering edit mode

great! many thanks

ADD REPLY
0
Entering edit mode

thx Baboune. It helped me too.

ADD REPLY
0
Entering edit mode
12.6 years ago

Assuming it's a bed file, or some other file that has the chromosome name in the first column, it's as simple as doing

perl -ne 'print $_ if /^chr[12]?[0-9]/' yourfile >outfile

As Aaron said, it's difficult to tell you about the others without a sample of the file format.

You really should read up on regular expressions, and the use of grep, awk, sed, or perl to do pattern matching.

ADD COMMENT

Login before adding your answer.

Traffic: 2457 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6