Extract Out The Row Containing Accession Number
2
0
Entering edit mode
10.9 years ago

I have accession numbers like this

AT2G41490
AT3G57220
AT2G39630
AT4G16710
AT4G18230
AT1G20575
AT1G74340
AT1G48140
AT1G16570
AT1G78800
AT2G40190
AT5G07630
AT2G47760
AT1G16900
AT1G02145
AT5G38460
AT2G44660

ant my table look like this

APK_ORTHOMCL3128    10    7    arabidopsis brachy grapevine maize poplar rice sorghum    AT2G41490 AT3G57220 Bradi1g18800 GRMZM2G137707 GRMZM5G827205 GSVIVG00033552001 LOC_Os07g46640 POPTR_0006s04420 POPTR_0016s04160 Sb02g041960
APK_ORTHOMCL3129    10    7    arabidopsis brachy grapevine maize poplar rice sorghum    AT2G42130 AT3G58010 Bradi5g25350 GRMZM2G124466 GRMZM2G371944 GSVIVG00033499001 LOC_Os04g57020 POPTR_0006s20660 POPTR_0016s04520 Sb06g032010
APK_ORTHOMCL3130    10    6    arabidopsis brachy maize poplar rice sorghum    AT2G42310 AT3G57785 Bradi4g33650 GRMZM2G132748 GRMZM2G137139 LOC_Os09g31260 POPTR_0006s05570 POPTR_0016s05190 Sb02g028260 Sb07g027870

I want to extract out the all row containing the accession numbers. Folllowing perl script is working fine for single accession number but I want to modify it for batch queries. plzz help me

    #!/usr/bin/perl


$filename = "apk.txt";

open FILE,"<$filename" or die "Cannot read the file $filename: $!\n";

while ($line = <FILE>)
{
    if ($line =~ m/AT2G41490/)
    {
        print $line;
    }
}
sequence • 2.3k views
ADD COMMENT
1
Entering edit mode

Additionally to Pierre's answer check this question Extract according to row .
And your question is more programming related than bioinformatics so for similar code line problems, I suggest asking for help at http://stackoverflow.com/ (for example: http://stackoverflow.com/questions/11490036/fast-alternative-to-grep-f).

ADD REPLY
0
Entering edit mode

Thanks for everyone for answers and suggestion. Since I am a window lover therefore I found another easy solution to do that. I used FINDSTR command to perform this job.

ADD REPLY
0
Entering edit mode
10.9 years ago

the anwser is not in perl but in sort and join

duplicate of How to join text files by a certain column (accession id)

ADD COMMENT
0
Entering edit mode
10.9 years ago
Vivek ★ 2.7k

Can't a simple grep do the job instead of a perl script?

for acc_no in `cat accessions.txt`;do grep -w $acc_no table.txt;done
ADD COMMENT
0
Entering edit mode

Unless table.txt is very small, you'd be re-reading this file for each line in it. Setting up a hash/lookup table with Perl/Python/etc. is usually within reach of modern workstations and will calculate a solution much, much faster.

ADD REPLY
0
Entering edit mode

A solution would be grep -w -F -f accessions.txt table.txt

ADD REPLY
0
Entering edit mode

Well, I'm most likely lazy enough to use a one-liner.

ADD REPLY

Login before adding your answer.

Traffic: 2559 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6