Extract Common Line Out Of Multiple Files
3
1
Entering edit mode
11.3 years ago
macmath ▴ 170

Need to extract common list of Organisms from different files based on the second tab separated column

Example :

Below is the Query: File A:

  • YP_001173296 Pseudomonas stutzeri A1501
  • ZP_10847128 Pseudomonas fragi A22
  • YP_006325640 Pseudomonas fluorescens A506
  • ZP_08518919 Aeromonas caviae Ae398
  • ZP_10474222 Rickettsiella grylli
  • EKB21198 Aeromonas veronii AMC34

File B:

  • P_02062827 Rickettsiella grylli
  • YP_004473601 Pseudomonas fulva 12-X
  • ZP_10438680 Pseudomonas extremaustralis 14-3 substr. 14-3b
  • YP_528004 Aeromonas caviae Ae398
  • ZP_11138829 Gallaecimonas xiamenensis 3-C-1
  • YP_52800475 Pseudomonas stutzeri A1501

File C:

  • P_02062827 Pseudomonas extremaustralis
  • YP_004473601 Pseudomonas fulva 12-X
  • ZP_10438680 Pseudomonas extremaustralis 14-3 substr. 14-3b
  • YP_528004 Aeromonas caviae Ae398
  • ZP_11138829 Rickettsiella grylli
  • YP_52800475 Pseudomonas stutzeri A1501

Expected Result file : Common organism

  • Aeromonas caviae Ae398
  • Pseudomonas stutzeri A1501
  • Rickettsiella grylli
awk comparison • 3.6k views
ADD COMMENT
7
Entering edit mode
11.3 years ago

cut the 2nd column, sort and get the lines having a count=3.

   cut -d '  ' -f 2 file*.tsv | sort | uniq -c |  grep -E '^      3 ' | cut -c 9-
ADD COMMENT
0
Entering edit mode

actual i have 1000 files and i want to get common set of organisms using tab separated 2nd column I checked this command. Could you explain if I am wrong, the 3 stands for three files? In each file duplicates are present of the same organism

ADD REPLY
0
Entering edit mode

yes, "3" is for "3" files. Try : (for F in *.tsv; do cut -d ' ' -f2 $F | sort | uniq: done )| sort | uniq -c | grep -E '^ 1000 ' | cut -c 9-

ADD REPLY
0
Entering edit mode

I received syntax error near unexpected token `)'

ADD REPLY
2
Entering edit mode

I let you solve that syntax error as an exercise. :-)

ADD REPLY
0
Entering edit mode

hehe :) thanks dude

ADD REPLY

Login before adding your answer.

Traffic: 2453 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6