Biostar Beta. Not for public use.
awk command to count specific field
0
Entering edit mode
2.7 years ago
@saadleeshehreen45504

Hi, I have a file with the following content. Now I want to count how many of them have 1 in the field -n2.

Bacteroides fragilis,0
Bacteroides fragilis,0
Salmonella enterica,1
Salmonella enterica,1
Salmonella enterica,1
Bacteroides fragilis,0
.................................
..................................

I used the following command :

cat f1.txt | awk {'$2 == 1'} | wc -l

But it doesn't give me the answer. Please help!

command • 547 views
ADD COMMENTlink
0
Entering edit mode

I am not able to Understand input file format. You can use following command to count number of 1 in field 2

grep -o ",1" input.txt  | wc -l
ADD REPLYlink
1
Entering edit mode
2.7 years ago
@Vijay Lakhujani26377

The weird format of your file (if indeed it is in this way) is out of anyone's understanding. But I ll explain how awk could work here provided a nicely formatted tab separated table

Consider your file (say file.txt) this way, the <tab> and <space> symbols are for representation, your actual file will have whitepspace (tabs and spaces) and corresponding positions shown in the file

Bacteroides<space>fragilis<tab>0
Bacteroides<space>fragilis<tab>0
Salmonella<space>enterica<tab>1
Salmonella<space>enterica<tab>1
Salmonella<space>enterica<tab>1
Bacteroides<space>fragilis<tab>0

Now if you say

awk '$2==1{print}' file.txt | wc -l

It may not work, because, by default the field separator which awk consider here is the first white space it encounters which in this case would be the space Bacteroides <space> fragilis

Hence, you must add a field separator -F

awk -F "\t" '$2==1{print}' file.txt | wc -l
ADD COMMENTlink
0
Entering edit mode
awk -F "," '$2==1' file.txt | wc -l
ADD REPLYlink
1
Entering edit mode
2.7 years ago
@Pierre Lindenbaum30

set the field separator 'F' to 'comma' and increase a value 'N' each time column 2 is '1'. At the end print the value of N.

awk -F, '($2==1){N++;}END{print N;}' file.txt

but I think most people would use

cut -d, -f 2 file.txt | grep -c -w 1
ADD COMMENTlink
0
Entering edit mode

Hi, Thanks. I have other related problem. My file like this:

10 Lachnoclostridium sp.   0       0       0       0       1
11 Haemophilus ducreyi     0       0       0       0       1
12 Clostridiales bacterium 0       0       0       0       1
13 Escherichia albertii    0       1       0       0       1

It has 8 fields. I want to just count the lines which value =1 in field 7 and field 8. How can I do that? I used the following, but it's not the exact output.

awk '$4 == 0; $5 == 0; $6 == 0; $7 == 1; $8 ==1' file.txt

ADD REPLYlink
0
Entering edit mode

Hi, You can use following command

awk  '$7==1 && $8==1 {print}' input.txt
ADD REPLYlink
0
Entering edit mode

or just awk '$7==1 && $8==1' input.txt

ADD REPLYlink
0
Entering edit mode

Thanks. It works for me. :)

ADD REPLYlink
0
Entering edit mode

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted. Upvote|Bookmark|Accept

ADD REPLYlink
0
Entering edit mode

try this as well:

awk '$7 && $8==1'  input.txt
ADD REPLYlink
0
Entering edit mode
2.7 years ago
@cpad011220673

Though OP wants solution in awk, here is datamash solution:

Species (organism) wide 0's and 1's count:

$ datamash -s -t "," -g 1,2 count 2 < test.txt | sed 's/,/\t/g'
Bacteroides fragilis    0   3
Salmonella enterica 1   3

Only 0's and 1's count:

$ datamash -s -t "," -g 2 count 2 < test.txt | sed 's/,/\t/g'
0   3
1   3

input (from OP):

 $ cat test.txt 
Bacteroides fragilis,0
Bacteroides fragilis,0
Salmonella enterica,1
Salmonella enterica,1
Salmonella enterica,1
Bacteroides fragilis,0
ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.3