Using awk to sum the values of a column, based on the values of another column

I am trying to sum certain numbers in a column using awk. I would like to sum just column 3 of the “smiths” to get a total of 212. I can sum the whole column using awk but not just the “smiths”. I have:

awk 'BEGIN {FS = "|"} ; {sum+=$3} END {print sum}' filename.txt

Also I am using putty. Thank you for any help.

smiths|Login|2
olivert|Login|10
denniss|Payroll|100
smiths|Time|200
smiths|Logout|10

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

awk -F '|' '$1 ~ /smiths/ {sum += $3} END {print sum}' inputfilename
  • The -F flag sets the field separator; I put it in single quotes because it is a special shell character.
  • Then $1 ~ /smiths/ applies the following {code block} only to lines where the first field matches the regex /smiths/.
  • The rest is the same as your code.

Note that since you’re not really using a regex here, just a specific value, you could just as easily use:

awk -F '|' '$1 == "smiths" {sum += $3} END {print sum}' inputfilename

Which checks string equality. This is equivalent to using the regex /^smiths$/, as mentioned in another answer, which includes the ^ anchor to only match the start of the string (the start of field 1) and the $ anchor to only match the end of the string. Not sure how familiar you are with regexes. They are very powerful, but for this case you could use a string equality check just as easily.

Method 2

Another approach is to use awk associative arrays, more info here. This line produces the desired output:

awk -F '|' '{a[$1] += $3} END{print a["smiths"]}' filename.txt

As a side effect, the array stores all other values:

awk -F '|' '{a[$1] += $3} END{for (i in a) print i, a[i]}' filename.txt

Output:

smiths 212
denniss 100
olivert 10

Method 3

Very good so far. All you need to do is add a selector before the block to add the sum. Here we check that the first argument contains only “smiths”:

awk 'BEGIN {FS = "|"} ; $1 ~ /^smiths$/ {sum+=$3} END {print sum}'

You could shorten this a bit by specifying the field separator as an option. In awk it’s generally a good idea to initialize variables on the command line:

awk -F'|' '$1 ~ /^smiths$/ {sum+=$3} END {print sum}'

Method 4

I personally would prefer to keep the awk section as simple as possible and do as much as you can without it. Comingled logic doesn’t take advantage of the power of Unix pipelines and is thus harder to understand, debug or modify for closely related use cases.

cat filename.txt | perl -pe 's{.*|}{}g' | awk '{sum+=$1} END {print sum}'

Method 5

cat filename.txt | grep smiths | awk -F '|' '{sum+=$NF} END {print sum}'
  • -F option to specify separator.
  • $NF is for “last column”.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x