I am trying to sum certain numbers in a column using awk. I would like to sum just column 3 of the “smiths” to get a total of 212. I can sum the whole column using awk but not just the “smiths”. I have:
awk 'BEGIN {FS = "|"} ; {sum+=$3} END {print sum}' filename.txt
Also I am using putty. Thank you for any help.
smiths|Login|2 olivert|Login|10 denniss|Payroll|100 smiths|Time|200 smiths|Logout|10
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
awk -F '|' '$1 ~ /smiths/ {sum += $3} END {print sum}' inputfilename
- The
-Fflag sets the field separator; I put it in single quotes because it is a special shell character. - Then
$1 ~ /smiths/applies the following {code block} only to lines where the first field matches the regex/smiths/. - The rest is the same as your code.
Note that since you’re not really using a regex here, just a specific value, you could just as easily use:
awk -F '|' '$1 == "smiths" {sum += $3} END {print sum}' inputfilename
Which checks string equality. This is equivalent to using the regex /^smiths$/, as mentioned in another answer, which includes the ^ anchor to only match the start of the string (the start of field 1) and the $ anchor to only match the end of the string. Not sure how familiar you are with regexes. They are very powerful, but for this case you could use a string equality check just as easily.
Method 2
Another approach is to use awk associative arrays, more info here. This line produces the desired output:
awk -F '|' '{a[$1] += $3} END{print a["smiths"]}' filename.txt
As a side effect, the array stores all other values:
awk -F '|' '{a[$1] += $3} END{for (i in a) print i, a[i]}' filename.txt
Output:
smiths 212 denniss 100 olivert 10
Method 3
Very good so far. All you need to do is add a selector before the block to add the sum. Here we check that the first argument contains only “smiths”:
awk 'BEGIN {FS = "|"} ; $1 ~ /^smiths$/ {sum+=$3} END {print sum}'
You could shorten this a bit by specifying the field separator as an option. In awk it’s generally a good idea to initialize variables on the command line:
awk -F'|' '$1 ~ /^smiths$/ {sum+=$3} END {print sum}'
Method 4
I personally would prefer to keep the awk section as simple as possible and do as much as you can without it. Comingled logic doesn’t take advantage of the power of Unix pipelines and is thus harder to understand, debug or modify for closely related use cases.
cat filename.txt | perl -pe 's{.*|}{}g' | awk '{sum+=$1} END {print sum}'
Method 5
cat filename.txt | grep smiths | awk -F '|' '{sum+=$NF} END {print sum}'
-Foption to specify separator.$NFis for “last column”.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0