Sum values in 5th column that correspond to same field in 2nd column

Considering below file:

0,2,,,10
0,2,,,15
0,1,,,984
0,2,,,9
1,14,,,5

Using awk, I need to calculate the total value in $5 per each $2.

The desired output would look like below:

2,34
1,984
14,5

Contents hide

Answers:

Method 1

Method 2

Method 3

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Try:

awk -F, '{a[$2]+=$5};END{for(i in a)print i","a[i]}' <file

A note that array traversal in POSIX awk is unspecified order.

Method 2

With gnu datamash:

datamash -t ',' -s -g 2 sum 5 <infile

the output will be sorted by 2nd column:

1,984
14,5
2,34

Method 3

I’d be tempted to use perl:

#!/usr/bin/env perl
use strict;
use warnings;

my %things;

while (<>) {
    my ( undef, $key, @rest ) = split(/,/);
    $things{$key} += pop(@rest);
}

foreach my $key ( sort { $a <=> $b } keys %things ) {
    print "$key = $things{$key}n";
}

You could condense that down to a one liner if needs be.

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes

Article Rating