I have a list of (label, count) tuples like this:
[('grape', 100), ('grape', 3), ('apple', 15), ('apple', 10), ('apple', 4), ('banana', 3)]
From that I want to sum all values with the same label (same labels always adjacent) and return a list in the same label order:
[('grape', 103), ('apple', 29), ('banana', 3)]
I know I could solve it with something like:
def group(l):
result = []
if l:
this_label = l[0][0]
this_count = 0
for label, count in l:
if label != this_label:
result.append((this_label, this_count))
this_label = label
this_count = 0
this_count += count
result.append((this_label, this_count))
return result
But is there a more Pythonic / elegant / efficient way to do this?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
itertools.groupby can do what you want:
import itertools
import operator
L = [('grape', 100), ('grape', 3), ('apple', 15), ('apple', 10),
('apple', 4), ('banana', 3)]
def accumulate(l):
it = itertools.groupby(l, operator.itemgetter(0))
for key, subiter in it:
yield key, sum(item[1] for item in subiter)
print(list(accumulate(L)))
# [('grape', 103), ('apple', 29), ('banana', 3)]
Method 2
using itertools and list comprehensions
import itertools
[(key, sum(num for _, num in value))
for key, value in itertools.groupby(l, lambda x: x[0])]
Edit: as gnibbler pointed out: if l isn’t already sorted replace it with sorted(l).
Method 3
import collections
d=collections.defaultdict(int)
a=[]
alist=[('grape', 100), ('banana', 3), ('apple', 10), ('apple', 4), ('grape', 3), ('apple', 15)]
for fruit,number in alist:
if not fruit in a: a.append(fruit)
d[fruit]+=number
for f in a:
print (f,d[f])
output
$ ./python.py
('grape', 103)
('banana', 3)
('apple', 29)
Method 4
>>> from itertools import groupby
>>> from operator import itemgetter
>>> L=[('grape', 100), ('grape', 3), ('apple', 15), ('apple', 10), ('apple', 4), ('banana', 3)]
>>> [(x,sum(map(itemgetter(1),y))) for x,y in groupby(L, itemgetter(0))]
[('grape', 103), ('apple', 29), ('banana', 3)]
Method 5
my version without itertools
[(k, sum([y for (x,y) in l if x == k])) for k in dict(l).keys()]
Method 6
Method
def group_by(my_list):
result = {}
for k, v in my_list:
result[k] = v if k not in result else result[k] + v
return result
Usage
my_list = [
('grape', 100), ('grape', 3), ('apple', 15),
('apple', 10), ('apple', 4), ('banana', 3)
]
group_by(my_list)
# Output: {'grape': 103, 'apple': 29, 'banana': 3}
You Convert to List of tuples like list(group_by(my_list).items()).
Method 7
Or a simpler more readable answer ( without itertools ):
pairs = [('foo',1),('bar',2),('foo',2),('bar',3)]
def sum_pairs(pairs):
sums = {}
for pair in pairs:
sums.setdefault(pair[0], 0)
sums[pair[0]] += pair[1]
return sums.items()
print sum_pairs(pairs)
Method 8
Simpler answer without any third-party libraries:
dct={}
for key,value in alist:
if key not in dct:
dct[key]=value
else:
dct[key]+=value
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0