Combining Dictionaries Of Lists In Python

I have a very large collection of (p, q) tuples that I would like to convert into a dictionary of lists where the first item in each tuple is a key that indexes a list that contains q.

Example:

Original List: (1, 2), (1, 3), (2, 3)  
Resultant Dictionary: {1:[2, 3], 2:[3]}

Furthermore, I would like to efficiently combine these dictionaries.

Example:

Original Dictionaries: {1:[2, 3], 2:[3]}, {1:[4], 3:[1]}  
Resultant Dictionary: {1:[2, 3, 4], 2:[3], 3:[1]}

These operations reside within an inner loop, so I would prefer that they be as fast as possible.

Thanks in advance

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

If the list of tuples is sorted, itertools.groupby, as suggested by @gnibbler, is not a bad alternative to defaultdict, but it needs to be used differently than he suggested:

import itertools
import operator

def lot_to_dict(lot):
  key = operator.itemgetter(0)
  # if lot's not sorted, you also need...:
  # lot = sorted(lot, key=key)
  # NOT in-place lot.sort to avoid changing it!
  grob = itertools.groupby(lot, key)
  return dict((k, [v[1] for v in itr]) for k, itr in grob)

For “merging” dicts of lists into a new d.o.l…:

def merge_dols(dol1, dol2):
  keys = set(dol1).union(dol2)
  no = []
  return dict((k, dol1.get(k, no) + dol2.get(k, no)) for k in keys)

I’m giving [] a nickname no to avoid uselessly constructing a lot of empty lists, given that performance is important. If the sets of the dols’ keys overlap only modestly, faster would be:

def merge_dols(dol1, dol2):
  result = dict(dol1, **dol2)
  result.update((k, dol1[k] + dol2[k])
                for k in set(dol1).intersection(dol2))
  return result

since this uses list-catenation only for overlapping keys — so, if those are few, it will be faster.

Method 2

collections.defaultdict works like this:

from collections import defaultdict
dic = defaultdict(list)
for i, j in tuples:
    dic[i].append(j)

similar for the dicts:

a, b = {1:[2, 3], 2:[3]}, {1:[4], 3:[1]}
de = defaultdict(list, a)
for i, j in b.items():
    de[i].extend(j)

Method 3

defaltdict to the rescue (as usual)

from collections import defaultdict
my_dict = defaultdict(list)

for key,value in original_list:
    my_dict[key].append(value)

Combining the two dicts can be done like this (note that duplicates will be preserved):

for key,value in orig_dict:
    new_dict[key].extend(value)

Method 4

Here is the iterator style of doing it

>>> mylist=[(1, 2), (1, 3), (2, 3)]
>>> from itertools import groupby
>>> from operator import itemgetter
>>> mylist=[(1, 2), (1, 3), (2, 3)]
>>> groupby(mylist,itemgetter(0))

>>> list(_)
[(1, <itertools._grouper object at 0xb7d402ec>), (2, <itertools._grouper object at 0xb7c716ec>)]

Method 5

I wanted these done in one line just for fun:

>>> from itertools import groupby
>>> t=(1, 2), (1, 3), (2, 3) 
>>> [(i,[x for _,x in list(f)]) for i,f in groupby(sorted(t),lambda t: t[0])] 
[(1, [2, 3]), (2, [3])]
>>> b={1:[2, 3], 2:[3]}, {1:[4], 3:[1]}
>>> dict([(key,sum([i[1::][0] for i in elements],[])) for key,elements in groupby(sorted(b[0].items()+b[1].items()),lambda t: t[0])])
{1: [2, 3, 4], 2: [3], 3: [1]}


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x