Dictionary creation with fromkeys and mutable objects. A surprise

I came across this behavior that surprised me in Python 2.6 and 3.2:

>>> xs = dict.fromkeys(range(2), [])
>>> xs
{0: [], 1: []}
>>> xs[0].append(1)
>>> xs
{0: [1], 1: [1]}

However, dict comprehensions in 3.2 show a more polite demeanor:

>>> xs = {i:[] for i in range(2)}
>>> xs
{0: [], 1: []}
>>> xs[0].append(1)
>>> xs
{0: [1], 1: []}
>>>

Why does fromkeys behave like that?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Your Python 2.6 example is equivalent to the following, which may help to clarify:

>>> a = []
>>> xs = dict.fromkeys(range(2), a)

Each entry in the resulting dictionary will have a reference to the same object. The effects of mutating that object will be visible through every dict entry, as you’ve seen, because it’s one object.

>>> xs[0] is a and xs[1] is a
True

Use a dict comprehension, or if you’re stuck on Python 2.6 or older and you don’t have dictionary comprehensions, you can get the dict comprehension behavior by using dict() with a generator expression:

xs = dict((i, []) for i in range(2))

Method 2

In the first version, you use the same empty list object as the value for both keys, so if you change one, you change the other, too.

Look at this:

>>> empty = []
>>> d = dict.fromkeys(range(2), empty)
>>> d
{0: [], 1: []}
>>> empty.append(1) # same as d[0].append(1) because d[0] references empty!
>>> d
{0: [1], 1: [1]}

In the second version, a new empty list object is created in every iteration of the dict comprehension, so both are independent from each other.

As to “why” fromkeys() works like that – well, it would be surprising if it didn’t work like that. fromkeys(iterable, value) constructs a new dict with keys from iterable that all have the value value. If that value is a mutable object, and you change that object, what else could you reasonably expect to happen?

Method 3

To answer the actual question being asked: fromkeys behaves like that because there is no other reasonable choice. It is not reasonable (or even possible) to have fromkeys decide whether or not your argument is mutable and make new copies every time. In some cases it doesn’t make sense, and in others it’s just impossible.

The second argument you pass in is therefore just a reference, and is copied as such. An assignment of [] in Python means “a single reference to a new list”, not “make a new list every time I access this variable”. The alternative would be to pass in a function that generates new instances, which is the functionality that dict comprehensions supply for you.

Here are some options for creating multiple actual copies of a mutable container:

  1. As you mention in the question, dict comprehensions allow you to execute an arbitrary statement for each element:
    d = {k: [] for k in range(2)}

    The important thing here is that this is equivalent to putting the assignment k = [] in a for loop. Each iteration creates a new list and assigns it to a value.

  2. Use the form of the dict constructor suggested by @Andrew Clark:
    d = dict((k, []) for k in range(2))

    This creates a generator which again makes the assignment of a new list to each key-value pair when it is executed.

  3. Use a collections.defaultdict instead of a regular dict:
    d = collections.defaultdict(list)

    This option is a little different from the others. Instead of creating the new list references up front, defaultdict will call list every time you access a key that’s not already there. You can there fore add the keys as lazily as you want, which can be very convenient sometimes:

    for k in range(2):
        d[k].append(42)

    Since you’ve set up the factory for new elements, this will actually behave exactly as you expected fromkeys to behave in the original question.

  4. Use dict.setdefault when you access potentially new keys. This does something similar to what defaultdict does, but it has the advantage of being more controlled, in the sense that only the access you want to create new keys actually creates them:
    d = {}
    for k in range(2):
        d.setdefault(k, []).append(42)

    The disadvantage is that a new empty list object gets created every time you call the function, even if it never gets assigned to a value. This is not a huge problem, but it could add up if you call it frequently and/or your container is not as simple as list.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x