Is there a built-in/quick way to use a list of keys to a dictionary to get a list of corresponding items?
For instance I have:
>>> mydict = {'one': 1, 'two': 2, 'three': 3}
>>> mykeys = ['three', 'one']
How can I use mykeys to get the corresponding values in the dictionary as a list?
>>> mydict.WHAT_GOES_HERE(mykeys) [3, 1]
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
A list comprehension seems to be a good way to do this:
>>> [mydict[x] for x in mykeys] [3, 1]
Method 2
A couple of other ways than list-comp:
- Build list and throw exception if key not found:
map(mydict.__getitem__, mykeys) - Build list with
Noneif key not found:map(mydict.get, mykeys)
Alternatively, using operator.itemgetter can return a tuple:
from operator import itemgetter myvalues = itemgetter(*mykeys)(mydict) # use `list(...)` if list is required
Note: in Python3, map returns an iterator rather than a list. Use list(map(...)) for a list.
Method 3
A little speed comparison:
Python 2.7.11 |Anaconda 2.4.1 (64-bit)| (default, Dec 7 2015, 14:10:42) [MSC v.1500 64 bit (AMD64)] on win32
In[1]: l = [0,1,2,3,2,3,1,2,0]
In[2]: m = {0:10, 1:11, 2:12, 3:13}
In[3]: %timeit [m[_] for _ in l] # list comprehension
1000000 loops, best of 3: 762 ns per loop
In[4]: %timeit map(lambda _: m[_], l) # using 'map'
1000000 loops, best of 3: 1.66 µs per loop
In[5]: %timeit list(m[_] for _ in l) # a generator expression passed to a list constructor.
1000000 loops, best of 3: 1.65 µs per loop
In[6]: %timeit map(m.__getitem__, l)
The slowest run took 4.01 times longer than the fastest. This could mean that an intermediate result is being cached
1000000 loops, best of 3: 853 ns per loop
In[7]: %timeit map(m.get, l)
1000000 loops, best of 3: 908 ns per loop
In[33]: from operator import itemgetter
In[34]: %timeit list(itemgetter(*l)(m))
The slowest run took 9.26 times longer than the fastest. This could mean that an intermediate result is being cached
1000000 loops, best of 3: 739 ns per loop
So list comprehension and itemgetter are the fastest ways to do this.
Update
For large random lists and maps I had a bit different results:
Python 2.7.11 |Anaconda 2.4.1 (64-bit)| (default, Dec 7 2015, 14:10:42) [MSC v.1500 64 bit (AMD64)] on win32 In[2]: import numpy.random as nprnd l = nprnd.randint(1000, size=10000) m = dict([(_, nprnd.rand()) for _ in range(1000)]) from operator import itemgetter import operator f = operator.itemgetter(*l) %timeit f(m) 1000 loops, best of 3: 1.14 ms per loop %timeit list(itemgetter(*l)(m)) 1000 loops, best of 3: 1.68 ms per loop %timeit [m[_] for _ in l] # list comprehension 100 loops, best of 3: 2 ms per loop %timeit map(m.__getitem__, l) 100 loops, best of 3: 2.05 ms per loop %timeit list(m[_] for _ in l) # a generator expression passed to a list constructor. 100 loops, best of 3: 2.19 ms per loop %timeit map(m.get, l) 100 loops, best of 3: 2.53 ms per loop %timeit map(lambda _: m[_], l) 100 loops, best of 3: 2.9 ms per loop
So in this case the clear winner is f = operator.itemgetter(*l); f(m), and clear outsider: map(lambda _: m[_], l) .
Update for Python 3.6.4
import numpy.random as nprnd l = nprnd.randint(1000, size=10000) m = dict([(_, nprnd.rand()) for _ in range(1000)]) from operator import itemgetter import operator f = operator.itemgetter(*l) %timeit f(m) 1.66 ms ± 74.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) %timeit list(itemgetter(*l)(m)) 2.1 ms ± 93.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit [m[_] for _ in l] # list comprehension 2.58 ms ± 88.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit list(map(m.__getitem__, l)) 2.36 ms ± 60.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit list(m[_] for _ in l) # a generator expression passed to a list constructor. 2.98 ms ± 142 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit list(map(m.get, l)) 2.7 ms ± 284 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit list(map(lambda _: m[_], l) 3.14 ms ± 62.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
So, results for Python 3.6.4 is almost the same.
Method 4
Here are three ways.
Raising KeyError when key is not found:
result = [mapping[k] for k in iterable]
Default values for missing keys.
result = [mapping.get(k, default_value) for k in iterable]
Skipping missing keys.
result = [mapping[k] for k in iterable if k in mapping]
Method 5
Try This:
mydict = {'one': 1, 'two': 2, 'three': 3}
mykeys = ['three', 'one','ten']
newList=[mydict[k] for k in mykeys if k in mydict]
print newList
[3, 1]
Method 6
Try this:
mydict = {'one': 1, 'two': 2, 'three': 3}
mykeys = ['three', 'one'] # if there are many keys, use a set
[mydict[k] for k in mykeys]
=> [3, 1]
Method 7
new_dict = {x: v for x, v in mydict.items() if x in mykeys}
Method 8
Pandas does this very elegantly, though ofc list comprehensions will always be more technically Pythonic. I don’t have time to put in a speed comparison right now (I’ll come back later and put it in):
import pandas as pd
mydict = {'one': 1, 'two': 2, 'three': 3}
mykeys = ['three', 'one']
temp_df = pd.DataFrame().append(mydict)
# You can export DataFrames to a number of formats, using a list here.
temp_df[mykeys].values[0]
# Returns: array([ 3., 1.])
# If you want a dict then use this instead:
# temp_df[mykeys].to_dict(orient='records')[0]
# Returns: {'one': 1.0, 'three': 3.0}
Method 9
Following closure of Python: efficient way to create a list from dict values with a given order
Retrieving the keys without building the list:
from __future__ import (absolute_import, division, print_function,
unicode_literals)
import collections
class DictListProxy(collections.Sequence):
def __init__(self, klist, kdict, *args, **kwargs):
super(DictListProxy, self).__init__(*args, **kwargs)
self.klist = klist
self.kdict = kdict
def __len__(self):
return len(self.klist)
def __getitem__(self, key):
return self.kdict[self.klist[key]]
myDict = {'age': 'value1', 'size': 'value2', 'weigth': 'value3'}
order_list = ['age', 'weigth', 'size']
dlp = DictListProxy(order_list, myDict)
print(','.join(dlp))
print()
print(dlp[1])
The output:
value1,value3,value2 value3
Which matches the order given by the list
Method 10
reduce(lambda x,y: mydict.get(y) and x.append(mydict[y]) or x, mykeys,[])
incase there are keys not in dict.
Method 11
If you found yourself doing this a lot, you might want to subclass dict to take a list of keys and return a list of values.
>>> d = MyDict(mydict) >>> d[mykeys] [3, 1]
Here’s a demo implementation.
class MyDict(dict):
def __getitem__(self, key):
getitem = super().__getitem__
if isinstance(key, list):
return [getitem(x) for x in key]
else:
return getitem(key)
Subclassing dict well requires some more work, plus you’d probably want to implement .get(), .__setitem__(),
and .__delitem__(), among others.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0