How do I retrive the top 3 list from a dictionary?
>>> d
{'a': 2, 'and': 23, 'this': 14, 'only.': 21, 'is': 2, 'work': 2, 'will': 2, 'as': 2, 'test': 4}
Expected result:
and: 23 only: 21 this: 14
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Use collections.Counter:
>>> d = Counter({'a': 2, 'and': 23, 'this': 14, 'only.': 21, 'is': 2, 'work': 2, 'will': 2, 'as': 2, 'test': 4})
>>> d.most_common()
[('and', 23), ('only.', 21), ('this', 14), ('test', 4), ('a', 2), ('is', 2), ('work', 2), ('will', 2), ('as', 2)]
>>> for k, v in d.most_common(3):
... print '%s: %i' % (k, v)
...
and: 23
only.: 21
this: 14
Counter objects offer various other advantages, such as making it almost trivial to collect the counts in the first place.
Method 2
>>> d = {'a': 2, 'and': 23, 'this': 14, 'only.': 21, 'is': 2, 'work': 2, 'will': 2, 'as': 2, 'test': 4}
>>> t = sorted(d.iteritems(), key=lambda x:-x[1])[:3]
>>> for x in t:
... print "{0}: {1}".format(*x)
...
and: 23
only.: 21
this: 14
Method 3
The replies you already got are right, I would however create my own key function to use when call sorted().
d = {'a': 2, 'and': 23, 'this': 14, 'only.': 21, 'is': 2, 'work': 2, 'will': 2, 'as': 2, 'test': 4}
# create a function which returns the value of a dictionary
def keyfunction(k):
return d[k]
# sort by dictionary by the values and print top 3 {key, value} pairs
for key in sorted(d, key=keyfunction, reverse=True)[:3]:
print "%s: %i" % (key, d[key])
Method 4
Given the solutions above:
def most_popular(L):
# using lambda
start = datetime.datetime.now()
res=dict(sorted([(k,v) for k, v in L.items()], key=lambda x: x[1])[-2:])
delta=datetime.datetime.now()-start
print "Microtime (lambda:%d):" % len(L), str( delta.microseconds )
# using collections
start=datetime.datetime.now()
res=dict(collections.Counter(L).most_common()[:2])
delta=datetime.datetime.now()-start
print "Microtime (collections:%d):" % len(L), str( delta.microseconds )
# list of 10
most_popular({el:0 for el in list(range(10))})
# list of 100
most_popular({el:0 for el in list(range(100))})
# list of 1000
most_popular({el:0 for el in list(range(1000))})
# list of 10000
most_popular({el:0 for el in list(range(10000))})
# list of 100000
most_popular({el:0 for el in list(range(100000))})
# list of 1000000
most_popular({el:0 for el in list(range(1000000))})
Working on dataset dict of size from 10^1 to 10^6 dict of objects like
print {el:0 for el in list(range(10))}
{0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}
we have the following benchmarks
Python 2.7.10 (default, Jul 14 2015, 19:46:27) [GCC 4.8.2] on linux Microtime (lambda:10): 24 Microtime (collections:10): 106 Microtime (lambda:100): 49 Microtime (collections:100): 50 Microtime (lambda:1000): 397 Microtime (collections:1000): 178 Microtime (lambda:10000): 4347 Microtime (collections:10000): 2782 Microtime (lambda:100000): 55738 Microtime (collections:100000): 26546 Microtime (lambda:1000000): 798612 Microtime (collections:1000000): 361970 => None
So we can say that for small lists use lambda, but for huge list, collections has better performances.
See the benchmark running here.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0