operator.itemgetter or lambda

I was curious if there was any indication of which of operator.itemgetter(0) or lambda x:x[0] is better to use, specifically in sorted() as the key keyword argument as that’s the use that springs to mind first. Are there any known performance differences? Are there any PEP related preferences or guidance on the matter?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

The performance of itemgetter is slightly better:

>>> f1 = lambda: sorted(w, key=lambda x: x[1])
>>> f2 = lambda: sorted(w, key=itemgetter(1))
>>> timeit(f1)
21.33667682500527
>>> timeit(f2)
16.99106214600033

Method 2

Leaving aside the speed issue, which is often based on where you make the itemgetter or lambda function, I personally find that itemgetter is really nice for getting multiple items at once: for example, itemgetter(0, 4, 3, 9, 19, 20) will create a function that returns a tuple of the items at the specified indices of the listlike object passed to it. To do that with a lambda, you’d need lambda x:x[0], x[4], x[3], x[9], x[19], x[20], which is a lot clunkier. (And then some packages such as numpy have advanced indexing, which works a lot like itemgetter() except built in to normal bracket notation.)

Method 3

According to my benchmark on a list of 1000 tuples, using itemgetter is almost twice as quick as the plain lambda method. The following is my code:

In [1]: a = list(range(1000))

In [2]: b = list(range(1000))

In [3]: import random

In [4]: random.shuffle(a)

In [5]: random.shuffle(b)

In [6]: c = list(zip(a, b))

In [7]: %timeit c.sort(key=lambda x: x[1])
81.4 µs ± 433 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [8]: random.shuffle(c)

In [9]: from operator import itemgetter

In [10]: %timeit c.sort(key=itemgetter(1))
47 µs ± 202 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

I have also tested the performance (run time in µs) of this two method for various list size.

+-----------+--------+------------+
| List size | lambda | itemgetter |
+-----------+--------+------------+
| 100       | 8.19   | 5.09       |
+-----------+--------+------------+
| 1000      | 81.4   | 47         |
+-----------+--------+------------+
| 10000     | 855    | 498        |
+-----------+--------+------------+
| 100000    | 14600  | 10100      |
+-----------+--------+------------+
| 1000000   | 172000 | 131000     |
+-----------+--------+------------+

enter image description here

(The code producing the above image can be found here)

Combined with the conciseness to select multiple elements from a list, itemgetter is clearly the winner to use in sort method.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x