I need to choose some elements from the given list, knowing their index. Let say I would like to create a new list, which contains element with index 1, 2, 5, from given list [-2, 1, 5, 3, 8, 5, 6]. What I did is:
a = [-2,1,5,3,8,5,6] b = [1,2,5] c = [ a[i] for i in b]
Is there any better way to do it? something like c = a[b] ?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can use operator.itemgetter:
from operator import itemgetter a = [-2, 1, 5, 3, 8, 5, 6] b = [1, 2, 5] print(itemgetter(*b)(a)) # Result: (1, 5, 5)
Or you can use numpy:
import numpy as np a = np.array([-2, 1, 5, 3, 8, 5, 6]) b = [1, 2, 5] print(list(a[b])) # Result: [1, 5, 5]
But really, your current solution is fine. It’s probably the neatest out of all of them.
Method 2
Alternatives:
>>> map(a.__getitem__, b) [1, 5, 5]
>>> import operator >>> operator.itemgetter(*b)(a) (1, 5, 5)
Method 3
Another solution could be via pandas Series:
import pandas as pd a = pd.Series([-2, 1, 5, 3, 8, 5, 6]) b = [1, 2, 5] c = a[b]
You can then convert c back to a list if you want:
c = list(c)
Method 4
Basic and not very extensive testing comparing the execution time of the five supplied answers:
def numpyIndexValues(a, b):
na = np.array(a)
nb = np.array(b)
out = list(na[nb])
return out
def mapIndexValues(a, b):
out = map(a.__getitem__, b)
return list(out)
def getIndexValues(a, b):
out = operator.itemgetter(*b)(a)
return out
def pythonLoopOverlap(a, b):
c = [ a[i] for i in b]
return c
multipleListItemValues = lambda searchList, ind: [searchList[i] for i in ind]
using the following input:
a = range(0, 10000000) b = range(500, 500000)
simple python loop was the quickest with lambda operation a close second, mapIndexValues and getIndexValues were consistently pretty similar with numpy method significantly slower after converting lists to numpy arrays.If data is already in numpy arrays the numpyIndexValues method with the numpy.array conversion removed is quickest.
numpyIndexValues -> time:1.38940598 (when converted the lists to numpy arrays) numpyIndexValues -> time:0.0193445 (using numpy array instead of python list as input, and conversion code removed) mapIndexValues -> time:0.06477512099999999 getIndexValues -> time:0.06391049500000001 multipleListItemValues -> time:0.043773591 pythonLoopOverlap -> time:0.043021754999999995
Method 5
Here’s a simpler way:
a = [-2,1,5,3,8,5,6] b = [1,2,5] c = [e for i, e in enumerate(a) if i in b]
Method 6
I’m sure this has already been considered: If the amount of indices in b is small and constant, one could just write the result like:
c = [a[b[0]]] + [a[b[1]]] + [a[b[2]]]
Or even simpler if the indices itself are constants…
c = [a[1]] + [a[2]] + [a[5]]
Or if there is a consecutive range of indices…
c = a[1:3] + [a[5]]
Method 7
My answer does not use numpy or python collections.
One trivial way to find elements would be as follows:
a = [-2, 1, 5, 3, 8, 5, 6] b = [1, 2, 5] c = [i for i in a if i in b]
Drawback: This method may not work for larger lists. Using numpy is recommended for larger lists.
Method 8
List comprehension is clearly the most immediate and easiest to remember – in addition to being quite pythonic!
In any case, among the proposed solutions, it is not the fastest (I have run my test on Windows using Python 3.8.3):
import timeit
from itertools import compress
import random
from operator import itemgetter
import pandas as pd
__N_TESTS__ = 10_000
vector = [str(x) for x in range(100)]
filter_indeces = sorted(random.sample(range(100), 10))
filter_boolean = random.choices([True, False], k=100)
# Different ways for selecting elements given indeces
# list comprehension
def f1(v, f):
return [v[i] for i in filter_indeces]
# itemgetter
def f2(v, f):
return itemgetter(*f)(v)
# using pandas.Series
# this is immensely slow
def f3(v, f):
return list(pd.Series(v)[f])
# using map and __getitem__
def f4(v, f):
return list(map(v.__getitem__, f))
# using enumerate!
def f5(v, f):
return [x for i, x in enumerate(v) if i in f]
# using numpy array
def f6(v, f):
return list(np.array(v)[f])
print("{:30s}:{:f} secs".format("List comprehension", timeit.timeit(lambda:f1(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Operator.itemgetter", timeit.timeit(lambda:f2(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Using Pandas series", timeit.timeit(lambda:f3(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Using map and __getitem__", timeit.timeit(lambda: f4(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Enumeration (Why anyway?)", timeit.timeit(lambda: f5(vector, filter_indeces), number=__N_TESTS__)))
My results are:
List comprehension :0.007113 secs
Operator.itemgetter :0.003247 secs
Using Pandas series :2.977286 secs
Using map and getitem :0.005029 secs
Enumeration (Why anyway?) :0.135156 secs
Numpy :0.157018 secs
Method 9
Static indexes and small list?
Don’t forget that if the list is small and the indexes don’t change, as in your example, sometimes the best thing is to use sequence unpacking:
_,a1,a2,_,_,a3,_ = a
The performance is much better and you can also save one line of code:
%timeit _,a1,b1,_,_,c1,_ = a 10000000 loops, best of 3: 154 ns per loop %timeit itemgetter(*b)(a) 1000000 loops, best of 3: 753 ns per loop %timeit [ a[i] for i in b] 1000000 loops, best of 3: 777 ns per loop %timeit map(a.__getitem__, b) 1000000 loops, best of 3: 1.42 µs per loop
Method 10
Kind of pythonic way:
c = [x for x in a if a.index(x) in b]
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0