I know I can do it like the following:
import numpy as np N=10 a=np.arange(1,100,1) np.argsort()[-N:]
However, it is very slow since it did a full sort.
I wonder whether numpy provide some methods the do it fast.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
numpy 1.8 implements partition and argpartition that perform partial sort ( in O(n) time as opposed to full sort that is O(n) * log(n)).
import numpy as np test = np.array([9,1,3,4,8,7,2,5,6,0]) temp = np.argpartition(-test, 4) result_args = temp[:4] temp = np.partition(-test, 4) result = -temp[:4]
Result:
>>> result_args array([0, 4, 8, 5]) # indices of highest vals >>> result array([9, 8, 6, 7]) # highest vals
Timing:
In [16]: a = np.arange(10000) In [17]: np.random.shuffle(a) In [18]: %timeit np.argsort(a) 1000 loops, best of 3: 1.02 ms per loop In [19]: %timeit np.argpartition(a, 100) 10000 loops, best of 3: 139 us per loop In [20]: %timeit np.argpartition(a, 1000) 10000 loops, best of 3: 141 us per loop
Method 2
The bottleneck module has a fast partial sort method that works directly with Numpy arrays: bottleneck.partition().
Note that bottleneck.partition() returns the actual values sorted, if you want the indexes of the sorted values (what numpy.argsort() returns) you should use bottleneck.argpartition().
I’ve benchmarked:
z = -bottleneck.partition(-a, 10)[:10]z = a.argsort()[-10:]z = heapq.nlargest(10, a)
where a is a random 1,000,000-element array.
The timings were as follows:
bottleneck.partition(): 25.6 ms per loopnp.argsort(): 198 ms per loopheapq.nlargest(): 358 ms per loop
Method 3
I had this problem and, since this question is 5 years old, I had to redo all benchmarks and change the syntax of bottleneck (there is no partsort anymore, it’s partition now).
I used the same arguments as kwgoodman, except the number of elements retrieved, which I increased to 50 (to better fit my particular situation).
I got these results:
bottleneck 1: 01.12 ms per loop bottleneck 2: 00.95 ms per loop pandas : 01.65 ms per loop heapq : 08.61 ms per loop numpy : 12.37 ms per loop numpy 2 : 00.95 ms per loop
So, bottleneck_2 and numpy_2 (adas’s solution) were tied.
But, using np.percentile (numpy_2) you have those topN elements already sorted, which is not the case for the other solutions. On the other hand, if you are also interested on the indexes of those elements, percentile is not useful.
I added pandas too, which uses bottleneck underneath, if available (http://pandas.pydata.org/pandas-docs/stable/install.html#recommended-dependencies). If you already have a pandas Series or DataFrame to start with, you are in good hands, just use nlargest and you’re done.
The code used for the benchmark is as follows (python 3, please):
import time
import numpy as np
import bottleneck as bn
import pandas as pd
import heapq
def bottleneck_1(a, n):
return -bn.partition(-a, n)[:n]
def bottleneck_2(a, n):
return bn.partition(a, a.size-n)[-n:]
def numpy(a, n):
return a[a.argsort()[-n:]]
def numpy_2(a, n):
M = a.shape[0]
perc = (np.arange(M-n,M)+1.0)/M*100
return np.percentile(a,perc)
def pandas(a, n):
return pd.Series(a).nlargest(n)
def hpq(a, n):
return heapq.nlargest(n, a)
def do_nothing(a, n):
return a[:n]
def benchmark(func, size=1000000, ntimes=100, topn=50):
t1 = time.time()
for n in range(ntimes):
a = np.random.rand(size)
func(a, topn)
t2 = time.time()
ms_per_loop = 1000000 * (t2 - t1) / size
return ms_per_loop
t1 = benchmark(bottleneck_1)
t2 = benchmark(bottleneck_2)
t3 = benchmark(pandas)
t4 = benchmark(hpq)
t5 = benchmark(numpy)
t6 = benchmark(numpy_2)
t0 = benchmark(do_nothing)
print("bottleneck 1: {:05.2f} ms per loop".format(t1 - t0))
print("bottleneck 2: {:05.2f} ms per loop".format(t2 - t0))
print("pandas : {:05.2f} ms per loop".format(t3 - t0))
print("heapq : {:05.2f} ms per loop".format(t4 - t0))
print("numpy : {:05.2f} ms per loop".format(t5 - t0))
print("numpy 2 : {:05.2f} ms per loop".format(t6 - t0))
Method 4
Each negative sign in the proposed bottleneck solution
-bottleneck.partsort(-a, 10)[:10]
makes a copy of the data. We can remove the copies by doing
bottleneck.partsort(a, a.size-10)[-10:]
Also the proposed numpy solution
a.argsort()[-10:]
returns indices not values. The fix is to use the indices to find the values:
a[a.argsort()[-10:]]
The relative speed of the two bottleneck solutions depends on the ordering of the elements in the initial array because the two approaches partition the data at different points.
In other words, timing with any one particular random array can make either method look faster.
Averaging the timing across 100 random arrays, each with 1,000,000 elements, gives
-bn.partsort(-a, 10)[:10]: 1.76 ms per loop bn.partsort(a, a.size-10)[-10:]: 0.92 ms per loop a[a.argsort()[-10:]]: 15.34 ms per loop
where the timing code is as follows:
import time
import numpy as np
import bottleneck as bn
def bottleneck_1(a):
return -bn.partsort(-a, 10)[:10]
def bottleneck_2(a):
return bn.partsort(a, a.size-10)[-10:]
def numpy(a):
return a[a.argsort()[-10:]]
def do_nothing(a):
return a
def benchmark(func, size=1000000, ntimes=100):
t1 = time.time()
for n in range(ntimes):
a = np.random.rand(size)
func(a)
t2 = time.time()
ms_per_loop = 1000000 * (t2 - t1) / size
return ms_per_loop
t1 = benchmark(bottleneck_1)
t2 = benchmark(bottleneck_2)
t3 = benchmark(numpy)
t4 = benchmark(do_nothing)
print "-bn.partsort(-a, 10)[:10]: %0.2f ms per loop" % (t1 - t4)
print "bn.partsort(a, a.size-10)[-10:]: %0.2f ms per loop" % (t2 - t4)
print "a[a.argsort()[-10:]]: %0.2f ms per loop" % (t3 - t4)
Method 5
Perhaps heapq.nlargest
import numpy as np import heapq x = np.array([1,-5,4,6,-3,3]) z = heapq.nlargest(3,x)
Result:
>>> z [6, 4, 3]
If you want to find the indices of the n largest elements using bottleneck you could use
bottleneck.argpartsort
>>> x = np.array([1,-5,4,6,-3,3]) >>> z = bottleneck.argpartsort(-x, 3)[:3] >>> z array([3, 2, 5]
Method 6
You can also use numpy’s percentile function. In my case it was slightly faster then bottleneck.partsort():
import timeit
import bottleneck as bn
N,M,K = 10,1000000,100
start = timeit.default_timer()
for k in range(K):
a=np.random.uniform(size=M)
tmp=-bn.partsort(-a, N)[:N]
stop = timeit.default_timer()
print (stop - start)/K
start = timeit.default_timer()
perc = (np.arange(M-N,M)+1.0)/M*100
for k in range(K):
a=np.random.uniform(size=M)
tmp=np.percentile(a,perc)
stop = timeit.default_timer()
print (stop - start)/K
Average time per loop:
- bottleneck.partsort(): 59 ms
- np.percentile(): 54 ms
Method 7
If storing the array as a list of numbers isn’t problematic, you can use
import heapq heapq.nlargest(N, a)
to get the N largest members.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0