This answer explains how to find the nearest (sorted) array element to a single point, in a manner efficient for large arrays (slightly modified):
def arg_nearest(array, value):
idx = np.searchsorted(array, value, side="left")
if idx > 0 and (idx == len(array) or math.fabs(value - array[idx-1]) < math.fabs(value - array[idx])):
return idx-1
else:
return idx
If, instead, we want to find the array elements nearest a set of points (i.e. a second array); are there efficient (by speed, for large arrays) ways of extending this besides using a for-loop?
Some test cases:
>>> xx = [0.2, 0.8, 1.3, 1.5, 2.0, 3.1, 3.8, 3.9, 4.5, 5.1, 5.5] >>> yy = [1, 2, 3, 4, 5] >>> of_x_nearest_y(xx, yy) [0.5, 2.0, 3.1, 3.9, 5.1] >>> xx = [0.2, 0.8, 1.3, 1.5, 2.0, 3.1, 3.8, 3.9, 4.5, 5.1, 5.5] >>> yy = [-2, -1, 4.6, 5.8] >>> of_x_nearest_y(xx, yy) [0.2, 0.2, 4.5, 5.5]
Edit: assuming both arrays are sorted, you can do a little better than a completely naive for-loop by excluding values below those already matched, i.e.
def args_nearest(options, targets):
locs = np.zeros(targets.size, dtype=int)
prev = 0
for ii, tt in enumerate(targets):
locs[ii] = prev + arg_nearest(options[prev:], tt)
prev = locs[ii]
return locs
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can make few changes to extend it for an array of elements in value, like so –
idx = np.searchsorted(xx, yy, side="left").clip(max=xx.size-1)
mask = (idx > 0) &
( (idx == len(xx)) | (np.fabs(yy - xx[idx-1]) < np.fabs(yy - xx[idx])) )
out = xx[idx-mask]
Explanation
Nomenclature : array is the array in which we are looking to place elements from value to maintain the sorted nature of array.
Changes needed to extend the solution for a single element to many elements for searching :
1] Clip the indices array idx obtained from np.searchsorted at a max. of array.size-1, because for elements in value that are larger than the maximum of array, we need to make idx indexable by array.
2] Introduce numpy to replace math to do those operations in a vectorized manner.
3] Replace the conditional statement by the trick of idx - mask. In this case, internally Python would up-convert mask to an int array to match up with the datatype of idx. Thus, all the True elements become 1 and thus for True elements we would effectively have idx-1, which is the True case of the IF conditional statement in the original code.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0