Find matching rows in 2 dimensional numpy array

I would like to get the index of a 2 dimensional Numpy array that matches a row. For example, my array is this:

vals = np.array([[0, 0],
                 [1, 0],
                 [2, 0],
                 [0, 1],
                 [1, 1],
                 [2, 1],
                 [0, 2],
                 [1, 2],
                 [2, 2],
                 [0, 3],
                 [1, 3],
                 [2, 3],
                 [0, 0],
                 [1, 0],
                 [2, 0],
                 [0, 1],
                 [1, 1],
                 [2, 1],
                 [0, 2],
                 [1, 2],
                 [2, 2],
                 [0, 3],
                 [1, 3],
                 [2, 3]])

I would like to get the index that matches the row [0, 1] which is index 3 and 15. When I do something like numpy.where(vals == [0 ,1]) I get…

(array([ 0,  3,  3,  4,  5,  6,  9, 12, 15, 15, 16, 17, 18, 21]), array([0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0]))

I want index array([3, 15]).

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You need the np.where function to get the indexes:

>>> np.where((vals == (0, 1)).all(axis=1))
(array([ 3, 15]),)

Or, as the documentation states:

If only condition is given, return condition.nonzero()

You could directly call .nonzero() on the array returned by .all:

>>> (vals == (0, 1)).all(axis=1).nonzero()
(array([ 3, 15]),)

To dissassemble that:

>>> vals == (0, 1)
array([[ True, False],
       [False, False],
       ...
       [ True, False],
       [False, False],
       [False, False]], dtype=bool)

and calling the .all method on that array (with axis=1) gives you True where both are True:

>>> (vals == (0, 1)).all(axis=1)
array([False, False, False,  True, False, False, False, False, False,
       False, False, False, False, False, False,  True, False, False,
       False, False, False, False, False, False], dtype=bool)

and to get which indexes are True:

>>> np.where((vals == (0, 1)).all(axis=1))
(array([ 3, 15]),)

or

>>> (vals == (0, 1)).all(axis=1).nonzero()
(array([ 3, 15]),)

I find my solution a bit more readable, but as unutbu points out, the following may be faster, and returns the same value as (vals == (0, 1)).all(axis=1):

>>> (vals[:, 0] == 0) & (vals[:, 1] == 1)

Method 2

In [5]: np.where((vals[:,0] == 0) & (vals[:,1]==1))[0]
Out[5]: array([ 3, 15])

I’m not sure why, but this is significantly faster than
np.where((vals == (0, 1)).all(axis=1)):

In [34]: vals2 = np.tile(vals, (1000,1))

In [35]: %timeit np.where((vals2 == (0, 1)).all(axis=1))[0]
1000 loops, best of 3: 808 µs per loop

In [36]: %timeit np.where((vals2[:,0] == 0) & (vals2[:,1]==1))[0]
10000 loops, best of 3: 152 µs per loop

Method 3

Using the numpy_indexed package, you can simply write:

import numpy_indexed as npi
print(np.flatnonzero(npi.contains([[0, 1]], vals)))


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x