Comparing two NumPy arrays for equality, element-wise

What is the simplest way to compare two NumPy arrays for equality (where equality is defined as: A = B iff for all indices i: A[i] == B[i])?

Simply using == gives me a boolean array:

 >>> numpy.array([1,1,1]) == numpy.array([1,1,1])

array([ True,  True,  True], dtype=bool)

Do I have to and the elements of this array to determine if the arrays are equal, or is there a simpler way to compare?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

(A==B).all()

test if all values of array (A==B) are True.

Note: maybe you also want to test A and B shape, such as A.shape == B.shape

Special cases and alternatives (from dbaupp’s answer and yoavram’s comment)

It should be noted that:

  • this solution can have a strange behavior in a particular case: if either A or B is empty and the other one contains a single element, then it return True. For some reason, the comparison A==B returns an empty array, for which the all operator returns True.
  • Another risk is if A and B don’t have the same shape and aren’t broadcastable, then this approach will raise an error.

In conclusion, if you have a doubt about A and B shape or simply want to be safe: use one of the specialized functions:

np.array_equal(A,B)  # test if same shape, same elements values
np.array_equiv(A,B)  # test if broadcastable shape, same elements values
np.allclose(A,B,...) # test if same shape, elements have close enough values

Method 2

The (A==B).all() solution is very neat, but there are some built-in functions for this task. Namely array_equal, allclose and array_equiv.

(Although, some quick testing with timeit seems to indicate that the (A==B).all() method is the fastest, which is a little peculiar, given it has to allocate a whole new array.)

Method 3

If you want to check if two arrays have the same shape AND elements you should use np.array_equal as it is the method recommended in the documentation.

Performance-wise don’t expect that any equality check will beat another, as there is not much room to optimize comparing two elements. Just for the sake, i still did some tests.

import numpy as np
import timeit

A = np.zeros((300, 300, 3))
B = np.zeros((300, 300, 3))
C = np.ones((300, 300, 3))

timeit.timeit(stmt='(A==B).all()', setup='from __main__ import A, B', number=10**5)
timeit.timeit(stmt='np.array_equal(A, B)', setup='from __main__ import A, B, np', number=10**5)
timeit.timeit(stmt='np.array_equiv(A, B)', setup='from __main__ import A, B, np', number=10**5)
> 51.5094
> 52.555
> 52.761

So pretty much equal, no need to talk about the speed.

The (A==B).all() behaves pretty much as the following code snippet:

x = [1,2,3]
y = [1,2,3]
print all([x[i]==y[i] for i in range(len(x))])
> True

Method 4

Let’s measure the performance by using the following piece of code.

import numpy as np
import time

exec_time0 = []
exec_time1 = []
exec_time2 = []

sizeOfArray = 5000
numOfIterations = 200

for i in xrange(numOfIterations):

    A = np.random.randint(0,255,(sizeOfArray,sizeOfArray))
    B = np.random.randint(0,255,(sizeOfArray,sizeOfArray))

    a = time.clock() 
    res = (A==B).all()
    b = time.clock()
    exec_time0.append( b - a )

    a = time.clock() 
    res = np.array_equal(A,B)
    b = time.clock()
    exec_time1.append( b - a )

    a = time.clock() 
    res = np.array_equiv(A,B)
    b = time.clock()
    exec_time2.append( b - a )

print 'Method: (A==B).all(),       ', np.mean(exec_time0)
print 'Method: np.array_equal(A,B),', np.mean(exec_time1)
print 'Method: np.array_equiv(A,B),', np.mean(exec_time2)

Output

Method: (A==B).all(),        0.03031857
Method: np.array_equal(A,B), 0.030025185
Method: np.array_equiv(A,B), 0.030141515

According to the results above, the numpy methods seem to be faster than the combination of the == operator and the all() method and by comparing the numpy methods the fastest one seems to be the numpy.array_equal method.

Method 5

Usually two arrays will have some small numeric errors,

You can use numpy.allclose(A,B), instead of (A==B).all(). This returns a bool True/False

Method 6

Now use np.array_equal. From documentation:

np.array_equal([1, 2], [1, 2])
True
np.array_equal(np.array([1, 2]), np.array([1, 2]))
True
np.array_equal([1, 2], [1, 2, 3])
False
np.array_equal([1, 2], [1, 4])
False

Method 7

On top of the other answers, you can now use an assertion:

numpy.testing.assert_array_equal(x, y)

You also have similar function such as numpy.testing.assert_almost_equal()

https://numpy.org/doc/stable/reference/generated/numpy.testing.assert_array_equal.html

Method 8

Just for the sake of completeness. I will add the
pandas approach for comparing two arrays:

import numpy as np
a = np.arange(0.0, 10.2, 0.12)
b = np.arange(0.0, 10.2, 0.12)
ap = pd.DataFrame(a)
bp = pd.DataFrame(b)

ap.equals(bp)
True

FYI: In case you are looking of How to
compare Vectors, Arrays or Dataframes in R.
You just you can use:

identical(iris1, iris2)
#[1] TRUE
all.equal(array1, array2)
#> [1] TRUE


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x