What is the simplest way to compare two NumPy arrays for equality (where equality is defined as: A = B iff for all indices i: A[i] == B[i])?
Simply using == gives me a boolean array:
>>> numpy.array([1,1,1]) == numpy.array([1,1,1]) array([ True, True, True], dtype=bool)
Do I have to and the elements of this array to determine if the arrays are equal, or is there a simpler way to compare?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
(A==B).all()
test if all values of array (A==B) are True.
Note: maybe you also want to test A and B shape, such as A.shape == B.shape
Special cases and alternatives (from dbaupp’s answer and yoavram’s comment)
It should be noted that:
- this solution can have a strange behavior in a particular case: if either
AorBis empty and the other one contains a single element, then it returnTrue. For some reason, the comparisonA==Breturns an empty array, for which thealloperator returnsTrue. - Another risk is if
AandBdon’t have the same shape and aren’t broadcastable, then this approach will raise an error.
In conclusion, if you have a doubt about A and B shape or simply want to be safe: use one of the specialized functions:
np.array_equal(A,B) # test if same shape, same elements values np.array_equiv(A,B) # test if broadcastable shape, same elements values np.allclose(A,B,...) # test if same shape, elements have close enough values
Method 2
The (A==B).all() solution is very neat, but there are some built-in functions for this task. Namely array_equal, allclose and array_equiv.
(Although, some quick testing with timeit seems to indicate that the (A==B).all() method is the fastest, which is a little peculiar, given it has to allocate a whole new array.)
Method 3
If you want to check if two arrays have the same shape AND elements you should use np.array_equal as it is the method recommended in the documentation.
Performance-wise don’t expect that any equality check will beat another, as there is not much room to optimize
comparing two elements. Just for the sake, i still did some tests.
import numpy as np import timeit A = np.zeros((300, 300, 3)) B = np.zeros((300, 300, 3)) C = np.ones((300, 300, 3)) timeit.timeit(stmt='(A==B).all()', setup='from __main__ import A, B', number=10**5) timeit.timeit(stmt='np.array_equal(A, B)', setup='from __main__ import A, B, np', number=10**5) timeit.timeit(stmt='np.array_equiv(A, B)', setup='from __main__ import A, B, np', number=10**5) > 51.5094 > 52.555 > 52.761
So pretty much equal, no need to talk about the speed.
The (A==B).all() behaves pretty much as the following code snippet:
x = [1,2,3] y = [1,2,3] print all([x[i]==y[i] for i in range(len(x))]) > True
Method 4
Let’s measure the performance by using the following piece of code.
import numpy as np
import time
exec_time0 = []
exec_time1 = []
exec_time2 = []
sizeOfArray = 5000
numOfIterations = 200
for i in xrange(numOfIterations):
A = np.random.randint(0,255,(sizeOfArray,sizeOfArray))
B = np.random.randint(0,255,(sizeOfArray,sizeOfArray))
a = time.clock()
res = (A==B).all()
b = time.clock()
exec_time0.append( b - a )
a = time.clock()
res = np.array_equal(A,B)
b = time.clock()
exec_time1.append( b - a )
a = time.clock()
res = np.array_equiv(A,B)
b = time.clock()
exec_time2.append( b - a )
print 'Method: (A==B).all(), ', np.mean(exec_time0)
print 'Method: np.array_equal(A,B),', np.mean(exec_time1)
print 'Method: np.array_equiv(A,B),', np.mean(exec_time2)
Output
Method: (A==B).all(), 0.03031857 Method: np.array_equal(A,B), 0.030025185 Method: np.array_equiv(A,B), 0.030141515
According to the results above, the numpy methods seem to be faster than the combination of the == operator and the all() method and by comparing the numpy methods the fastest one seems to be the numpy.array_equal method.
Method 5
Usually two arrays will have some small numeric errors,
You can use numpy.allclose(A,B), instead of (A==B).all(). This returns a bool True/False
Method 6
Now use np.array_equal. From documentation:
np.array_equal([1, 2], [1, 2]) True np.array_equal(np.array([1, 2]), np.array([1, 2])) True np.array_equal([1, 2], [1, 2, 3]) False np.array_equal([1, 2], [1, 4]) False
Method 7
On top of the other answers, you can now use an assertion:
numpy.testing.assert_array_equal(x, y)
You also have similar function such as numpy.testing.assert_almost_equal()
https://numpy.org/doc/stable/reference/generated/numpy.testing.assert_array_equal.html
Method 8
Just for the sake of completeness. I will add the
pandas approach for comparing two arrays:
import numpy as np a = np.arange(0.0, 10.2, 0.12) b = np.arange(0.0, 10.2, 0.12) ap = pd.DataFrame(a) bp = pd.DataFrame(b) ap.equals(bp) True
FYI: In case you are looking of How to
compare Vectors, Arrays or Dataframes in R.
You just you can use:
identical(iris1, iris2) #[1] TRUE all.equal(array1, array2) #> [1] TRUE
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0