Efficiently Using Multiple Numpy Slices for Random Image Cropping

I have a 4-D numpy array, with the first dimension representing the number of images in a data set, the second and third being the (equal) width and height, and the 4th being the number of channels (3). For example let’s say I have 4 color images that are 28*28, so my image data looks like this:

X = np.reshape(np.arange(4*28*28*3), (4,28,28,3))

I would like to select a random 16*16 width x height crop of each of the 4 images. Critically, I want the crop to be different per-image, i.e I want to generate 4 random (x_offset, y_offset) pairs. In the end I want access to an array of shape (4, 16, 16, 3).

If I were to write this in a for loop it would look something like this:

x = np.random.randint(0,12,4)
y = np.random.randint(0,12,4)
for i in range(X.shape[0]):
    cropped_image = X[i, x[i]:x[i]+16, y[i]:y[i]+16, :]
    #Add cropped image to a list or something

But I’d like to do it as efficiently as possible and I’m wondering if there’s a way to do it with strides and fancy indexing. I’ve seen the answers to this question, but can’t quite wrap my head around how I might combine something like stride_tricks with random starting points for the strides on the second and third (width and height) axes.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Leverage strided-based method for efficient patch extraction

We can leverage np.lib.stride_tricks.as_strided based scikit-image's view_as_windows to get sliding windows that would be merely views into the input array and hence incur no extra memory overhead and virtually free! We can surely use np.lib.stride_tricks.as_strided directly, but the setup work required is hard to manage especially on arrays with higher dimensions. If scikit-image is not available, we can directly use the source code that works standalone.

Explanation on usage of view_as_windows

The idea with view_as_windows is that we feed in the input arg window_shape as a tuple of length same as the number of dimensions in the input array whose sliding windows are needed. The axes along which we need to slide are fed with the respective window lengths and rest are fed with 1s. This would create an array of views with singleton dims/axes i.e. axes with lengths=1 corresponding to the 1s in window_shape arg. So, for those cases we might want to index into the zeroth element corresponding to the axes that are fed 1 as the sliding window lengths to have a squeezed version of the sliding windows.

Thus, we would have a solution, like so –

# Get sliding windows
from skimage.util.shape import view_as_windows
w = view_as_windows(X, (1,16,16,1))[...,0,:,:,0]

# Index and get our specific windows
out = w[np.arange(X.shape[0]),x,y]

# If you need those in the same format as in the posted loopy code
out = out.transpose(0,2,3,1)


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x