'Sliding window input (image sequence) for convolutional neural network
I am currently trying to feed an image sequence as a single input entity to my CNN. I found the numpy utility numpy.lib.stride_tricks.sliding_window_view
My image data wrapper array has shape: (num_of_images, height, width, channels) and I would like to slice 5 images together resulting in a new single input array (5, height, width, channels), which would result in a wrapper array of shape (num_of_images/5, 5, height, width, channels). However, I struggle to use the sliding window view. Can someone enlighten me?
Bonus question: Each of the images has an associated label. I am unsure how to treat these labels when dealing with an image sequence.
Thank you in advance!
Solution 1:[1]
It is just how much you collects from the sources
image_1 = plt.imread(list_pictures[0])
image_2 = plt.imread(list_pictures[1])
image = np.concatenate((image_1, image_2), axis=0)
image = np.reshape(image, (1920, 720, 4)) <<< confirm the input image shape
print(np.asarray(image).shape) # (1920, 720, 4)
shape = (64, 64, 4)
v = np.lib.stride_tricks.sliding_window_view(np.asarray(image), shape)
(1920, 720, 4)
(1857, 657, 1, 64, 64, 4)
plt.imshow(np.reshape(v[:,:,:,0,0], (1857, 657, 4)))
plt.show()
plt.close()
input('...')
Solution 2:[2]
I did it without the numpy utility like so:
im_pixels is an array containing n 1d-arrays with im_height*im_width entries. The 1 stems from 1 channel (greyscale).
def prep_images(im_pixels, window_size, im_height, im_width, pixel_normalizer):
images = np.empty((len(im_pixels), window_size, im_height, im_width, 1))
for i in range(len(im_pixels)):
frame = im_pixels[i:i+window_size]
im_frame = np.empty((window_size, im_height, im_width, 1))
for j, image in enumerate(frame):
frame[j] = normalize_pixels(image, pixel_normalizer)
image_2d = np.reshape(frame[j], (im_height, im_width, 1))
im_frame[j] = image_2d
images[i] = im_frame
return images
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Martijn Pieters |
Solution 2 | ABF |