'Random cropping data augmentation convolutional neural networks

I am training a convolutional neural network, but have a relatively small dataset. So I am implementing techniques to augment it. Now this is the first time i am working on a core computer vision problem so am relatively new to it. For augmenting, i read many techniques and one of them that is mentioned a lot in the papers is random cropping. Now i'm trying to implement it ,i've searched a lot about this technique but couldn't find a proper explanation. So had a few queries:

How is random cropping actually helping in data augmentation? Is there any library (e.g OpenCV, PIL, scikit-image, scipy) in python implementing random cropping implicitly? If not, how should i implement it?



Solution 1:[1]

In my opinion the reason random cropping helps data augmentation is that while the semantics of the image are preserved (unless you pick out a really bad crop, but let's assume that you setup your random cropping so that this is very low probability) the activations values you get in your conv net are different. So in effect our conv net learns to associate a broader range of spatial activation statistics with a certain class label and thus data augmentation via random cropping helps improve the robustness of our feature detectors in conv nets. Also in the same vein, the random crop produces different intermediate activation values and produces a different forwardpass so it's like a "new training point."

It's also not trivial. See the recent work on adversarial examples in neural networks (relatively shallow to AlexNet sized). Images that semantically look the same, more or less, when we pass them through a neural net with a softmax classifier on top, we can get drastically different class probabilities. So subtle changes from a semantic point of view can end up having different forward passes through a conv net. For more details see Intriguing properties of neural networks.

To answer the last part of your question: I usually just make my own random cropping script. Say my images are (3, 256, 256) (3 RGB channels, 256x256 spatial size) you can code up a loop which takes 224x224 random crops of your image by just randomly selecting a valid corner point. So I typically compute an array of valid corner points and if I want to take 10 random crops, I randomly select 10 different corner points from this set, say I choose (x0, y0) for my upper left hand corner point, I will select the crop X[x0:x0+224, y0:y0+224], something like this. I personally like to randomly choose from a pre-computed set of valid corner points instead of randomly choosing a corner one draw at a time because this way I guarantee I do not get a duplicate crop, though in reality it's probably low probability anyway.

Solution 2:[2]

To answer the "how to implement cropping" question, you might want to explore https://github.com/aleju/imgaug. There is a Crop augmenter available that lets you do random cropping. And a lot of other fun augmenters.

Solution 3:[3]

Based on the above answer by @Indie AI, the following piece of code may help you to implement random cropping:

from random import randrange
import numpy as np

def my_random_crop(vol, w, h):
"""
Given a volume with extra pixels, this functions randomly 
crops it by removing a specific number of pixels from each side of it.

:param vol: input volume with shape W*H*D
:param w: number of pixels to be removed from the width dimension
:param h: number of pixels to be removed from the height dimension
:return: a cropped volume
"""
vw = randrange(w) # valid corner point for the width
vh = randrange(h) # valid corner point for the height

rw = w - vw  # remaining width to be removed
rh = h - vh  # remaining height to be removed

width, height, depth = vol.shape
vol = vol[vw:width - rw, vh:height - rh, :]

return vol

As a quick test, run the following:

tmp = np.random.randn(64,128,32)
print(tmp.shape)
tmp = my_random_crop(tmp, w = 10, h = 15)
print(tmp.shape)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Indie AI
Solution 2 happy_sisyphus
Solution 3