'How do custom input_shape for Inception V3 in Keras work?

I know that the input_shape for Inception V3 is (299,299,3). But in Keras it is possible to construct versions of Inception V3 that have custom input_shape if include_top is False.

"input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (299, 299, 3) (with 'channels_last' data format) or (3, 299, 299) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 75. E.g. (150, 150, 3) would be one valid value" - https://keras.io/applications/#inceptionv3

How is this possible and why can it only have custom input_shape if include_top is false?



Solution 1:[1]

This is possible because the model is fully convolutional. Convolutions don't care about the image size, they're "sliding filters". If you have big images, you have big outputs, if small images, small outputs. (The filters, though, have a fixed size defined by kernel_size and input and output filters)

You cannot do that when you use include_top because this model is probably using a Flatten() layer followed by Dense layers at the end. Dense layers require a fixed input size (given by flatten based on the image size), otherwise it would be impossible to create trainable weights (having a variable number of weights doesn't make sense)

Solution 2:[2]

In order to understand this, you should be clear on how the convolutions work. Like Daniel Möller said if the image size changes the convolution output size also changes. The takeaway is you can't use custom image size on pre trained models because the parameters are fixed.

For example, In your case the inception v3 uses golbal average pooling followed by dense layer after the last convolutional layer. Since its the pre trained model the dense layer always expects the same input size from globalaveragepooling.

mixed10 (Concatenate)          (None, 8, 8, 2048)   0           ['activation_743[0][0]',         
                                                              'mixed9_1[0][0]',               
                                                              'concatenate_15[0][0]',         
                                                              'activation_751[0][0]']         
                                                                                              
avg_pool (GlobalAveragePooling  (None, 2048)        0           ['mixed10[0][0]']                
 2D)                                                                                              
                                                                                              
 predictions (Dense)            (None, 1000)         2049000     ['avg_pool[0][0]']               

It always expect that 2048 from globalaveragepooling. This is why you can't use custom image size with include_top=True and weights='imagenet'. But with include_top=False and weights=None you can because you are initializing the params that depends on your image size.

you can implement the custom image size like this

a=tf.keras.applications.inception_v3.InceptionV3(include_top=False,weights=None,
input_shape=(256,256,4))

x=tfa.layers.SpectralNormalization(layers.Conv2D(filters=64, kernel_size= 
(3,3),strides=(2,2),padding='same',activation="LeakyReLU",name='ls_1'))(a.output)

x=layers.Dropout(0.3)(x)
x=keras.Model(inputs=a.input,outputs=x)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Dcode