'Can I use Layer Normalization with CNN?
I see the Layer Normalization is the modern normalization method than Batch Normalization, and it is very simple to coding in Tensorflow. But I think the layer normalization is designed for RNN, and the batch normalization for CNN. Can I use the layer normalization with CNN that process image classification task? What are the criteria for choosing batch normalization or layer?
Solution 1:[1]
You can use Layer normalisation
in CNNs, but i don't think it more 'modern' than Batch Norm
. They both normalise differently. Layer norm
normalises all the activations of a single layer from a batch by collecting statistics from every unit within the layer, while batch norm
normalises the whole batch for every single activation, where the statistics is collected for every single unit across the batch.
Batch norm
is generally preferred over layer norm
as it tries to normalise every activation to a unit gaussian distribution, while layer norm
tries to get the 'average' of all activations to unit gaussian. But if the batch size is too small to collect reasonable statistics, then layer norm
is preferred.
Solution 2:[2]
I would also like to add, as mentioned in original paper for Layer Norm, page 10 section 6.7, Layer Norm isn't advised to be used, and authors tell 'that more research has to be done' for CNNs
Also, a heads up - for RNN, Layer norm seems a better choice than Batch Norm, because training cases can be of different length in the same minibatch
Solution 3:[3]
In more recent work [1], it was found that you can use LayerNorm in CNNs without degrading accuracy, though it depends on the model architecture. Liu et al. [1] found while developing ConvNeXt that "Directly substituting LN for BN in the original ResNet will result in suboptimal performance" but they observed that their ConvNeXt "model does not have any difficulties training with LN; in fact, the performance is slightly better".
It would be great if there were a better explanation as to why...
- Liu et al. A ConvNet for the 2020s. https://arxiv.org/pdf/2201.03545.pdf
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | vijay m |
Solution 2 | |
Solution 3 | hendryx |