'What is the meaning of "trainable_weights" in Keras?

If I freeze my base_model with trainable=false, I get strange numbers with trainable_weights.

Before freezing my model has 162 trainable_weights. After freezing, the model only has 2. I tied 2 layers to the pre-trained network. Does trainable_weights show me the layers to train? I find the number weird, when I see 2,253,335 Trainable params.

Solution 1:^[1]

Trainable weights are the weights that will be learnt during the training process. If you do trainable=False then those weights are kept as it is and are not changed because they are not learnt. You might see some "strange numbers" because either you are using a pre-trained network that has its weights already learnt or you might be using random initialization when defining the model. When using transfer learning with pre-trained models a common practice is to freeze the weights of base model (pre-trained) and only train the extra layers that you add at the end.

Solution 2:^[2]

Trainable weights are the same as trainable parameters. A trainable layer often has multiple trainable weights.

Let's view this example:

Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, None, 501)         0         
_________________________________________________________________
lstm_1 (LSTM)                (None, None, 40)          86720     
_________________________________________________________________
SoftDense (TimeDistributed)  (None, None, 501)         20541     
=================================================================
Total params: 107,261
Trainable params: 107,261
Non-trainable params: 0
__________________________

The first layer is just an input layer; it receives the data as-is, so it does not have any trainable weights.

The next layer has 542*4 *40=86720 trainable weights. 40 due to its output dim, 4 because as an LSTM it actually has 4 trainable layers inside it, and 542 for 501+40+1... due to reasons that are probably beyond the scope of this answer.

The last layer has 41*501=20541 trainable weights (40 from the hidden dimension of its input, the LSTM layer, +1 for bias, times 501 for its output).

Total trainable parameters are 107,261.

If I were to freeze the last layer I would have only 86,720 trainable weights.

Solution 3:^[3]

Late to the party, but maybe this answer can be useful to others that might be googling this.

First, it is useful to distinguish between the quantity "Trainable params" one sees at the end of my_model.summary(), with the output of len(my_model.trainable_weights).

Maybe an example helps: let's say I have a model with VGG16 architecture.

my_model = keras.applications.vgg16.VGG16(
    weights="imagenet", 
    include_top=False
)

# take a look at model summary
my_model.summary()

You will see that there are 13 conv. layers that have trainable parameters. Acknowledging the fact that pooling/input layers do not have trainable parameters, i.e. no learning is needed for them. On the other hand, in each of those 13 layers, there are "weights" and "biases" that need to be learned, think of them as variables.

What len(my_model.trainable_weights) will give you is the number of trainable layers (if you will) multiplied by 2 (weights + bias).

In this case, if you print len(my_model.trainable_weights), you will get 26 as the answer. maybe we can think of 26 as the number of variables for the optimization, variables that can differ in the shape of course.

Now to connect trainable_weights to the total number of trainable parameters, one can try:

trainable_params = 0 
for layer in my_model.trainable_weights:
    trainable_params += layer.numpy().size
print(F"#{trainable_params = }")

You will get this number: 14714688. Which must be the "Trainable params" number you see at the end of my_model.summary().

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	techytushar
Solution 2
Solution 3	Alireza Amani

'What is the meaning of "trainable_weights" in Keras?

Solution 1:[1]

Solution 2:[2]

Solution 3:[3]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]

Solution 3:^[3]