'tensorflow ModelCheckpoint on validation precision and recall

I want to checkpoint model whenever validation precision and recall improves - this is on top of validation accuracy and validation loss. So I have added followings:

checkPointPath = os.path.join(checkPointDir, 'cp-{epoch:03d}-{val_binary_accuracy:.3f}-{val_loss:.4f}-{val_precision:.3f}-{val_recall:.3f}.h5')  


valAccuracyCheckPointCallBack = tf.keras.callbacks.ModelCheckpoint(checkPointPath,
                                                                   monitor='val_binary_accuracy',
                                                                   save_freq='epoch',
                                                                   save_weights_only=False,
                                                                   save_best_only=True,
                                                                   verbose=1)
                
valLossCheckPointCallBack = tf.keras.callbacks.ModelCheckpoint(checkPointPath,
                                                               monitor='val_loss',
                                                               save_freq='epoch',
                                                               save_weights_only=False,
                                                               save_best_only=True,
                                                               verbose=1)
            
valPrecisionCheckPointCallBack = tf.keras.callbacks.ModelCheckpoint(checkPointPath,
                                                                    monitor='val_precision',
                                                                    save_freq='epoch',
                                                                    save_weights_only=False,
                                                                    save_best_only=True,
                                                                    verbose=1)
            
valRecallCheckPointCallBack = tf.keras.callbacks.ModelCheckpoint(checkPointPath,
                                                                 monitor='val_recall',
                                                                 save_freq='epoch',
                                                                 save_weights_only=False,
                                                                 save_best_only=True,
                                                                 verbose=1)
callBacks = [accuracyTrainingStopCB, valAccuracyCheckPointCallBack, valLossCheckPointCallBack, valPrecisionCheckPointCallBack, valRecallCheckPointCallBack]

Elsewhere in the code I have metric defined as following:

  model.compile(loss=tf.keras.losses.BinaryCrossentropy(),
                optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
                metrics=[tf.keras.metrics.BinaryAccuracy(name='binary_accuracy', threshold=0.9),
                         tf.keras.metrics.Precision(name='precision', thresholds=0.9),
                         tf.keras.metrics.Recall(name='recall', thresholds=0.9)
                        ],
                )

And finally I pass the callBacks in the fit() method

history = model.fit(
        train_generator,
        epochs=1000,
        verbose=1,
        validation_data=validation_generator,
        validation_steps=8,
        callbacks=callBacks
    )

However, during training, I see it's not honoring val_precision and val_recall and those always get reported as 0

Epoch 56/1000
126/128 [============================>.] - ETA: 0s - loss: 0.1819 - binary_accuracy: 0.9102 - precision: 0.9662 - recall: 0.8502
Epoch 56: val_binary_accuracy did not improve from 0.87500
            
Epoch 56: val_loss did not improve from 0.22489
            
Epoch 56: val_precision did not improve from 0.00000
            
Epoch 56: val_recall did not improve from 0.00000
128/128 [==============================] - 2s 18ms/step - loss: 0.1796 - binary_accuracy: 0.9116 - precision: 0.9668 - recall: 0.8525 - val_loss: 0.4248 - val_binary_accuracy: 0.7656 - val_precision: 0.8400 - val_recall: 0.6562
Epoch 57/1000
127/128 [============================>.] - ETA: 0s - loss: 0.2490 - binary_accuracy: 0.8868 - precision: 0.9456 - recall: 0.8209
Epoch 57: val_binary_accuracy did not improve from 0.87500
    
Epoch 57: val_loss did not improve from 0.22489
    
Epoch 57: val_precision did not improve from 0.00000
    
Epoch 57: val_recall did not improve from 0.00000
128/128 [==============================] - 2s 18ms/step - loss: 0.2473 - binary_accuracy: 0.8877 - precision: 0.9461 - recall: 0.8223 - val_loss: 0.2993 - val_binary_accuracy: 0.8516 - val_precision: 0.9245 - val_recall: 0.7656

What am I missing?

EDIT-1: I've noticed that both precision and recall in the "check-point callback" starts from inf and soon those reach 0 and don't improve further. Should not those start from -inf and keep going up like binary-accuracy?

Is it something I need to fix to define initial value while instantiating the callbacks?



Solution 1:[1]

The issue gets fixed when I add "mode='max'" parameter.

valPrecisionCheckPointCallBack = tf.keras.callbacks.ModelCheckpoint(checkPointPath,
                                                                    mode='max', 
                                                                    monitor='val_precision',
                                                                    save_freq='epoch',
                                                                    save_weights_only=False,
                                                                    save_best_only=True,
                                                                    verbose=1)
            
valRecallCheckPointCallBack = tf.keras.callbacks.ModelCheckpoint(checkPointPath,
                                                                 mode='max',
                                                                 monitor='val_recall',
                                                                 save_freq='epoch',
                                                                 save_weights_only=False,
                                                                 save_best_only=True,
                                                                 verbose=1)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 soumeng78