'ValueError when using ModelCheckpoint in Keras

I'm creating an Ensemble of Vgg19, DenseNet, and EfficientNetB1.

The code is as follows:

IMAGE_SIZE = (224,224,3)

import tensorflow as tf
vgg19 = tf.keras.applications.vgg19.VGG19(
    input_shape=IMAGE_SIZE, weights='imagenet', include_top=False)
for layer in vgg19.layers:
    layer._name = layer._name + str('_19')
    layer.trainable = False

effnetb1 =tf.keras.applications.efficientnet.EfficientNetB1(
    include_top=False, weights='imagenet', input_shape=IMAGE_SIZE)
for layer in effnetb1.layers:
    layer._name = layer._name + str('_B1')
    layer.trainable=False

densenet=tf.keras.applications.densenet.DenseNet121(
    include_top=False, weights="imagenet", input_shape=IMAGE_SIZE)
for layer in densenet.layers:
    layer._name = layer._name + str('_Dense')
    layer.trainable=False


from keras.layers import Input, Flatten, Concatenate, Dense, Average, Dropout
inp = Input(IMAGE_SIZE)
    
vgg19_x = Flatten()(vgg19(inp))
vgg19_x = Dense(256, activation='relu')(vgg19_x)

effnet_x = Flatten()(effnetb1(inp))
effnet_x = Dense(256, activation='relu')(effnet_x)

densenet_x = Flatten()(densenet(inp))
densenet_x = Dense(256, activation='relu')(densenet_x)

from keras.models import Model

x = Concatenate()([vgg19_x, effnet_x, densenet_x])
x = Dense(128, activation='relu')(x)
x = Dropout(0.30)(x)
x = Dense(64, activation='relu')(x)
out = Dense(2, activation='softmax')(x)

model = Model(inputs = inp, outputs = out)
model.compile(
  loss='categorical_crossentropy',
  optimizer=tf.keras.optimizers.Adam(
    learning_rate=0.0005,
    name="Adam"),
  metrics=['accuracy']
)
model.summary()

from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint

checkpointer = ModelCheckpoint(filepath="/content/drive/MyDrive/ensemble/ensemble-weights.hdf5", verbose=1, save_best_only=True)

r = model.fit(
  training_set,
  validation_data=test_set,
  epochs=30,
  steps_per_epoch=len(training_set),
  validation_steps=len(test_set),
  callbacks = [checkpointer]
)

The code runs fine and the training is successfully taking place when I'm not using the callback. But when I use a ModelCheckpoint, I get the following error after 1st epoch:

ValueError: The target structure is of type `<class 'keras.engine.keras_tensor.KerasTensor'>`
  KerasTensor(type_spec=TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='input_5'), name=...
However, the input structure is a sequence (<class 'list'>) of length 0.
  []
nest cannot guarantee that it is safe to map one to the other.

Can anyone tell me what's wrong here? Also, is it because I'm concatenating three models?

Your help will be appreciated. Thank you!



Solution 1:[1]

I also ran into this issue while trying to implement a nested model (which is what would be constructed here after you create the concatenated model).

The issue seems to be that Keras cannot handle the inputs and outputs of nested models in newer tensorflow versions(tf 2.0 and above). Depending on the version you are on, you might want to either explicitly refer the input/output of the nested model you are using. In tf2.6, what seems to work is to define separate models for each part - ie - the common layers added after concatenation should also be wrapped in a model like below (taken from here):

#Make GradCAM heatmap following the Keras tutorial.
last_conv_layer = model.layers[-4].layers[-1]
last_conv_layer_model = keras.Model(model.layers[-4].inputs, last_conv_layer.output)

# Second, we create a model that maps the activations of the last conv
# layer to the final class predictions
classifier_input = keras.Input(shape=last_conv_layer.output.shape[1:])
x = classifier_input
for layer in model.layers[-3:]:
    x = layer(x)
classifier_model = keras.Model(classifier_input, x)

#Preparing the image with the preprocessing layers
preprocess_layers = keras.Model(model.inputs, model.layers[-5].output)
img_array = preprocess_layers(prepared_image)

# Then, we compute the gradient of the top predicted class for our input image
# with respect to the activations of the last conv layer
with tf.GradientTape() as tape:
    # Compute activations of the last conv layer and make the tape watch it
    last_conv_layer_output = last_conv_layer_model(img_array)
    tape.watch(last_conv_layer_output)
    # Compute class predictions
    preds = classifier_model(last_conv_layer_output)
    top_pred_index = tf.argmax(preds[0])
    top_class_channel = preds[:, top_pred_index]

# This is the gradient of the top predicted class with regard to
# the output feature map of the last conv layer
grads = tape.gradient(top_class_channel, last_conv_layer_output)

You can also check the following github issues (they are not very related, but deal with a similar problem) - issue1, issue2, issue3

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Anshuman Sabath