'Single updates using tf.GradientTape with multiple outputs
I defined the following model, which has two distinct outputs:
input_layer = keras.layers.Input(shape = (1, 20), name = "input_features")
# Shared layers
hidden_1 = keras.layers.Dense(32,
activation = "relu",
name = "LSTM_shared_l"
)(input_layer)
# Additional layers
hidden_2 = keras.layers.Dense(32,
activation = "selu",
name = "Forecasting_extra_layer_1"
)(input_layer)
hidden_3 = keras.layers.Dense(32,
activation = "selu",
name = "Forecasting_extra_layer_2"
)(hidden_2)
# Output layers
f_output = keras.layers.Dense(1,
name = "F_output")(hidden_1)
rl_output = keras.layers.Dense(32,
name = "RL_output")(hidden_3)
model = keras.Model(inputs = [input_layer], outputs = [f_output, rl_output])
model.summary()
and I would like to train it with GradientTape, performing single iterations; with only one output, I would use the following code:
with tf.GradientTape() as tape:
predictions = model(inputs)
pred_values = tf.reduce_sum(predictions, axis=1, keepdims=True)
loss = tf.reduce_mean(loss_fn(target_pred, pred_values))
grads = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
How can I extend it to the multiple outputs scenario?
Solution 1:[1]
There are multiple strategies, the simplest one is to calculate the loss from both outputs and sum the results together:
predictions_1, predictions_2 = model(inputs)
predictions_1 = ...
predictions_2 = ... # any desired post-processing
loss = tf.reduce_mean(loss_fn(target_1, predictions_1)) + tf.reduce_mean(loss_fn(target_2, predictions_2))
and then you can safely descend:
grads = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Captain Trojan |