'Negative gradients when calculating GradCAM heatmap
I have a Segmentation network model trained for 2 classes and am able to see accurate results. But when using grad-cam for the heatmap, I am able to see good results for the last convolution layer for both the classes but having issues when trying to generate a heatmap for the second last convolution layer for one of the classes (the other class's heatmap is working fine).
**Last 5 layers**
convolution_layer(filters:8, kernel:3*3)
convolution_transpose_layer(filters:2, kernel:2*2)
convolution_layer(filters:2, kernel:3*3)
convolution_layer(filters:10, kernel:1*1)
activation_layer(softmax)
The heatmap is empty because of all negative pooled gradients(due to mean from all the -ve gradients wrt Conv layer), resulting in negative values in pooled_grads*convolution_output on which relu is applied, giving all zeros.
What does it mean for GradCAM to be all negative?
Why is it that all channels in the convolution lead to a "negative" contribution to the true output class?
https://arxiv.org/pdf/2002.11434.pdf following this paper for heatmap for segmentation models.
Solution 1:[1]
I use a bit of time to make the GIF explain the convolution was running too much of the same input or no contrast information they merged to similar values. Convolution with padding they will extend the similarities those are one cause or they can find contrast information. It is before answers I go through your links and see the examples output and think this is what the heat map you mean just using the color system you specify. ( I removed those Conv layers and used only 1 layer to examine and accelerate the output behaviors, Heatmap is worked and that is not always negative when using convs not negative means, not negative image when you use the true dimensions images )
[ Sample ]:
import os
from os.path import exists
import tensorflow as tf
import h5py
import matplotlib.pyplot as plt
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Variables
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
database_buffer = "F:\\models\\buffer\\" + os.path.basename(__file__).split('.')[0] + "\\TF_DataSets_01.h5"
database_buffer_dir = os.path.dirname(database_buffer)
checkpoint_path = "F:\\models\\checkpoint\\" + os.path.basename(__file__).split('.')[0] + "\\TF_DataSets_01.h5"
checkpoint_dir = os.path.dirname(checkpoint_path)
loggings = "F:\\models\\checkpoint\\" + os.path.basename(__file__).split('.')[0] + "\\loggings.log"
if not exists(checkpoint_dir) :
os.mkdir(checkpoint_dir)
print("Create directory: " + checkpoint_dir)
if not exists(database_buffer_dir) :
os.mkdir(database_buffer_dir)
print("Create directory: " + database_buffer_dir)
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
DataSet
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
x_train = tf.constant( x_train[0], shape=( 1, 32, 32, 3 ) )
y_train = tf.constant( y_train[0], shape=( 1, 1, 1, 1 ) )
plt.imshow( tf.constant( x_train, shape=( 32, 32, 3 ) ).numpy() )
plt.show()
plt.close()
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Model Initialize
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
model = tf.keras.models.Sequential([
tf.keras.layers.InputLayer(input_shape=( 32, 32, 3 ), name="Input_01"),
tf.keras.layers.Conv2D(1, (1, 1), activation='relu', name="Conv2D_00"),
# tf.keras.layers.Conv2D(8, (3, 3), activation='relu', name="Conv2D_01"),
# tf.keras.layers.Conv2DTranspose(2, (2, 2), activation='relu', name="Conv2DTranspose_01"),
# tf.keras.layers.Conv2D(2, (3, 3), activation='relu', name="Conv2D_02"),
# tf.keras.layers.Conv2D(10, (1, 1), activation='relu', name="Conv2D_03"),
])
model.add(tf.keras.layers.Dense(1))
model.summary()
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Callback
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
class custom_callback(tf.keras.callbacks.Callback):
def on_train_end(self, epoch, logs={}):
print(self.model.inputs)
feature_extractor = tf.keras.Model(inputs=self.model.inputs, outputs=[layer.output for layer in self.model.layers], )
img = tf.keras.preprocessing.image.array_to_img(
tf.constant(feature_extractor(x_train)[0], shape=(32, 32, 1)),
data_format=None,
scale=True
)
plt.imshow( img )
plt.show()
plt.close()
input('Press Any Key!')
custom_callback = custom_callback()
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Optimizer
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
optimizer = tf.keras.optimizers.Nadam( learning_rate=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, name='Nadam' )
optimizer = tf.keras.optimizers.Adam( learning_rate=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False, name='Adam' )
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Loss Fn
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
lossfn = tf.keras.losses.SparseCategoricalCrossentropy( from_logits=False, reduction=tf.keras.losses.Reduction.AUTO, name='sparse_categorical_crossentropy' )
lossfn = tf.keras.losses.MeanSquaredLogarithmicError(reduction=tf.keras.losses.Reduction.AUTO, name='mean_squared_logarithmic_error')
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Model Summary
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
model.compile(optimizer=optimizer, loss=lossfn, metrics=['accuracy'])
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: FileWriter
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
if exists(checkpoint_path) :
model.load_weights(checkpoint_path)
print("model load: " + checkpoint_path)
input("Press Any Key!")
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Training
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
history = model.fit(x_train, y_train, epochs=25 ,validation_data=(x_train, y_train), callbacks=[custom_callback])
input('...')
[ Output ]:
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Martijn Pieters |