'About 2D convolutions and how they produce a 1 channel image

Trying to understand 2D convolutions, I ran into the following image, which has me confused: link to the source

If I understood correctly:

  • the blue shape is the input
  • the orange shape is the one of the convolution filters
  • the green shape is the output

My question is: what are the calculations performed to get, from 2 tensors with shape 3x3xD (where D is the depth), a single value.

As far as I understand, the calculation of convolution would produce a 1x1xD vector, but I don't get how from this vector we get a single value. Is it just addition? Does it have normalization for the addition?

Thank you in advance!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source