'Matrix multiplication in TensorFlow model

I want to use matrix multiplication inside TF model. My model is a NN with input shape = (1,9). And I want to get a product of this vectors by themself (i.e. I want to get a matrix-product equals multiplication of transposed input vector by itself, so its shape equals (9,9)).

Code example:

inputs = tf.keras.layers.Input(shape=(1,9))
outputs = tf.keras.layers.Dense(1, activation='linear')(tf.transpose(inputs) @ inputs)
    
model = tf.keras.Model(inputs, outputs)

adam = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)

model.compile(optimizer=adam, loss='mse', metrics=['mae'])

But I have problem with shape of such result. In the case of the above code I get a next architecture:

enter image description here

If I understand correctly, first dimension (None) in the input layer corresponds to size of batch of input data. And when I use transpose operation, it applies to all dimensions in this shape. So I get result with shape (9,1,9) after transpose and multiplication. But I think, that it is not correctly. Because I want to get product of transposed input vector by itself for all vectors in batch (i.e. correct shape for result which I want to get is (None, 9, 9)).

Getting this product as input for the model (compute this multiplication outside this model) is not suitable. Because I want to have in my model original input vector and the result of multiplication to do some operations after (above architecture is not full and using as example).

How can I get correct result? What is correct way to multiply matrices and vectors in TF, if we want to apply this operation to all vectors (matrices) in batch?



Solution 1:[1]

Try tf.linalg.matmul, since it will respect the batch dimension:

import tensorflow as tf

inputs = tf.keras.layers.Input(shape=(1,9))
outputs = tf.keras.layers.Dense(1, activation='linear')(tf.linalg.matmul(inputs, inputs, transpose_a=True))
    
model = tf.keras.Model(inputs, outputs)

adam = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)

model.compile(optimizer=adam, loss='mse', metrics=['mae'])
print(model.summary())
Model: "model_3"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_5 (InputLayer)           [(None, 1, 9)]       0           []                               
                                                                                                  
 tf.linalg.matmul_3 (TFOpLambda  (None, 9, 9)        0           ['input_5[0][0]',                
 )                                                                'input_5[0][0]']                
                                                                                                  
 dense_4 (Dense)                (None, 9, 1)         10          ['tf.linalg.matmul_3[0][0]']     
                                                                                                  
==================================================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
__________________________________________________________________________________________________
None

Solution 2:[2]

I am reading from your questoion that to do the matrix multiplication inside the NN where number mutiplication is do it easy ! It is sequence to sequence where we had many example of them ( those word sentense input with target multiplication dictionary ) It is no need shape output specify but seuquence output is still answer !

( 1 ): Using TF.where or greater !

input:

array_1 = [ 0, 1, 1, 0 ]
array_2 = np.concatenate((array_1, array_1), axis = 0)
temp = [ 0, 1, 1, 0 ]

print( np.asarray( tf.where([ temp == [0, 1, 1, 0] ], array_2, 0 ) ) )

input('...')

output:

[0 1 1 0 0 1 1 0]

( 2 ): Using tfa.seq2seq.BasicDecoder sum

input:

index = 1
next_char = tf.strings.substr(
    input_word, index, len(input_word[0].numpy()) - index, unit="UTF8_CHAR", name=None
)
output, state, lengths = decoder(
    next_char, start_tokens=start_tokens, end_token=end_token, initial_state=initial_state)

print('next_char[0].numpy(): ' + str(next_char[0].numpy()))

output:

input_word[0].numpy() length: tf.Tensor([b'Gl\xc3\xbccklicherweise '], shape=(1,), dtype=string)
input_word[0].numpy() length: 18
next_char[0].numpy(): b'Gl\xc3\xbccklicherweise '
next_char[0].numpy(): b'l\xc3\xbccklicherweise '
next_char[0].numpy(): b'\xc3\xbccklicherweise '
next_char[0].numpy(): b'cklicherweise '
next_char[0].numpy(): b'klicherweise '
next_char[0].numpy(): b'licherweise '

sum = G + L + L + ...

( 3 ): Model multiplication, you using dense input and the output is sequence of the desired target as in the picture.

... Simple calculation

Sequences decoding

TF Fn

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 AloneTogether
Solution 2 Martijn Pieters