'Matrix multiplication in TensorFlow model
I want to use matrix multiplication inside TF model. My model is a NN with input shape = (1,9). And I want to get a product of this vectors by themself (i.e. I want to get a matrix-product equals multiplication of transposed input vector by itself, so its shape equals (9,9)).
Code example:
inputs = tf.keras.layers.Input(shape=(1,9))
outputs = tf.keras.layers.Dense(1, activation='linear')(tf.transpose(inputs) @ inputs)
model = tf.keras.Model(inputs, outputs)
adam = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=adam, loss='mse', metrics=['mae'])
But I have problem with shape of such result. In the case of the above code I get a next architecture:
If I understand correctly, first dimension (None) in the input layer corresponds to size of batch of input data. And when I use transpose operation, it applies to all dimensions in this shape. So I get result with shape (9,1,9) after transpose and multiplication. But I think, that it is not correctly. Because I want to get product of transposed input vector by itself for all vectors in batch (i.e. correct shape for result which I want to get is (None, 9, 9)).
Getting this product as input for the model (compute this multiplication outside this model) is not suitable. Because I want to have in my model original input vector and the result of multiplication to do some operations after (above architecture is not full and using as example).
How can I get correct result? What is correct way to multiply matrices and vectors in TF, if we want to apply this operation to all vectors (matrices) in batch?
Solution 1:[1]
Try tf.linalg.matmul
, since it will respect the batch dimension:
import tensorflow as tf
inputs = tf.keras.layers.Input(shape=(1,9))
outputs = tf.keras.layers.Dense(1, activation='linear')(tf.linalg.matmul(inputs, inputs, transpose_a=True))
model = tf.keras.Model(inputs, outputs)
adam = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=adam, loss='mse', metrics=['mae'])
print(model.summary())
Model: "model_3"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_5 (InputLayer) [(None, 1, 9)] 0 []
tf.linalg.matmul_3 (TFOpLambda (None, 9, 9) 0 ['input_5[0][0]',
) 'input_5[0][0]']
dense_4 (Dense) (None, 9, 1) 10 ['tf.linalg.matmul_3[0][0]']
==================================================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
__________________________________________________________________________________________________
None
Solution 2:[2]
I am reading from your questoion that to do the matrix multiplication inside the NN where number mutiplication is do it easy ! It is sequence to sequence where we had many example of them ( those word sentense input with target multiplication dictionary ) It is no need shape output specify but seuquence output is still answer !
( 1 ): Using TF.where or greater !
input:
array_1 = [ 0, 1, 1, 0 ]
array_2 = np.concatenate((array_1, array_1), axis = 0)
temp = [ 0, 1, 1, 0 ]
print( np.asarray( tf.where([ temp == [0, 1, 1, 0] ], array_2, 0 ) ) )
input('...')
output:
[0 1 1 0 0 1 1 0]
( 2 ): Using tfa.seq2seq.BasicDecoder sum
input:
index = 1
next_char = tf.strings.substr(
input_word, index, len(input_word[0].numpy()) - index, unit="UTF8_CHAR", name=None
)
output, state, lengths = decoder(
next_char, start_tokens=start_tokens, end_token=end_token, initial_state=initial_state)
print('next_char[0].numpy(): ' + str(next_char[0].numpy()))
output:
input_word[0].numpy() length: tf.Tensor([b'Gl\xc3\xbccklicherweise '], shape=(1,), dtype=string)
input_word[0].numpy() length: 18
next_char[0].numpy(): b'Gl\xc3\xbccklicherweise '
next_char[0].numpy(): b'l\xc3\xbccklicherweise '
next_char[0].numpy(): b'\xc3\xbccklicherweise '
next_char[0].numpy(): b'cklicherweise '
next_char[0].numpy(): b'klicherweise '
next_char[0].numpy(): b'licherweise '
sum = G + L + L + ...
( 3 ): Model multiplication, you using dense input and the output is sequence of the desired target as in the picture.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | AloneTogether |
Solution 2 | Martijn Pieters |