'How to train LSTM model with variable-length sequence input

I'm trying to train LSTM model in Keras using data of variable timestep, for example, the data looks like:

<tf.RaggedTensor [[[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0],
  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]],
 [[1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
  [1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]],
 [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]], ...,
 [[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]],
 [[1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0],
  [1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0],
  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0],
  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0],
  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]],
 [[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]]>

and its corresponding label:

 <tf.RaggedTensor [[6, 6], [7, 7], [8], ..., [6], [11, 11, 11, 11, 11], [24, 24, 24, 24, 24]]>

Each input data have 13 features, so for each time step, the model receives a 1 x 13 vector. I wonder if it is possible to do so? I don't mind doing this on pytorch either.

Solution 1:^[1]

I try to align them with no reshape layer.

However, my input for each time step in the LSTM layer is a vector of dimension 13. And each sample has variable-length of these vectors, which means the time step is not constant for each sample. Can you show me a code example of how to train such model? – TurquoiseJ

First of all, the concept of windows length and time steps is they take the same amount of the input with a higher number of length and time. We assume the input to extract features can be divide by multiple times of windows travels along with axis, please see the attached for idea.

[Codes]:

batched_features = tf.constant( [   [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], ], shape=( 2, 1, 13 ) )
batched_labels = tf.constant( [[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]], shape=( 2, 13 ) )

dataset = tf.data.Dataset.from_tensor_slices((batched_features, batched_labels))
dataset = dataset.batch(10)
batched_features = dataset

[Sample]:

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 bidirectional (Bidirectiona  (None, 1, 64)            11776
 l)

 bidirectional_1 (Bidirectio  (None, 64)               24832
 nal)

 dense (Dense)               (None, 13)                845

=================================================================
Total params: 37,453
Trainable params: 37,453
Non-trainable params: 0
_________________________________________________________________
<BatchDataset element_spec=(TensorSpec(shape=(None, 1, 13), dtype=tf.int32, name=None), TensorSpec(shape=(None, 13), dtype=tf.int32, name=None))>
Epoch 1/100
2022-03-28 05:19:04.116345: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8100
1/1 [==============================] - 8s 8s/step - loss: 0.0000e+00 - accuracy: 1.0000 - val_loss: 0.0000e+00 - val_accuracy: 1.0000
Epoch 2/100
1/1 [==============================] - 0s 38ms/step - loss: 0.0000e+00 - accuracy: 1.0000 - val_loss: 0.0000e+00 - val_accuracy: 1.0000

Assume each windows consume about 13 level of the input :

batched_features = tf.constant( [   [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], ], shape=( 2, 1, 13 ) )
batched_labels = tf.constant( [[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]], shape=( 2, 13 ) )

Adding more windows is easy by

batched_features = tf.constant( [   [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], ], shape=( 3, 1, 13 ) )
batched_labels = tf.constant( [[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]], shape=( 3, 13 ) )

dataset = tf.data.Dataset.from_tensor_slices((batched_features, batched_labels))
dataset = dataset.batch(10)
batched_features = dataset

At least you tell me what is the purpose they can use reverse windows to have certain results. ( Apmplitues frequency ) The results will look like these for each windows :

[ Output ] : 2 and 3 Windows

# Sequence types with timestep #1:
# <BatchDataset element_spec=(TensorSpec(shape=(None, 1, 13), dtype=tf.int32, name=None), TensorSpec(shape=(None, 13), dtype=tf.int32, name=None))>

# Sequence types with timestep #2:
# <BatchDataset element_spec=(TensorSpec(shape=(None, 1, 13), dtype=tf.int32, name=None), TensorSpec(shape=(None, 13), dtype=tf.int32, name=None))>

[ Result ]: