'How to iterate over multiple datasets in TensorFlow 2
I use TensorFlow 2.2.0. In my data pipeline, I use multiple datasets to train a neural net. Something like:
# these are all tf.data.Dataset objects:
paired_data = get_dataset(id=0, repeat=False, shuffle=True)
unpaired_images = get_dataset(id=1, repeat=True, shuffle=True)
unpaired_masks = get_dataset(id=2, repeat=True, shuffle=True)
In the training loop, I want to iterate over paired_data
to define one epoch. But I also want to iterate over unpaired_images
and unpaired_masks
to optimize other objectives (classic semi-supervised learning for semantic segmentation, with a mask discriminator).
In order to do this, my current code looks like:
def train_one_epoch(self, writer, step, paired_data, unpaired_images, unpaired_masks):
unpaired_images = unpaired_images.as_numpy_iterator()
unpaired_masks = unpaired_masks.as_numpy_iterator()
for images, labels in paired_data:
with tf.GradientTape() as sup_tape, \
tf.GradientTape() as gen_tape, \
tf.GradientTape() as disc_tape:
# paired data (supervised cost):
predictions = segmentor(images, training=True)
sup_loss = weighted_cross_entropy(predictions, labels)
# unpaired data (adversarial cost):
pred_real = discriminator(next(unpaired_masks), training=True)
pred_fake = discriminator(segmentor(next(unpaired_images), training=True), training=True)
gen_loss = generator_loss(pred_fake)
disc_loss = discriminator_loss(pred_real, pred_fake)
gradients = sup_tape.gradient(sup_loss, self.segmentor.trainable_variables)
generator_optimizer.apply_gradients(zip(gradients, self.segmentor.trainable_variables))
gradients = gen_tape.gradient(gen_loss, self.segmentor.trainable_variables)
generator_optimizer.apply_gradients(zip(gradients, self.segmentor.trainable_variables))
gradients = disc_tape.gradient(disc_loss, self.discriminator.trainable_variables)
discriminator_optimizer.apply_gradients(zip(gradients, self.discriminator.trainable_variables))
However, this results in the error:
main.py:275 train_one_epoch *
unpaired_images = unpaired_images.as_numpy_iterator()
/home/venvs/conda/miniconda3/envs/tf-gpu/lib/python3.8/site-packages/tensorflow/python/data/ops/dataset_ops.py:476 as_numpy_iterator **
raise RuntimeError("as_numpy_iterator() is not supported while tracing "
RuntimeError: as_numpy_iterator() is not supported while tracing functions
Any idea what is wrong with this? Is this the correct way of optimizing over multiple losses/datasets in tensorflow 2?
I add my current solution to the problem in the comments. Any suggestion fo more optimized ways is more than welcome! :)
Solution 1:[1]
My current solution:
def train_one_epoch(self, writer, step, paired_data, unpaired_images, unpaired_masks):
# create a new dataset zipping the three original dataset objects
dataset = tf.data.Dataset.zip((paired_data, unpaired_images, unpaired_masks))
for (images, labels), unpaired_images, unpaired_masks in dataset:
# go ahead and train:
with tf.GradientTape() as tape:
#[...]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |