'How to iterate over multiple datasets in TensorFlow 2

I use TensorFlow 2.2.0. In my data pipeline, I use multiple datasets to train a neural net. Something like:

# these are all tf.data.Dataset objects:
paired_data = get_dataset(id=0, repeat=False, shuffle=True)
unpaired_images = get_dataset(id=1, repeat=True, shuffle=True)
unpaired_masks = get_dataset(id=2, repeat=True, shuffle=True)

In the training loop, I want to iterate over paired_data to define one epoch. But I also want to iterate over unpaired_images and unpaired_masks to optimize other objectives (classic semi-supervised learning for semantic segmentation, with a mask discriminator).

In order to do this, my current code looks like:

def train_one_epoch(self, writer, step, paired_data, unpaired_images, unpaired_masks):

    unpaired_images = unpaired_images.as_numpy_iterator()
    unpaired_masks = unpaired_masks.as_numpy_iterator()

    for images, labels in paired_data:

        with tf.GradientTape() as sup_tape, \
                tf.GradientTape() as gen_tape, \
                tf.GradientTape() as disc_tape:

            # paired data (supervised cost):
            predictions = segmentor(images, training=True)
            sup_loss = weighted_cross_entropy(predictions, labels)

            # unpaired data (adversarial cost):
            pred_real = discriminator(next(unpaired_masks), training=True)
            pred_fake = discriminator(segmentor(next(unpaired_images), training=True), training=True)
            gen_loss = generator_loss(pred_fake)
            disc_loss = discriminator_loss(pred_real, pred_fake)

        gradients = sup_tape.gradient(sup_loss, self.segmentor.trainable_variables)
        generator_optimizer.apply_gradients(zip(gradients, self.segmentor.trainable_variables))

        gradients = gen_tape.gradient(gen_loss, self.segmentor.trainable_variables)
        generator_optimizer.apply_gradients(zip(gradients, self.segmentor.trainable_variables))

        gradients = disc_tape.gradient(disc_loss, self.discriminator.trainable_variables)
        discriminator_optimizer.apply_gradients(zip(gradients, self.discriminator.trainable_variables))

However, this results in the error:

main.py:275 train_one_epoch  *
        unpaired_images = unpaired_images.as_numpy_iterator()
    /home/venvs/conda/miniconda3/envs/tf-gpu/lib/python3.8/site-packages/tensorflow/python/data/ops/dataset_ops.py:476 as_numpy_iterator  **
        raise RuntimeError("as_numpy_iterator() is not supported while tracing "

    RuntimeError: as_numpy_iterator() is not supported while tracing functions

Any idea what is wrong with this? Is this the correct way of optimizing over multiple losses/datasets in tensorflow 2?


I add my current solution to the problem in the comments. Any suggestion fo more optimized ways is more than welcome! :)



Solution 1:[1]

My current solution:

def train_one_epoch(self, writer, step, paired_data, unpaired_images, unpaired_masks):

    # create a new dataset zipping the three original dataset objects
    dataset = tf.data.Dataset.zip((paired_data, unpaired_images, unpaired_masks))

    for (images, labels), unpaired_images, unpaired_masks in dataset:
        # go ahead and train:
        with tf.GradientTape() as tape:
            #[...]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1