'TF.data.dataset.map(map_func) with Eager Mode

I am using TF 1.8 with eager mode enabled.

I cannot print the example inside the mapfunc. It when I run tf.executing_eagerly() from within the mapfunc I get "False"

import os
import tensorflow as tf
tf.logging.set_verbosity(tf.logging.ERROR)

tfe = tf.contrib.eager
tf.enable_eager_execution()
x = tf.random_uniform([16,10], -10, 0, tf.int64)
print(x)
DS = tf.data.Dataset.from_tensor_slices((x))


def mapfunc(ex, con):
    import pdb; pdb.set_trace()
    new_ex = ex + con
    print(new_ex) 
    return new_ex

DS = DS.map(lambda x: mapfunc(x, [7]))
DS = DS.make_one_shot_iterator()

print(DS.next())

print(new_ex) outputs:

Tensor("add:0", shape=(10,), dtype=int64)

Outside mapfunc, it works fine. But inside it, the passed example does not have a value, nor .numpy() attribute.



Solution 1:[1]

The tf.data transformations actually execute as a graph, so the body of the map function itself isn't executed eagerly. See #14732 for some more discussion on this.

If you really need eager execution for the map function, you could use tf.contrib.eager.py_func, so something like:

DS = DS.map(lambda x: tf.contrib.eager.py_func(
  mapfunc,
  [x, tf.constant(7, dtype=tf.int64)], tf.int64)
# In TF 1.9+, the next line can be print(next(DS))
print(DS.make_one_shot_iterator().next())

Hope that helps.

Note that by adding a py_func to the dataset, the single-threaded Python interpreter will be in the loop for every element produced.

Solution 2:[2]

Anything within map is run as graph no matter what mode is used outside. See https://www.tensorflow.org/api_docs/python/tf/data/Dataset#map

As in the page, there are 3 options:

  1. Rely on AutoGraph to convert Python code into an equivalent graph computation. The downside of this approach is that AutoGraph can convert some but not all Python code.
  2. Use tf.py_function, which allows you to write arbitrary Python code but will generally result in worse performance than 1)
  3. Use tf.numpy_function, which also allows you to write arbitrary Python code. Note that tf.py_function accepts tf.Tensor whereas tf.numpy_function accepts numpy arrays and returns only numpy arrays.

With tf.py_function() your line will become:

DS = DS.map(lambda y: tf.py_function(
                          (lambda x: mapfunc(x, [7])),
                          inp=[y], Tout=tf.int64
                      ))

The same applies to tf.map_fn() and tf.vectorized_map().

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ash
Solution 2 Crispin N