'How to extract 'image' and 'label' out of Tensorflow?

I've loaded in my train and validation sets from CIFAR10 like so:

train = tfds.load('cifar10', split='train[:90%]', shuffle_files=True)
validation = tfds.load('cifar10', split='train[-10%:]', shuffle_files=True)

I've created the architecture for my CNN

model = ...

Now I'm trying to use model.fit() to train my model but I don't know how to seperate out the 'image' and 'label' from my objects. Train and Validation look like this:

print(train) # same layout as the validation set
<_OptionsDataset shapes: {id: (), image: (32, 32, 3), label: ()}, types: {id: tf.string, image: tf.uint8, label: tf.int64}>

My naive approach would be this but those OptionsDatasets are not subscript-able.

history = model.fit(train['image'], train['label'], epochs=100, batch_size=64, validation_data=(validation['image'], test['label'], verbose=0)


Solution 1:[1]

We can do this as follows

import tensorflow as tf
import tensorflow_datasets as tfds

def normalize(img, label):
  img = tf.cast(img, tf.float32) / 255.
  return (img, label)

ds = tfds.load('mnist', split='train', as_supervised=True)
ds = ds.shuffle(1024).batch(32).prefetch(tf.data.experimental.AUTOTUNE)
ds = ds.map(normalize)

for i in ds.take(1):
    print(i[0].shape, i[1].shape)
# (32, 28, 28, 1) (32,)
  • Use as_supervised=True for returning an image, label as tuple
  • Use .map() for applying preprocessing funciton or even augmentation.

Model

# declare input shape 
input = tf.keras.Input(shape=(28,28,1))

# Block 1
x = tf.keras.layers.Conv2D(32, 3, strides=2, activation="relu")(input)

# Now that we apply global max pooling.
gap = tf.keras.layers.GlobalMaxPooling2D()(x)

# Finally, we add a classification layer.
output = tf.keras.layers.Dense(10, activation='softmax')(gap)

# bind all
func_model = tf.keras.Model(input, output)

Compile and Run

print('\nFunctional API')
func_model.compile(
          metrics=['accuracy'],
          loss= 'sparse_categorical_crossentropy', # labels are integer (not one-hot)
          optimizer = tf.keras.optimizers.Adam()
          )

func_model.fit(ds)
# 1875/1875 [==============================] - 15s 7ms/step - loss: 2.1782 - accuracy: 0.2280

Solution 2:[2]

Tensorflow knows how to handle the tfds objects. So in your case you can just do

history = model.fit(train, epochs=100, batch_size=64, validation_data=(validation, verbose=0)

No need to split out the labels from the images. But if you really want to you can do the following

labels = []
for image, label in tfds.as_numpy(train):
  labels.append(label)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Brian L