'Difference between the return values of tensorflow.keras.preprocessing.img_to_array and tf.image.decode_jpeg
I have a task to load and extract feature embeddings from images. I've working solutions that utilize two different ways of loading the image:
tensorflow.keras.preprocessing.image.load_img
to load the image itself and then usetensorflow.keras.preprocessing.image.img_to_array
to turn it into an array.tf.io.read_file
to read the raw file and then usef.image.decode_jpeg
to decode it and turn it into an array.
During coding, I found that both ways of turning an image into an array return different results! And as a result, different feature embeddings are returned after model inference. Since I use the embedded results to cluster similar images, I've found that due to the difference in embedding I'm getting different clusters.
Question:
- Which method of loading images should be used and why?
- Why the results are different and how to make them the same?
CODE:
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import img_to_array
from efficientnet.tfkeras import EfficientNetB0
import tensorflow as tf
import numpy as np
import urllib.request
model = EfficientNetB0(weights='imagenet', include_top=False, pooling="avg")
IMG_SIZE=[224,224]
def read_image1(path):
raw = tf.io.read_file(path)
image = tf.image.decode_jpeg(raw, channels=3, dct_method='INTEGER_ACCURATE')
image = tf.image.resize(image,IMG_SIZE)
image = tf.image.convert_image_dtype(image, tf.float32)
return np.array(image)
def read_image2(path):
img = image.load_img(path,target_size=IMG_SIZE)
arr = img_to_array(img)
return np.array(arr)
def embedings(arr, model):
arr4d = np.expand_dims(arr, axis=0)
embeds=model.predict(arr4d)
return embeds
url='https://upload.wikimedia.org/wikipedia/commons/d/d6/Thai-Ridgeback.jpg'
pic_name='test_picture.jpg'
urllib.request.urlretrieve(url,pic_name)
arr1=read_image1(pic_name)
arr2=read_image2(pic_name)
embs1=embedings(arr1, model)
embs2=embedings(arr2, model)
img1 = tf.keras.preprocessing.image.array_to_img(arr1)
img2 = tf.keras.preprocessing.image.array_to_img(arr2)
def compare_arrays(arr1,arr2):
print(f'Shape is the same? {arr1.shape==arr2.shape}')
print(f'Arrays are equal? {(arr1==arr2).all()}')
compare_arrays(arr1,arr2)
compare_arrays(embs1,embs2)
Solution 1:[1]
Results are different because when you use tf.image.convert_image_dtype(image, tf.float32)
, image values are normalized by the MAX value of the input dtype. (convert_image_dtype), also tf.image.resize
default interpolation method is "bilinear" (resize) while tf.keras.utils.load_img
(load_img) defaul is 'nearest'.
I edited your code like this:
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import img_to_array
import tensorflow as tf
import numpy as np
import urllib.request
IMG_SIZE=[224,224]
def read_image1(path):
raw = tf.io.read_file(path)
image = tf.image.decode_jpeg(raw, channels=3, dct_method='INTEGER_ACCURATE')
image = tf.image.resize(image,IMG_SIZE, method='nearest')
image = tf.cast(image, 'float32')
return np.array(image)
def read_image2(path):
img = image.load_img(path, target_size=IMG_SIZE)
arr = img_to_array(img)
return np.array(arr)
url='https://upload.wikimedia.org/wikipedia/commons/d/d6/Thai-Ridgeback.jpg'
pic_name='test_picture.jpg'
urllib.request.urlretrieve(url,pic_name)
arr1=read_image1(pic_name)
arr2=read_image2(pic_name)
def compare_arrays(arr1,arr2):
print(f'Shape is the same? {arr1.shape==arr2.shape}')
print(f'Arrays are equal? {(arr1==arr2).all()}')
compare_arrays(arr1,arr2)
Shape is the same? True
Arrays are equal? True
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Stanislav D. |