'Convert TensorFlow Keras python model to Android .tflite model

I am working on an image recognition project using TensorFlow and Keras, that I would like to implement to my Android project. And I am new to Tensorflow...

I would like to find the closest match between an image to a folder with +2000 images. Images are similar in background and size, like so:

enter image description here

For now I have this following Python code that works okay.

from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input
from tensorflow.keras.models import Model
import numpy as np
from PIL import Image

base_model = VGG16(weights='imagenet')
model = Model(inputs=base_model.input, outputs=base_model.get_layer('fc1').output)

def extract(img):
    img = img.resize((224, 224)) # Resize the image
    img = img.convert('RGB') # Convert the image color space
    x = image.img_to_array(img) # Reformat the image
    x = np.expand_dims(x, axis=0)
    x = preprocess_input(x)
    feature = model.predict(x)[0] # Extract Features
    return feature / np.linalg.norm(feature)

# Iterate through images and extract Features
images = ["img1.png","img2.png","img3.png","img4.png","img5.png"...+2000 more]
all_features = np.zeros(shape=(len(images),4096))

for i in range(len(images)):
    feature = extract(img=Image.open(images[i]))
    all_features[i] = np.array(feature)

# Match image
query = extract(img=Image.open("image_to_match.png")) # Extract its features
dists = np.linalg.norm(all_features - query, axis=1) # Calculate the similarity (distance) between images
ids = np.argsort(dists)[:5] # Extract 5 images that have lowest distance

Now I am a bit lost to where to go from here. To my understanding I need to create a .h5 file with all extracted image features and a .tflite file containing the model.

UPDATE after answer

I can convert the model with:

# Convert the model.
base_model = VGG16(weights='imagenet')
model = Model(inputs=base_model.input, outputs=base_model.get_layer('fc1').output)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Save the model.
with open('model.tflite', 'wb') as f:
    f.write(tflite_model)

But how can I get the extracted features to my Android project? Also, the file size of the model is +400 mb so Android doesnt allow to import it.

Hope you can help me, thanks.



Solution 1:[1]

Please find the below diagram for better understanding of the conversion process. The TensorFlow Lite converter takes a tensorflow/keras model and generates a tensoflow lite (.tflite) model. Even though there is a command line way of converting the model (https://www.tensorflow.org/lite/convert/index#cmdline) you are recommended to use the Python API to do the same, because it allows to add metadata(if required) and apply optimizations to the model. The following steps lets you convert your keras model to tflite. source: https://www.tensorflow.org/lite/convert/index

# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Save the model.
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)

Many a times when you are dealing with bigger models, your model size will be much bigger than the allowed size and might not perform as good as you want. So you have apply optimizations to make the model work. This is done using tf.lite.Optimize. It allows you to optimize your model for speed, storage etc. before tensorflow allowed a lot of manual control where you were able to specify what you need to optimize upon using tf.lite.Optimize.OPTIMIZE_FOR_LATENCY or tf.lite.Optimize.OPTIMIZE_FOR_SIZE nowadays default comes with both these optimizations. Now the conversion code becomes like this.

# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT] #optimization
tflite_quant_model = converter.convert()

# Save the model.
with open('model.tflite', 'wb') as f:
  f.write(tflite_quant_model)

What this does is a dynamic range quantization. Check the size of your model after this step.

If you want to further quantize the model, for example convert all float32 to float16 which will reduce the model size to approx. half the size as original, you can do it specifying a target spec. then your code will look like this. (understand that this will affect the model accuracy)

# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT] #optimization
converter.target_spec.supported_types = [tf.float16] #target spec
tflite_fp16_model = converter.convert()
tflite_model_fp16_file = tflite_models_dir/"model_quant_f16.tflite"
tflite_model_fp16_file.write_bytes(tflite_fp16_model)

There are other types of post training quantizations also, which you can find in this page.

https://www.tensorflow.org/lite/performance/post_training_quantization

All this is post training quantizations, you may quantize the model before that also. refer to this link for the same, you can also find a lot of tutorials via search https://www.tensorflow.org/api_docs/python/tf/quantization/

After this you will have to run these tflite models to test them using python.

the steps are as below.

  1. Load the model onto interpreters.
  2. Test the model with sample images(s)
  3. Evaluate the model

you can find a detailed example on this page https://www.tensorflow.org/lite/performance/post_training_float16_quant

there are many other types of quantization as well based on the precision required and the type of edge device you are going to use. Please refer to the below link for details on how to apply them to your model.

https://www.tensorflow.org/lite/performance/model_optimization.

After quantization check your model size, this should reduce the model size to the required value if not repeat the operation will lower precisions.

Solution 2:[2]

From Tensorflows own site:

# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Save the model.
with open('model.tflite', 'wb') as f:
    f.write(tflite_model)

Solution 3:[3]

You have 2 options:

  1. Post-training quantization

    import tensorflow as tf
    converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    tflite_quant_model = converter.convert()
    
  2. Selective builds

TensorFlow Lite enables you to reduce model binary sizes by using selective builds (Mobilenet). It says:

Selective builds skip unused operations in your model set and produce a compact library with just the runtime and the op kernels required for the model to run on your mobile device.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 DharmanBot
Solution 2 N. Krh
Solution 3 Ehsan Barkhordar