Exploration of Tensorflow SavedModel

6 min readSep 23, 2020

From the diagram above, we know that the saved model plays an important to build the bridge between the training and deployment. In this article, I’d like to explore more about SavedModel.

What is SavedModel

A SavedModel contains a complete TensorFlow program, including weights and computation. It does not require the original model building code to run, which makes it useful for sharing or deploying (with TFLite, TensorFlow.js, TensorFlow Serving, or TensorFlow Hub). From [1]

Save and load Keras models

You probably know there is some relationship between TensorFlow and Keras. Originally, they are separately packages and now they are combining together. To be short, we have to learn tensorflow and Keras’s original model saving method first, then we can have a better understanding of current Tensorflow’s model saving.

What is a full Keras model contains

A Keras model consists of multiple components[2]:

An architecture, or configuration, which specifies what layers the model contains, and how they’re connected.
A set of weights values (the “state of the model”).
An optimizer (defined by compiling the model).
A set of losses and metrics (defined by compiling the model or calling add_loss() or add_metric()).

Model saving methods in Keras

From [2]

“The Keras API makes it possible to save all of these pieces to disk at once, or to only selectively save some of them:

Saving everything into a single archive in the TensorFlow SavedModel format (or in the older Keras H5 format). This is the standard practice.
Saving the architecture / configuration only, typically as a JSON file.
Saving the weights values only. This is generally used when training the model.”

We can see from the first approach, that the latest Keras (which is under Tensorflow) is using the TensorFlow SavedModel format to make it have a unified API as Tensorflow.

Sample code for the TensorFlow SavedModel format by Keras

The code is adjusted from [2]

import numpy as np
import tensorflow as tf
from tensorflow import keras
def get_model():
    # Create a simple model.
    inputs = keras.Input(shape=(32,))
    outputs = keras.layers.Dense(1)(inputs)
    model = keras.Model(inputs, outputs)
    model.compile(optimizer="adam", loss="mean_squared_error")
    return model


model = get_model()
print(model.summary())
# Train the model.
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)

# Calling `save('my_model')` creates a SavedModel folder `my_model`.
model.save("my_model")

# It can be used to reconstruct the model identically.
reconstructed_model = keras.models.load_model("my_model")

# Let's check:
np.testing.assert_allclose(
    model.predict(test_input), reconstructed_model.predict(test_input)
)

# The reconstructed model is already compiled and has retained the optimizer
# state, so training can resume:
reconstructed_model.fit(test_input, test_target)

I am saving the above file with the name of “save_load_keras_model.py”. And I am using docker to run the code. How to use docker to run the TensorFlow 2.3 code on mac can be found in my previous article.

When you run the docker TensorFlow image, you can see the code.

Run the code, you can get something here

Screenshot of the output. Ignore the first part warnings as my MAC does not have a GPU. We can see the model summary from the output.

What are the 3 parts in saved model

If you check the content in my_model, you will get 3 parts.

root@e5689efc0e46:/tmp/keras_model# ls my_model/assets  saved_model.pb  variables

saved_model.pb is the most important part as it will be directly used for the deployment.

the assets folder is empty, the variables contain two files

root@e5689efc0e46:/tmp/keras_model/my_model# ls variables/variables.data-00000-of-00001  variables.index

If you open the variables.data-00000-of-00001 file, you will get something not very readable as shown below.

¢<85>=äz·><89>^ñ½D|}>ý<89>ì=¦§Ã¼Ù<90><99>¾yý§>¤^E<95>>Ïî^Z=<8c><87>¾zË<87>¾§^N<90>>6<86>n¾ ^__½`Á8>g°P¾,^GB>ÑÆ^V>õzs>z<8a>Ã>^[,À><8e>>w½{<80>½½¶à<97>>^FLÍ>^@6Z¾<93>ã<89>=mÈK¾_°<9b>>ð<8d>z¾c'Ü¾ðÝz»^D^@^@^@^@^@^@^@fff?w¾^??^@^@^@^@o^R<83>:Y^EBB^@^@^@C<8a>¡á=<81><9d>^@>D¶ä=ÈØ^T><96><99>^@>,#ù=Gk²=^Z)^Q>e?è=Fbæ=<93>§Ì=·ù<9d>=«sù=b<8c>á=y^Q¹=^Möõ=<82><97>Ë=<86>_    >lö^@>áL^D>jDÓ=úâ^K>ú×Ò=¼½È=,ª^Q><80>H^B>¹<8c>Ô=Xë^B>^P<8e>©=þD^S>é¬ß=+<9a><9a>=`Gc>â)^^:pA:5e^L:r²U:<8d>è/:6,:°B½9<91>kO:^Ef^K:¬´0:^Y°Ú9{^G<91>9Kî":ÖK^K:Y<93>¶9cI^]:û»Î9ÝÎF:é<8d>3:z!L:%tÛ90ÒC:vÕó9=ÜÂ9¬²I:   _#:m^C^[:&ç.::¦<99>95<81>F:öâú98iu9<8e>^Q^C;ì^P~îy{¥^A^K^H^A^R^Glayer-0^X^H^B^R^Tlayer_with_weights-0^K^H^B^R^Glayer-1^M^H^C^R    optimizer^M^H^D^R    variables^Y^H^E^R^Uregularization_losses^W^H^F^R^Strainable_variables^M^H^G^R    keras_api^N^H^H^Rsignatures^@h^H  ^R^Fkernel^H^H^R^Dbias^M^H^K^R    variables^Y^H^L^R^Uregularization_losses^W^H^M^R^Strainable_variables^M^H^N^R    keras_apid^H^H^O^R^Diter^H^P^R^Fbeta_1^H^Q^R^Fbeta_2^H^R^R^Edecay^Q^H^S^R^Mlearning_rate^Z^G^H   ^R^Am^X#^Z^G^H^R^Am^X$^Z^G^H  ^R^Av^X%^Z^G^H^R^Av^X&^N

If you open the variables.index, you will get

^@^@^F^H^A^Z^B^H^A^@^\^O_CHECKPOINTABLE_OBJECT_GRAPH^H^G^R^@ ¬^C(ò^P5É_8<89>^@4^Nkeras_api/metrics/0/count/.ATTRIBUTES/VARIABLE_VALUE^H^A^R^@  ^A(^D5q^_¡í^T ^Ntotal/.ATTRIBUTES/VARIABLE_VALUE^H^A^R^@ <9c>^A(^D5+¯<8f>%^@4^Rlayer_with_weights-0/bias/.ATTRIBUTES/VARIABLE_VALUE^H^A^R^D^R^B^H^A <80>^A(^D5^L(úª^[5^ROPTIMIZER_SLOT/optimizer/m/.ATTRIBUTES/VARIABLE_VALUE^H^A^R^D^R^B^H^A ¤^B(^D5ûI¨à4^\^Rv/.ATTRIBUTES/VARIABLE_VALUE^H^A^R^D^R^B^H^A ¨^C(^D5(¬<99>O^U!^Tkernel/.ATTRIBUTES/VARIABLE_VALUE^H^A^R^H^R^B^H ^R^B^H^A(<80>^A5ò^Z^]5^WOPTIMIZER_SLOT/optimizer/m/.ATTRIBUTES/VARIABLE_VALUE^H^A^R^H^R^B^H ^R^B^H^A ¤^A(<80>^A5^G¤{6^\^Wv/.ATTRIBUTES/VARIABLE_VALUE^H^A^R^H^R^B^H ^R^B^H^A ¨^B(<80>^A5^N¬Oe^@+^Noptimizer/beta_1/.ATTRIBUTES/VARIABLE_VALUE^H^A^R^@ <8c>^A(^D5ýëÇß^O^\^N2/.ATTRIBUTES/VARIABLE_VALUE^H^A^R^@ <90>^A(^D5»íúÙ^Ndecay/.ATTRIBUTES/VARIABLE_VALUE^H^A^R^@ <94>^A(^D5¦{^Q:^_^Niter/.ATTRIBUTES/VARIABLE_VALUE^H   ^R^@ <84>^A(^H5BER^D(^Nlearning_rate/.ATTRIBUTES/VARIABLE_VALUE^H^A^R^@ <98>^A(^D5iËU2^@^@^@^@^A^@^@^@^@<87>^MÐú^@^@^@^@^A^@^@^@^@Àò¡°^@^A^Cp^@¹^F^@^@^@^@^A^@^@^@^@ò®K§¾^F^HË^F^O^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@Wû<80><8b>$uGÛ

Why we need 3 parts in the saved model

The model architecture and training configuration (including the optimizer, losses, and metrics) are stored in saved_model.pb. The weights are saved in the variables/ directory.[2]

A SavedModel directory has the following structure[4]:

assets/
assets.extra/
variables/
    variables.data-?????-of-?????
    variables.index
saved_model.pb

SavedModel protocol buffer

saved_model.pb or saved_model.pbtxt
Includes the graph definitions as MetaGraphDef protocol buffers.

Assets

Subfolder called assets.
Contains auxiliary files such as vocabularies, etc.

Extra assets

Subfolder where higher-level libraries and users can add their own assets that co-exist with the model, but are not loaded by the graph.
This subfolder is not managed by the SavedModel libraries.

Variables

Subfolder called variables.
Includes output from the TensorFlow Saver.
variables.data-?????-of-?????
variables.index

Back to Tensorflow SavedModel

An example code

# https://www.tensorflow.org/guide/saved_model
import os
import tempfilefrom matplotlib import pyplot as plt
import numpy as np
import tensorflow as tftmpdir = tempfile.mkdtemp()
physical_devices = tf.config.experimental.list_physical_devices('GPU')
if physical_devices:
  tf.config.experimental.set_memory_growth(physical_devices[0], True)
file = tf.keras.utils.get_file(
    "grace_hopper.jpg",
    "https://storage.googleapis.com/download.tensorflow.org/example_images/grace_hopper.jpg")
img = tf.keras.preprocessing.image.load_img(file, target_size=[224, 224])
plt.imshow(img)
plt.axis('off')
x = tf.keras.preprocessing.image.img_to_array(img)
x = tf.keras.applications.mobilenet.preprocess_input(
    x[tf.newaxis,...])
labels_path = tf.keras.utils.get_file(
    'ImageNetLabels.txt',
    'https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt')
imagenet_labels = np.array(open(labels_path).read().splitlines())pretrained_model = tf.keras.applications.MobileNet()
print(pretrained_model.summary())
result_before_save = pretrained_model(x)decoded = imagenet_labels[np.argsort(result_before_save)[0,::-1][:5]+1]print("Result before saving:\n", decoded)mobilenet_save_path = os.path.join(tmpdir, "mobilenet/1/")
tf.saved_model.save(pretrained_model, mobilenet_save_path)loaded = tf.saved_model.load(mobilenet_save_path)
print(list(loaded.signatures.keys()))  # ["serving_default"]infer = loaded.signatures["serving_default"]
print(infer.structured_outputs)labeling = infer(tf.constant(x))[pretrained_model.output_names[0]]decoded = imagenet_labels[np.argsort(labeling)[0,::-1][:5]+1]print("Result after saving and loading:\n", decoded)

Output

We have the model summary output

Also we have three output files

root@9cbba134aee4:/tmp# ls tmp5dy2i6uj/mobilenet/1/assets  saved_model.pb  variables

If you check the filesize, you will get

root@9cbba134aee4:/tmp/tmp5dy2i6uj/mobilenet/1# du -hsc *0 assets1.7M saved_model.pb17M variables18M total

Read the saved mobilenet back

Let’s try to read the saved mobilenet back to see whether we can complete the whole process

import tensorflow as tf
print(tf.__version__)
def load_model(model_filename):
    model = tf.keras.models.load_model(model_filename)
    # Check its architecture
    # model.summary()
    print(model)
    print(model.summary())if __name__ == "__main__":
    model_dir = "/tmp/tmp5dy2i6uj/mobilenet/1/"
    #model_dir = "./../model/"
    load_model(model_dir)

It works and we can get the same network as shown below

Trouble shooting

For some model read based on the above code may fail. The failing message is

2020-09-23 16:19:39.399102: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory2020-09-23 16:19:39.399237: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.2.3.02020-09-23 16:19:40.654460: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory2020-09-23 16:19:40.654515: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)2020-09-23 16:19:40.654542: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (9cbba134aee4): /proc/driver/nvidia/version does not exist2020-09-23 16:19:40.654719: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMATo enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.2020-09-23 16:19:40.660438: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2592000000 Hz2020-09-23 16:19:40.661176: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x60c8ad0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:2020-09-23 16:19:40.661215: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version<tensorflow.python.training.tracking.tracking.AutoTrackable object at 0x7f55f088d400>Traceback (most recent call last):File "model_summary.py", line 13, in <module>load_model(model_dir)File "model_summary.py", line 8, in load_modelprint(model.summary())AttributeError: 'AutoTrackable' object has no attribute 'summary'

For a regular model, which can print out the model summary, the output of print(model) is

<tensorflow.python.keras.engine.functional.Functional object at 0x7f9f300b6438>

What happened?

What is the difference between

<tensorflow.python.training.tracking.tracking.AutoTrackable object at 0x7f55f088d400>

and

<tensorflow.python.keras.engine.functional.Functional object at 0x7f9f300b6438>

It may have some relationship to The Functional API

The Functional API

Detail see [3]

TF saved_model source code and readme

See HERE

`Signature[4]`

Graphs that are used for inference tasks typically have a set of inputs and outputs. This is called a Signature
SavedModel uses SignatureDefs to allow generic support for signatures that may need to be saved with the graphs.
For commonly used SignatureDefs in the context of TensorFlow Serving, please see documentation here.

Reference

[1] https://www.tensorflow.org/guide/saved_model

[2] https://www.tensorflow.org/guide/keras/save_and_serialize

[3] https://www.tensorflow.org/guide/keras/functional

[4]https://github.com/tensorflow/tensorflow/tree/master/tensorflow/python/saved_model