Models exist in multiple formats:
Yggdrasil model: A directory containing a Yggdrasil model. The directory is recognizable by a
<optional prefix>data_spec.pbfile. This is the native format of the library. This format is used with the CLI, C++, and Go APIs.
A TensorFlow Decision Forests model: This is a TensorFlow SavedModel containing one or more Yggdrasil models and recognizable by a
saved_model.pbfile . A SavedModel itself is a directory and the Yggdrasil models are stored in one of its subdirectories. The directory containing the Yggdrasil models (currently named
assets). This format is used with the TensorFlow Decision Forests API.
Models can be “converted” between formats:
Yggdrasil model → Zipped Yggdrasil model: Zip the model directory.
Zipped Yggdrasil model → Yggdrasil model: Unzip the model directory.
TensorFlow Decision Forests model → Yggdrasil model: The Yggdrasil model is contained in the
assetssub-directory of the TensorFlow Decision Forests model directory. See details in the the next section.
(Zipped) Yggdrasil model → TensorFlow Decision Forests model: This conversion requires some code. See the next section.
Convert a a TensorFlow Decision Forests model to a Yggdrasil model#
A TensorFlow Decision Forests model directory contains a sub-dirctory containing
a Yggdrasil model. In the next example, we create a TF-DF model in
and show the Yggdrasil model in
import tensorflow_decision_forests as tfdf # Train a TF-DF model (without any pre-processing) model = tfdf.keras.GradientBoostedTreesModel() model.fit(...) # Export the model as a TF Saved Model # Note: /tmp/model/assets is an Yggdrasil Decision Forests model model.save("/tmp/model") # Show the structure of the TF SavedModel. !tree /tmp/model # /tmp/model # ├── assets # │ ├── <prefix>data_spec.pb # │ ├── <prefix>done # │ ├── <prefix>gradient_boosted_trees_header.pb # │ ├── <prefix>header.pb # │ └── <prefix>nodes-00000-of-00001 # ├── keras_metadata.pb # ├── saved_model.pb # └── variables # ├── variables.data-00000-of-00001 # └── variables.index
Yggdrasil tools can be used. The next example use the
command on a TF-DF model:
# Show the model structure using Yggdrasil tool box !yggdrasil_decision_forests/cli/show_model --model=/tmp/model/assets
An Yggdrasil model contained in a TF-DF model differ from the TF-DF model in the following points:
Any TensorFlow preprocessing (e.g., using of the
preprocessormodel constructor argument) is not taken into account.
Categorical integer features need to be offset by 1. Therefore, for simplicity of use, it is better to use categorical string features.
Convert a Yggdrasil model to a TensorFlow Decision Forests model#
The conversion of an Yggdrasil model to to a TensorFlow Decision Forests is done
tfdf.keras.yggdrasil_model_to_keras_model() function available in in
TensorFlow Decision Forests.
The following example converts a YDF model into a TensorFlow Decision Forests model:
# Prepare and load the model with TensorFlow import tensorflow_decision_forests as tfdf tfdf.keras.yggdrasil_model_to_keras_model(<path to ydf model>, <path to tfdf-model>)
yggdrasil_model_to_keras_model creates a model with raw tensor
inputs. The type of the input tensor is determined by the semantics of the
features. For example, numerical features are always encoded as float32. You can
change this signature using the
The following example converts a YDF model while making sure that
encoded into a int64 tensor.
def custom_model_input_signature(inspector) -> Any: # Default signature. input_spec = tfdf.keras.build_default_input_model_signature(inspector) # Override the type of feature_1. input_spec["feature_1"] = tf.TensorSpec(shape=[None], dtype=tf.int64) return input_spec tfdf.keras.yggdrasil_model_to_keras_model( ygg_model_path, tfdf_model_path, input_model_signature_fn=custom_model_input_signature)
In YDF, pre-integerized categorical features (i.e., categorical features
represented as integers) are 1-indexed. The value 0 (zero) is reserved for the
out-of-vocabulary item. However, many TensorFlow tools assume that categorical
integer features are 0-indexed. For this reason, TF-DF models are 0-indexed and
contain an implicit +1 offset to the YDF model underneath. This offset does not
impact users in most cases. However, when converting models, depending on how
this model will be used, removing this offset might be better. This is done with