FastAPI + DockerĀ¶
SetupĀ¶
pip install ydf -U
import ydf
import pandas as pd
Training a modelĀ¶
We first train a model using the "adult" dataset. For a comprehensive explanation of model training, evaluation, and interpretation with YDF, read the getting started tutorial.
We load the dataset:
dataset_path = "https://raw.githubusercontent.com/google/yggdrasil-decision-forests/main/yggdrasil_decision_forests/test_data/dataset"
dataset = pd.read_csv(f"{dataset_path}/adult_train.csv")
dataset.head(5)
age | workclass | fnlwgt | education | education_num | marital_status | occupation | relationship | race | sex | capital_gain | capital_loss | hours_per_week | native_country | income | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 44 | Private | 228057 | 7th-8th | 4 | Married-civ-spouse | Machine-op-inspct | Wife | White | Female | 0 | 0 | 40 | Dominican-Republic | <=50K |
1 | 20 | Private | 299047 | Some-college | 10 | Never-married | Other-service | Not-in-family | White | Female | 0 | 0 | 20 | United-States | <=50K |
2 | 40 | Private | 342164 | HS-grad | 9 | Separated | Adm-clerical | Unmarried | White | Female | 0 | 0 | 37 | United-States | <=50K |
3 | 30 | Private | 361742 | Some-college | 10 | Married-civ-spouse | Exec-managerial | Husband | White | Male | 0 | 0 | 50 | United-States | <=50K |
4 | 67 | Self-emp-inc | 171564 | HS-grad | 9 | Married-civ-spouse | Prof-specialty | Wife | White | Female | 20051 | 0 | 30 | England | >50K |
We train a model with default parameters:
model = ydf.GradientBoostedTreesLearner(label="income").train(dataset)
Train model on 22792 examples Model trained in 0:00:01.420861
We can generate predictions to make sure the model works:
Note that model.predict
takes as input a batch of examples (i.e., a list of examples). If we only have one example, we need to create a list of one value for each of the features.
model.predict({'age': [44],
'workclass': ['Private'],
'fnlwgt': [228057],
'education': ['7th-8th'],
'education_num': [4],
'marital_status': ['Married-civ-spouse'],
'occupation': ['Machine-op-inspct'],
'relationship': ['Wife'],
'race': ['White'],
'sex': ['Female'],
'capital_gain': [0],
'capital_loss': [0],
'hours_per_week': [40],
'native_country': ['Dominican-Republic']})
array([0.02801839], dtype=float32)
For a binary classification model (i.e., a model that can predict one of two classes), the output is the probability of the positive class:
model.label_classes()[True]
'>50K'
Packaging the model into a DockerĀ¶
model.to_docker(path)
export the model into a Docker.
model.to_docker("my_docker_model")
You can look at the Docker content. In some advanced case, you might want to update some of the automatically generated files.
!ls -l my_docker_model
total 4 -rw-rw-r-- 1 gbm primarygroup 288 Jul 26 13:39 deploy_in_google_cloud.sh -rw-rw-r-- 1 gbm primarygroup 211 Jul 26 13:39 Dockerfile -rw-rw-r-- 1 gbm primarygroup 1313 Jul 26 13:39 main.py drwxrwxr-x 1 gbm primarygroup 0 Jul 26 13:39 model -rw-rw-r-- 1 gbm primarygroup 360 Jul 26 13:39 readme.txt -rw-rw-r-- 1 gbm primarygroup 21 Jul 26 13:39 requirements.txt -rw-rw-r-- 1 gbm primarygroup 485 Jul 26 13:39 test_locally.sh
The docker can be deployed and tested locally with:
docker build -t ydf_predict_image ./my_docker_model
docker run --rm -p 8080:8080 -d ydf_predict_image
Note: For this command to run, you'll need to install Docker.
The test_locally.sh
script available in the generated docker directory shows how to generate a local request.
Finally, the docker can be deployed on Google Cloud with:
gcloud run deploy ydf-predict --source ./my_docker_model
The deployed model can be monitored with the Google Cloud Console.
Note: For this command to run, you'll need to install Google Cloud CLI and setup a project.