YDF is a library to train, evaluate, interpret, and
serve Random Forest,

Gradient Boosted Decision Trees, and CART decision forest models.

Gradient Boosted Decision Trees, and CART decision forest models.

A concise and modern API

YDF allows for rapid prototyping and development while minimizing risk of modeling errors.

Deep learning composition

Integrated with TensorFlow, Keras, and Vertex AI.

Cutting-edge algorithms

Include the latest decision forest research to ensure maximum performance.

Fast inference

Compute predictions in a few microseconds. Executed in the tens of millions of times per second in Google.

## Key features

**Modeling**

- Train Random Forest, Gradient Boosted Trees, Cart, and Isolation Forest models.
- Train classification, regression, ranking, uplifting, and anomaly detection models.
- Plot of decision trees.
- Interpret model (variable importances, partial dependence plots, conditional dependence plots).
- Interpret predictions (counter factual, feature variation).
- Evaluate models (accuracy, AUC, ROC plots, RMSE, confidence intervals, cross-validation).
- Hyper-parameter tune models.
- Consume natively numerical, categorical, boolean, tags, text, and missing values.
- Consume natively Pandas Dataframe, Numpy Arrays, TensorFlow Datasets, CSV files and TensorFlow Records.

**Serving**

- Benchmark model inference.
- Run models in Python, C++, Go, JavaScript, and CLI.
- Online inference with REST API with TensorFlow Serving and Vertex AI.

**Advanced modeling**

- Compose YDF models with Neural Network models in TensorFlow, Keras, and JAX.
- Distributed training over billions of examples and hundreds of machines.
- Use cutting-edge learning algorithm such as oblique splits, honest trees, hessian scores, global tree optimizations, optimal categorical splits, categorical-set inputs, dart, extremely randomized trees.
- Apply monotonic constraints.
- Consumes multi-dimensional features.
- Enjoy backward compatibility for model and learners since 2018.
- Edits trees in Python.
- Define custom loss in Python.

## Supported platforms

YDF is available on Python 3.8, 3.9, 3.11 and 3.12, on Windows x86-64, Linux x86-64, and macOS ARM64.

## Installation

To install YDF from PyPI, run:

## Usage example

```
import ydf
import pandas as pd
# Load dataset with Pandas
ds_path = "https://raw.githubusercontent.com/google/yggdrasil-decision-forests/main/yggdrasil_decision_forests/test_data/dataset/"
train_ds = pd.read_csv(ds_path + "adult_train.csv")
test_ds = pd.read_csv(ds_path + "adult_test.csv")
# Train a Gradient Boosted Trees model
model = ydf.GradientBoostedTreesLearner(label="income").train(train_ds)
# Look at a model (input features, training logs, structure, etc.)
model.describe()
# Evaluate a model (e.g. roc, accuracy, confusion matrix, confidence intervals)
model.evaluate(test_ds)
# Generate predictions
model.predict(test_ds)
# Analyse a model (e.g. partial dependence plot, variable importance)
model.analyze(test_ds)
# Benchmark the inference speed of a model
model.benchmark(test_ds)
# Save the model
model.save("/tmp/my_model")
```

## Next steps

Read the 🧭 Getting Started tutorial. You will learn how to train a model, interpret it, evaluate it, generate predictions, benchmark its speed, and export it for serving.

Ask us questions on Github. Googlers can join the internal YDF Chat.

Read the TF-DF to YDF Migration guide to convert a TensorFlow Decision Forests pipeline into a YDF pipeline.