# Customer Lifetime Value (CLTV) Forecasting

Forecast how much each customer will spend in the months ahead — so you can invest where it matters most.

## Overview and Use Cases

The CLTV workflow from Treasure AI's machine learning platform (AI Signal) predicts how much each of your existing repeat customers will spend over a future window you choose, such as the next three, six, or twelve months. It works directly from your transaction history and produces a per-customer dollar forecast along with a percentile rank that shows where each customer sits relative to the rest of your base. When you choose the probabilistic model, you also get a churn probability for the same window, giving you a forward-looking view of both value and risk in a single run. The solution offers two modeling approaches under one workflow:

**Probabilistic approach (BG/NBD combined with Gamma-Gamma)** learns the rhythm of how your customers shop. How often they tend to buy, how much they typically spend, and when their purchasing pattern suggests they may be drifting away. It is a good fit when your customers have steady, repeatable buying habits, and it gives you a churn signal alongside the value forecast so you can spot at-risk customers in the same view.

**AutoML approach (FLAML)** studies your full transaction history and finds the patterns that best predict future spend. You can tune it for one of two goals depending on what you need: accurate dollar predictions when you are planning budgets or forecasting revenue, or accurate customer rankings when you only need to identify your top spenders for a VIP campaign or premium offer. This approach tends to perform best when you have a long, rich history to learn from.

Common use cases:

* Allocate acquisition and retention budgets toward customers with the highest predicted future value
* Prioritize VIP service, loyalty perks, and personalized offers for top-percentile customers
* Surface churn-risk signals (when using the probabilistic model) for proactive win-back campaigns
* Feed CLTV scores as input features into Next Best Action, lookalike, and propensity models
* Build segment tiers (Very High to Very Low) for differentiated lifecycle messaging


Who benefits most: Marketing analysts, CRM managers, and growth teams who need a forward-looking view of customer value without building a forecasting pipeline from scratch.

## Model Configuration

### Training Inputs

CLTV AI Signals accepts a transaction-level table. Each row represents a single purchase made by a single customer. Unlike RFM, no pre-aggregation is required — the solution handles summarization internally.

| **Field**  | **Type**  | **Required**  | **Description**  |
|  --- | --- | --- | --- |
| `user_id` | STRING | Yes | Unique customer identifier. The column name can be customized via the `user_column` parameter. |
| `amount` | FLOAT | Yes | Monetary value of the transaction (quantity x unit price). Must be positive (greater than zero). |
| `timestamp` | STRING | Yes | Timestamp when the transaction occurred (e.g., `2024-02-03 00:06:16.342827`). |


Data quality requirements
* Each customer must have at least three transactions in the calibration window for the model to score them. Customers below this threshold are filtered out automatically and will not appear in the output.
* Negative or zero amounts will produce errors. Clean refunds, voids, or promotional credits before training.
* Timestamps must be parseable; mixed timezones in the same column can shift recency calculations and should be normalized upstream.


### Training Outputs

CLTV AI Signals produces two output tables: a per-customer prediction table and a model metrics table. The columns produced depend on the modeling approach you choose.

**CLTV Prediction Output (all models):**

| **Field**  | **Type**  | **Description**  |
|  --- | --- | --- |
| `user_id` | STRING | Customer identifier, carried forward from input. |
| `pcltv_xmonth` | DOUBLE | Predicted lifetime value over the next X months, where X matches the `prediction_period` parameter (e.g., `pcltv_6month`). |
| `pcltv_xmonth_pctile` | LONG | Percentile rank of the predicted value, from 0 to 100. A value of 92 means the customer's predicted CLTV is higher than 92% of all scored customers. |


**Additional column when using the probabilistic (BGGG) model:**

| **Field**  | **Type**  | **Description**  |
|  --- | --- | --- |
| `pchurn_xmonth` | DOUBLE | Probability that the customer will churn (stop purchasing) within the next X months. Range: 0.0 to 1.0. |


**CLTV Metrics Output:**

| **Field**  | **Type**  | **Description**  |
|  --- | --- | --- |
| `rmse` | DOUBLE | Root Mean Squared Error of the CLTV predictions on the holdout period. |
| `mae` | DOUBLE | Mean Absolute Error of the CLTV predictions on the holdout period. |
| `label_gini` | DOUBLE | Gini coefficient computed on the true holdout labels (a ceiling on rankability). |
| `model_gini` | DOUBLE | Gini coefficient computed on the model's predictions. |
| `normalized_gini` | DOUBLE | `model_gini` / `label_gini`. Values closer to 1.0 indicate better ranking quality. |


**Additional metrics when using the probabilistic (BGGG) model:**

| **Field**  | **Type**  | **Description**  |
|  --- | --- | --- |
| `churn_auc` | DOUBLE | Area under the ROC curve for the churn classifier. |
| `churn_precision` | DOUBLE | Precision at a 0.5 churn-probability threshold. |
| `churn_recall` | DOUBLE | Recall at a 0.5 churn-probability threshold. |
| `churn_f1` | DOUBLE | F1 score at a 0.5 churn-probability threshold. |
| `churn_brier_score` | DOUBLE | Brier score — measures how well-calibrated the churn probabilities are. |
| `churn_ece` | DOUBLE | Expected Calibration Error — another calibration measure for churn probabilities. |


### Example Output

A sample of the CLTV prediction table from a six-month run using the FLAML model:

| **user_id**  | **pcltv_6month**  | **pcltv_6month_pctile**  |
|  --- | --- | --- |
| `user_001` | 450.25 | 90 |
| `user_002` | 185.00 | 80 |
| `user_003` | 12.10 | 20 |


Reading the output, the first customer is predicted to spend roughly $450 over the next six months, placing them in the top 10% of all scored customers. The third customer ranks in the bottom 20%, suggesting they are unlikely to drive significant revenue in that window — a good candidate for low-cost engagement rather than expensive paid media.

When using the BGGG model, an additional `pchurn_6month` column appears alongside the prediction columns. A customer with a high predicted CLTV and a high churn probability is your most urgent win-back target — they would be valuable if retained, but they are at risk of leaving.

### Parameters

The pipeline has two stages, training and prediction, and each accepts its own configuration. Parameters fall into two groups: general parameters that apply to both modeling approaches, and FLAML-specific parameters that only apply when using the AutoML option.

**General parameters:**

| **Parameter**  | **Type**  | **Default**  | **Required**  | **Description**  |
|  --- | --- | --- | --- | --- |
| `input_table` | string | — | Yes | Source transaction table in `dbname.table_name` format. |
| `output_table` | string | — | Yes | Destination table for predictions and metrics. |
| `model_name` | string | — | Yes (training) | A name to register the trained model under. The same name is used at prediction time to load the model. |
| `model_type` | string | `bggg` | No | Modeling approach. Use `bggg` for the probabilistic model or `flaml` for the AutoML model. |
| `user_column` | string | `user_id` | No | Name of the customer ID column in your input table. |
| `amount_column` | string | `amount` | No | Name of the transaction amount column. |
| `timestamp_column` | string | `timestamp` | No | Name of the transaction timestamp column. |
| `prediction_period` | int | `3` | No | Forecast horizon in months. Common values are 3, 6, or 12. Used to define both calibration and holdout periods. |
| `min_transactions` | int | `3` | No | Minimum transactions per customer during the calibration period. Customers below this threshold are dropped. The floor is 3. |
| `split_strategy` | string | auto-set | No | Data-splitting approach: `single_cutoff` or `dual_cutoff`. We recommend leaving this unset so it defaults to the right strategy for your `model_type`. |


**FLAML-specific parameters** (used only when `model_type: flaml`):

| **Parameter**  | **Type**  | **Default**  | **Required**  | **Description**  |
|  --- | --- | --- | --- | --- |
| `optimization_goal` | string | `ranking` | No | What the AutoML search optimizes for. Use `value` to minimize dollar error (RMSE), or `ranking` to maximize Gini and rank top customers correctly. |
| `time_budget` | int | `60` | No | Total seconds allocated to the FLAML model search. Longer budgets explore more model configurations. |


**Tuning notes:**

* **`model_type: bggg`** — Pick this when you want interpretable behavioral parameters and a built-in churn signal, and when your customers have steady repeat-purchase patterns. It's the simplest starting point and works well for most use cases.
* **`model_type: flaml`** — Pick this when you have rich transaction history and want to fine-tune training toward a specific business goal. For example, optimizing for customer ranking (Gini) or absolute value accuracy (RMSE). FLAML gives you more customization options, including `optimization_goal` (which is only available with FLAML).
* **`optimization_goal: value`** — Choose when you need accurate dollar predictions for budgeting or revenue forecasting.
* **`optimization_goal: ranking`** — Choose when you only need to identify your top customers correctly. For example, to pick the top 10% for a VIP campaign.
* The `prediction_period` and your data horizon are linked. Dual cutoff (used by FLAML) splits the timeline into more pieces than single cutoff, so longer prediction horizons demand longer total histories. See the data splitting section below for the math.


### Data Splitting Strategies

The two model families consume your data differently, and that drives a different splitting strategy for each. You typically will not need to set this by hand — the pipeline binds the right strategy to each `model_type` automatically — but understanding what is happening makes the parameter choices clearer.

**Single cutoff** (used by BGGG) divides the timeline once into a calibration period (used to fit the model) and a holdout period (used to evaluate it). Probabilistic models do not need labeled training examples in the calibration period — they extract behavioral parameters directly from transaction summaries — so a single split is enough.

**Dual cutoff** (used by FLAML) divides the timeline at two points (T1 and T2), producing a feature window, a label window for training, and a final evaluation window. Machine learning models need labeled training examples, and a single cutoff would force the model to peek at outcomes that would not have been available at prediction time. The dual-cutoff structure prevents this temporal leakage by simulating two historical prediction points.

A worked example: suppose your data spans January 2022 through December 2024 (a thirty-six-month horizon) and you choose a six-month prediction period. With single cutoff, the calibration period covers January 2022 through June 2024 and the holdout covers July through December 2024. With dual cutoff, training features come from January 2022 through December 2023, training labels come from January through June 2024, evaluation features extend through June 2024, and final evaluation labels come from July through December 2024. Notice that dual cutoff leaves you with less data to learn from, which is why FLAML benefits from longer histories — at minimum, plan for two years of training data.

If your horizon is short and you still want to use FLAML, consider shortening the `prediction_period` (six months instead of twelve, for example). Forcing FLAML to use single cutoff is possible by overriding `split_strategy`, but it introduces temporal leakage and the evaluation metrics become unreliable. We do not recommend this path.

## Example Workflow Code

CLTV AI Signals runs in two stages: a training workflow that fits and registers the model, and a prediction workflow that scores customers using the registered model. The two stages can run on different cadences. For example, train monthly and predict weekly, which keeps compute costs down without letting your scores go stale.

The workflow structure is the same for both modeling approaches; the difference is in the `solution_arguments` block. FLAML accepts `optimization_goal` and `time_budget` to control its AutoML search; BGGG ignores those parameters and runs with its default behavioral fitting. Pick the pair that matches the approach you chose.

BGGG
### BGGG Training Workflow


```yaml
# cltv_train_bggg.dig
# Treasure Workflow: CLTV AI Signals — BGGG Training
# Fits the probabilistic (BG/NBD + Gamma-Gamma) model on your
# transaction history and registers it under `model_name`.
# The trained model produces both CLTV and churn predictions.

_export:
  user_column: user_id
  amount_column: amount
  timestamp_column: timestamp
  input_dataset: uc_irvine_online_retail
  model_type: bggg

# Step 1: Submit the CLTV training job
+train:
  http>: https://ml-batch-api.treasuredata.com/v1/runs
  method: POST
  headers:
    - authorization: ${secret:td.apikey}
    - X-TD-ML-SESSION-ID: ${session_id}
    - X-TD-ML-ATTEMPT-ID: ${attempt_id}
  store_content: true
  content:
    input_table: ml_sample_datasets.${input_dataset}
    output_table: some_db.${input_dataset}_${model_type}_outputs
    solution_name: cltv_train
    solution_arguments:
      model_name: ${input_dataset}_${model_type}
      user_column: ${user_column}
      amount_column: ${amount_column}
      timestamp_column: ${timestamp_column}
      model_type: ${model_type}
      # Note: optimization_goal and time_budget are FLAML-only and
      # are not used by BGGG. The split_strategy defaults to
      # single_cutoff automatically for BGGG.

+print_response:
  echo>: "Training job submitted. Response: ${http.last_content}"

# Step 2: Poll until training completes
+train_poll_status:
  http>: https://ml-batch-api.treasuredata.com/v1/runs/${JSON.parse(http.last_content)['id']}/status
  method: GET
  headers:
    - authorization: ${secret:td.apikey}
```

### BGGG Prediction Workflow


```yaml
# cltv_predict_bggg.dig
# Treasure Workflow: CLTV AI Signals — BGGG Prediction
# Loads the BGGG model registered under `model_name` and scores
# the input customers. Output includes both `pcltv_xmonth` and
# `pchurn_xmonth` per customer.

_export:
  user_column: user_id
  amount_column: amount
  timestamp_column: timestamp
  input_dataset: uc_irvine_online_retail
  model_type: bggg

# Step 1: Submit the prediction job
+pred:
  http>: https://ml-batch-api.treasuredata.com/v1/runs
  method: POST
  headers:
    - authorization: ${secret:td.apikey}
    - X-TD-ML-SESSION-ID: ${session_id}
    - X-TD-ML-ATTEMPT-ID: ${attempt_id}
  store_content: true
  content:
    input_table: ml_sample_datasets.${input_dataset}
    output_table: some_db.${input_dataset}_${model_type}_predictions
    solution_name: cltv_predict
    solution_arguments:
      model_name: ${input_dataset}_${model_type}     # Must match training run
      user_column: ${user_column}
      amount_column: ${amount_column}
      timestamp_column: ${timestamp_column}
      model_type: ${model_type}

+print_response:
  echo>: "Prediction job submitted. Response: ${http.last_content}"

# Step 2: Poll until prediction completes
+pred_poll_status:
  http>: https://ml-batch-api.treasuredata.com/v1/runs/${JSON.parse(http.last_content)['id']}/status
  method: GET
  headers:
    - authorization: ${secret:td.apikey}
```

FLAML
### FLAML Training Workflow


```yaml
# cltv_train_workflow.dig
# Treasure Workflow: CLTV AI Signals — Training
# Submits a CLTV training job to the ML Batch API and polls for
# completion. Registers the trained model under `model_name` for
# later use in the prediction workflow.

_export:
  user_column: user_id
  amount_column: amount
  timestamp_column: timestamp
  input_dataset: uc_irvine_online_retail
  model_type: flaml                  # Use 'bggg' for the probabilistic model
  optimization_goal: ranking         # Or 'value' for dollar-accurate predictions
  time_budget: 60                    # Seconds for the FLAML search

# Step 1: Submit the CLTV training job
+train:
  http>: https://ml-batch-api.treasuredata.com/v1/runs
  method: POST
  headers:
    - authorization: ${secret:td.apikey}
    - X-TD-ML-SESSION-ID: ${session_id}
    - X-TD-ML-ATTEMPT-ID: ${attempt_id}
  store_content: true
  content:
    input_table: ml_sample_datasets.${input_dataset}
    output_table: some_db.${input_dataset}_${model_type}_outputs
    solution_name: cltv_train
    solution_arguments:
      model_name: ${input_dataset}_${model_type}
      user_column: ${user_column}
      amount_column: ${amount_column}
      timestamp_column: ${timestamp_column}
      model_type: ${model_type}
      optimization_goal: ${optimization_goal}
      time_budget: ${time_budget}

+print_response:
  echo>: "Training job submitted. Response: ${http.last_content}"

# Step 2: Poll until training completes
+train_poll_status:
  http>: https://ml-batch-api.treasuredata.com/v1/runs/${JSON.parse(http.last_content)['id']}/status
  method: GET
  headers:
    - authorization: ${secret:td.apikey}
```

### FLAML Prediction Workflow


```yaml
# cltv_predict_workflow.dig
# Treasure Workflow: CLTV AI Signals — Prediction
# Loads the trained model registered under `model_name` and scores
# the input customers, writing results to `output_table`.

_export:
  user_column: user_id
  amount_column: amount
  timestamp_column: timestamp
  input_dataset: uc_irvine_online_retail
  model_type: flaml

# Step 1: Submit the prediction job
+pred:
  http>: https://ml-batch-api.treasuredata.com/v1/runs
  method: POST
  headers:
    - authorization: ${secret:td.apikey}
    - X-TD-ML-SESSION-ID: ${session_id}
    - X-TD-ML-ATTEMPT-ID: ${attempt_id}
  store_content: true
  content:
    input_table: ml_sample_datasets.${input_dataset}
    output_table: some_db.${input_dataset}_${model_type}_predictions
    solution_name: cltv_predict
    solution_arguments:
      model_name: ${input_dataset}_${model_type}     # Must match training run
      user_column: ${user_column}
      amount_column: ${amount_column}
      timestamp_column: ${timestamp_column}
      model_type: ${model_type}

+print_response:
  echo>: "Prediction job submitted. Response: ${http.last_content}"

# Step 2: Poll until prediction completes
+pred_poll_status:
  http>: https://ml-batch-api.treasuredata.com/v1/runs/${JSON.parse(http.last_content)['id']}/status
  method: GET
  headers:
    - authorization: ${secret:td.apikey}
```

**What these workflows do:** The training workflow submits a job that fits the chosen model type, evaluates it on the holdout period, writes metrics to your output table, and registers the trained model under `model_name`. The prediction workflow loads that registered model and scores the customers in your input table, writing per-customer predictions to a separate output table. The polling steps handle the ML Batch API's behavior of returning HTTP 408 while a job is still running — Treasure Workflow automatically retries until it receives HTTP 200.

**Why split training from prediction?** Training is expensive and only needs to run when your customer base or business has changed materially — typically once a month or once a quarter. Prediction is cheap and benefits from running often, usually weekly, so that recently active customers receive fresh scores. Splitting the two stages lets you tune each cadence independently.

Security
Store your TD API key as a Workflow secret named `td.apikey`. Never hardcode API keys in a workflow definition or commit them to version control.

## Modeling Approaches

CLTV AI Signals supports two modeling approaches under the same configuration surface. Each has different strengths, and the right choice depends on your data and what you need from the predictions.

### Probabilistic (BG/NBD + Gamma-Gamma)

The probabilistic approach models two underlying processes separately. The transaction process (BG/NBD) describes how often a customer purchases while they are still active and the probability that they have silently churned. The monetary process (Gamma-Gamma) describes the typical value of each transaction. Combining the two gives you both a CLTV forecast and a churn probability for the same window.

This approach shines when your customers have steady, repeat-purchase patterns — think specialty retail, grocery, or subscription-adjacent categories — because its assumptions about purchase intervals match those settings well. It is also more interpretable than AutoML: each fitted parameter has a clear behavioral meaning, which helps when you need to explain results to stakeholders.

### AutoML (FLAML)

The AutoML approach treats CLTV as a supervised learning problem. It builds RFM-style features from the calibration window and trains a regression model to predict spend in the holdout window, automatically searching across model families and hyperparameters within your `time_budget`.

The optimization goal determines what the search prioritizes. With `optimization_goal: value`, FLAML minimizes RMSE, producing the most accurate dollar amounts — useful for budgeting or revenue forecasting. With `optimization_goal: ranking`, it optimizes for Gini, producing predictions that order customers correctly even if the absolute dollar amounts are off — useful when you only need to find your top customers for a VIP campaign or budget allocation. AutoML does not produce churn probabilities.

## Limitations and Known Issues

* **Not for first-time or prospective customers.** The model is built for existing repeat customers. Customers with fewer than three transactions in the calibration period are filtered out automatically because the underlying assumptions do not hold for them. For prospect or one-purchase customers, a different scoring approach is needed.
* **Not designed for fixed-price subscriptions.** When monetary value is flat across customers (like cell phone plans), CLTV reduces to a churn forecast, which the BGGG model partially provides but is not optimized for as a primary objective.
* **Forecast accuracy depends on history length and quality.** The probabilistic model needs enough purchase intervals per customer to estimate behavioral parameters reliably, and FLAML with dual cutoff needs at least two years of total horizon for a six-month prediction period. Shorter horizons compress the training window and degrade evaluation reliability.
* **Long prediction horizons amplify uncertainty.** Predicting twelve-month CLTV is meaningfully harder than predicting three-month CLTV because more can change in a customer's life and your business over a longer window. Treat long-horizon predictions as directional rather than precise.
* **Single-cutoff with FLAML causes temporal leakage.** The pipeline allows this configuration to keep all options open, but the resulting metrics should not be trusted. Stick with the default `split_strategy` unless you have a specific reason to override it and understand the tradeoff.
* **Sensitive to data freshness.** Stale transaction data shifts every customer toward looking inactive, which biases predictions downward and inflates apparent churn risk. Schedule prediction runs to match your campaign cadence — weekly for fast-moving retail, monthly or quarterly for lower-frequency categories.


## Glossary

| **Term**  | **Definition**  |
|  --- | --- |
| CLTV | Customer Lifetime Value — the total revenue a single customer is expected to generate over a defined future window. |
| Calibration period | The historical window the model uses to learn customer behavior. |
| Holdout period | The future window held back during training and used to evaluate how well the model predicts. |
| Prediction period | The future window the model produces forecasts for, set by the `prediction_period` parameter (e.g., 6 months). |
| Single cutoff | A splitting strategy that divides the timeline once into a calibration period and a holdout period. Used by the BGGG model. |
| Dual cutoff | A splitting strategy that divides the timeline at two points (T1, T2) to produce a clean training example without temporal leakage. Used by FLAML. |
| Temporal leakage | When a model indirectly sees information that would not have been available at prediction time. Causes evaluation metrics to be artificially inflated. |
| BG/NBD | Beta-Geometric / Negative Binomial Distribution — a probabilistic model of customer purchase frequency and silent churn. |
| Gamma-Gamma | A probabilistic model of average transaction monetary value. Pairs with BG/NBD to produce CLTV forecasts. |
| FLAML | Fast and Lightweight AutoML — an open-source library that automatically searches model families and hyperparameters within a time budget. |
| Optimization goal | The metric FLAML's search tries to optimize. `value` minimizes dollar-error (RMSE); `ranking` maximizes ranking quality (Gini). |
| Gini coefficient | A measure of how well predictions rank customers from highest to lowest value. Higher is better; 1.0 is a perfect ranking. |
| Normalized Gini | `model_gini` / `label_gini`. Values close to 1.0 mean the model ranks customers nearly as well as the true labels would allow. |
| Churn probability | The probability that a customer will stop purchasing within the prediction period. Produced only by the BGGG model. |
| Master Segment | A Treasure AI CDP construct that holds all customers scored by the model. Child Segments are subsets filtered by predicted CLTV percentile. |


## FAQs

| **Question**  | **Answer**  |
|  --- | --- |
| Which model type should I pick — BGGG or FLAML? | Start with BGGG as the default model. It provides built-in churn prediction and works well with steady repeat-purchase patterns or shorter transaction histories. If BGGG doesn't produce the intended results, or if you want to explicitly optimize for either ranking accuracy (identifying top customers) or value accuracy (predicting dollar amounts), switch to FLAML. FLAML requires a longer transaction history. If you're unsure which to use, run both and compare the holdout metrics — the pipeline reports the same evaluation metrics for each model, making the comparison straightforward. |
| What is the difference between `optimization_goal: value` and `optimization_goal: ranking`? | The `value` setting tells FLAML to minimize RMSE, so the model tries to predict dollar amounts that are as close as possible to the true future spend. Use this when you need accurate revenue forecasts or budgeting numbers. The `ranking` setting tells FLAML to maximize Gini, so the model tries to put your top customers at the top and your bottom customers at the bottom, even if the absolute dollar amounts drift. Use this when you only need to identify the top decile for a VIP campaign — the relative order matters more than the exact numbers. |
| How often should I retrain versus repredict? | Training is expensive and only needs to run when your customer base, product mix, or business model has shifted enough to change behavior — typically monthly or quarterly. Prediction is much cheaper and benefits from running often, usually weekly, so customers who have recently transacted get fresh scores. Splitting the two stages lets you tune each cadence independently and keeps compute costs down. |
| Why are some of my customers missing from the prediction output? | The pipeline filters out any customer with fewer than three transactions in the calibration period because the underlying models cannot fit reliable parameters with less data. This is controlled by `min_transactions` (default and minimum is 3). Single-purchase or prospective customers are not in scope for this solution and need a different scoring approach. |
| Can I predict CLTV for prospective customers who have not purchased yet? | Not with this solution — it is designed for existing repeat customers. Prospective customer scoring uses different signals, like ad clicks, page views, and cart activity, rather than transaction history, and is handled by separate prospect-scoring models. |
| Why do my dual-cutoff FLAML metrics look worse than my single-cutoff metrics? | Because the dual-cutoff metrics are honest. Single-cutoff with FLAML produces temporal leakage, which inflates the metrics — the model is implicitly seeing information from the future. The dual-cutoff numbers are what you should expect to see in production. If you are seeing a meaningful gap between the two, that gap is the size of the leakage your model would have benefited from in single-cutoff mode, and it tells you the single-cutoff metrics were not trustworthy to begin with. |
| How do I activate CLTV scores in a campaign tool? | The output table contains `pcltv_xmonth` and `pcltv_xmonth_pctile` for every scored customer. You can build segments directly from these — for example, a "Very High Value" segment as percentile >= 80, or a "VIP" segment as the top 1%. These segments push to downstream activation destinations through standard CDP activation workflows alongside any other audience. |