# Step 1: Prepare Your BigQuery Data

Organize your BigQuery tables to match the Composable Parent Segment data model: **one Customers table and one or more Behaviors tables.** Tables should reside within a BigQuery dataset in your GCP project.

## Customers Table

The Customers table is the **single source of all profile data and attributes.** Every column you want to use for segmentation **must be in this table.**

| Requirement | Description |
|  --- | --- |
| Unique key column | A column with **unique values per customer** (e.g., `cdp_customer_id`) |
| No duplicate keys | Each row must represent a unique customer profile |
| All attributes | All customer properties for segmentation must be columns in this table |


### Example Customers Table

| cdp_customer_id | email | first_name | last_name | city | country | gender | membership_tier | ltv | aov | next_best_channel | next_best_offer |
|  --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 001 | alice@example.com | Alice | Smith | Tokyo | JP | F | Gold | 5000 | 120 | email | discount_20 |
| 002 | bob@example.com | Bob | Jones | Osaka | JP | M | Silver | 2500 | 80 | push | free_shipping |


## Behaviors Tables

Each Behaviors table represents a specific type of customer activity. You can define multiple Behaviors tables (e.g., page views, purchases, email clicks).

| Requirement | Description |
|  --- | --- |
| Key column | Must contain a column with the same customer identifier used in the Customers table (e.g., `cdp_customer_id`) as join key to link back to the Customers table |
| Time column | Timestamp column for the event (e.g., `time`) -- must be Unix timestamp (INT64) or TIMESTAMP type |
| Event columns | Additional columns describing the event details |


### Example Behaviors Table

| cdp_customer_id | time | td_url | td_title |
|  --- | --- | --- | --- |
| 001 | 2025-11-20 10:30:00 | https://example.com/products | Products Page |
| 001 | 2025-11-20 11:00:00 | https://example.com/cart | Shopping Cart |
| 002 | 2025-11-20 12:15:00 | https://example.com/sale | Sale Page |


## BigQuery IAM Permissions

Ensure the service account you plan to use has the required IAM roles. You can grant these via the Google Cloud Console (**IAM & Admin** > **IAM** > **Grant Access**) or via `gcloud`:

```shell
# Grant job execution permission at project level
gcloud projects add-iam-policy-binding <project_id> \
  --member="serviceAccount:<sa_name>@<project_id>.iam.gserviceaccount.com" \
  --role="roles/bigquery.jobUser"

# OR: Grant read+write access at dataset level (required for activations)
bq add-iam-policy-binding \
  --member="serviceAccount:<sa_name>@<project_id>.iam.gserviceaccount.com" \
  --role="roles/bigquery.dataEditor" \
  <project_id>:<dataset_name>
```

| Role | Scope | When to use |
|  --- | --- | --- |
| `roles/bigquery.jobUser` | Project level | Always required -- allows Treasure AI to run query jobs |
| `roles/bigquery.dataEditor` | Dataset level | Zero-copy queries + activations (read + write) |


Also ensure the **BigQuery API** is enabled in your GCP project: **APIs & Services** > **Enabled APIs** > **BigQuery API**.