# Step 1: Prepare Your Databricks Data

Organize your Databricks tables to match the Parent Segment data model: **one Customers table and one or more Behaviors tables.** Tables should be managed or external tables within a Unity Catalog schema.

## Customers Table

The Customers table is the **single source of all profile data and attributes.** Every column you want to use for segmentation **must be in this table.**

| Requirement | Description |
|  --- | --- |
| Unique key column | A column with **unique values per customer** (e.g., `cdp_customer_id`) |
| No duplicate keys | Each row must represent a unique customer profile |
| All attributes | All customer properties for segmentation must be columns in this table |


### Example Customers Table

| cdp_customer_id | email | first_name | last_name | city | country | gender | membership_tier | ltv | aov | next_best_channel | next_best_offer |
|  --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 001 | alice@example.com | Alice | Smith | Tokyo | JP | F | Gold | 5000 | 120 | email | discount_20 |
| 002 | bob@example.com | Bob | Jones | Osaka | JP | M | Silver | 2500 | 80 | push | free_shipping |


## Behaviors Tables

Each Behaviors table represents a specific type of customer activity. You can define multiple Behaviors tables (e.g., page views, purchases, email clicks).

| Requirement | Description |
|  --- | --- |
| Key column | Must contain a column with the same customer identifier used in the Customers table (e.g., `cdp_customer_id`) to join the two tables |
| Time column | Timestamp column for the event (e.g., `time`) |
| Event columns | Additional columns describing the event details |


### Example Behaviors Table

| cdp_customer_id | time | td_url | td_title |
|  --- | --- | --- | --- |
| 001 | 2025-11-20 10:30:00 | https://example.com/products | Products Page |
| 001 | 2025-11-20 11:00:00 | https://example.com/cart | Shopping Cart |
| 002 | 2025-11-20 12:15:00 | https://example.com/sale | Sale Page |


## Databricks Permissions

Ensure the service principal you plan to use has the required permissions in Unity Catalog. You can grant these via the Databricks UI (**Catalog** > select your catalog > **Permissions** > **Grant**) or via SQL:


```sql
-- Grant usage on the catalog and schema
GRANT USE CATALOG ON CATALOG <catalog_name> TO `<service_principal_name>`;
GRANT USE SCHEMA ON SCHEMA <catalog_name>.<schema_name> TO `<service_principal_name>`;

-- Grant SELECT on the Customers table
GRANT SELECT ON TABLE <catalog_name>.<schema_name>.customers TO `<service_principal_name>`;

-- Grant SELECT on each Behaviors table
GRANT SELECT ON TABLE <catalog_name>.<schema_name>.behavior_pageviews TO `<service_principal_name>`;
-- Repeat for each Behaviors table

-- Grant CREATE TABLE on the schema to create and clean up temporary tables during activation
GRANT CREATE TABLE ON SCHEMA <catalog_name>.<schema_name> TO `<service_principal_name>`;
```

Alternatively, use the **Data Reader** privilege preset in the Databricks UI to grant USE CATALOG, USE SCHEMA, EXECUTE, and SELECT permissions at the catalog level. Note that the Data Reader preset does not include `CREATE TABLE`, which is additionally required for running activations and must be granted separately as shown above.