Skip to content

Step 1: Prepare Your Databricks Data

Organize your Databricks tables to match the Parent Segment data model: one Customers table and one or more Behaviors tables. Tables should be managed or external tables within a Unity Catalog schema.

Customers Table

The Customers table is the single source of all profile data and attributes. Every column you want to use for segmentation must be in this table.

RequirementDescription
Unique key columnA column with unique values per customer (e.g., cdp_customer_id)
No duplicate keysEach row must represent a unique customer profile
All attributesAll customer properties for segmentation must be columns in this table

Example Customers Table

cdp_customer_idemailfirst_namelast_namecitycountrygendermembership_tierltvaovnext_best_channelnext_best_offer
001alice@example.comAliceSmithTokyoJPFGold5000120emaildiscount_20
002bob@example.comBobJonesOsakaJPMSilver250080pushfree_shipping

Behaviors Tables

Each Behaviors table represents a specific type of customer activity. You can define multiple Behaviors tables (e.g., page views, purchases, email clicks).

RequirementDescription
Key columnMust contain a column with the same customer identifier used in the Customers table (e.g., cdp_customer_id) to join the two tables
Time columnTimestamp column for the event (e.g., time)
Event columnsAdditional columns describing the event details

Example Behaviors Table

cdp_customer_idtimetd_urltd_title
0012025-11-20 10:30:00https://example.com/productsProducts Page
0012025-11-20 11:00:00https://example.com/cartShopping Cart
0022025-11-20 12:15:00https://example.com/saleSale Page

Databricks Permissions

Ensure the service principal you plan to use has the required permissions in Unity Catalog. You can grant these via the Databricks UI (Catalog > select your catalog > Permissions > Grant) or via SQL:

-- Grant usage on the catalog and schema
GRANT USE CATALOG ON CATALOG <catalog_name> TO `<service_principal_name>`;
GRANT USE SCHEMA ON SCHEMA <catalog_name>.<schema_name> TO `<service_principal_name>`;

-- Grant SELECT on the Customers table
GRANT SELECT ON TABLE <catalog_name>.<schema_name>.customers TO `<service_principal_name>`;

-- Grant SELECT on each Behaviors table
GRANT SELECT ON TABLE <catalog_name>.<schema_name>.behavior_pageviews TO `<service_principal_name>`;
-- Repeat for each Behaviors table

-- Grant CREATE TABLE on the schema to create and clean up temporary tables during activation
GRANT CREATE TABLE ON SCHEMA <catalog_name>.<schema_name> TO `<service_principal_name>`;

Alternatively, use the Data Reader privilege preset in the Databricks UI to grant USE CATALOG, USE SCHEMA, EXECUTE, and SELECT permissions at the catalog level. Note that the Data Reader preset does not include CREATE TABLE, which is additionally required for running activations and must be granted separately as shown above.