Skip to content

Adding a Custom Python Script to Your Workflow

You can run Python scripts from the Treasure Workflow using the Python operator (py>:). Create the workflow definition using Treasure Console or using Treasure Workflow from the command line.

In the workflow definition, specify a Docker image to use for running the Python script. When the workflow task starts, a new Docker container is created based on the specified Docker image. Then, the Python script is executed in the container in an isolated environment.

Prerequisites

  • Make sure this feature is enabled for your Treasure account.

  • Basic Knowledge of Treasure Workflow's syntax.

  • If you intend to use the CLI, you need to do the following:

Python Examples

You might want to view examples for basics such as:

  • How to call functions

  • How to pass parameters to functions

  • How to use environment variables

  • How to import functions

Add your Python Script to Treasure Workflow

Using the command line method is recommended if you have more than a few scripts to add.

Using Treasure Console

  1. Navigate to Data Workbench > Workflows.

  2. Select the workflow to which you would like to add the Python scripts.

  3. Select Launch Project Editor.

  4. Select Edit Files.

  5. Select Add New File.

  6. Type in your dig filename.

  7. Add the py> operator and specify a Docker image that you want to use. Your script might look like this sample:

+py_custom_code:
  py>: tasks.printMessage
  docker:
    image: "treasuredata/customscript-python:3.12.11-td2"

For the latest available images, see Custom Scripts Docker Images.

  1. You can add each script or copy-paste the text of each script into the new script editor window.

9. Select Save & Commit.

Using td CLI

You can add a Python script to your existing workflow using the command line. However, new users may need to create a workflow using the command line first.

  1. Add a workflow definition .dig file and Python script to the workflow directory.

  2. Specify a Docker image you want to use for the py>: operator in the .dig file.

  3. Add syntax similar to the following to your workflow dig file to add the py> operator and specify the Docker image. Your script might look like the following sample:

  4. Push the workflow to Treasure Data using td CLI command td wf push <project_name>

+<wf_task_name>:
  py>: <script_filename>.<function_name>
  docker:
    image: "<image_name>:<version>"

Docker Images

The Python scripts in Treasure Workflows are managed and run by Treasure Data in isolated Docker containers. Treasure Data provides a number of base Docker images to run in the container. You can pick the appropriate Docker image to run your Python script in, based on the Python version and libraries supported by the image.

+task_name:
  py>: <script_filename>.<function_name>
  docker:
    image: "<image_name>:<version>"

For available image names and versions, see Custom Scripts Docker Images.

Install Your Own Python Libraries

In addition to the libraries provided by the Docker image, you can install additional 3rd party libraries using the pip install command within the Python script.

From within your Python script, add the following syntax to install libraries:

import os
import sys

os.system(f"{sys.executable} -m pip install asn1==3.1.0")
WARNING: Python package version pinning

It is recommended to install or update Python packages with a specific version to avoid unexpected issues caused by future updates of the packages.

For example, use the following syntax to install a specific version of a package:

os.system(f"{sys.executable} -m pip install asn1==3.1.0")

Do not use the following syntax without specifying a version, as it may lead to unexpected issues when the package is updated in the future:

os.system(f"{sys.executable} -m pip install -U pytd")

Using Docker Images on Your Local Laptop

Docker images are also published in Dockerhub and publicly available on your laptop for evaluation or testing purposes.

Prerequisite: Docker runtime installed.

Note

The examples below use treasuredata/customscript-python:3.12.11-td2. For the latest available image, see Custom Scripts Docker Images. Replace the image tag in any command before running it.

You can confirm the python version as follows on your laptop:

$ docker run -it --rm treasuredata/customscript-python:3.12.11-td2 python --version

To run an interactive session, you can run as follows:

$ docker run -it --rm treasuredata/customscript-python:3.12.11-td2 bash
$ whoami
> td-user

Python interactive shell is launched when running the image without arguments:

$ docker run -it --rm treasuredata/customscript-python:3.12.11-td2
Python 3.12.11 (main, Aug 12 2025, 22:47:31) [GCC 12.2.0] on linux
>>

You can get a complete list of library versions using pip freeze:

$ docker run -it --rm treasuredata/customscript-python:3.12.11-td2 pip freeze
> aiohappyeyeballs==2.6.1
> aiohttp==3.12.15
> aiosignal==1.4.0
> alembic==1.16.4
>
$ docker run -it --rm treasuredata/customscript-python:3.12.11-td2 pip freeze | grep scikit
> scikit-learn==1.7.1
$ docker run -it --rm treasuredata/customscript-python:3.12.11-td2 pip freeze | grep pytd
> pytd==2.2.0