{"templateId":"markdown","sharedDataIds":{"sidebar":"sidebar-sidebars.yaml"},"props":{"metadata":{"markdoc":{"tagList":["admonition"]},"redocly_category":"Integrations","type":"markdown"},"seo":{"title":"Embulk Bulk Import From Csv Files","description":"Treasure Data Product Documentation · Collect and Unify · Segment and Activate · Experiment and Analyze · Decisioning Automate with AI Scale and Trust.","siteUrl":"https://docs.treasuredata.com","lang":"en-US","llmstxt":{"hide":false,"sections":[{"title":"Table of contents","includeFiles":["**/*"],"excludeFiles":[]}],"excludeFiles":[]}},"dynamicMarkdocComponents":[],"compilationErrors":[],"ast":{"$$mdtype":"Tag","name":"article","attributes":{},"children":[{"$$mdtype":"Tag","name":"Heading","attributes":{"level":1,"id":"embulk-bulk-import-from-csv-files","__idx":0},"children":["Embulk Bulk Import From Csv Files"]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["You can import data from CSV files into Treasure Data using Embulk, an open-source bulk data loader. Embulk enables you to transfer data between various databases, storage locations, file formats, and cloud services."]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":[{"$$mdtype":"Tag","name":"img","attributes":{"src":"/assets/image-20191021-194315.1d20dc43b99774785560764470f9f3522e77b6b3188edfe480ebf89b39071dba.75c1e439.png","alt":""},"children":[]}]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["You can also import data from CSV files using using the Bulk Import program (td-import). Be advised that the td-import is not actively maintained and is a candidate for deprecation. Therefore, we strongly recommend using Embulk."]},{"$$mdtype":"Tag","name":"Heading","attributes":{"level":2,"id":"prerequisites","__idx":1},"children":["Prerequisites"]},{"$$mdtype":"Tag","name":"ul","attributes":{},"children":[{"$$mdtype":"Tag","name":"li","attributes":{},"children":["Basic knowledge of Treasure Data."]},{"$$mdtype":"Tag","name":"li","attributes":{},"children":["Basic Knowledge of ",{"$$mdtype":"Tag","name":"MarkdownLink","attributes":{"href":"http://www.embulk.org/docs/"},"children":["Embulk"]},"."]},{"$$mdtype":"Tag","name":"li","attributes":{},"children":["Embulk is a Java application. Make sure that Java is installed."]},{"$$mdtype":"Tag","name":"li","attributes":{},"children":["Follow the instructions in ",{"$$mdtype":"Tag","name":"MarkdownLink","attributes":{"href":"/products/customer-data-platform/integration-hub/batch/import/bulk-data-import#installing-bulk-data-import"},"children":["Installing Bulk Data Import"]},"."]}]},{"$$mdtype":"Tag","name":"Heading","attributes":{"level":2,"id":"create-a-seed-configuration-file","__idx":2},"children":["Create a Seed Configuration File"]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["Using your favorite text editor, create an Embulk config file (for eg:seed.yml) defining the input file and output Treasure Data parameters."]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":[{"$$mdtype":"Tag","name":"strong","attributes":{},"children":["Example"]}]},{"$$mdtype":"Tag","name":"CodeBlock","attributes":{"data-language":"yaml","header":{"controls":{"copy":{}}},"source":"in:\n  type: file\n  path_prefix: /path/to/files/sample_\nout:\n  type: td\n  apikey: xxxxxxxxxxxx\n  endpoint: api.treasuredata.com\n  database: dbname\n  table: tblname\n  time_column: time\n  mode: replace\n  default_timestamp_format: '%Y-%m-%d %H:%M:%S'\n","lang":"yaml"},"children":[]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["This is Sample Data."]},{"$$mdtype":"Tag","name":"CodeBlock","attributes":{"header":{"controls":{"copy":{}}},"source":"id,account,time,purchase,comment\n1,32864,2015-01-27 19:23:49,20150127,embulk\n2,14824,2015-01-27 19:01:23,20150127,embulk jruby\n3,27559,2015-01-28 02:20:02,20150128,\"Embulk \"\"csv\"\" parser plugin\"\n4,11270,2015-01-29 11:54:36,20150129,NULL\n"},"children":[]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["For further details about additional parameters available for embulk-local-file-input, refer to ",{"$$mdtype":"Tag","name":"MarkdownLink","attributes":{"href":"http://www.embulk.org/docs/built-in.html#local-file-input-plugin"},"children":["Embulk Local file input"]}," Also details about embulk-output-td, refer to the ",{"$$mdtype":"Tag","name":"MarkdownLink","attributes":{"href":"https://github.com/treasure-data/embulk-output-td#td-output-plugin-for-embulk"},"children":["TD output plugin for Embulk"]},"."]},{"$$mdtype":"Tag","name":"Heading","attributes":{"level":2,"id":"guess-fields-generate-loadyml","__idx":3},"children":["Guess Fields (Generate load.yml)"]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["Embulk guess option uses ",{"$$mdtype":"Tag","name":"code","attributes":{},"children":["seed.yml"]}," to read the target file and automatically guesses the column types and settings, and creates a new file ",{"$$mdtype":"Tag","name":"code","attributes":{},"children":["load.yml"]}," with this information."]},{"$$mdtype":"Tag","name":"CodeBlock","attributes":{"data-language":"bash","header":{"controls":{"copy":{}}},"source":"embulk guess seed.yml -o load.yml\n","lang":"bash"},"children":[]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["Generated load.yml file."]},{"$$mdtype":"Tag","name":"CodeBlock","attributes":{"data-language":"yaml","header":{"controls":{"copy":{}}},"source":"in:\n  type: file\n  path_prefix: /path/to/files/sample_\n  'last_path:': /path/to/files/sample_02.csv\n  parser:\n    charset: UTF-8\n    newline: CRLF\n    type: csv\n    delimiter: ','\n    quote: '\"'\n    escape: '\"'\n    null_string: 'NULL'\n    trim_if_not_quoted: false\n    skip_header_lines: 1\n    allow_extra_columns: false\n    allow_optional_columns: false\n    columns:\n    - {name: id, type: long}\n    - {name: account, type: long}\n    - {name: time, type: timestamp, format: '%Y-%m-%d %H:%M:%S'}\n    - {name: purchase, type: timestamp, format: '%Y%m%d'}\n    - {name: comment, type: string}\nout: {type: td, apikey: xxxxx, endpoint: api.treasuredata.com, database: dbname, table: tblname, time_column: time, mode: replace, default_timestamp_format: '%Y-%m-%d %H:%M:%S'}\n","lang":"yaml"},"children":[]},{"$$mdtype":"Tag","name":"Admonition","attributes":{"type":"info"},"children":[{"$$mdtype":"Tag","name":"p","attributes":{},"children":["Best Practice: Add the \"auto_create_table: true\" parameter to the load.yml, so that tables that do not exist are automatically."]}]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["This is a sample of the auto_create_table parameter in a .yml file."]},{"$$mdtype":"Tag","name":"CodeBlock","attributes":{"data-language":"yaml","header":{"controls":{"copy":{}}},"source":"out:\n  type: td\n  apikey: your apikey\n  endpoint: api.treasuredata.com\n  database: dbname\n  table: tblname\n  time_column: created_at\n  auto_create_table: true\n  mode: append\n","lang":"yaml"},"children":[]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["You must create the database and table in TD, prior to executing the load job."]},{"$$mdtype":"Tag","name":"ul","attributes":{},"children":[{"$$mdtype":"Tag","name":"li","attributes":{},"children":[{"$$mdtype":"Tag","name":"p","attributes":{},"children":["Alternative: If you:"]},{"$$mdtype":"Tag","name":"ul","attributes":{},"children":[{"$$mdtype":"Tag","name":"li","attributes":{},"children":["must add a database"]},{"$$mdtype":"Tag","name":"li","attributes":{},"children":["do not add the auto_create_table parameter in a .yml file and must add a table"]}]}]}]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["run the following TD commands:"]},{"$$mdtype":"Tag","name":"CodeBlock","attributes":{"data-language":"bash","header":{"controls":{"copy":{}}},"source":"td database:create dbname\ntd table:create dbname tblname\n","lang":"bash"},"children":[]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["You can also create the database and table using ",{"$$mdtype":"Tag","name":"MarkdownLink","attributes":{"href":"https://console.treasuredata.com/app/databases"},"children":["Treasure Console"]},"."]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["You can preview the data using ",{"$$mdtype":"Tag","name":"code","attributes":{},"children":["embulk preview load.yml"]}," command. If any of the column types or data seems incorrect you can edit ",{"$$mdtype":"Tag","name":"code","attributes":{},"children":["load.yml"]}," file directly and preview again to verify. If the ",{"$$mdtype":"Tag","name":"code","attributes":{},"children":["guess"]}," option doesn’t yield satisfactory results, you can change parameters in ",{"$$mdtype":"Tag","name":"code","attributes":{},"children":["load.yml"]}," according to your requirements manually by using ",{"$$mdtype":"Tag","name":"MarkdownLink","attributes":{"href":"http://www.embulk.org/docs/built-in.html#csv-parser-plugin"},"children":["CSV/TSV parser plugin options"]},"."]},{"$$mdtype":"Tag","name":"CodeBlock","attributes":{"data-language":"bash","header":{"controls":{"copy":{}}},"source":"embulk preview load.yml\n","lang":"bash"},"children":[]},{"$$mdtype":"Tag","name":"Heading","attributes":{"level":1,"id":"execute-load-job","__idx":4},"children":["Execute Load Job"]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["Issue the import job by running the following command:"]},{"$$mdtype":"Tag","name":"CodeBlock","attributes":{"data-language":"bash","header":{"controls":{"copy":{}}},"source":"embulk run load.yml\n","lang":"bash"},"children":[]},{"$$mdtype":"Tag","name":"p","attributes":{},"children":["It may take a few minutes to several hours for the job to complete, depending on the size of the data."]}]},"headings":[{"value":"Embulk Bulk Import From Csv Files","id":"embulk-bulk-import-from-csv-files","depth":1},{"value":"Prerequisites","id":"prerequisites","depth":2},{"value":"Create a Seed Configuration File","id":"create-a-seed-configuration-file","depth":2},{"value":"Guess Fields (Generate load.yml)","id":"guess-fields-generate-loadyml","depth":2},{"value":"Execute Load Job","id":"execute-load-job","depth":1}],"frontmatter":{"seo":{"title":"Embulk Bulk Import From Csv Files"}},"lastModified":"2026-06-02T03:56:21.000Z","pagePropGetterError":{"message":"","name":""}},"slug":"/int/embulk-bulk-import-from-csv-files","userData":{"isAuthenticated":false,"teams":["anonymous"]},"isPublic":true}