All Products
Search
Document Center

DataWorks:Snowflake data source

Last Updated:Dec 08, 2025

DataWorks data integration supports reading data from and writing data to Snowflake. This topic describes the data synchronization capabilities that are available for Snowflake.

Limits

Snowflake Reader and Writer use the snowflake-jdbc:3.20.0 driver. For more information about the capabilities of the driver, see Snowflake JDBC Driver Release Notes.

Supported field types

Snowflake Reader and Snowflake Writer support most common data types for relational databases, such as numbers and characters. However, some data types are not supported. You must verify that your data types are supported by the specific database.

Field type

Offline read (Snowflake Reader)

Offline write (Snowflake Writer)

NUMBER

Supported

Support

DECIMAL

Supported

Supported

NUMERIC

Supported

Supported

INT

Support

Supported

INTEGER

Support

Supported

BIGINT

Supported

Supported

SMALLINT

Supported

Help and support

TINYINT

Supported

Supported

BYTEINT

Supported

Supported

FLOAT

Supported

Supported

FLOAT4

Supported

Supported

FLOAT8

Support

Supported

DOUBLE

Supported

Support

DOUBLE PRECISION

Supported

Supported

CHAR

Supported

Supported

VARCHAR

Supported

Supported

STRING

Supported

Supported

TEXT

Supported

Supported

BINARY

Support

Supported

VARBINARY

Supported

Supported

BOOLEAN

Supported

Supported

DATE

Support

Support

TIME

Supported

Supported

TIMESTAMP

Supported

Supported

TIMESTAMP_NTZ

Supported

Supported

TIMESTAMP_LTZ

Supported

Support

TIMESTAMP_TZ

Not supported

Supported

OBJECT

Not supported

Not supported

ARRAY

Not supported

Not supported

VARIANT

Not supported

Not supported

GEOGRAPHY

Not supported

Not supported

GEOMETRY

Not supported

Not supported

VECTOR

Not supported

Not supported

FILE

Not supported

Not supported

Create a data source

Before you develop a data synchronization task, you must create a corresponding data source in DataWorks. For more information about the procedure, see Data Source Management. For detailed descriptions of the configuration parameters, see the tooltips on the configuration page.

Develop a data synchronization task

For information about the entry point and general process for configuring a data synchronization task, see the following guides.

Configure an offline sync task for a single table

Appendix: Script demos and parameter descriptions

Configure an offline task script

If you use the code editor to configure an offline task for a single table, you must write the script in the required format. For more information, see Configure a sync task in the code editor. The following sections describe the parameters for the data source in the code editor.

Reader script demo

{
    "transform": false,
    "type": "job",
    "version": "2.0",
    "steps": [
        {
            "stepType": "snowflake",
            "parameter": {
                "schema": "PUBLIC",
                "envType": 0,
                "datasource": "snowflake_datasource",
                "column": [
                    "ID",
                    "NAME"
                ],
                "where": "",
                "table": "table"
            },
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "stream",
            "parameter": {
                "print": true
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "setting": {
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "throttle": false,
            "concurrent": 2
        }
    },
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    }
}

Reader script parameters

Parameter

Description

Required

Default value

datasource

The name of the Snowflake data source.

Yes

None

table

The name of the table from which you want to sync data.

Yes

None

schema

The schema where the Snowflake table that you want to read resides.

Yes

None

column

The columns that you want to sync from the source table. Use a JSON array to describe the fields. By default, all columns are used, such as ["*"].

  • You can select a subset of columns to export.

  • You can export columns in an order different from the table schema.

  • You can configure constants by providing the configuration in JSON format. The following is an example: ["id","1", "'bazhen.csy'", "null", "to_char(a + 1)", "2.3" , "true"].

    • id is a regular column name.

    • 1 is an integer constant.

    • 'bazhen.csy' is a string constant.

    • null is a null pointer.

    • to_char(a + 1) is a function expression.

    • 2.3 is a floating-point number.

    • true is a Boolean value.

  • The column parameter must specify the columns that you want to sync. It cannot be empty.

Yes

None

splitFactor

The split factor. This parameter specifies the number of chunks into which the data is split for synchronization. If you configure multiple concurrent threads, the data is split into a number of chunks equal to Concurrency × splitFactor. For example, if Concurrency is set to 5 and splitFactor is set to 5, the data is split into 25 chunks (5 × 5) and processed by 5 concurrent threads.

Note

Set this parameter to a value from 1 to 100. A value that is too large may cause an out-of-memory (OOM) error.

splitPk

When Snowflake Reader extracts data, you can specify the splitPk parameter to shard data based on the specified primary key column. This allows the data synchronization system to run concurrent tasks to improve efficiency.

  • Set splitPk to the primary key of the table. The values in a primary key column are typically distributed evenly. This prevents data hot spots in the shards.

  • Currently, the splitPk parameter supports only integer columns for data sharding. If you specify a column of another data type, such as string, floating-point number, or date, Snowflake Reader reports an error.

where

The filter condition. Snowflake Reader combines the specified column, table, and where parameters into an SQL statement to extract data. For example, for testing purposes, you can set the where condition to limit 10.

For incremental synchronization, you can set the where condition to gmt_create>${bizdate}.

  • The where condition allows for efficient incremental data synchronization.

  • If you do not configure this parameter or leave it empty, all data is synced from the table.

No

None

fetchSize

This parameter specifies the number of records to retrieve in each batch from the database server. A larger value reduces the number of network interactions between Data Integration and the server, which improves data extraction performance.

Note

A fetchSize value greater than 2048 may cause an OOM error in the data synchronization process.

No

1024

Writer script demo

{
    "transform": false,
    "type": "job",
    "version": "2.0",
    "steps": [
        { 
            "stepType":"stream",
            "parameter":{},
            "name":"Reader",
            "category":"reader"
        },
        {
            "stepType": "snowflake",
            "parameter": {
                "schema": "PUBLIC",
                "envType": 0,
                "datasource": "snowflake_datasource",
                "column": [
                    "ID",
                    "NAME"
                ],
                "table": "TABLE"
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "setting": {
        "executeMode": null,
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "concurrent": 2,
            "throttle": false
        }
    },
    "order": {
        "hops": []
    }
}

Writer script parameters

Parameter

Description

Required

Default value

datasource

The name of the Snowflake data source.

Yes

None

table

The name of the table to which you want to sync data.

Yes

None

schema

The schema where the Snowflake table that you want to write to resides.

Yes

None

column

The columns in the destination table to which you want to write data. Separate the column names with commas (,). Example: "column":["id","name","age"]. To write data to all columns in order, use an asterisk (*). Example: "column":["*"].

Yes

None

batchSize

The number of records to write in a single batch. A larger value can significantly reduce network interactions between Data Integration and Snowflake and improve overall throughput. However, a value that is too large may cause an OOM error in the Data Integration process.

No

1024