All Products
Search
Document Center

DataWorks:Amazon Redshift

Last Updated:Mar 25, 2026

The Amazon Redshift data source provides a bidirectional channel to read from and write to Amazon Redshift. Use it to configure data synchronization tasks through the codeless UI or the code editor.

Supported sync modes

CapabilitySupported
Incremental sync (cursor-based, using where)Yes
Concurrent reads (using splitPk)Yes
Write modeInsert only

Supported versions

Amazon Redshift uses the redshift-jdbc4.2 Driver 2.1.0.1. For driver details, see Configure a JDBC driver for Amazon Redshift.

Supported field types

The following table lists the supported field types and their SQL and Java mappings. For the full type reference, see the official Amazon Redshift documentation.

Amazon Redshift typeSQL typeJava type
BIGINTSQL_BIGINTLong
BOOLEANSQL_BITBoolean
CHARSQL_CHARString
DATESQL_TYPE_DATEjava.sql.Date
DECIMALSQL_NUMERICBigDecimal
DOUBLE PRECISIONSQL_DOUBLEDouble
GEOMETRYSQL_LONGVARBINARYbyte[]
INTEGERSQL_INTEGERInteger
OIDSQL_BIGINTLong
SUPERSQL_LONGVARCHARString
REALSQL_REALFloat
SMALLINTSQL_SMALLINTShort
TEXTSQL_VARCHARString
TIMESQL_TYPE_TIMEjava.sql.Time
TIMETZSQL_TYPE_TIMEjava.sql.Time
TIMESTAMPSQL_TYPE_TIMESTAMPjava.sql.Timestamp
TIMESTAMPTZSQL_TYPE_TIMESTAMPjava.sql.Timestamp
VARCHARSQL_VARCHARString

Prerequisites

Before you begin, ensure that you have:

  • Established network connectivity between Amazon Redshift and a serverless resource group or an exclusive resource group for Data Integration. Use a serverless resource group for most synchronization tasks. For setup instructions, see Network connectivity solutions.

  • The Amazon Redshift host, port, database name, username, and password.

Add the data source

Add Amazon Redshift as a data source in DataWorks by following the steps in Data source management. Check the parameter descriptions in the DataWorks console as you fill in each field.

The following parameters are specific to Amazon Redshift:

  • JDBC URL: The Java Database Connectivity (JDBC) connection string, which must include the host IP address, port number, database name, and any connection parameters. Both public and private IP addresses are supported. If you use a public IP address, confirm that the Data Integration resource group can reach the Amazon Redshift host.

  • Username: The username for the Amazon Redshift database.

  • Password: The password for the specified username.

Develop a data sync task

To configure an offline sync task for a single table, follow the relevant guide:

For code editor script parameters and examples, see Appendix: Script examples and parameter descriptions.

Appendix: Script examples and parameter descriptions

The following Reader and Writer scripts follow the unified script format for batch synchronization tasks. For format requirements, see Configure a task in the code editor.

Reader

Script example

{
  "stepType": "redshift",
  "parameter": {
    "datasource": "redshift_datasource",
    "table": "redshift_table_name",
    "where": "xxx=3",
    "splitPk": "id",
    "column": [
      "id",
      "table_id",
      "table_no",
      "table_name",
      "table_status"
    ]
  },
  "name": "Reader",
  "category": "reader"
}

Parameters

ParameterDescriptionRequiredDefault
datasourceThe data source name. Must match the name of the data source you added in DataWorks.YesNone
tableThe name of the source table.YesNone
columnThe columns to sync, as a list. To sync all columns, use ["*"].YesNone
whereA filter condition used to build the SQL query. For incremental sync, use a condition such as gmt_create>$bizdate to sync only records from the current day. If left blank, all rows are synced.NoNone
splitPkThe column used for data partitioning. When specified, the Reader splits the data and runs concurrent tasks to improve sync throughput.NoNone

Writer

Script example

{
  "stepType": "redshift",
  "parameter": {
    "preSql": ["delete from XXX;"],
    "postSql": ["delete from XXX;"],
    "datasource": "redshift_datasource",
    "table": "redshift_table_name",
    "writeMode": "insert",
    "batchSize": 2048,
    "column": [
      "id",
      "table_id",
      "table_no",
      "table_name",
      "table_status"
    ]
  },
  "name": "Writer",
  "category": "writer"
}

Parameters

ParameterDescriptionRequiredDefault
datasourceThe data source name. Must match the name of the data source you added in DataWorks.YesNone
tableThe name of the destination table.YesNone
columnThe destination columns to write to, as a list. To write to all columns in order, use ["*"].YesNone
preSqlSQL to run before the sync task starts, such as a statement to clear old data. The codeless UI supports one statement; the code editor supports multiple.NoNone
postSqlSQL to run after the sync task completes, such as a statement to add a timestamp. The codeless UI supports one statement; the code editor supports multiple.NoNone
batchSizeThe maximum number of rows to write per batch.No2048
writeModeThe write mode. Only insert is supported.Noinsert