Configure data sync for AnalyticDB for MySQL 3.0 - DataWorks

Limitations

Synchronization tasks for a data lakehouse edition data source cannot be configured or run on a shared resource group.
If you switch an AnalyticDB for MySQL instance from the data warehouse edition to the data lakehouse edition, synchronization tasks that use this data source and run on a shared resource group will fail. Before switching editions, check for such tasks and reconfigure them to run on a Serverless resource group (recommended) or an exclusive Data Integration resource group.
Batch synchronization supports reading from views.

Supported data types

The AnalyticDB for MySQL 3.0 Reader and Writer support the same set of data types.

Type	AnalyticDB for MySQL 3.0 data types
Integer types	INT, INTEGER, TINYINT, SMALLINT, BIGINT
Floating-point types	FLOAT, DOUBLE, DECIMAL
String types	VARCHAR
Date and time types	DATE, DATETIME, TIMESTAMP, TIME
Boolean types	BOOLEAN

Add a data source

Add the AnalyticDB for MySQL 3.0 data source to DataWorks before creating any synchronization task. For instructions, see Data source management.

Parameter descriptions are available in the DataWorks console when you add the data source.

Synchronization task types

The following table lists supported synchronization task types for AnalyticDB for MySQL 3.0.

Task type	Supported	Configuration guide
Single-table batch synchronization	Yes	Codeless UI / Code editor
Single-table real-time synchronization	Yes	Configure a single-table real-time synchronization task
Full-database batch synchronization	Yes	Configure a full-database batch synchronization task
Full-database real-time synchronization	Yes	Configure a full-database real-time synchronization task

For single-table batch synchronization configured via the code editor, see Appendix: Script examples and parameters for the full parameter reference and script examples.

Appendix: Script examples and parameters

This section describes the parameters and provides script examples for single-table batch synchronization using the code editor. For an overview of the code editor, see Use the code editor.

Reader

Script example

{
  "type": "job",
  "steps": [
    {
      "stepType": "analyticdb_for_mysql",
      "parameter": {
        "column": [
          "id",
          "value",
          "table"
        ],
        "connection": [
          {
            "datasource": "<data-source-name>",
            "table": [
              "<table-name>"
            ]
          }
        ],
        "where": "",
        "splitPk": "",
        "encoding": "UTF-8"
      },
      "name": "Reader",
      "category": "reader"
    },
    {
      "stepType": "stream",
      "parameter": {},
      "name": "Writer",
      "category": "writer"
    }
  ],
  "version": "2.0",
  "order": {
    "hops": [
      {
        "from": "Reader",
        "to": "Writer"
      }
    ]
  },
  "setting": {
    "errorLimit": {
      "record": "0"
    },
    "speed": {
      "throttle": true,
      "concurrent": 1,
      "mbps": "12"
    }
  }
}

Reader parameters

Parameter	Description	Required	Default
`datasource`	The name of your DataWorks data source.	Yes	None
`table`	The source table name.	Yes	None
`column`	The columns to read from the source table. Use `["*"]` to read all columns. Specify a subset in any order. Columns can include constants and expressions that follow MySQL syntax — for example, `["id", "`table`", "1", "'bazhen.csy'", "null", "to_char(a + 1)", "2.3", "true"]`. The array cannot be empty.	Yes	None
`splitPk`	The column used to split data into shards for concurrent reading. Use an integer-type column — string, floating-point, and date types are not supported. For best performance, use the table's primary key, whose values are typically distributed evenly. If omitted or left blank, the task runs on a single channel.	No	None
`where`	A filter condition following MySQL `WHERE` clause syntax. Use this for incremental synchronization — for example, `gmt_create>$bizdate` to synchronize only the current day's data. `LIMIT` is not supported. If omitted, the task performs a full data synchronization.	No	None

Writer

Script example

{
  "type": "job",
  "steps": [
    {
      "stepType": "stream",
      "parameter": {},
      "name": "Reader",
      "category": "reader"
    },
    {
      "stepType": "analyticdb_for_mysql",
      "parameter": {
        "postSql": [],
        "tableType": null,
        "datasource": "<data-source-name>",
        "column": [
          "id",
          "value"
        ],
        "guid": null,
        "writeMode": "insert",
        "batchSize": 2048,
        "encoding": "UTF-8",
        "table": "<table-name>",
        "preSql": []
      },
      "name": "Writer",
      "category": "writer"
    }
  ],
  "version": "2.0",
  "order": {
    "hops": [
      {
        "from": "Reader",
        "to": "Writer"
      }
    ]
  },
  "setting": {
    "errorLimit": {
      "record": "0"
    },
    "speed": {
      "throttle": true,
      "concurrent": 2,
      "mbps": "12"
    }
  }
}

Writer parameters

Parameter	Description	Required	Default
`datasource`	The name of your DataWorks data source.	Yes	None
`table`	The destination table name.	Yes	None
`writeMode`	The write behavior when a primary key or unique index conflict occurs. Valid values: `insert` (skip the conflicting record), `replace` (overwrite the existing record with the new one), `update` (update the existing record's fields with the new values). The `update` mode is available only in the code editor.	No	`insert`
`column`	The destination columns to write to. Separate column names with commas — for example, `["id", "name", "age"]`. Use `["*"]` to write to all columns in order. Enclose reserved keywords in backticks — for example, ` item_select_no `.	Yes	None
`preSql`	SQL statements to run before the write task begins. The codeless UI supports one statement; the code editor supports multiple. Transactions are not supported for multiple statements.	No	None
`postSql`	SQL statements to run after the write task finishes. The codeless UI supports one statement; the code editor supports multiple. Transactions are not supported for multiple statements.	No	None
`batchSize`	The number of records per batch write. A larger value increases throughput but also increases memory usage, which can lead to out-of-memory (OOM) errors.	No	1,024