The AnalyticDB for MySQL 3.0 data source provides a bidirectional channel to read data from and write data to AnalyticDB for MySQL 3.0. This topic describes the data synchronization capabilities of DataWorks for AnalyticDB for MySQL 3.0.
Limits
Sync tasks for AnalyticDB for MySQL Data Lakehouse Edition data sources cannot be configured or run in a public resource group.
If you switch an instance from Data Warehouse Edition to Data Lakehouse Edition, sync tasks that use the data source and run in a public resource group will fail. Before you switch, check for sync tasks that run in the public resource group and move them to a Serverless resource group (recommended) or an exclusive resource group for Data Integration.
Offline sync supports reading data from views.
Supported field types
Offline read
The following table lists the data type mappings for AnalyticDB for MySQL 3.0 Reader.
Type category | AnalyticDB for MySQL 3.0 type |
Integer | INT, INTEGER, TINYINT, SMALLINT, and BIGINT |
Floating-point | FLOAT, DOUBLE, and DECIMAL |
String | VARCHAR |
Date and time | DATE, DATETIME, TIMESTAMP, and TIME |
Boolean | BOOLEAN |
Offline write
The following table lists the data type mappings for AnalyticDB for MySQL 3.0 Writer.
Type | AnalyticDB for MySQL 3.0 data type |
Integer | INT, INTEGER, TINYINT, SMALLINT, and BIGINT |
Floating-point | FLOAT, DOUBLE, and DECIMAL |
String | VARCHAR |
Date and time | DATE, DATETIME, TIMESTAMP, and TIME |
Boolean | BOOLEAN |
Add a data source
Before you develop a synchronization task in DataWorks, you must add the required data source to DataWorks by following the instructions in Data source management. You can view parameter descriptions in the DataWorks console to understand the meanings of the parameters when you add a data source.
Develop a data synchronization task
For information about the entry point for and the procedure of configuring a synchronization task, see the following configuration guides.
Configure an offline sync task for a single table
For the procedure, see Configure in the codeless UI and Configure in the code editor.
For a full list of parameters and a sample script for the code editor, see Appendix: Sample script and parameter descriptions.
Configure a real-time sync task for a single table or an entire database
For the procedure, see Configure a real-time sync task in DataStudio.
Configure database-level sync tasks
For the procedure, see Configure a real-time sync task for an entire database.
Appendix: Script demo and parameter description
Configure a batch synchronization task by using the code editor
If you want to configure a batch synchronization task by using the code editor, you must configure the related parameters in the script based on the unified script format requirements. For more information, see Configure a task in the code editor. The following information describes the parameters that you must configure for data sources when you configure a batch synchronization task by using the code editor.
Reader script demo
{
"type": "job",
"steps": [
{
"stepType": "analyticdb_for_mysql", // The plug-in name.
"parameter": {
"column": [ // The column names.
"id",
"value",
"table"
],
"connection": [
{
"datasource": "xxx", // The data source.
"table": [ // The table name.
"xxx"
]
}
],
"where": "", // The filter condition.
"splitPk": "", // The split key.
"encoding": "UTF-8" // The encoding format.
},
"name": "Reader",
"category": "reader"
},
{
"stepType": "stream",
"parameter": {},
"name": "Writer",
"category": "writer"
}
],
"version": "2.0",
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
},
"setting": {
"errorLimit": {
"record": "0" // The maximum number of error records allowed during synchronization.
},
"speed": {
"throttle":true,// If throttle is set to false, the mbps parameter does not take effect and the data rate is not limited. If throttle is set to true, the data rate is limited.
"concurrent":1, // The number of concurrent jobs.
"mbps":"12"// The maximum data rate. 1 mbps = 1 MB/s.
}
}
}Reader script parameters
Parameter | Description | Required | Default value |
datasource | The name of the data source. The value must be the same as the name of the data source that you added. | Yes | None |
table | The table that you want to synchronize. | Yes | None |
column | The columns in the source table that you want to synchronize. Use a JSON array to specify the columns. By default, all columns are synchronized. Example: [ * ].
| Yes | None |
splitPk | When AnalyticDB for MySQL 3.0 Reader extracts data, if you specify splitPk, this indicates that you want to use the field represented by splitPk for data partitioning. Data synchronization then starts concurrent tasks, which improves data synchronization efficiency.
| No | None |
where | The filter condition. In many business scenarios, you might want to synchronize only the data from the current day. To do this, set the where condition to
| No | None |
Writer script demo
{
"type": "job",
"steps": [
{
"stepType": "stream",
"parameter": {},
"name": "Reader",
"category": "reader"
},
{
"stepType": "analyticdb_for_mysql", // The plug-in name.
"parameter": {
"postSql": [], // The SQL statement to execute after the import.
"tableType": null, // A reserved field. The default value is empty.
"datasource": "hangzhou_ads", // The name of the data source.
"column": [ // The columns to synchronize.
"id",
"value"
],
"guid": null,
"writeMode": "insert", // The write mode. For more information, see the description of the writeMode parameter.
"batchSize": 2048, // The number of records in each batch. For more information, see the description of the batchSize parameter.
"encoding": "UTF-8", // The encoding format.
"table": "t5", // The destination table.
"preSql": [] // The SQL statement to execute before the import.
},
"name": "Writer",
"category": "writer"
}
],
"version": "2.0", // The version number of the configuration file format.
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
},
"setting": {
"errorLimit": {
"record": "0" // The number of error records.
},
"speed": {
"throttle":true,// If throttle is set to false, the mbps parameter does not take effect and the data rate is not limited. If throttle is set to true, the data rate is limited.
"concurrent":2, // The number of concurrent jobs.
"mbps":"12"// The maximum data rate. 1 mbps = 1 MB/s.
}
}
}Writer script parameters
Parameter | Description | Required | Default value |
datasource | The name of the data source. The value must be the same as the name of the data source that you added. | Yes | None |
table | The tables selected for synchronization. | Yes | None |
writeMode | The write mode. Valid values: insert, replace, and update.
| No | insert |
column | The columns in the destination table to which you want to write data. Separate columns with commas. Example: Note If a field name contains select, enclose the name in backticks. For example, write item_select_no as `item_select_no`. | Yes | None |
preSql | The SQL statement to execute before the data sync task runs. In the codeless UI, you can execute only one SQL statement. In the code editor, you can execute multiple SQL statements, for example, to purge old data. Note Transactions are not supported for multiple SQL statements. | No | None |
postSql | The SQL statement to execute after the data sync task runs. In the codeless UI, you can execute only one SQL statement. In the code editor, you can execute multiple SQL statements, for example, to add a timestamp. Note Transactions are not supported for multiple SQL statements. | No | None |
batchSize | The number of records to submit in a single batch. A larger value can significantly reduce network interactions between the data sync system and MySQL and improve overall throughput. If this value is too large, an out-of-memory (OOM) error may occur in the data sync process. | No | 1024 |