The AnalyticDB for MySQL 3.0 data source lets you read from and write data to AnalyticDB for MySQL 3.0. This topic describes the data synchronization capabilities of DataWorks for AnalyticDB for MySQL 3.0.
Limitations
You cannot configure or run synchronization tasks for a data lakehouse edition data source on a shared resource group.
If you switch an AnalyticDB for MySQL instance from the data warehouse edition to the data lakehouse edition, synchronization tasks that use this data source and run on a shared resource group will fail. Before you switch the instance edition, check for such tasks and reconfigure them to run on a Serverless resource group (recommended) or an exclusive Data Integration resource group.
Batch synchronization supports reading from views.
Supported data types
Batch read
The AnalyticDB for MySQL 3.0 Reader supports the following data types.
Type | AnalyticDB for MySQL 3.0 data type |
Integer types | INT, INTEGER, TINYINT, SMALLINT, and BIGINT |
Floating-point types | FLOAT, DOUBLE, and DECIMAL |
String types | VARCHAR |
Date and time types | DATE, DATETIME, TIMESTAMP, and TIME |
Boolean types | BOOLEAN |
Batch write
The AnalyticDB for MySQL 3.0 Writer supports the following data types.
Type | AnalyticDB for MySQL 3.0 data type |
Integer types | INT, INTEGER, TINYINT, SMALLINT, and BIGINT |
Floating-point types | FLOAT, DOUBLE, and DECIMAL |
String types | VARCHAR |
Date and time types | DATE, DATETIME, TIMESTAMP, and TIME |
Boolean types | BOOLEAN |
Add a data source
Before you develop a synchronization task in DataWorks, you must add the required data source to DataWorks by following the instructions in Data source management. You can view parameter descriptions in the DataWorks console to understand the meanings of the parameters when you add a data source.
Develop a data synchronization task
For information about the entry point for and the procedure of configuring a synchronization task, see the following configuration guides.
Configure a single-table batch synchronization task
For instructions, see Configure in codeless UI and Configure in code editor.
For a list of all parameters and a sample script for the code editor, see Appendix: Script examples and parameters.
Configure a single-table real-time synchronization task
For instructions, see Configure a single-table real-time synchronization task.
Configure full-database batch synchronization
For instructions, see Configure a full-database batch synchronization task.
Configure full-database real-time synchronization
For instructions, see Configure a full-database real-time synchronization task.
Appendix: Script examples and parameters
Configure a batch synchronization task by using the code editor
If you want to configure a batch synchronization task by using the code editor, you must configure the related parameters in the script based on the unified script format requirements. For more information, see Use the code editor. The following information describes the parameters that you must configure for data sources when you configure a batch synchronization task by using the code editor.
Reader script example
{
"type": "job",
"steps": [
{
"stepType": "analyticdb_for_mysql", // The plugin name.
"parameter": {
"column": [ // The columns to read.
"id",
"value",
"table"
],
"connection": [
{
"datasource": "xxx", // The data source name.
"table": [ // The table name.
"xxx"
]
}
],
"where": "", // The filter condition.
"splitPk": "", // The split key.
"encoding": "UTF-8" // The encoding format.
},
"name": "Reader",
"category": "reader"
},
{
"stepType": "stream",
"parameter": {},
"name": "Writer",
"category": "writer"
}
],
"version": "2.0",
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
},
"setting": {
"errorLimit": {
"record": "0" // The maximum number of error records allowed.
},
"speed": {
"throttle":true,// A value of `true` enables throttling and respects the `mbps` limit. A value of `false` disables it.
"concurrent":1, // The concurrency level for the task.
"mbps":"12"// The rate limit in MB/s.
}
}
}Reader parameters
Parameter | Description | Required | Default |
datasource | The name of your DataWorks data source. | Yes | None |
table | The source table name. | Yes | None |
column | An array of columns that you want to synchronize from the source table. The default value
| Yes | None |
splitPk | When AnalyticDB for MySQL 3.0 Reader extracts data, if you specify the splitPk parameter, the system shards the data based on the field specified by splitPk. As a result, data synchronization is performed by using concurrent tasks, which improves efficiency.
| No | None |
where | A filter condition. For example, to synchronize only the data from the current day, you can set the where condition to
| No | None |
Writer script example
{
"type": "job",
"steps": [
{
"stepType": "stream",
"parameter": {},
"name": "Reader",
"category": "reader"
},
{
"stepType": "analyticdb_for_mysql", // The plugin name.
"parameter": {
"postSql": [], // SQL statements to execute after the import.
"tableType": null, // Reserved field. The default value is null.
"datasource": "hangzhou_ads", // The data source name.
"column": [ // The columns to synchronize.
"id",
"value"
],
"guid": null,
"writeMode": "insert", // The write mode. See the description of the writeMode parameter.
"batchSize": 2048, // The number of records in each batch write. See the description of the batchSize parameter.
"encoding": "UTF-8", // The encoding format.
"table": "t5", // The destination table name.
"preSql": [] // SQL statements to execute before the import.
},
"name": "Writer",
"category": "writer"
}
],
"version": "2.0", // The version of the configuration file format.
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
},
"setting": {
"errorLimit": {
"record": "0" // The maximum number of error records allowed.
},
"speed": {
"throttle":true,// A value of `true` enables throttling and respects the `mbps` limit. A value of `false` disables it.
"concurrent":2, // The concurrency level for the task.
"mbps":"12"// The rate limit in MB/s.
}
}
}Writer parameters
Parameter | Description | Required | Default |
datasource | The name of your DataWorks data source. | Yes | None |
table | The destination table name. | Yes | None |
writeMode | The method for writing data. Valid values are insert, replace, and update.
| No | insert |
column | The columns in the destination table to which you want to write data. Separate column names with commas. Example: Note If a column name contains a reserved keyword like select, enclose it in backticks (). For example, specify item_select_no as | Yes | None |
preSql | One or more SQL statements to execute before the write task begins. The codeless UI supports one statement, while the code editor supports multiple. Note Transactions are not supported for multiple SQL statements. | No | None |
postSql | One or more SQL statements to execute after the write task finishes. The codeless UI supports one statement, while the code editor supports multiple. Note Transactions are not supported for multiple SQL statements. | No | None |
batchSize | A larger value can improve throughput but also increases memory consumption, which may lead to out-of-memory (OOM) errors. | No | 1,024 |