This topic describes the data types and parameters that AnalyticDB for MySQL 3.0 Reader supports and how to configure it by using the codeless user interface (UI) and code editor.
AnalyticDB for MySQL 3.0 Reader allows you to read data from AnalyticDB for MySQL 3.0. Specifically, AnalyticDB for MySQL 3.0 Reader connects to a remote AnalyticDB for MySQL 3.0 database by using Java Database Connectivity (JDBC), and executes a SELECT statement to select and read data from the AnalyticDB for MySQL 3.0 database.
Data types
The following table describes the data types that AnalyticDB for MySQL 3.0 Reader supports.
Category | AnalyticDB for MySQL 3.0 data type |
---|---|
Integer | INT, INTEGER, TINYINT, SMALLINT, and BIGINT |
Floating point | FLOAT, DOUBLE, and DECIMAL |
String | VARCHAR |
Date and time | DATE, DATETIME, TIMESTAMP, and TIME |
Boolean | BOOLEAN |
Parameters
Parameter | Description | Required | Default value |
---|---|---|---|
datasource | The connection name. It must be the same as the name of the created connection. You can create connections in the code editor. | Yes | N/A |
table | The name of the source table. | Yes | N/A |
column | The columns to be synchronized from the source table. The columns are described in
a JSON array. The default value is [*], which indicates all columns.
|
Yes | N/A |
splitPk | The field that is used for data sharding when AnalyticDB for MySQL 3.0 Reader reads
data. If you specify the splitPk parameter, the table is sharded based on the shard key that is specified by this
parameter. Data Integration then runs concurrent threads to synchronize data. This
way, data can be synchronized more efficiently.
|
No | N/A |
where | The WHERE clause. For example, set this parameter to gmt_create>$bizdate .
|
No | N/A |
Configure AnalyticDB for MySQL 3.0 Reader by using the codeless UI
- Configure the connections.
Configure the connections to the source and destination data stores for the sync node.
GUI element Description Connection The datasource parameter in the preceding parameter description. Select a connection type and select the name of a connection that you have configured in DataWorks. Table The table parameter in the preceding parameter description. Filter The condition for filtering the data to be synchronized. AnalyticDB for MySQL 3.0 Reader cannot filter data based on the limit keyword. The SQL syntax is determined by the selected connection. Shard Key The shard key. You can specify a column in the source table as the shard key. We recommend that you use the primary key or an indexed column as the shard key. Only integer fields are supported. If data sharding is performed based on the configured shard key, data can be read concurrently. This way, data can be synchronized more efficiently.Note The Shard Key parameter is displayed only after you select the connection to the source data store for the sync node. - Configure field mapping. It is equivalent to setting the column parameter in the preceding parameter description.
Fields in the source table on the left have a one-to-one mapping with fields in the destination table on the right. You can click Add to add a field. To delete a field, move the pointer over the field and click theDelete icon.
GUI element Description Map Fields with the Same Name Click Map Fields with the Same Name to establish a mapping between fields with the same name. The data types of the fields must match. Map Fields in the Same Line Click Map Fields in the Same Line to establish a mapping between fields in the same row. The data types of the fields must match. Delete All Mappings Click Delete All Mappings to remove mappings that have been established. Auto Layout Click Auto Layout to sort the fields based on specified rules. Change Fields Click the Change Fields icon. In the Change Fields dialog box, you can manually edit the fields in the source table. Each field occupies a row. The first and the last blank rows are included, whereas other blank rows are ignored. Add - Click Add to add a field. You can enter constants. Each constant must be enclosed in single quotation marks (' '), for example, 'abc' and '123'.
- You can use scheduling parameters such as ${bizdate}.
- You can enter functions that are supported by relational databases, for example, now() and count(1).
- Fields that cannot be parsed are indicated by Unidentified.
- Configure channel control policies.
GUI element Description Expected Maximum Concurrency The maximum number of concurrent threads that the sync node uses to read data from or write data to data stores. You can configure the concurrency for the node on the codeless UI. Bandwidth Throttling Specifies whether to enable bandwidth throttling. You can enable bandwidth throttling and set a maximum transmission rate to avoid heavy read workload of the source. We recommend that you enable bandwidth throttling and set the maximum transmission rate to a proper value. Dirty Data Records Allowed The maximum number of dirty data records allowed. Resource Group The resource group that is used to run the sync node. If a large number of nodes including this sync node are deployed on the default resource group, the sync node may need to wait for resources. We recommend that you purchase an exclusive resource group for Data Integration or add a custom resource group. For more information, see DataWorks exclusive resources and Add a custom resource group.
Configure AnalyticDB for MySQL 3.0 Reader by using the code editor
{
"type": "job",
"steps": [
{
"stepType": "analyticdb_for_mysql", // The reader type.
"parameter": {
"column": [ // The columns to be synchronized from the source table.
"id",
"value",
"table"
],
"connection": [
{
"datasource": "",// The connection name.
"table": [ // The name of the source table.
"xxx"
]
}
],
"where": "", // The WHERE clause.
"splitPk": "", // The shard key.
"encoding": "UTF-8" // The encoding format.
},
"name": "Reader",
"category": "reader"
},
{
"stepType": "stream",
"parameter": {},
"name": "Writer",
"category": "writer"
}
],
"version": "2.0",
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
},
"setting": {
"errorLimit": {
"record":"0" // The maximum number of dirty data records allowed.
},
"speed": {
"concurrent": 2,// The maximum number of concurrent threads.
"throttle": false // Specifies whether to enable bandwidth throttling. A value of false indicates that the bandwidth is not throttled. A value of true indicates that the bandwidth is throttled. The maximum transmission rate takes effect only if you set this parameter to true.
}
}
}