This topic describes the data types and parameters that are supported by DRDS Writer and how to configure DRDS Writer by using the codeless user interface (UI) and code editor.
Background information
REPLACE INTO
statement to write data to the DRDS database.
- To execute the
REPLACE INTO
statement, you must make sure that your table has the primary key or a unique index to prevent duplicate data. - Before you configure DRDS Writer, you must add a DRDS data source. For more information, see Add a DRDS data source.
- DataWorks does not support DRDS instances that run MySQL 8.0.
DRDS Writer is designed for extract, transform, load (ETL) developers to import data in data warehouses to DRDS databases. DRDS Writer can also be used as a data migration tool by users such as database administrators.
REPLACE INTO
statement to write the data to the destination database. If no primary key conflict
or unique index conflict occurs, data is directly written to the destination table,
which is the same as the action of the INSERT INTO
statement. If a conflict occurs, data in conflicting rows in the destination table
is replaced by new data. DRDS Writer sends data to the DRDS proxy when the amount
of buffered data reaches a specific threshold. The proxy determines whether to write
the data to one or more tables and how to route the data when the data is written
to multiple tables.
REPLACE INTO
statement. Whether other permissions are required depends on the SQL statements that
you specify in the preSql and postSql parameters when you configure the node.
Data types
DRDS Writer supports most DRDS data types. Make sure that the data types of your database are supported.
Category | DRDS data type |
---|---|
Integer | INT, TINYINT, SMALLINT, MEDIUMINT, BIGINT, and YEAR |
Floating point | FLOAT, DOUBLE, and DECIMAL |
String | VARCHAR, CHAR, TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT |
Date and time | DATE, DATETIME, TIMESTAMP, and TIME |
Boolean | BIT and BOOLEAN |
Binary | TINYBLOB, MEDIUMBLOB, BLOB, LONGBLOB, and VARBINARY |
Parameters
Parameter | Description | Required | Default value |
---|---|---|---|
datasource | The name of the data source. It must be the same as the name of the added data source. You can add data sources by using the code editor. | Yes | No default value |
table | The name of the table to which you want to write data. | Yes | No default value |
writeMode | The write mode. Valid values:
|
No | insert ignore |
column | The names of the columns to which you want to write data. Separate the names with commas (,), such as "column": ["id","name","age"]. If you want to write data to all the columns in the destination table, set this parameter to an asterisk (*), such as "column": ["*"]. | Yes | No default value |
preSql | The SQL statement that you want to execute before the synchronization node is run.
You can execute only one SQL statement on the codeless UI and multiple SQL statements
in the code editor.
For example, you can set this parameter to |
No | No default value |
postSql | The SQL statement that you want to execute after the synchronization node is run.
You can execute only one SQL statement on the codeless UI and multiple SQL statements
in the code editor.
For example, you can set this parameter to |
No | No default value |
batchSize | The number of data records to write at a time. Set this parameter to an appropriate value based on your business requirements. This greatly reduces the interactions between Data Integration and DRDS and increases throughput. If you set this parameter to an excessively large value, an out of memory (OOM) error may occur during data synchronization. | No | 1,024 |
Configure DRDS Writer by using the codeless UI
Create a synchronization node and configure the node. For more information, see Configure a sync node by using the codeless UI.
- Configure data sources.
Configure Source and Target for the synchronization node.
Parameter Description Connection The name of the data source to which you want to write data. This parameter is equivalent to the datasource parameter that is described in the preceding section. Table The name of the table to which you want to write data. This parameter is equivalent to the table parameter that is described in the preceding section. Statement Run Before Writing The SQL statement that you want to execute before the synchronization node is run. This parameter is equivalent to the preSql parameter that is described in the preceding section. Statement Run After Writing The SQL statement that you want to execute after the synchronization node is run. This parameter is equivalent to the postSql parameter that is described in the preceding section. Solution to Primary Key Violation The write mode. This parameter is equivalent to the writeMode parameter that is described in the preceding section. You can select the desired write mode. - Configure field mappings. This operation is equivalent to setting the column parameter that is described in the preceding section. Fields in the source on the
left have a one-to-one mapping with fields in the destination on the right.
Operation Description Map Fields with the Same Name Click Map Fields with the Same Name to establish mappings between fields with the same name. The data types of the fields must match. Map Fields in the Same Line Click Map Fields in the Same Line to establish mappings between fields in the same row. The data types of the fields must match. Delete All Mappings Click Delete All Mappings to remove the mappings that are established. Auto Layout Click Auto Layout. Then, the system automatically sorts the fields based on specific rules. - Configure channel control policies.
Parameter Description Expected Maximum Concurrency The maximum number of parallel threads that the synchronization node uses to read data from the source or write data to the destination. You can configure the parallelism for the synchronization node on the codeless UI. Bandwidth Throttling Specifies whether to enable bandwidth throttling. You can enable bandwidth throttling and specify a maximum transmission rate to prevent heavy read workloads on the source. We recommend that you enable bandwidth throttling and set the maximum transmission rate to an appropriate value based on the configurations of the source. Dirty Data Records Allowed The maximum number of dirty data records allowed.
Configure DRDS Writer by using the code editor
{
"type":"job",
"version":"2.0",// The version number.
"steps":[
{
"stepType":"stream",
"parameter":{},
"name":"Reader",
"category":"reader"
},
{
"stepType":"drds",// The writer type.
"parameter":{
"postSql":[],// The SQL statement that you want to execute after the synchronization node is run.
"datasource":"",// The name of the data source.
"column":[// The names of the columns to which you want to write data.
"id"
],
"writeMode":"insert ignore",
"batchSize":"1024",// The number of data records to write at a time.
"table":"test",// The name of the table to which you want to write data.
"preSql":[]// The SQL statement that you want to execute before the synchronization node is run.
},
"name":"Writer",
"category":"writer"
}
],
"setting":{
"errorLimit":{
"record":"0"// The maximum number of dirty data records allowed.
},
"speed":{
"throttle":true,// Specifies whether to enable bandwidth throttling. The value false indicates that bandwidth throttling is disabled, and the value true indicates that bandwidth throttling is enabled. The mbps parameter takes effect only when the throttle parameter is set to true.
"concurrent":1, // The maximum number of parallel threads.
"mbps":"12"// The maximum transmission rate.
}
},
"order":{
"hops":[
{
"from":"Reader",
"to":"Writer"
}
]
}
}