This topic describes the data types and parameters that DRDS Writer supports and how to configure it by using the codeless user interface (UI) and code editor.
Background information
REPLACE INTO
statement to write data to the DRDS database.
- To execute the
REPLACE INTO
statement, make sure that your table has the primary key or a unique index to avoid duplicate data. - You must configure a DRDS connection before you configure DRDS Writer. For more information, see Configure a DRDS connection.
- DataWorks does not support MySQL 8.0 for DRDS databases.
DRDS Writer is designed for extract-transform-load (ETL) developers to import data from data warehouses to DRDS databases. Users such as database administrators can use DRDS Writer as a data migration tool.
REPLACE INTO
statement to write the data to the destination database. If no primary key conflict
or unique index conflict occurs, the action is the same as the action of the INSERT INTO
statement. If a conflict occurs, original rows are replaced by new rows. DRDS Writer
sends data to the DRDS proxy when the amount of buffered data reaches a specific threshold.
The proxy determines whether to write the data to one or more tables and how to route
the data when it is written to multiple tables.
REPLACE INTO
statement. Whether other permissions are required depends on the SQL statements that
you specify in the preSql and postSql parameters when you configure the node.
Data types
DRDS Writer supports most DRDS data types. Make sure that your data types are supported.
Category | DRDS data type |
---|---|
Integer | INT, TINYINT, SMALLINT, MEDIUMINT, BIGINT, and YEAR |
Floating point | FLOAT, DOUBLE, and DECIMAL |
String | VARCHAR, CHAR, TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT |
Date and time | DATE, DATETIME, TIMESTAMP, and TIME |
Boolean | BIT and BOOLEAN |
Binary | TINYBLOB, MEDIUMBLOB, BLOB, LONGBLOB, and VARBINARY |
Parameters
Parameter | Description | Required | Default value |
---|---|---|---|
datasource | The connection name. It must be the same as the name of the added connection. You can add connections in the code editor. | Yes | N/A |
table | The name of the destination table. | Yes | N/A |
writeMode | The write mode. Valid values:
|
No | insert ignore |
column | The columns in the destination table to which data is written. Separate the columns with commas (,), for example, "column":["id","name","age"]. To write data to all the columns in the destination table, set the value to an asterisk (*), for example, "column":["*"]. | Yes | N/A |
preSql | The SQL statement to execute before the sync node is run. You can execute only one
SQL statement on the codeless UI and multiple SQL statements in the code editor.
For example, to clear data from the xxx table before data synchronization, you can
execute the |
No | N/A |
postSql | The SQL statement to execute after the sync node is run. You can execute only one
SQL statement on the codeless UI and multiple SQL statements in the code editor.
For example, to delete specific data from the xxx table after data synchronization,
you can execute the |
No | N/A |
batchSize | The number of data records to write at a time. Set this parameter to a proper value based on your actual needs. This can greatly reduce the interactions between Data Integration and the DRDS database on the network, and increase the throughput. However, an excessively large value may lead to the out of memory (OOM) error during the data synchronization process. | No | 1024 |
Configure DRDS Writer by using the codeless UI
Create a sync node and configure the node. For more information, see Create a sync node by using the codeless UI.
- Configure the connections.
Configure the connections to the source and destination data stores for the sync node.
Parameter Description Data source The datasource parameter in the preceding parameter description. Select a connection type and select the name of a connection that you have configured in DataWorks. Table The table parameter in the preceding parameter description. Prepare statement before import The preSql parameter in the preceding parameter description. Enter an SQL statement to execute before the sync node is run. Complete statement after import The postSql parameter in the preceding parameter description. Enter an SQL statement to execute after the sync node is run. Constraint conflict The writeMode parameter in the preceding parameter description. Select the required write mode. - Configure field mapping. It is equivalent to setting the column parameter in the preceding parameter description. Fields in the source table on the
left have a one-to-one mapping with fields in the destination table on the right.
Parameter Description The same name mapping Click The same name mapping to establish a mapping between fields with the same name. Note that the data types of the fields must match. Peer mapping Click Peer mapping to establish a mapping between fields in the same row. Note that the data types of the fields must match. Unmap Click Unmap to remove mappings that have been established. Automatic typesetting Click Automatic typesetting to sort the fields based on specified rules. - Configure channel control policies.
Parameter Description Maximum number of concurrent tasks expected The maximum number of concurrent threads that the sync node uses to read data from or write data to data stores. You can configure the concurrency for the node on the codeless UI. Synchronization rate Specifies whether to enable bandwidth throttling. You can enable bandwidth throttling and set a maximum transmission rate to avoid heavy read workload of the source. We recommend that you enable bandwidth throttling and set the maximum transmission rate to a proper value. The number of error records exceeds The maximum number of dirty data records allowed.
Configure DRDS Writer by using the code editor
{
"type":"job",
"version":"2.0", // The version number.
"steps":[
{
"stepType":"stream",
"parameter":{},
"name":"Reader",
"category":"reader"
},
{
"stepType":"drds", // The writer type.
"parameter":{
"postSql":[], // The SQL statement to execute after the sync node is run.
"datasource":"", // The connection name.
"column":[ // The columns to which data is written.
"id"
],
"writeMode":"insert ignore",
"batchSize":"1024", // The number of data records to write at a time.
"table":"test", // The name of the destination table.
"preSql":[] // The SQL statement to execute before the sync node is run.
},
"name":"Writer",
"category":"writer"
}
],
"setting":{
"errorLimit":{
"record":"0" // The maximum number of dirty data records allowed.
},
"speed":{
"throttle":false, // Specifies whether to enable bandwidth throttling.
"concurrent":1 // The maximum number of concurrent threads.
}
},
"order":{
"hops":[
{
"from":"Reader",
"to":"Writer"
}
]
}
}