This topic describes the data types and parameters that are supported by PolarDB Writer and how to configure PolarDB Writer by using the codeless user interface (UI) and code editor.

PolarDB Writer writes data to tables stored in PolarDB databases. PolarDB Writer connects to a remote PolarDB database by using Java Database Connectivity (JDBC) and executes an INSERT INTO or REPLACE INTO statement to write data to the PolarDB database. The PolarDB database must use the InnoDB engine because data is submitted to the PolarDB database in batches.
Note Before you configure PolarDB Writer, you must configure a PolarDB data source. For more information, see Configure a PolarDB data source.
PolarDB Writer is designed for extract, transform, load (ETL) developers to import data in data warehouses to PolarDB databases. PolarDB Writer can also be used as a data migration tool by users such as database administrators. PolarDB Writer obtains data from a reader and writes the data to the destination database based on the value of the writeMode parameter.
Note A synchronization node that uses PolarDB Writer must have at least the permissions to execute INSERT INTO and REPLACE INTO statements. Whether other permissions are required depends on the SQL statements that you specify in the preSql and postSql parameters when you configure the node.

Data types

Similar to PolarDB Reader, PolarDB Writer supports most PolarDB data types. Make sure that the data types of your database are supported.

The following table lists the data types that are supported by PolarDB Writer.
Category PolarDB data type
Integer INT, TINYINT, SMALLINT, MEDIUMINT, BIGINT, and YEAR
Floating point FLOAT, DOUBLE, and DECIMAL
String VARCHAR, CHAR, TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT
Date and time DATE, DATETIME, TIMESTAMP, and TIME
Boolean BOOLEAN
Binary TINYBLOB, MEDIUMBLOB, BLOB, LONGBLOB, and VARBINARY

Parameters

Parameter Description Required Default value
datasource The name of the data source. It must be the same as the name of the added data source. You can add data sources by using the code editor. Yes No default value
table The name of the table to which you want to write data. Yes No default value
writeMode The write mode. Valid values:
  • replace: If no primary key conflict or unique index conflict occurs, data is processed in the same way as that when you set this parameter to insert. If a conflict occurs, rows in the destination table are deleted, and new rows are inserted.
  • insert: If no primary key conflict or unique index conflict occurs, data is directly written to the destination table. If a primary key conflict or unique index conflict occurs, data cannot be written to conflicting rows, and the data that is not written to these rows is regarded as dirty data.
  • update: If no primary key conflict or unique index conflict occurs, data is processed in the same way as that when you set this parameter to insert. If a conflict occurs, data in conflicting rows in the destination table is replaced by new data.
No insert
column The names of the columns to which you want to write data. Separate the names with commas (,), such as "column": ["id", "name", "age"]. If you want to write data to all the columns in the destination table, set this parameter to an asterisk (*), such as "column": ["*"]. Yes No default value
preSql The SQL statement that you want to execute before the synchronization node is run. For example, you can set this parameter to the SQL statement that is used to delete outdated data. You can execute only one SQL statement on the codeless UI and multiple SQL statements in the code editor. No No default value
postSql The SQL statement that you want to execute after the synchronization node is run. For example, you can set this parameter to the SQL statement that is used to add a timestamp. You can execute only one SQL statement on the codeless UI and multiple SQL statements in the code editor. No No default value
batchSize The number of data records to write at a time. Set this parameter to an appropriate value based on your business requirements. This greatly reduces the interactions between Data Integration and PolarDB and increases throughput. If you set this parameter to an excessively large value, an out of memory (OOM) error may occur during data synchronization. No 1,024

Configure PolarDB Writer by using the codeless UI

  1. Configure data sources.
    Configure Source and Target for the synchronization node. Configure data sources
    Parameter Description
    Connection The name of the data source to which you want to write data. This parameter is equivalent to the datasource parameter that is described in the preceding section.
    Table The name of the table to which you want to write data. This parameter is equivalent to the table parameter that is described in the preceding section.
    Statement Run Before Writing The SQL statement that you want to execute before the synchronization node is run. This parameter is equivalent to the preSql parameter that is described in the preceding section.
    Statement Run After Writing The SQL statement that you want to execute after the synchronization node is run. This parameter is equivalent to the postSql parameter that is described in the preceding section. Example: update table set gmt_modify=now();.
    Solution to Primary Key Violation The write mode. This parameter is equivalent to the writeMode parameter that is described in the preceding section.
  2. Configure field mappings. This operation is equivalent to setting the column parameter that is described in the preceding section. Fields in the source on the left have a one-to-one mapping with fields in the destination on the right. Field mappings
    Operation Description
    Map Fields with the Same Name Click Map Fields with the Same Name to establish mappings between fields with the same name. The data types of the fields must match.
    Map Fields in the Same Line Click Map Fields in the Same Line to establish mappings between fields in the same row. The data types of the fields must match.
    Delete All Mappings Click Delete All Mappings to remove the mappings that are established.
    Auto Layout Click Auto Layout. Then, the system automatically sorts the fields based on specified rules.
  3. Configure channel control policies.Channel control
    Parameter Description
    Expected Maximum Concurrency The maximum number of parallel threads that the synchronization node uses to read data from the source or write data to the destination. You can configure the parallelism for the synchronization node on the codeless UI.
    Bandwidth Throttling Specifies whether to enable bandwidth throttling. You can enable bandwidth throttling and specify a maximum transmission rate to prevent heavy read workloads on the source. We recommend that you enable bandwidth throttling and set the maximum transmission rate to an appropriate value based on the configurations of the source.
    Dirty Data Records Allowed The maximum number of dirty data records allowed.

Configure PolarDB Writer by using the code editor

For more information about how to configure a synchronization node by using the code editor, see Create a sync node by using the code editor.

In the following code, a synchronization node is configured to write data to PolarDB. For more information about the parameters, see the preceding parameter description.
{
    "type": "job",
    "steps": [
        {
            "parameter": {},
            "name": "Reader",
            "category": "reader"
        },
        {
            "parameter": {
                "postSql": [],// The SQL statement that you want to execute after the synchronization node is run. 
                "datasource": "test_005",// The name of the data source. 
                "column": [// The names of the columns to which you want to write data. 
                    "id",
                    "name",
                    "age",
                    "sex",
                    "salary",
                    "interest"
                ],
                "writeMode": "insert",// The write mode. 
                "batchSize": 256,// The number of data records to write at a time. 
                "encoding": "UTF-8",// The encoding format. 
                "table": "PolarDB_person_copy",// The name of the table to which you want to write data. 
                "preSql": []// The SQL statement that you want to execute before the synchronization node is run. 
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "version": "2.0",// The version number. 
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    },
    "setting": {
        "errorLimit": {// The maximum number of dirty data records allowed. 
            "record": ""
        },
        "speed": {
            "throttle":true,// Specifies whether to enable bandwidth throttling. The value false indicates that bandwidth throttling is disabled, and the value true indicates that bandwidth throttling is enabled. The mbps parameter takes effect only when the throttle parameter is set to true. 
            "concurrent":6, // The maximum number of parallel threads. 
            "mbps":"12"// The maximum transmission rate.
        }
    }
}