This topic describes the data types and parameters that are supported by AnalyticDB for PostgreSQL Writer and how to configure AnalyticDB for PostgreSQL Writer by using the codeless user interface (UI) and code editor.

AnalyticDB for PostgreSQL Writer writes data to AnalyticDB for PostgreSQL databases. AnalyticDB for PostgreSQL Writer connects to a remote AnalyticDB for PostgreSQL database by using Java Database Connectivity (JDBC) and executes an SQL statement to write data to the AnalyticDB for PostgreSQL database.
Note Before you configure AnalyticDB for PostgreSQL Writer, you must configure an AnalyticDB for PostgreSQL data source. For more information, see Add an AnalyticDB for PostgreSQL data source.

Data types

AnalyticDB for PostgreSQL Writer supports most AnalyticDB for PostgreSQL data types. Make sure that the data types of your database are supported.

The following table lists the data type mappings based on which AnalyticDB for PostgreSQL Writer converts data types.

Data Integration data typeAnalyticDB for PostgreSQL data type
LONGBIGINT, BIGSERIAL, INTEGER, SMALLINT, and SERIAL
DOUBLEDOUBLE, PRECISION, MONEY, NUMERIC, and REAL
STRINGVARCHAR, CHAR, TEXT, BIT, and INET
DATEDATE, TIME, and TIMESTAMP
BOOLEANBOOLEAN
BYTESBYTEA
Note
  • Data types that are not listed in the preceding table are not supported.
  • The syntax such as a_inet::varchar is required when AnalyticDB for PostgreSQL Writer converts data to the MONEY, INET, or BIT data type.

Parameters

ParameterDescriptionRequiredDefault value
datasourceThe name of the data source. It must be the same as the name of the added data source. You can add data sources by using the code editor. YesNo default value
tableThe name of the table to which you want to write data. YesNo default value
writeModeThe write mode. Valid values:
Note You can configure the conflictMode parameter to select a policy to handle a primary key conflict or unique index conflict that occurs when data is written.
  • insert: AnalyticDB for PostgreSQL Writer executes the INSERT INTO...VALUES... statement to write data to the AnalyticDB for PostgreSQL database. We recommend that you select this mode in most cases.
  • copy: AnalyticDB for PostgreSQL provides the copy command to copy data between tables and the standard input or standard output file. Data Integration supports the COPY FROM statement, which allows you to copy data from a file to a table. We recommend that you use this mode if a performance issue occurs.
    Note If a conflict occurs when data is written in this mode, DataWorks uses the upsert policy specified by the conflictMode parameter to handle the conflict by default.
Noinsert
conflictModeThe policy to handle a primary key conflict or unique index conflict that occurs when data is written. Valid values:
  • report: Data cannot be written to the conflicting rows, and the data that fails to be written to these rows is regarded as dirty data.
  • upsert: Existing data is overwritten.
Note You can configure this parameter only when you configure a synchronization node by using the code editor.
Noreport
columnThe names of the columns to which you want to write data. Separate the names with commas (,), such as "column":["id","name","age"]. If you want to write data to all the columns in the destination table, set this parameter to an asterisk (*), such as "column":["*"]. YesNo default value
preSqlThe SQL statement that you want to execute before the synchronization node is run. For example, you can set this parameter to the SQL statement that is used to delete outdated data. You can execute only one SQL statement on the codeless UI and multiple SQL statements in the code editor. NoNo default value
postSqlThe SQL statement that you want to execute after the synchronization node is run. For example, you can set this parameter to the SQL statement that is used to add a timestamp. You can execute only one SQL statement on the codeless UI and multiple SQL statements in the code editor. NoNo default value
batchSizeThe number of data records to write at a time. Set this parameter to an appropriate value based on your business requirements. This greatly reduces the interactions between Data Integration and AnalyticDB for PostgreSQL and increases throughput. If you set this parameter to an excessively large value, an out of memory (OOM) error may occur during data synchronization. No1,024

Configure AnalyticDB for PostgreSQL Writer by using the codeless UI

  1. Configure data sources.
    Configure Source and Target for the synchronization node. Configure data sources
    ParameterDescription
    ConnectionThe name of the data source to which you want to write data. This parameter is equivalent to the datasource parameter that is described in the preceding section.
    TableThe name of the table to which you want to write data. This parameter is equivalent to the table parameter that is described in the preceding section.
    Statement Run Before WritingThe SQL statement that you want to execute before the synchronization node is run. This parameter is equivalent to the preSql parameter that is described in the preceding section.
    Statement Run After WritingThe SQL statement that you want to execute after the synchronization node is run. This parameter is equivalent to the postSql parameter that is described in the preceding section.
    Write MethodThe write mode. This parameter is equivalent to the writeMode parameter that is described in the preceding section. Valid values: insert and copy.
  2. Configure field mappings. This operation is equivalent to setting the column parameter that is described in the preceding section. Fields in the source on the left have a one-to-one mapping with fields in the destination on the right. Field mappings
    OperationDescription
    Map Fields with the Same NameClick Map Fields with the Same Name to establish mappings between fields with the same name. The data types of the fields must match.
    Map Fields in the Same LineClick Map Fields in the Same Line to establish mappings between fields in the same row. The data types of the fields must match.
    Delete All MappingsClick Delete All Mappings to remove the mappings that are established.
    Auto LayoutClick Auto Layout. Then, the system automatically sorts the fields based on specific rules.
  3. Configure channel control policies. Channel control
    ParameterDescription
    Expected Maximum ConcurrencyThe maximum number of parallel threads that the synchronization node uses to read data from the source or write data to the destination. You can configure the parallelism for the synchronization node on the codeless UI.
    Bandwidth ThrottlingSpecifies whether to enable throttling. You can enable throttling and specify a maximum transmission rate to prevent heavy read workloads on the source. We recommend that you enable throttling and set the maximum transmission rate to an appropriate value based on the configurations of the source.
    Dirty Data Records AllowedThe maximum number of dirty data records allowed.
    Distributed Execution

    The distributed execution mode that allows you to split your node into pieces and distribute them to multiple Elastic Compute Service (ECS) instances for parallel execution. This speeds up synchronization. If you use a large number of parallel threads to run your synchronization node in distributed execution mode, excessive access requests are sent to the data sources. Therefore, before you use the distributed execution mode, you must evaluate the access load on the data sources. You can enable this mode only if you use an exclusive resource group for Data Integration. For more information about exclusive resource groups for Data Integration, see Exclusive resource groups for Data Integration and Create and use an exclusive resource group for Data Integration.

Configure AnalyticDB for PostgreSQL Writer by using the code editor

For more information about how to configure a synchronization node by using the code editor, see Configure a batch synchronization node by using the code editor.
Note Delete the comments from the following sample code before you run the code.
{
    "type": "job",
    "steps": [
        {
            "parameter": {},
            "name": "Reader",
            "category": "reader"
        },
        {
            "parameter": {
                "postSql": [],// The SQL statement that you want to execute after the synchronization node is run. 
                "datasource": "test_004",// The name of the data source. 
                "column": [// The names of the columns to which you want to write data. 
                    "id",
                    "name",
                    "sex",
                    "salary",
                    "age"
                ],
                "table": "public.person",// The name of the table to which you want to write data. 
                "preSql": []// The SQL statement that you want to execute before the synchronization node is run. 
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "version": "2.0",// The version number. 
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    },
    "setting": {
        "errorLimit": {// The maximum number of dirty data records allowed. 
            "record": ""
        },
        "speed": {
            "throttle":true,// Specifies whether to enable bandwidth throttling. The value false indicates that bandwidth throttling is disabled, and the value true indicates that bandwidth throttling is enabled. The mbps parameter takes effect only when the throttle parameter is set to true. 
            "concurrent":6, // The maximum number of parallel threads. 
            "mbps":"12"// The maximum transmission rate.
        }
    }
}