This topic describes the data types and parameters that PostgreSQL Writer supports and how to configure it by using the codeless user interface (UI) and code editor.

PostgreSQL Writer allows you to write data to a PostgreSQL database. Specifically, PostgreSQL Writer connects to a remote PostgreSQL database by using Java Database Connectivity (JDBC), and executes an SQL statement to write data to the PostgreSQL database.
Note You must configure a PostgreSQL connection before you configure PostgreSQL Writer. For more information, see Configure a PostgreSQL connection.
  • PostgreSQL Writer generates the SQL statement based on the table, column, and where parameters that you specify, and sends the generated SQL statement to the PostgreSQL database.
  • If you specify the querySql parameter, PostgreSQL Writer directly sends the value of this parameter to the PostgreSQL database.

Data types

PostgreSQL Writer supports most PostgreSQL data types. Make sure that your data types are supported.

The following table describes the data types that PostgreSQL Writer supports.
Data Integration data type PostgreSQL data type
LONG BIGINT, BIGSERIAL, INTEGER, SMALLINT, and SERIAL
DOUBLE DOUBLE, PRECISION, MONEY, NUMERIC, and REAL
STRING VARCHAR, CHAR, TEXT, BIT, and INET
DATE DATE, TIME, and TIMESTAMP
BOOLEAN BOOL
BYTES BYTEA
Note
  • PostgreSQL Writer supports only the data types that are described in the preceding table.
  • You can convert the MONEY, INET, and BIT types by using syntax such as a_inet::varchar.

Parameters

Parameter Description Required Default value
datasource The connection name. It must be the same as the name of the added connection. You can add connections in the code editor. Yes N/A
table The name of the destination table. Yes N/A
writeMode The write mode. Valid values:
  • insert: executes the INSERT INTO...VALUES... statement to write data to the PostgreSQL database. If a primary key conflict or unique index conflict occurs, data cannot be written to the conflicting rows and is regarded as dirty data. We recommend that you select the insert mode.
  • copy: copies data between tables and the standard input or output file. Data Integration supports the COPY FROM command, which allows you to copy data from files to tables. We recommend that you try this mode when performance issues occur.
No insert
column The columns in the destination table to which data is written. Separate the columns with commas (,), for example, "column":["id","name","age"]. To write data to all the columns in the destination table, set the value to an asterisk (*), for example, "column":["*"]. Yes N/A
preSql The SQL statement to execute before the sync node is run. For example, you can clear outdated data before data synchronization. You can execute only one SQL statement on the codeless UI and multiple SQL statements in the code editor. No N/A
postSql The SQL statement to execute after the sync node is run. For example, you can add a timestamp after data synchronization. You can execute only one SQL statement on the codeless UI and multiple SQL statements in the code editor. No N/A
batchSize The number of data records to write at a time. Set this parameter to a proper value based on your actual needs. This can greatly reduce the interactions between Data Integration and the PostgreSQL database on the network, and increase the throughput. However, an excessively large value may lead to the out of memory (OOM) error during the data synchronization process. No 1,024
pgType The PostgreSQL configuration for converting data types. Valid values: bigint[], double[], text[], jsonb, and json. Example:
{
    "job": {
        "content": [{
            "reader": {...},
            "writer": {
                "parameter": {
                    "column": [
                        // The columns in the destination table to which data is written.
                        "bigint_arr",
                        "double_arr",
                        "text_arr",
                        "jsonb_obj",
                        "json_obj"
                    ],
                    "pgType": {
                        // The PostgreSQL configuration for converting data types. In each key-value pair, the key specifies the name of a field in the destination table, and the value specifies the data type of the field.
                        "bigint_arr": "bigint[]",
                        "double_arr": "double[]",
                        "text_arr": "text[]",
                        "jsonb_obj": "jsonb",
                        "json_obj": "json"
                    }]
                }
            }
        }]
    }
}
No N/A

Configure PostgreSQL Writer by using the codeless UI

  1. Configure the connections.
    Configure the connections to the source and destination data stores for the sync node. Select data source section
    Parameter Description
    Data source The datasource parameter in the preceding parameter description. Select a connection type and select the name of a connection that you have configured in DataWorks.
    Table The table parameter in the preceding parameter description.
    Prepare statement before import The preSql parameter in the preceding parameter description. Enter an SQL statement to execute before the sync node is run.
    Complete statement after import The postSql parameter in the preceding parameter description. Enter an SQL statement to execute after the sync node is run.
    Import mode The writeMode parameter in the preceding parameter description. Valid values: insert and copy.
  2. Configure field mapping. It is equivalent to setting the column parameter in the preceding parameter description. Fields in the source table on the left have a one-to-one mapping with fields in the destination table on the right.Field Mapping section
    GUI element Description
    The same name mapping Click The same name mapping to establish a mapping between fields with the same name. Note that the data types of the fields must match.
    Peer mapping Click Peer mapping to establish a mapping for fields in the same row. Note that the data types of the fields must match.
    Unmap Click Unmap to remove mappings that have been established.
    Automatic typesetting Click Automatic typesetting to sort the fields based on specified rules.
  3. Configure channel control policies.Channel control section
    Parameter Description
    Maximum number of concurrent tasks expected The maximum number of concurrent threads that the sync node uses to read data from or write data to data stores. You can configure the concurrency for the node on the codeless UI.
    Synchronization rate Specifies whether to enable bandwidth throttling. You can enable bandwidth throttling and set a maximum transmission rate to avoid heavy read workload of the source. We recommend that you enable bandwidth throttling and set the maximum transmission rate to a proper value.
    The number of error records exceeds The maximum number of dirty data records allowed.

Configure PostgreSQL Writer by using the code editor

You can configure PostgreSQL Writer by using the code editor. For more information, see Create a sync node by using the code editor.

The following example shows how to configure a sync node to write data to a PostgreSQL database. For more information about the parameters, see the preceding parameter description.
{
    "type":"job",
    "version":"2.0", // The version number.
    "steps":[ 
        {
            "stepType":"stream",
            "parameter":{},
            "name":"Reader",
            "category":"reader"
        },
        {
            "stepType":"postgresql",// The writer type.
            "parameter":{
                "postSql":[],// The SQL statement to execute after the sync node is run.
                "datasource":"// The connection name.
                    "col1",
                    "col2"
                ],
                "table":"",// The name of the destination table.
                "preSql":[]// The SQL statement to execute before the sync node is run.
            },
            "name":"Writer",
            "category":"writer"
        }
    ],
    "setting":{
        "errorLimit":{
            "record":"0" // The maximum number of dirty data records allowed.
        },
        "speed":{
            "throttle":false,// Specifies whether to enable bandwidth throttling. A value of false indicates that the bandwidth is not throttled. A value of true indicates that the bandwidth is throttled. The maximum transmission rate takes effect only if you set this parameter to true.
            "concurrent":1 // The maximum number of concurrent threads.
        }
    },
    "order":{
        "hops":[
            {
                "from":"Reader",
                "to":"Writer"
            }
        ]
    }
}