edit-icon download-icon

Configure PostgreSQL writer

Last Updated: Mar 21, 2018

The PostgreSQL Writer plug‑in provides the ability to write data into PostgreSQL. At the underlying implementation level, PostgreSQL Writer connects to a remote PostgreSQL database through JDBC, and runs the SELECT statements to extract data from PostgreSQL database. On the public cloud, RDS provides PostgreSQL storage engine. You must configure the data source before configuring the PostgreSql Writer plug‑in. For more information, see Configure the PostgreSql Data Source.

In short, PostgreSQL Writer connects to a remote PostgreSQL database through the JDBC connector. The SELECT SQL query statements are generated and sent to the remote PostgreSQL database based on your configuration. Then, the SQL statements run and the returned results are assembled into abstract datasets using the custom data types of Data Integration. Datasets are passed to the downstream writer for processing.

  • PostgreSQL Writer concatenates the table, column, and WHERE information you configured into SQL statements and sends them to the PostgreSQL database.

  • PostgreSQL directly sends the querySql information you configured to the PostgreSQL database.

PostgreSQL Writer supports most data types in PostgreSQL. Check whether your data type is supported.

PostgreSQL Writer converts PostgreSQL data types as follows:

Internal CDP type PostgreSQL data type
Long bigint, bigserial, integer, smallint, serial
Double double precision, money, numeric, real
String varchar, char, text, bit, inet
Date date, time, timestamp
Boolean bool
Bytes bytea

Note:

  • Apart from the field types listed here, other types are not supported.

  • For money, inet, and bit, you must use syntax like a_inet::varchar for conversion.

Parameter description

  • datasource

    • Description: Data source name. It must be identical to the data source name added. Adding data source is supported in script mode.

    • Required: Yes

    • Default value: None

  • table

    • Description: The table selected for synchronization.

    • Required: Yes

    • Default value: None

  • writeMode

    • Description: Select an import mode. “insert” is supported.

      • insert: If the primary key conflicts with the unique index, Data Integration deems the data as dirty data, but the original data is retained.
    • Required: No

    • Default value: insert

  • column

    • Description: The fields of the target table into which data is required to be written. These fields are separated by commas. For example: “column”: [“id”, “name”, “age”]. Use * if it is required to write data into all columns in sequence. For example: “column”: [“*”]

    • Required: Yes

    • Default value: None
  • preSql

    • Description: The SQL statement that runs before the data synchronization task runs. Currently, you can run only one SQL statement in wizard mode, and more than one SQL statement in script mode. For example, clear old data.

    • Required: No

    • Default value: None
  • postSql

    • Description: The SQL statement that runs after the data synchronization task runs. Currently, you can run only one SQL statement in wizard mode, and more than one SQL statement in script mode. For example, add a timestamp.

    • Required: No

    • Default value: None

  • batchSize

    • Description: The quantity of records submitted in one operation. Setting this parameter can greatly reduce the interactions between Data Integration and PostgreSQL, and increase the overall throughout. However, an excessively large value may cause the running process of Data Integration to become out of memory (OOM).

    • Required: No

    • Default value: 1024

Development in wizard mode

TargetPostgreSQL

  • Data Source: datasource in the preceding parameter description. Select the PostgreSQL data source.
  • Table: table in the preceding parameter description. Select the table to be synchronized.

  • Prepared Statement before Import: preSql in the preceding parameter description, namely, the SQL statement that runs before the data synchronization task runs.

  • Prepared Statement after Import: postSql in the preceding parameter description, namely, the SQL statement that runs after the data synchronization task runs.

  • Primary Key Conflict: writeMode in the preceding parameter description. You can select the expected import mode.

  • Field Mapping: column in the preceding parameter description.

Development in script mode

The following is a script configuration sample. For relevant parameters, see Parameter description.

  1. {
  2. "type": "job",
  3. "version": "1.0",
  4. "configuration": {
  5. "reader": {},
  6. "writer": {
  7. "plugin": "postgresql",
  8. "parameter": {
  9. "datasource": "datasourceName",
  10. "table": "table",
  11. "column": ["*"],
  12. "preSql": [
  13. "delete from XXX;"
  14. ]
  15. }
  16. }
  17. }
  18. }
Thank you! We've received your feedback.