PolarDB for PostgreSQL provides the wal2json plug-in to generate logical log files in the JSON format.

Prerequisites

The wal2json plug-in is supported on the PolarDB for PostgreSQL clusters that run the following engines:
  • PostgreSQL 14 (revision version 14.5.1.0 or later)
  • PostgreSQL 11 (revision version 1.1.29 or later)
Note You can execute the following statement to view the minor version that is used by PolarDB for PostgreSQL:
  • PostgreSQL 14
    select version();
  • PostgreSQL 11
    show polar_version;

Background information

wal2json is a logic decoding plug-in that provides the following features:
  • Access tuples generated by executing the INSERT and UPDATE statements.
  • Access UPDATE and DELETE old row versions depending on the specified duplicate identity.
  • Consume changes by using the streaming protocol (logical replication slots) or special SQL APIs.

The wal2json plug-in produces a JSON object for each transaction. All new and old tuples are available in the JSON object. In addition, options include properties such as transaction timestamp, schema-qualified, data type, and transaction ID. For more information, see Execute SQL statements to obtain JSON objects.

Precautions

  • Because PolarDB for PostgreSQL uses the replication method of REPLICA_IDENTITY_FULL, the entire row of data is displayed during UPDATE and DELETE, instead of the columns before and after UPDATE and DELETE. To modify only the columns before and after UPDATE, you must set the polar_create_table_with_full_replica_identity parameter to off. This parameter cannot be modified in the console. You must

    submit a ticket

    to modify the parameter.
  • The wal2json plug-in requires logical encoding and decoding. You must modify the wal_level parameter to replica. This parameter can be modified in the console.

Execute SQL statements to obtain JSON objects

You do not need to execute the create extension statement to create the wal2json plug-in. You can create a logical copy slot to load the wal2json plug-in.

After you create a logical replication slot that contains the wal2json plug-in, execute the following statement to obtain the JSON object in WAL:
-- Create tables with and without primary keys.
CREATE TABLE table2_with_pk (a SERIAL, b VARCHAR(30), c TIMESTAMP NOT NULL, PRIMARY KEY(a, c));
CREATE TABLE table2_without_pk (a SERIAL, b NUMERIC(5,2), c TEXT);

-- Create a logical replication slot of the wal2json type.
SELECT 'init' FROM pg_create_logical_replication_slot('test_slot', 'wal2json');

-- Commit a transaction and write it to WAL.
BEGIN;
INSERT INTO table2_with_pk (b, c) VALUES('Backup and Restore', now());
INSERT INTO table2_with_pk (b, c) VALUES('Tuning', now());
INSERT INTO table2_with_pk (b, c) VALUES('Replication', now());
DELETE FROM table2_with_pk WHERE a < 3;

INSERT INTO table2_without_pk (b, c) VALUES(2.34, 'Tapir');
-- it is not added to stream because there isn't a pk or a replica identity
UPDATE table2_without_pk SET c = 'Anta' WHERE c = 'Tapir';
COMMIT;

-- Obtain the JSON object in WAL.
SELECT data FROM pg_logical_slot_get_changes('test_slot', NULL, NULL, 'pretty-print', '1');
SELECT 'stop' FROM pg_drop_replication_slot('test_slot');
Sample result:
$ psql -At -f /tmp/example2.sql postgres
CREATE TABLE
CREATE TABLE
init
BEGIN
INSERT 0 1
INSERT 0 1
INSERT 0 1
DELETE 2
INSERT 0 1
UPDATE 1
COMMIT
psql:/tmp/example2.sql:17: WARNING:  table "table2_without_pk" without primary key or replica identity is nothing
{
    "change": [
        {
            "kind": "insert",
            "schema": "public",
            "table": "table2_with_pk",
            "columnnames": ["a", "b", "c"],
            "columntypes": ["integer", "character varying(30)", "timestamp without time zone"],
            "columnvalues": [1, "Backup and Restore", "2018-03-27 12:05:29.914496"]
        }
        ,{
            "kind": "insert",
            "schema": "public",
            "table": "table2_with_pk",
            "columnnames": ["a", "b", "c"],
            "columntypes": ["integer", "character varying(30)", "timestamp without time zone"],
            "columnvalues": [2, "Tuning", "2018-03-27 12:05:29.914496"]
        }
        ,{
            "kind": "insert",
            "schema": "public",
            "table": "table2_with_pk",
            "columnnames": ["a", "b", "c"],
            "columntypes": ["integer", "character varying(30)", "timestamp without time zone"],
            "columnvalues": [3, "Replication", "2018-03-27 12:05:29.914496"]
        }
        ,{
            "kind": "delete",
            "schema": "public",
            "table": "table2_with_pk",
            "oldkeys": {
                "keynames": ["a", "c"],
                "keytypes": ["integer", "timestamp without time zone"],
                "keyvalues": [1, "2018-03-27 12:05:29.914496"]
            }
        }
        ,{
            "kind": "delete",
            "schema": "public",
            "table": "table2_with_pk",
            "oldkeys": {
                "keynames": ["a", "c"],
                "keytypes": ["integer", "timestamp without time zone"],
                "keyvalues": [2, "2018-03-27 12:05:29.914496"]
            }
        }
        ,{
            "kind": "insert",
            "schema": "public",
            "table": "table2_without_pk",
            "columnnames": ["a", "b", "c"],
            "columntypes": ["integer", "numeric(5,2)", "text"],
            "columnvalues": [1, 2.34, "Tapir"]
        }
    ]
}
stop

Parameters

The following table describes the parameters related to wal2json.
ParameterDescription
changeThe WAL entry for each DML, such as the WAL records for INSERT, UPDATE, DELETE, and TRUNCATE.
changesetA set of several changes.
include-xidsSpecifies whether to add xid to each changeset. Default value: false. Valid values:
  • true
  • false
include-timestampSpecifies whether to add a timestamp to each changeset. Default value: false. Valid values:
  • true
  • false
include-schemasSpecifies whether to add a schema to each change. Default value: true. Valid values:
  • true
  • false
include-typesSpecifies whether to add a type to each change. Default value: true. Valid values:
  • true
  • false
include-typmodAdds modifier to types that have it (such as varchar(20) instead of varchar). Default value: true. Valid values:
include-type-oidsSpecifies whether to add the type oids. Default value: false. Valid values:
  • true
  • false
include-not-nullSpecifies whether to add not null information as a columnoptionals. Default value: false. Valid values:
  • true
  • false
pretty-printSpecifies whether to add spaces and indents to the JSON structure for formatting. Default value: false. Valid values:
  • true
  • false
write-in-chunksSpecifies whether to write after each change instead of each changeset. Default value: false. Valid values:
  • true
  • false
include-lsnSpecifies whether to add nextlsn to each changeset. Default value: false. Valid values:
  • true
  • false
filter-tablesExcludes rows from the specified tables. By default, this parameter is empty, indicating that no table is excluded.
Note
  • Separate multiple tables with commas (,). You must specify a schema for each table.
  • *.foo indicates table foo in all schemas, while bar.* indicates all tables in schema bar.
  • Special characters (space, single quote, comma, period, asterisk) must be escaped with backslashes (\).
  • Schemas and tables are case-sensitive.
  • Table Foo bar in schema public must be specified as public.Foo\bar.
add-tablesIncludes only rows from the specified tables. By default, all tables from all schemas are parsed. For more information, see filter-tables.
filter-msg-prefixesExcludes rows with the specified prefix. This parameter is typically used in the pg_logical_slot_peek_changes() function. By default, this parameter is empty, indicating that no row is excluded. Separate multiple prefixes with commas (,).
add-msg-prefixesIncludes rows with the specified prefix. This parameter is typically used in the pg_logical_slot_peek_changes() function. By default, all prefixes are selected. Separate multiple prefixes with commas (,). You must use filter-msg-prefixes before this parameter.
format-versionDefines which output format to use. Default value: 1. Valid values:
  • 1: uses format version 1.
  • 2: uses format version 2.
actionsSpecifies which operations are sent. By default, all operations including INSERT, UPDATE, DELETE, and TRUNCATE are sent. If format version 1 is used, the TRUNCATE operation is not sent.

Examples

include-xids is used in the following examples.

  1. Create a table and a logical replication slots. Insert a row of data.
    DROP TABLE IF EXISTS tbl;
    CREATE TABLE tbl (id int);
    SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot', 'wal2json');
    INSERT INTO tbl VALUES (1);
                            
  2. Enter the parameters and values in the function.
    SELECT    
    count(*) = 1,    
    count(distinct ((data::json)->'xid')::text) = 1
    FROM pg_logical_slot_get_changes(    
    'regression_slot', NULL, NULL,    
    'format-version', '1',    
    'include-xids', '1');

How it works

For more information, see official documentation.