When you use Data Transmission Service (DTS) to migrate or synchronize data to a Kafka cluster, you can select the format in which data records are stored. This topic describes the data formats that are supported by DTS. You can parse data based on the definition of these data formats.

Data formats

DTS allows you to store data written to a Kafka cluster in the following three formats:

  • DTS Avro: A data serialization format into which data structures or objects can be converted to facilitate storage and transmission.
  • SharePlex JSON: The data replication software SharePlex reads the data in the source database and writes the data to the Kafka cluster in the SharePlex JSON format.
  • Canal JSON: Canal parses the incremental logs of the source database and transfers the incremental data to the Kafka cluster in the Canal JSON format.

DTS Avro

DTS Avro is the default data format. Data migrated or synchronized to a Kafka cluster by DTS is stored in the Avro format. You need to parse data based on the schema definition of DTS Avro. For more information, visit GitHub.

SharePlex JSON

Table 1. Parameters
Parameter Description
time The UTC time when the transaction in the database is committed, in the yyyy-MM-ddTHH:mm:ssZ format.
userid The ID of the user who commits the transaction.
op The operation type. Valid values: INSERT, UPDATE, DELETE, TRUNCATE, DROP COLUMN, UPDATE BEFORE, and UPDATE AFTER.
scn The system change number (SCN) that identifies the version of the transaction that the database commits at a specific time. Each committed transaction is assigned a unique SCN.
rowid A relatively unique address value that is used to identify a record in the database.
trans The ID of the transaction.
seq The sequence number of the operation in the transaction, starting from 1.
size The total number of operations in the transaction.
table The name of the table.
idx The index of the operation in the transaction, in the seq/size format. For example, 1/11 indicates that the sequence number of the operation is 1 in the transaction with a total number of 11 operations.
posttime The time when the transaction is committed to the destination database.

Examples:

  • Insert data:
    {
        "meta": {
            "time": "2017-06-16T14:24:34", 
            "userid": 84,                                    
            "op": "ins",                                   
              "scn": "14589063118712",                  
              "rowid": "AAATGpAAIAAItcIAAA",      
            "trans": "7.0.411499",                 
            "seq": 1,                                          
            "size": 11,                                         
            "table": "CL_BIZ1.MIO_LOG",       
              "idx": "1/11",                                       
            "posttime": "2017-06-16T14:33:52"
        },
        "data": {
            "MIO_LOG_ID": "32539737"
         }
    }
  • Update data:
    {
        "meta": {
            "time": "2017-06-16T15:38:13",
            "userid": 84,
            "op": "upd",                             
            "table": "CL_BIZ1.MIO_LOG"
            ...
        },
        "data": {                                          
            "CNTR_NO": "1171201606"
        },
        "key": {                                            
            "MIO_LOG_ID": "32537893",
            "PLNMIO_REC_ID": "31557806",
            "POL_CODE": null,
            "CNTR_TYPE": null,
            "CNTR_NO": "1171201606syui26"
        }
    }
  • Delete data:
    {
        "meta": {
            "time": "2017-06-16T15:51:35",
            "userid": 84,
            "op": "del",                      
         },
        "data": {                                    
            "MIO_LOG_ID": "32539739",
            "PLNMIO_REC_ID": "31557806",
            "POL_CODE": null,
            "CNTR_TYPE": null,
            "CG_NO": null
         }
    }

Canal JSON

Table 2. Parameters
Parameter Description
database The name of the database.
es The time when the operation is performed in the database. The value is a 13-bit UNIX timestamp. Unit: ms.
Note You can use a search engine to obtain a UNIX timestamp converter.
id The serial number of the operation.
isDdl Indicates whether the operation is a data definition language (DDL) operation.
  • true: yes
  • false: no
mysqlType The data type of the field.
old The data before update.
pkNames The name of the primary key.
sql The SQL statement.
sqlType The converted field type. For example, unsigned int is converted to Long, and unsigned long is converted to BigDecimal.
table The name of the table.
ts The time when the operation is written to the destination database. The value is a 13-bit UNIX timestamp. Unit: ms.
Note You can use a search engine to obtain a UNIX timestamp converter.
type The operation type. Valid values: DELETE, UPDATE, and INSERT.

The following script shows an example of UPDATE operation:

  "data": [
    {
      "id": "500000287",
      "shipping_type": null
    }
  ],
  "database": "dbname",
  "es": 1600161894000,
  "id": 58,
  "isDdl": false,
  "mysqlType": {
    "id": "bigint(20)",
    "shipping_type": "varchar(50)"
  },
  "old": [
    {
      "shipping_type": "aaa"
    }
  ],
  "pkNames": [
    "id"
  ],
  "sql": "",
  "sqlType": {
    "id": -5,
    "shipping_type": 12
  },
  "table": "tablename",
  "ts": 1600161894771,
  "type": "UPDATE"
}