All Products
Search
Document Center

DataWorks:GBase8a

Last Updated:Mar 27, 2026

The GBase8a data source lets you read data from and write data to GBase8a. This topic describes the data synchronization capabilities for GBase8a in DataWorks.

GBase8a Reader and GBase8a Writer support:

  • Reading from multiple tables in a single synchronization task

  • Filtering rows with WHERE conditions for incremental synchronization

  • Partitioning large tables by primary key for parallel reads

  • Writing data with pre- and post-execution SQL hooks

Limitations

  • GBase8a Reader and GBase8a Writer support Serverless resource groups (recommended) and exclusive resource groups for Data Integration.

  • When an INSERT INTO statement encounters a primary key or unique index conflict, the conflicting rows are not written.

  • Data can be written only to a destination table in the primary database.

  • The task requires at least the INSERT INTO permission. Additional permissions may be required for statements specified in preSql and postSql.

  • GBase8a Writer does not support the writeMode parameter.

Prerequisites

Add a GBase8a data source to DataWorks before developing a synchronization task. Follow the instructions in Data source management. Parameter descriptions are available in the DataWorks console when you add the data source.

Set up a synchronization task

Configure an offline synchronization task for a single table using either the codeless UI or the code editor:

For code editor parameter descriptions and script examples, see Appendix: Script examples and parameter descriptions.

Appendix: Script examples and parameter descriptions

The following scripts and parameter tables cover the settings specific to GBase8a Reader and GBase8a Writer. For the unified script format required by the code editor, see Configure a task in the code editor.

Reader script example

{
    "type": "job",
    "steps": [
        {
            "stepType": "gbase8a",
            "parameter": {
                "datasource": "",
                "username": "",
                "password": "",
                "where": "",
                "column": [
                    "id",
                    "name"
                ],
                "splitPk": "id",
                "connection": [
                    {
                        "table": [
                            "table"
                        ],
                        "datasource": ""
                    }
                ]
            },
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "stream",
            "parameter": {
                "print": false,
                "fieldDelimiter": ","
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "version": "2.0",
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    },
    "setting": {
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "throttle": true,
            "concurrent": 1,
            "mbps": "12"
        }
    }
}

Reader parameters

Parameter Description Required Default
table The tables from which data is synchronized. Specify as a JSON array. Multiple tables can be read in parallel, but all tables must have the same schema. GBase8a Reader does not verify schema consistency across tables. The table parameter must be nested inside the connection configuration block. Yes None
column The columns to synchronize. Specify as a JSON array. Use ["*"] to select all columns. Supports column pruning (select specific columns), column reordering (export in a different order from the schema), constant values (e.g., '123'), and function columns (e.g., date('now')). Cannot be blank. Yes None
datasource The name of the GBase8a data source added in DataWorks. No None
splitPk The column used to partition data for parallel reads. Use an integer primary key for even data distribution and to avoid data hotspots. Supports integer types only — strings, floating-point numbers, and dates are not supported and cause the setting to be ignored, falling back to single-channel read. Leave blank to disable partitioning. No Blank
where A filter condition appended to the SQL query. GBase8a Reader builds a query from column, table, and where to extract data. Use where for incremental synchronization — for example, set it to gmt_create>$bizdate to sync the current day's data. If left blank, a full data synchronization is performed. No None
querySql A custom SQL query that overrides table, column, where, and splitPk. Use this when where alone cannot express the required filter logic. When querySql is set, GBase8a Reader ignores the table, column, where, and splitPk parameters. No None
fetchSize The number of records fetched from the database per batch. A larger value reduces network round trips and improves read throughput.
Note

Values greater than 2048 may cause an out-of-memory (OOM) error during synchronization.

No 1,024

Writer script example

{
    "type": "job",
    "version": "2.0",
    "steps": [
        {
            "stepType": "stream",
            "parameter": {},
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "gbase8a",
            "parameter": {
                "datasource": "Data source name",
                "username": "",
                "password": "",
                "column": [
                    "id",
                    "name"
                ],
                "connection": [
                    {
                        "table": [
                            "Gbase8a_table"
                        ],
                        "datasource": ""
                    }
                ],
                "preSql": [
                    "delete from @table where db_id = -1"
                ],
                "postSql": [
                    "update @table set db_modify_time = now() where db_id = 1"
                ]
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "setting": {
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "throttle": true,
            "concurrent": 1,
            "mbps": "12"
        }
    },
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    }
}

Writer parameters

Parameter Description Required Default
datasource The name of the data source added in DataWorks. Must match the name of the added data source exactly. Yes None
table The destination table for data writes. Specify as a JSON array. The table parameter must be nested inside the connection configuration block. Yes None
column The destination columns to write to. Separate multiple columns with commas — for example, ["id", "name", "age"]. Cannot be blank. Yes None
preSql A SQL statement to run before the data write. Use @table as a placeholder for the destination table name — the system substitutes the actual table name at runtime. No None
postSql A SQL statement to run after the data write completes. No None
batchSize The number of records submitted per batch. A larger value reduces network round trips and improves write throughput. Excessively large values may cause an out-of-memory (OOM) error during synchronization. No 1,024