All Products
Search
Document Center

DataWorks:Graph Database (GDB)

Last Updated:Mar 26, 2026

A Graph Database data source lets you read data from and write data to Graph Database. This topic describes the data synchronization capabilities for Graph Database in DataWorks.

Prerequisites

Before you develop a synchronization task, make sure you have:

Limits

Offline read

  • Configure separate tasks for vertices and edges. Each export task traverses data based on the label names of the vertices or edges being exported.

  • Primary key ID fields for both vertices and edges are of the STRING type. If you configure a numeric type such as LONG, GDB Reader attempts a type conversion. A failed conversion causes the record to be lost.

  • Property values must match the storage class. If they don't match, GDB Reader attempts a type conversion. A failed conversion may cause the record to be lost.

  • When exporting a SET property value from a vertex, the same value is not guaranteed to be exported each time.

  • When all properties are exported in JSON format, a SET property with only one value is output as a regular property.

  • Field names and enumeration values in the examples are case-sensitive unless otherwise specified.

  • The GDB server supports UTF-8 encoding only. All exported data is in UTF-8 format.

  • GDB must be upgraded to version 1.0.20 or later to support SET properties. Verify the instance version before using SET properties.

Offline write

  • Run the vertex sync task first. After it completes successfully, run the edge sync task.

  • Field names and enumeration values in the examples are case-sensitive unless otherwise specified.

  • The GDB server supports UTF-8 encoding only. Source data must also be in UTF-8 format.

Vertex constraints

Constraint

Details

Type name

Required. A vertex must have a type name (vertex name) that corresponds to the label.

Primary key ID

Required. Must be unique among all vertices and must be of the STRING type. GDB Writer force-converts non-STRING types.

idTransRule

If set to none, the vertex ID must be unique among all vertices globally. Choose carefully.

Edge constraints

Constraint

Details

Type name

Required. An edge must have a type name (edge name) that corresponds to the label.

Primary key ID

Optional. If specified, it must be globally unique across all edges. If not specified, the GDB server generates a UUID. The type must be STRING; GDB Writer force-converts non-STRING types.

idTransRule

If set to none, the edge ID must be unique among all vertices and edges globally. Choose carefully.

srcIdTransRule and dstIdTransRule

Required. Must be consistent with the idTransRule used when importing vertices.

Add a data source

Before developing a synchronization task, add GDB as a data source in DataWorks. Follow the instructions in Data source management. Parameter descriptions are available in the DataWorks console when you add the data source.

Develop a data synchronization task

For the entry point and configuration procedure, see the following guides.

Configuration guide for an offline sync task for a single table

Appendix: Script demo and parameter description

Configure a batch synchronization task using the code editor

To configure a batch synchronization task using the code editor, configure the parameters in your script following the unified script format. For more information, see Configure a task in the code editor. The following sections describe the parameters required for GDB data sources.

Reader script demo

GDB Reader exports vertices and edges using separate tasks. All examples use the labelType parameter to specify whether the task targets vertices (VERTEX) or edges (EDGE).

Vertex configuration example

{
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    },
    "setting": {
        "errorLimit": {
            "record": "100"
        },
        "jvmOption": "",
        "speed": {
            "concurrent": 3,
            "throttle": true,
            "mbps": "12"
        }
    },
    "steps": [
        {
            "category": "reader",
            "name": "Reader",
            "parameter": {
                "host": "gdb-xxxxxx.aliyuncs.com",
                "port": 8182,
                "username": "gdb",
                "password": "gdb",
                "labelType": "VERTEX",
                "labels": ["label1", "label2"],
                "column": [
                    {
                        "name": "id",
                        "type": "string",
                        "columnType": "primaryKey"
                    },
                    {
                        "name": "label",
                        "type": "string",
                        "columnType": "primaryLabel"
                    },
                    {
                        "name": "age",
                        "type": "int",
                        "columnType": "vertexProperty"
                    }
                ]
            },
            "stepType": "gdb"
        },
        {
            "category": "writer",
            "name": "Writer",
            "parameter": {
                "print": true
            },
            "stepType": "stream"
        }
    ]
}

Edge configuration example

{
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    },
    "setting": {
        "errorLimit": {
            "record": "100"
        },
        "jvmOption": "",
        "speed": {
            "concurrent": 3,
            "throttle": true,
            "mbps": "12"
        }
    },
    "steps": [
        {
            "category": "reader",
            "name": "Reader",
            "parameter": {
                "host": "gdb-xxxxxx.aliyuncs.com",
                "port": 8182,
                "username": "gdb",
                "password": "gdb",
                "labelType": "EDGE",
                "labels": ["label1", "label2"],
                "column": [
                    {
                        "name": "id",
                        "type": "string",
                        "columnType": "primaryKey"
                    },
                    {
                        "name": "label",
                        "type": "string",
                        "columnType": "primaryLabel"
                    },
                    {
                        "name": "srcId",
                        "type": "string",
                        "columnType": "srcPrimaryKey"
                    },
                    {
                        "name": "srcLabel",
                        "type": "string",
                        "columnType": "srcPrimaryLabel"
                    },
                    {
                        "name": "dstId",
                        "type": "string",
                        "columnType": "dstPrimaryKey"
                    },
                    {
                        "name": "dstLabel",
                        "type": "string",
                        "columnType": "dstPrimaryLabel"
                    },
                    {
                        "name": "weight",
                        "type": "double",
                        "columnType": "edgeProperty"
                    }
                ]
            },
            "stepType": "gdb"
        },
        {
            "category": "writer",
            "name": "Writer",
            "parameter": {
                "print": true
            },
            "stepType": "stream"
        }
    ]
}

Reader script parameters

Parameter

Description

Required

Default

host

The endpoint of the GDB instance. In the Graph Database console, click Graph Database consoleManage next to the instance to view the Internal Endpoint.

Yes

None

port

The port used to connect to the GDB instance.

Yes

8182

username

The account name for the GDB instance.

Yes

None

password

The password for the GDB instance account.

Yes

None

labels

The label names to read. Accepts an array, for example, ["label1", "label2"]. An empty array exports all vertices or edges.

Yes

None

labelType

The type of data to read. VERTEX exports vertices; EDGE exports edges.

Yes

None

column

The field mapping configuration for the vertex or edge.

Yes

None

columnname

The field name for the vertex or edge. For properties, provide the property name.

Yes

None

columntype

The type of the field value. Supported types for regular properties: INT, LONG, FLOAT, DOUBLE, BOOLEAN, and STRING. Do not configure the primary key ID or label as STRING — they are already STRING in GDB and doing so causes a transformation failure. GDB Reader attempts to convert read data to the configured type; a failed conversion marks the record as an error.

Yes

None

columncolumnType

The role of the field. See the table below for supported values.

Yes

None

Supported `columnType` values

Value

Applies to

Description

primaryKey

Vertices and edges

The primary key ID.

primaryLabel

Vertices and edges

The label name.

vertexProperty

Vertices (labelType: VERTEX)

A basic-type property of the vertex.

vertexJsonProperty

Vertices (labelType: VERTEX)

All vertex properties packed into a single JSON column. Cannot be combined with other property types in the same column array. See the format below.

srcPrimaryKey

Edges (labelType: EDGE)

The primary key ID of the source vertex.

dstPrimaryKey

Edges (labelType: EDGE)

The primary key ID of the destination vertex.

srcPrimaryLabel

Edges (labelType: EDGE)

The label name of the source vertex.

dstPrimaryLabel

Edges (labelType: EDGE)

The label name of the destination vertex.

edgeProperty

Edges (labelType: EDGE)

A property of the edge.

edgeJsonProperty

Edges (labelType: EDGE)

All edge properties packed into a single JSON column. Cannot be combined with other property types in the same column array. Edges do not support multi-valued properties; there is no c field.

`vertexJsonProperty` format

{
    "properties": [
        {"k": "name", "t": "string", "v": "tom",  "c": "set"},
        {"k": "name", "t": "string", "v": "jack", "c": "set"},
        {"k": "sex",  "t": "string", "v": "male", "c": "single"}
    ]
}

The name property above is multi-valued (two values). If a multi-valued property in GDB contains only one value, it is exported as a single-valued property.

`edgeJsonProperty` format

{
    "properties": [
        {"k": "name", "t": "string", "v": "tom"},
        {"k": "sex",  "t": "string", "v": "male"}
    ]
}

Writer script demo

Vertex configuration example

{
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    },
    "setting": {
        "errorLimit": {
            "record": "100"
        },
        "speed": {
            "throttle": true,
            "concurrent": 3,
            "mbps": "12"
        }
    },
    "steps": [
        {
            "category": "reader",
            "name": "Reader",
            "parameter": {
                "column": ["*"],
                "datasource": "_ODPS",
                "emptyAsNull": true,
                "guid": "",
                "isCompress": false,
                "partition": [],
                "table": ""
            },
            "stepType": "odps"
        },
        {
            "category": "writer",
            "name": "Writer",
            "parameter": {
                "datasource": "testGDB",
                "label": "person",
                "srcLabel": "",
                "dstLabel": "",
                "labelType": "VERTEX",
                "writeMode": "INSERT",
                "idTransRule": "labelPrefix",
                "srcIdTransRule": "none",
                "dstIdTransRule": "none",
                "column": [
                    {
                        "name": "id",
                        "value": "#{0}",
                        "type": "string",
                        "columnType": "primaryKey"
                    },
                    {
                        "name": "person_age",
                        "value": "#{1}",
                        "type": "int",
                        "columnType": "vertexProperty"
                    },
                    {
                        "name": "person_credit",
                        "value": "#{2}",
                        "type": "string",
                        "columnType": "vertexProperty"
                    }
                ]
            },
            "stepType": "gdb"
        }
    ],
    "type": "job",
    "version": "2.0"
}

Edge configuration example

{
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    },
    "setting": {
        "errorLimit": {
            "record": "100"
        },
        "jvmOption": "",
        "speed": {
            "throttle": true,
            "concurrent": 3,
            "mbps": "12"
        }
    },
    "steps": [
        {
            "category": "reader",
            "name": "Reader",
            "parameter": {
                "column": ["*"],
                "datasource": "_ODPS",
                "emptyAsNull": true,
                "guid": "",
                "isCompress": false,
                "partition": [],
                "table": ""
            },
            "stepType": "odps"
        },
        {
            "category": "writer",
            "name": "Writer",
            "parameter": {
                "datasource": "testGDB",
                "label": "use",
                "labelType": "EDGE",
                "srcLabel": "person",
                "dstLabel": "software",
                "writeMode": "INSERT",
                "idTransRule": "labelPrefix",
                "srcIdTransRule": "labelPrefix",
                "dstIdTransRule": "labelPrefix",
                "column": [
                    {
                        "name": "id",
                        "value": "#{0}",
                        "type": "string",
                        "columnType": "primaryKey"
                    },
                    {
                        "name": "id",
                        "value": "#{1}",
                        "type": "string",
                        "columnType": "srcPrimaryKey"
                    },
                    {
                        "name": "id",
                        "value": "#{2}",
                        "type": "string",
                        "columnType": "dstPrimaryKey"
                    },
                    {
                        "name": "person_use_software_time",
                        "value": "#{3}",
                        "type": "long",
                        "columnType": "edgeProperty"
                    },
                    {
                        "name": "person_regist_software_name",
                        "value": "#{4}",
                        "type": "string",
                        "columnType": "edgeProperty"
                    },
                    {
                        "name": "id",
                        "value": "#{5}",
                        "type": "long",
                        "columnType": "edgeProperty"
                    }
                ]
            },
            "stepType": "gdb"
        }
    ],
    "type": "job",
    "version": "2.0"
}

Writer script parameters

Parameter

Description

Required

Default

datasource

The data source name. Must match the name of the data source added in DataWorks.

Yes

None

label

The type name (vertex or edge name). Can be read from a source column using #{N}, where N is the 0-based source column index.

Yes

None

labelType

The type of the label. VERTEX targets vertices; EDGE targets edges.

Yes

None

srcLabel

The source vertex name. Required when label is an edge and srcIdTransRule is not none. Omit for vertices.

No

None

dstLabel

The destination vertex name. Required when label is an edge and dstIdTransRule is not none. Omit for vertices.

No

None

writeMode

How to handle duplicate IDs during import. INSERT reports an error and increments the error count. MERGE overwrites the old value with the new value.

Yes

INSERT

idTransRule

The transform rule for the primary key ID. labelPrefix transforms the mapped value to {label_name}-{source_field}. none uses the mapped value as-is.

Yes

none

srcIdTransRule

The transform rule for the source vertex primary key ID. labelPrefix or none. When none, srcLabel is not required. Required when label is an edge.

Required if label is an edge

none

dstIdTransRule

The transform rule for the destination vertex primary key ID. labelPrefix or none. When none, dstLabel is not required. Required when label is an edge.

Required if label is an edge

none

column

The field mapping configuration for vertices or edges. See the field descriptions below.

Yes

None

`column` field descriptions

Field

Description

name

The field name of the vertex or edge.

value

The mapped value. #{N} maps the Nth source column (0-based). Concatenation is supported: test-#{0} prepends a string, #{0}-#{1} combines two columns, and test-#{0}-test1-#{1}-test2 adds strings at multiple positions.

type

The type of the mapped value. The primary key ID accepts STRING only; GDB Writer force-converts other types. Regular properties support: INT, LONG, FLOAT, DOUBLE, BOOLEAN, and STRING.

columnType

The role of the mapped field. See the table below for supported values.

Supported `columnType` values for Writer

Value

Applies to

Description

primaryKey

Vertices and edges

The primary key ID. Required for vertices; optional for edges.

vertexProperty

Vertices (labelType: VERTEX)

A regular property of the vertex.

vertexJsonProperty

Vertices (labelType: VERTEX)

A JSON property of the vertex. For the value structure, see the properties example below.

srcPrimaryKey

Edges (labelType: EDGE)

The primary key ID of the source vertex.

dstPrimaryKey

Edges (labelType: EDGE)

The primary key ID of the destination vertex.

edgeProperty

Edges (labelType: EDGE)

A regular property of the edge.

edgeJsonProperty

Edges (labelType: EDGE)

A JSON property of the edge. For the value structure, see the properties example below.

`properties` example

{
    "properties": [
        {"k": "name", "t": "string", "v": "tom"},
        {"k": "age",  "t": "int",    "v": "20"},
        {"k": "sex",  "t": "string", "v": "male"}
    ]
}