This topic describes the data types and parameters supported by GDB Reader and how to configure it by using the code editor.

Background information

Graph Database (GDB) is a real-time and reliable online database service that makes it easy to store and navigate the relationships between highly connected datasets. GDB supports the Property Graph model and uses Apache TinkerPop Gremlin as the query language. GDB allows you to build queries that navigate highly connected datasets with improved efficiency.
Note
  • You must configure a GDB connection before you configure GDB Reader. For more information, see Configure a GDB connection.
  • You must configure two sync nodes to synchronize data about vertices and edges separately.

Limits

  • You must configure two sync nodes to synchronize data about vertices and edges separately.
  • The vertices or edges whose data is to be synchronized must have names for DataWorks to traverse and obtain related data.
  • The primary key values that the reader reads from the GDB instance are of the STRING type. You must specify the STRING type for the primary key values exported by the reader. If you specify another data type, such as LONG, for the primary key values, GDB Reader forcibly converts the primary key values to the STRING type. If the conversion fails, the primary key values are lost.
  • For the values of vertex or edge properties, you must specify the data type for the property values exported by the reader to be the same as the original data type in the GDB instance. If you specify a data type different from the original data type for the property values, GDB Reader forcibly converts the property values to the specified data type. If the conversion fails, the property values are lost.
  • When you run a node to synchronize the vertex data multiple times, the obtained values of the SET property may be different.
  • If you configure all properties in the JSON format, the SET property that contains only one value is regarded as a common property.
  • Unless otherwise specified, field names or enumerated values in this topic are case-sensitive.
  • GDB Reader supports only UTF-8 encoding. The exported data must be encoded in UTF-8.
  • Only GDB 1.0.20 or later supports the SET property. Confirm the GDB version when you use the SET property.

Parameters

Parameter Description Required Default value
host The endpoint for connecting to the GDB instance. To view the endpoint, perform the following steps: Log on to the GDB console. In the left-side navigation pane, click Instances. On the Instances page, find the instance and click Manage in the Action column. On the page that appears, view the endpoint indicated by the Internal Endpoint parameter. Yes N/A
port The port number for connecting to the GDB instance. Yes 8182
username The username for connecting to the GDB instance. Yes N/A
password The password for connecting to the GDB instance. Yes N/A
labels The label, which is the name of the vertex or edge. GDB Reader can read data from multiple vertices or edges at a time. In this case, the value of this parameter is in the data array format, for example, ["label1", "label2"]. Yes N/A
labelType The type of the label. Valid values:
  • VERTEX: a vertex.
  • EDGE: an edge.
Yes N/A
column The mappings between the vertices or edges to be synchronized and the exported data. Yes N/A
column -> name The name of the vertex or edge property to be synchronized. This parameter is required when vertex or edge properties are to be synchronized. Yes N/A
column -> type The data type for storing the vertex or edge property to be synchronized.
  • The primary key and label can only be of the STRING type. If you do not set the data type to STRING, data conversion fails.
  • Other properties can be of the INT, LONG, FLOAT, DOUBLE, BOOLEAN, or STRING type.
  • GDB Reader forcibly converts the obtained data to the specified type. If the conversion fails, the data record is lost.
Yes N/A
column -> columnType The category of the vertex or edge property to be synchronized.
  • For both vertices and edges:
    • primaryKey: the primary key.
    • primaryLabel: the label.
  • For vertices:
    • vertexProperty: a common property of the vertex.
    • vertexJsonProperty: a collection of the properties of the vertex, in the JSON format. If you set the columnType parameter to vertexJsonProperty, all properties are listed in this column. Other columns cannot contain any property of the vertex.
      Example of vertexJsonProperty:
      {
          "properties":[
              {"k":"name","t":"string","v":"tom","c":"set"},
              {"k":"name","t":"string","v":"jack","c":"set"},
              {"k":"sex","t":"string","v":"male","c":"single"}
          ]
      }
                                                          

      The preceding code contains a multi-value property name and a single-value property sex. The name property has two records. Although the sex property is a multi-value property, it is regarded as a single-value property in this example because only one related record exists.

  • For edges:
    • srcPrimaryKey: the primary key of the start vertex.
    • padstPrimaryKey: the primary key of the end vertex.
    • srcPrimaryLabel: the label of the start vertex.
    • dstPrimaryLabel: the label of the end vertex.
    • edgeProperty: a property of the edge.
    • edgeJsonProperty: a collection of the properties of the edge, in the JSON format. If you set the columnType parameter to edgeJsonProperty, all properties are listed in this column. Other columns cannot contain any property of the edge.
      Example of edgeJsonProperty:
      {
          "properties":[
              {"k":"name","t":"string","v":"tom"},
              {"k":"sex","t":"string","v":"male"}
      ]
      }
                                                          

      An edge does not support multi-value properties or the c field.

Yes N/A

Configure GDB Reader by using the codeless UI

The codeless user interface (UI) is not supported for GDB Reader.

Configure GDB Reader by using the code editor

For more information about how to configure a sync node by using the code editor, see Create a sync node by using the code editor.

In the following code, two sync nodes are configured to read data from a GDB instance. For more information about the parameters, see Parameters.
  • Configure a sync node to read data about vertices from a GDB instance
    {
        "order":{
            "hops":[
                {
                    "from":"Reader",
                    "to":"Writer"
                }
            ]
        },
        "setting":{
            "errorLimit":{
                "record":"100"  // The maximum number of dirty data records allowed.
            },
            "jvmOption":"",
            "speed":{
                "concurrent":3,
                "throttle":false
            }
        },
        "steps":[
            {
                "category":"reader",
                "name":"Reader",
                "parameter":{
                    "host": "gdb-xxxxxx.aliyuncs.com", // The endpoint for connecting to the GDB instance.
                    "port": 8182, // The port number for connecting to the GDB instance.
                    "username": "gdb", // The username for connecting to the GDB instance.
                    "password": "gdb", // The password for connecting to the GDB instance.
                    "labelType": "VERTEX", // The type of the label. The value of VERTEX indicates a vertex.
                    "labels": ["label1", "label2"],  // The labels of the vertices to be synchronized. If this parameter is left empty, all vertices are synchronized.
                    "column": [
                        {
                            "name": "id",               // The name of the vertex property.
                            "type": "string",           // The data type for storing the data to be synchronized.
                            "columnType": "primaryKey"  // The category of the vertex property. The value of primaryKey indicates that the synchronized data is the primary key of the vertex and is of the STRING type in the GDB instance.
                        },
                        {
                            "name": "label",              // The name of the vertex property.
                            "type": "string",                     // The data type for storing the data to be synchronized.
                            "columnType": "primaryLabel"  // The category of the vertex property. The value of primaryLabel indicates that the synchronized data is the label of the vertex and is of the STRING type in the GDB instance.
                        },
                        {
                            "name": "age",                   // The name of the vertex property.
                            "type": "int",                   // The data type for storing the data to be synchronized.
                            "columnType": "vertexProperty"   // The category of the vertex property. The value of vertexProperty indicates a common vertex property.
                        }
                    ]
                },
                "stepType":"gdb"
            },
            {
                "category":"writer",
                "name":"Writer",
                "parameter":{
                    "print": true
                },
                "stepType":"stream"
            }
        ]
    }
  • Configure a sync node to read data about edges from a GDB instance
    {
        "order":{
            "hops":[
                {
                    "from":"Reader",
                    "to":"Writer"
                }
            ]
        },
        "setting":{
            "errorLimit":{
                "record":"100"  // The maximum number of dirty data records allowed.
            },
            "jvmOption":"",
            "speed":{
                "concurrent":3,
                "throttle":false
            }
        },
        "steps":[
            {
                "category":"reader",
                "name":"Reader",
                "parameter":{
                    "host": "gdb-xxxxxx.aliyuncs.com", // The endpoint for connecting to the GDB instance.
                    "port": 8182, // The port number for connecting to the GDB instance.
                    "username": "gdb", // The username for connecting to the GDB instance.
                    "password": "gdb", // The password for connecting to the GDB instance.
                    "labelType": "EDGE", // The type of the label. The value of EDGE indicates an edge.
                    "labels": ["label1", "label2"],  // The labels of the edges to be synchronized. If this parameter is left empty, all edges are synchronized.
                    "column": [
                        {
                            "name": "id",               // The name of the edge property.
                            "type": "string",           // The data type for storing the data to be synchronized.
                            "columnType": "primaryKey"  // The category of the edge property. The value of primaryKey indicates that the synchronized data is the primary key of the edge and is of the STRING type in the GDB instance.
                        },
                        {
                            "name": "label",              // The name of the edge property.
                            "type": "string",             // The data type for storing the data to be synchronized.
                            "columnType": "primaryLabel"  // The category of the edge property. The value of primaryLabel indicates that the synchronized data is the label of the edge and is of the STRING type in the GDB instance.
                        },
                        {
                            "name": "srcId",               // The name of the edge property.
                            "type": "string",              // The data type for storing the data to be synchronized.
                            "columnType": "srcPrimaryKey"  // The category of the edge property. The value of srcPrimaryKey indicates that the synchronized data is the primary key of the start or end vertex and is of the STRING type in the GDB instance.
                        },
                        {
                            "name": "srcLabel",               // The name of the edge property.
                            "type": "string",                 // The data type for storing the data to be synchronized.
                            "columnType": "srcPrimaryLabel"   // The category of the edge property. The value of srcPrimaryLabel indicates that the synchronized data is the label of the start or end vertex and is of the STRING type in the GDB instance.
                        },
                        {
                            "name": "dstId",                    // The name of the edge property.
                            "type": "string",                   // The data type for storing the data to be synchronized.
                            "columnType": "srcPrimaryKey"       // The category of the edge property. The value of srcPrimaryKey indicates that the synchronized data is the primary key of the start or end vertex and is of the STRING type in the GDB instance.
                        },
                        {
                            "name": "dstLabel",                 // The name of the edge property.
                            "type": "string",                   // The data type for storing the data to be synchronized.
                            "columnType": "srcPrimaryLabel"     // The category of the edge property. The value of srcPrimaryLabel indicates that the synchronized data is the label of the start or end vertex and is of the STRING type in the GDB instance.
                        },
                        {
                            "name": "weight",               // The name of the edge property.
                            "type": "double",               // The data type for storing the data to be synchronized.
                            "columnType": "edgeProperty"    // The category of the edge property. The value of edgeProperty indicates a common edge property.
                        }
                    ]
                },
                "stepType":"gdb"
            },
            {
                "category":"writer",
                "name":"Writer",
                "parameter":{
                    "print": true
                },
                "stepType":"stream"
            }
        ]
    }