This topic describes the data types and parameters that are supported by Lindorm Reader and how to configure Lindorm Reader by using the codeless user interface (UI) and code editor.

Background information

Lindorm Reader reads data from tables stored in ApsaraDB for Lindorm databases. Lindorm Reader connects to a remote ApsaraDB for Lindorm database by using a Java client, and calls API operations to read data from the tables of the table and wideColumn types stored in the ApsaraDB for Lindorm database. Then, Lindorm Reader assembles the returned data into abstract datasets of the data types supported by Data Integration and sends the datasets to a writer.
Note
  • The configuration parameter is required for Lindorm Reader. You can go to the ApsaraDB for Lindorm console to obtain the configuration items that are necessary for Data Integration to connect to an ApsaraDB for Lindorm cluster. The configuration data must be in the JSON format.
  • ApsaraDB for Lindorm is a multimode database. Lindorm Reader reads data from the tables of the table and wideColumn types stored in ApsaraDB for Lindorm databases. For more information about the tables of the table and wideColumn types, see Overview. You can also consult Lindorm engineers on duty by using DingTalk.

Limits

Lindorm Reader supports only exclusive resource groups for Data Integration, but not shared resource groups or custom resource groups for Data Integration. For more information, see Create and use an exclusive resource group for Data Integration, Use a shared resource group, and Create a custom resource group for Data Integration.

Data types

Lindorm Reader supports most ApsaraDB for Lindorm data types. Make sure that the data types of your database are supported.

The following table lists the data types that are supported by Lindorm Reader.
Category ApsaraDB for Lindorm data type
Integer INT, LONG, and SHORT
Floating point DOUBLE, FLOAT, and DOUBLE
String STRING
Date and time DATE
Boolean BOOLEAN
Binary BINARYSTRING

Parameters

Parameter Description Required Default value
configuration The configuration items that are necessary for Data Integration to connect to each ApsaraDB for Lindorm cluster. You can go to the ApsaraDB for Lindorm console to obtain the configuration items and ask the administrator of the ApsaraDB for Lindorm database to convert the configurations to data in the following JSON format: {"key1":"value1","key2":"value2"}.

Example: {"lindorm.zookeeper.quorum":"????","lindorm.zookeeper.property.clientPort":"????"}.

Note If you write the JSON code manually, you must escape double quotation marks (") of value to \".
Yes No default value
mode The data read mode. Valid values: FixedColumn and DynamicColumn. Default value: FixedColumn. Yes FixedColumn
tablemode The table type. Valid values: table and wideColumn. Default value: table. You can leave this parameter empty if a table of the table type is used. No This parameter is left empty by default.
table The name of the table from which you want to read data. The table name is case-sensitive. Yes No default value
namespace The namespace of the table from which you want to read data. The namespace of the table is case-sensitive. Yes No default value
encoding The encoding method. Valid values: UTF-8 and GBK. This parameter is used to convert the lindorm byte[] data stored in binary mode to strings. No UTF-8
selects Specifies whether to support parallel threads. If Lindorm Reader reads data from a table of the table type, parallel threads are not supported and a single synchronization node is run automatically. In this case, you must manually set the selects parameter. Example:
selects": [
                    "where(compare(\"id\", LESS, 5))",
                    "where(and(compare(\"id\", GREATER_OR_EQUAL, 5), compare(\"id\", LESS, 10)))",
                    "where(compare(\"id\", GREATER_OR_EQUAL, 10))"
                ],
No No default value
columns The columns of the table from which you want to read data. Lindorm Reader allows you to read data from the specified columns in an order different from that specified in the schema of the source table.
  • If Lindorm Reader reads data from a table of the table type in an ApsaraDB for Lindorm database, you need only to specify the column names in a destination table. The column type information is automatically obtained based on the metadata of the source table and is filled in the destination table. Example:
    For a table of the table type:
    [
        "id",
        "name",
        "age",
        "birthday",
        "gender"
    ]
  • Lindorm Reader reads data from a table of the wideColumn type stored in an ApsaraDB for Lindorm database. Example:
    For a table of the wideColumn type:
    [
        "STRING|rowkey",
        "INT|f:a",
        "DOUBLE|f:b"
    ]
Yes No default value

Configure Lindorm Reader by using the codeless UI

This method is not supported.

Configure Lindorm Reader by using the code editor

  • For more information about how to configure a job that reads data from a table of the table type stored in an ApsaraDB for Lindorm database to your server by using the code editor, see Create a sync node by using the code editor.
    Note Delete the comments from the code before you run the code.
    {
        "type": "job",
        "version": "2.0",
        "steps": [
            {
                "stepType": "lindorm",
                "parameter": {
                    "mode": "FixedColumn",
                "caching": 128,
                    "configuration": {    // The configuration items that are necessary for Data Integration to connect to each ApsaraDB for Lindorm cluster. The value is in the JSON format.
                        "lindorm.client.username": "",
                        "lindorm.client.seedserver": "seddserver.et2sqa.tbsite.net:30020",
                        "lindorm.client.namespace": "namespace",
                        "lindorm.client.password": ""
                    },
                    "columns": [
                        "id",
                        "name",
                        "age",
                        "birthday",
                        "gender"
                    ],
                    "envType": 1,
                    "datasource": "_LINDORM",
                    "namespace": "namespace",
                    "table": "lindorm_table"
                },
                "name": "lindormreader",
                "category": "reader"
            },
            {
                "stepType": "mysql",
                "parameter": {
                    "postSql": [],
                    "datasource": "_IDB.TAOBAO",
                    "session": [],
                    "envType": 1,
                    "columns": "columns": [
                        "id",
                        "name",
                        "age",
                        "birthday",
                        "gender"
                    ],
             "selects": [
                        "where(compare(\"id\", LESS, 5))",
                        "where(and(compare(\"id\", GREATER_OR_EQUAL, 5), compare(\"id\", LESS, 10)))",
                        "where(compare(\"id\", GREATER_OR_EQUAL, 10))"
                    ],
                    "socketTimeout": 3600000,
                    "guid": "",
                    "writeMode": "insert",
                    "batchSize": 1024,
                    "encoding": "UTF-8",
                    "table": "",
                    "preSql": []
                },
                "name": "Writer",
                "category": "writer"
            }
        ],
        "setting": {
            "jvmOption": "",
            "executeMode": null,
            "errorLimit": {
                "record": "0"
            },
            "speed": {
            // The transmission rate, in Byte/s. Data Integration runs to reach this rate as much as possible but does not exceed it.
            "byte": 1048576
          }
          // The maximum number of dirty data records allowed.
          "errorLimit": {
            // The maximum number of dirty data records allowed. If the value of errorlimit is greater than the maximum value, an error is reported. 
            "record": 0,
            // The maximum percentage of dirty data records. 1.0 indicates 100% and 0.02 indicates 2%.
            "percentage": 0.02
          }
        },
        "order": {
            "hops": [
                {
                    "from": "Reader",
                    "to": "Writer"
                }
            ]
        }
    }
  • For more information about how to configure the job that reads data from a table of the wideColumn type in an ApsaraDB for Lindorm database to your server by using the code editor, see Create a sync node by using the code editor.
    Note Delete the comments from the code before you run the code.
    {
        "type": "job",
        "version": "2.0",
        "steps": [
            {
                "stepType": "lindorm",
                "parameter": {
                    "mode": "FixedColumn",
                    "configuration": {  // The configuration items that are necessary for Data Integration to connect to each ApsaraDB for Lindorm cluster. The value is in the JSON format.
                        "lindorm.client.username": "",
                        "lindorm.client.seedserver": "seddserver.et2sqa.tbsite.net:30020",
                        "lindorm.client.namespace": "namespace",
                        "lindorm.client.password": ""
                    },
                    "columns": "columns": [
                       "STRING|rowkey",
                          "INT|f:a",
                          "DOUBLE|f:b"
                    ],
                    "envType": 1,
                    "datasource": "_LINDORM",
                    "namespace": "namespace",
                    "table": "wideColumn"
                },
                "name": "lindormreader",
                "category": "reader"
            },
            {
                "stepType": "mysql",
                "parameter": {
                    "postSql": [],
                    "datasource": "_IDB.TAOBAO",
                    "session": [],
                    "envType": 1,
                    "column": [
                        "id",
                        "value"
                    ],
                    "socketTimeout": 3600000,
                    "guid": "",
                    "writeMode": "insert",
                    "batchSize": 1024,
                    "encoding": "UTF-8",
                    "table": "",
                    "preSql": []
                },
                "name": "Writer",
                "category": "writer"
            }
        ],
        "setting": {
            "jvmOption": "",
            "executeMode": null,
            "errorLimit": {
                "record": "0"
            },
            "speed": {
            // The transmission rate, in Byte/s. Data Integration runs to reach this rate as much as possible but does not exceed it. 
            "byte": 1048576
          }
            // The maximum number of dirty data records allowed.
            "errorLimit": {
            // The maximum number of dirty data records allowed. If the value of errorlimit is greater than the maximum value, an error is reported. 
            "record": 0,
            // The maximum percentage of dirty data records. 1.0 indicates 100% and 0.02 indicates 2%. 
            "percentage": 0.02
          }
        },
        "order": {
            "hops": [
                {
                    "from": "Reader",
                    "to": "Writer"
                }
            ]
        }
    }