Datahub is a real-time data distribution platform designed to process streaming data. You can publish and subscribe to streaming data in Datahub and distribute the data to other platforms. This allows you to easily analyze streaming data and build applications based on the streaming data.

Datahub Reader reads data from Datahub through the Java SDK of the following version:
<dependency>
    <groupId>com.aliyun.datahub</groupId>
    <artifactId>aliyun-sdk-datahub</artifactId>
    <version>2.9.1</version>
</dependency>

Parameters

Parameter Description Required
endpoint The endpoint of Datahub. Yes
accessId The AccessKey ID for accessing Datahub. Yes
accessKey The AccessKey secret for accessing Datahub. Yes
project The name of the source Datahub project. A project is the resource management unit in Datahub for resource isolation and control. Yes
topic The source topic of Datahub. Yes
batchSize The number of data records to read at a time. Default value: 1024. No
beginDateTime The start time of data consumption. This parameter defines the left boundary of an interval (left-closed and right-open) in the format of yyyyMMddHHmmss. The parameter can work with the scheduling time parameter in DataWorks.
Note Specify the beginDateTime and endDateTime parameters to determine the time range for consuming data.
Yes
endDateTime The end time of data consumption. This parameter defines the right boundary of an interval (left-closed and right-open) in the format of yyyyMMddHHmmss. The parameter can work with the scheduling time parameter in DataWorks.
Note Specify the beginDateTime and endDateTime parameters to determine the time range for consuming data.
Yes

Configure Datahub Reader by using the codeless UI

Currently, the codeless user interface (UI) is not supported for Datahub Reader.

Configure Datahub Reader by using the code editor

In the following code, a node is configured to read data from Datahub.
{
    "job": {
         "content": [
            {
                "reader": {
                    "name": "loghubreader",
                    "parameter": {
                        "endpoint": "xxx" // The endpoint of Datahub.
                        "accessId": "xxx", // The AccessKey ID for accessing Datahub.
                        "accessKey": "xxx", // The AccessKey secret for accessing Datahub.
                        "project": "xxx", // The name of the source Datahub project.
                        "topic": "xxx" // The source Datahub topic.
                        "batchSize": 1000, // The number of data records to read at a time.
                        "beginDateTime": "20180910111214", // The start time of data consumption.
                        "endDateTime": "20180910111614", // The end time of data consumption.
                        "column": [
                            "col0",
                            "col1",
                            "col2",
                            "col3",
                            "col4"
                        ]
                    }
                },
                "writer": {
                    "name": "streamwriter",
                    "parameter": {
                        "print": false
                    }
                }
            }
        ]
    }
}