DataHub Reader reads data from DataHub in real time by using the DataHub SDK.
Background information
DataHub Reader keeps running after it is started and reads data from DataHub when
new data is stored to DataHub. DataHub Reader provides the following features:
- Reads data in real time.
- Reads data concurrently based on the number of shards in DataHub.
Procedure
- Go to the DataStudio page.
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- Select the region where the required workspace resides, find the workspace, and then
click Data Analytics.
- Move the pointer over the
icon and choose .Alternatively, you can click the required workflow, right-click Data Integration, and then choose .
- In the Create Node dialog box, set the Node Name and Location parameters.
Notice The node name must be 1 to 128 characters in length. It can contain letters, digits,
underscores (_), and periods (.).
- Click Commit.
- On the configuration tab of the real-time sync node, drag DataHub under to the canvas on the right.
- Click the new DataHub node. In the configuration pane that appears, set the required parameters in the
Node configuration section.

Parameter |
Description |
Data source |
The connection to the DataHub data store. In this example, you can select only a DataHub
connection.
If no connection is available, click New data source on the right to create one on the page. For more information, see Configure a DataHub connection.
|
Topic |
The name of the topic from which data is read in DataHub. You can click Data preview on the right to preview the selected topic.
|
Output field |
The fields from which data is read. |
- Click the
icon in the toolbar.