A LogHub reader reads data from LogHub topics you specified in real time and supports shard merge and split.

Note After shards are merged or split, duplicate data records may exist but no data will be lost.

Create a LogHub reader

  1. Log on to the DataWorks console. In the left-side navigation pane, click Workspaces. On the Workspaces page, find the target workspace and click Data Analytics in the Actions column.
  2. On the Data Analytics tab, move the pointer over the Create a sync node icon and choose Data Integration > Real-Time Sync.

    You can also find the target workflow, right-click Data Integration, and choose Create > Real-Time Sync.

  3. In the Create Node dialog box that appears, set Node Name and Location, and then click Commit.
  4. On the configuration tab of the real-time sync node, drag LogHub under Reader to the editing panel.
  5. Click the LogHub node and set parameters in the Node Settings section.
    Parameter Description
    Connection The connection to LogHub. In this example, you can only select a LogHub connection.

    If no connection is available, click Add Connection on the right to create one on the Workspace Manage > Data Source page.

    Logstore The name of the Logstore from which data is read in LogHub. You can click Preview on the right to preview the selected Logstore.
    Start Offset The start time of the sync node.
    Time Zone The time zone where LogHub resides.
    Advanced Settings Specifies whether to split data in the Logstore.
    Output Fields The fields from which data is read.
  6. Click Save the settings in the toolbar.