Tunnel Service uses comprehensive operations of Table Store to consume full and incremental data. You can consume and process history data and incremental data in tables. A Tunnel client is an automatic data consumption framework of Tunnel Service.

The Tunnel client regularly checks heartbeats to:
  • Detect active channels.
  • Update status of the Channel and ChannelConnect classes.
  • Initialize, run, and terminate data processing tasks.

You can use TunnelWorkerConfig to configure the Tunnel client as follows:

  • The interval and timeout of heartbeats
  • The interval of outputting checkpoints
  • The client tag
  • The custom callback for processing data
  • The configuration of the thread pool for reading and processing data

Configuration details

  • Heartbeats
    • HeartbeatTimeoutInSec: the interval and timeout of heartbeats. Default value: 300s. When the heartbeat times out, Tunnel Service regards that the current Tunnel client is not available, and needs to run the ConnectTunnel task again.
    • HeartbeatIntervalInSec: the interval of checking heartbeats. Default value: 30s. To detect active channels by using heartbeat requests, you can set the parameter to the minimum of 5s. However, this configuration may change the interval of updating channel status or the duration for initializing the task of automatically outputting data.
  • The interval of outputting checkpoints

    CheckpointIntervalInMillis: the interval of outputting checkpoints in Tunnel Service after consuming data. Unit: ms. Default value: 5,000 ms.

    Note
    • Reading tasks run on different servers, various errors may occur in the process. For example, the server may restart due to environmental factors, so Tunnel Service regularly outputs checkpoints after processing data. A task continues from the last checkpoint after restart. In exceptional conditions, Tunnel Service may sequentially synchronize data for once or more times. If some data is repeated, pay attention to the service processing logic.
    • To reduce processing repeated data in the case of errors, you can increase the checkpoints. However, too many checkpoints may reduce the system throughput. Therefore, determine the checkpoint frequency based on your service features.
  • ClientTag: the custom client tag that is used to generate a Tunnel client ID and differentiate Tunnel clients.
  • Custom callbacks for processing data

    ChannelProcessor: the callback that you register to process data, including the process and shutdown methods.

  • Configuration of resources in the thread pool for reading and processing data
    • ReadRecordsExecutor: the data reading resource in the thread pool. Use the default configuration if you have no special requirements.
    • ProcessRecordsExecutor: the data processing resource in the thread pool. Use the default configuration if you have no special requirements.
    Note
    • When you customize the thread pool, we recommend that you make the number of threads in the pool the same as that of the channels in the tunnel. In this way, each channel can access compute resources such as CPU as soon as possible.
    • In the default configuration, to guarantee the throughput, follow these rules:
      • Allocate 32 core threads in advance to guarantee the real-time throughput when processing little data or few channels:
      • Properly reduce the queue length to quickly trigger the policy for creating a thread in the pool and to timely connect more compute resources when processing a large quantity of data or channels.
      • Set the thread keep-alive time to 60s to recycle thread resources after the data size reduces.