All Products
Search
Document Center

IoT Platform:Configure Data Aggregation

Last Updated:Oct 31, 2024

The data aggregation node is akin to the window function in Flink SQL, aggregating messages within a parsing task based on the window. The resulting aggregated data can be leveraged for further analysis or output.

Instructions

The Flink SQL currently supports the tumbling time window function, identified as TUMBLE. For more information, see TUMBLE.

The data aggregation node offers fixed-length time windows of 10s, 15s, 30s, 1min, 5min, 15min, and 30min.

Note
  • Should the necessary time window extend beyond 30 minutes, it is advisable to utilize the SQL analysis workbench within Analysis Insight, leveraging hourly scheduling to accomplish this. For more information, see Step 3: Set a job scheduling policy and publish.

  • For custom requirements, please contact technical support to submit your request.

Scenarios

Consider a park energy-saving system aiming to reduce energy consumption and costs. One subtask is to identify the meeting room with the highest air conditioner temperature (temperature) every minute. In this case, you would configure a data aggregation node with room ID (roomId) as the granularity, a 1-minute window length, and a MAX aggregation algorithm, outputting fields like room ID (roomId) and the highest temperature per minute (max_temperature).

Prerequisites

Ensure that data calculation expressions or data filtering filters are set up. For more information, see Configure data computing and data filtering.

Background information

Procedure

  1. Access the data parsing console.

  2. In the middle canvas, click the add icon following the current node.

  3. In the node list that appears, single click the Data Aggregation node.

  4. Single click the Data Aggregation node on the canvas. On the right-side configuration panel, set the data aggregation fields as detailed in the table below.

    Configuration Item

    Parameter

    Description

    Example

    Basic Configuration

    Grouping Field

    Select the field for data partitioning within the window function, which serves as the aggregation granularity. This value remains unaffected by the window aggregation logic, such as product key (ProductKey), device name (DeviceName), etc.

    Refer to "Scenarios": To calculate the highest temperature for each meeting room, choose room ID (roomId) as the grouping field.

    Window Length

    Choose the window length from the following options: 10s, 15s, 30s, 1min, 5min, 15min, 30min.

    • For windows exceeding 1 hour, employ hourly scheduling via the SQL workbench. For additional details, refer to Step 3: Set a job scheduling policy and publish.

    • For custom requirements, please contact technical support to submit your request.

    Refer to "Scenarios": 1min.

    Aggregated Field List

    Configure the following:

    • Aggregated Field: Choose a numeric field for aggregation operations.

    • Aggregation Result Field Name: Use only numbers, letters, and underscores, must not start with a number, avoid duplicating existing field names, and limit to 30 characters.

    • Aggregate Operation: Choose from SUM, COUNT, MIN, MAX, AVG.

    Refer to "Scenarios":

    • Aggregated Field: Choose air conditioner temperature temperature.

    • Aggregation Result Field Name: Highest temperature per minute max_temperature.

    • Aggregate Operation: Select MAX.

    Advanced Configuration

    Latency Toleration

    Unit of measurement: seconds.

    To address message disorder that results in data being reported after the actual window, configuring this field enables the delayed data to be attributed to the appropriate window. This approach aligns with the watermark mechanism in Flink SQL. For more information, see Time attributes.

    Use the default setting.

    Below is a specific configuration example for "Scenarios":

  5. To finalize the data aggregation node configuration, single click Save in the upper right corner of the data parsing workbench.

    Important

    Note that the output fields of this node include the grouping fields and the aggregation result fields list. Fields from preceding nodes will not be passed through this node.

What to do next

Following the configuration of the adjacent message calculation node, proceed to set up other processing nodes or configure the target node to complete the entire parsing task.