This topic provides a use case to describe how to use Realtime Compute to analyze data from IoT sensors in multiple dimensions.

Background

With the economic tidal wave of globalization sweeping over the world, industrial manufacturers are facing increasingly fierce competition. To increase competitiveness, manufacturers in the automotive, aviation, high-tech, food and beverage, textile, and pharmaceutical industries must innovate and replace the existing infrastructure. These industries have to address many challenges during the innovation process. For example, the existing traditional devices and systems have been used for decades, which results in high maintenance costs. However, replacing these systems and devices may slow down the production process and compromise the product quality.

These industries face two additional challenges, which are high security risks and the urgent need for complex process automation. The manufacturing industry has prepared to replace the existing traditional devices and systems. In this industry, high reliability and availability systems are needed to ensure the safety and stability of real-time operations. A manufacturing process involves a wide range of components, such as robotic arms, assembly lines, and packaging machines. This requires remote applications that can seamlessly integrate each stage of the manufacturing process, including the deployment, update, and end-of-life management of devices. The remote applications also need to handle failover issues.

Another requirement for these next-generation systems and applications is that they be able to capture and analyze the large amounts of data generated by devices, and respond appropriately in a timely manner. To increase competitiveness and accelerate development, manufacturers need to optimize and upgrade their existing systems and devices. The application of Realtime Compute and Alibaba Cloud IoT solutions allows you to analyze device running information, detect faults, and predict yield rates in real time. This topic describes a use case as an example. In this use case, a manufacturer uses Realtime Compute to analyze the large amounts of data collected from sensors in real time. Realtime Compute is also used to cleanse and aggregate data in real time, write data into an online analytical processing (OLAP) system in real time, and monitor the key metrics of devices in real time.

Business description

In this use case, the manufacturer has more than 1,000 devices from multiple factories in many cities. Each device is equipped with 10 types of sensors. These sensors send the collected data every 5 seconds to Log Service. The data collected from each sensor follows the format described in the following table.

s_id s_value s_ts
The ID of the sensor. The current value from the sensor. The time when the data was sent.

The sensors are distributed across devices from multiple factories. The manufacturer creates an RDS dimension table to display the distribution of sensors across devices and factories.

s_id s_type device_id factory_id
The ID of the sensor. The type of the sensor. The ID of the device. The ID of the factory.

The information included in this dimension table is stored in the RDS system. The manufacturer needs to organize the data from sensors based on this dimension table, and sort the data by device. To meet this need, Realtime Compute provides a wide table where the data sent from sensors is logically organized by device every minute.

ts device_id factory_id device_temp device_pres
The time when the data was sent. The ID of the device. The ID of the factory. The temperature of the device. The pressure of the device.

To simplify the computing logic and real-time stream computing, we use only two types of sensors in this use case: temperature and pressure. The computing logic is described as follows:

  1. Realtime Compute identifies the devices whose temperatures are higher than 80°C and triggers alerts at the downstream nodes. In this use case, Realtime Compute sends the data of the identified devices to Message Queue. Message Queue then triggers alerts that the manufacturer has specified in the downstream alerting system.

  2. Realtime Compute writes the data into an OLAP system. In this use case, the manufacturer uses HybridDB for MySQL. To integrate with HybridDB for MySQL, the manufacturer has developed a set of business intelligence (BI) applications for multidimensional data display and analysis.

The following figure shows the overall architecture.



FAQ

  • How is a wide table created?

    In most cases, each sensor only collects the IoT data of one dimension. This poses challenges for subsequent data processing and analysis. To create a wide table, Realtime Compute aggregates data based on windows and organizes data by dimension.

  • Why is Message Queue used to trigger alerts?

    Realtime Compute allows you to write data into any type of storage system. We recommend that you use message storage systems like Message Queue for sending alerts and notifications. This is because the application of these systems helps to prevent the errors encountered by user-defined alerting systems. These errors may cause failures to report certain alerts and notifications.

Code description

Send the data uploaded from sensors to Log Service. The data format of a row is shown as follows:

{
    "sid": "t_xxsfdsad", 
    "s_value": "85.5", 
    "s_ts": "1515228763"
}
			

Define a Log Service source table s_sensor_data.

CREATE TABLE s_sensor_data (
    s_id    VARCHAR,
    s_value VARCHAR,
    s_ts    VARCHAR,
    ts        AS CAST(FROM_UNIXTIME(CAST(s_ts AS BIGINT)) AS TIMESTAMP),
      WATERMARK FOR ts AS withOffset(ts, 10000)
) WITH (
    TYPE='sls',
    endPoint ='http://cn-hangzhou-corp.sls.aliyuncs.com',
      accessId ='yourAccessId',
      accessKey ='yourAccessSecret',
      project ='ali-cloud-streamtest',
      logStore ='stream-test',
);
			

Create an RDS dimension table d_sensor_device_data. This dimension table stores the mappings between sensors and devices.

CREATE TABLE d_sensor_device_data (
    s_id    VARCHAR,
    s_type    VARCHAR,
    device_id BIGINT,
    factory_id BIGINT,
    PERIOD FOR SYSTEM_TIME,
    PRIMARY KEY(s_id)
) WITH (
    TYPE='RDS',
     url='yourDatabaseURL',
     tableName='test4',
     userName='test',
     password='yourDatabasePassword'
);
			

Create a Message Queue result table r_monitor_data. This table specifies the logic for triggering alerts.

CREATE TABLE r_monitor_data (
    ts    VARCHAR,
    device_id    BIGINT,
    factory_id    BIGINT,
    device_TEMP    DOUBLE,
    device_PRES DOUBLE
) WITH (
    TYPE='MQ'
);
			

Create a HybridDB for MySQL result table r_device_data.

CREATE TABLE r_device_data (
    ts    VARCHAR,
    device_id BIGINT,
    factory_id BIGINT,
    device_temp    DOUBLE,
    device_pres DOUBLE,
    PRIMARY KEY(ts, device_id)
) WITH (
    TYPE='HybridDB'
);
			

Aggregate the data collected from sensors by minute and create a wide table based on the aggregated data. To help you understand structured code and facilitate code maintenance, we create views in this use case.

// Create a view to obtain the device and factory mapping each sensor.
CREATE VIEW v_sensor_device_data
AS
SELECT
    s.ts,
    s.s_id,
    s.s_value,
    d.s_type,
    d.device_id,
    d.factory_id
FROM
    s_sensor_data s
JOIN
    d_sensor_device_data FOR SYSTEM_TIME AS OF PROCTIME() as d
ON
    s.s_id = d.s_id;

// Create a wide table.
CREATE VIEW v_device_data
AS
SELECT
    // Specify the start time of a tumbling window as the time for the record.
    CAST(TUMBLE_START(v.ts, INTERVAL '1' MINUTE) AS VARCHAR) as ts,
    v.device_id,
    v.factory_id,
    CAST(SUM(IF(v.s_type = 'TEMP', v.s_value, 0)) AS DOUBLE)/CAST(SUM(IF(v.s_type = 'TEMP', 1, 0)) AS DOUBLE) device_temp, // Compute the average temperature by minute.
    CAST(SUM(IF(v.s_type = 'PRES', v.s_value, 0)) AS DOUBLE)/CAST(SUM(IF(v.s_type = 'PRES', 1, 0)) AS DOUBLE) device_pres // Compute the average pressure by minute.
FROM
    v_sensor_device_data v
GROUP BY
    TUMBLE(v.ts, INTERVAL '1' MINUTE), v.device_id, v.factory_id;    

In the preceding core computing logic, the average temperature and pressure by minute are computed as the output. Because a tumbling window is used here, a new data record is generated every minute. Then, filter and write the generated data into the Message Queue result table and HybridDB for MySQL result table.

// Identify the sensors whose temperatures are higher than 80°C and write the data into the Message Queue result table to trigger alerts.
INSERT INTO r_monitor_data
SELECT
    ts,
    device_id,
    factory_id,
    device_temp,
    device_pres
FROM
    v_device_data
WHERE
    device_temp > 80.0;

// Write the result data into the HybridDB for MySQL result table for analysis.
INSERT INTO r_device_data
SELECT
    ts,
    device_id,
    factory_id,
    device_temp,
    device_pres
FROM
    v_device_data;    

Demo code and source code

The Alibaba Cloud team has created demo code that contains a complete link applicable to the preceding IoT solution.

  • Use the text information collected by Logtail from ECS as data in the source table.
  • Create an RDS dimension table.
  • Create a Message Queue result table and a HybridDB for MySQL result table.

Refer to the complete demo code to register input and output data to develop your own IoT solution. Click Demo code to download the demo code.