Best Practices for IoT Platform Operation Monitoring

1. Introduction to monitoring and alarm functions[](#i0bcvd)
The monitoring indicators of IoT platform docking cloud monitoring are divided into two categories: system event alarm and threshold alarm. System event alarms are mainly based on the performance indicators of the IoT platform; threshold alarms are mainly based on changes in customer business indicators.

Cloud Monitor Console: https://cloudmonitor.console.aliyun.com/#/alarmservice/product=&searchValue=&searchType=&searchProduct=


2. IoT platform monitoring configuration actual combat

2.1 System event alarm
As a public cloud product, the Alibaba Cloud IoT Platform has usage restrictions for indicators such as device connection frequency, data reporting frequency, downlink command frequency, and message flow frequency. For example, some usage restrictions are shown in the figure below:

For the complete product use restriction document, please refer to: https://help.aliyun.com/document_detail/30527.html

When we use the IoT platform, once the usage restrictions are triggered, the traffic will be limited, which will affect the normal operation of our business. Combined with cloud monitoring products, we can perceive abnormalities at the first time, so as to make corresponding business adjustments.

The system events of the connected IoT platform in Cloud Monitor are as follows:

The maximum number of connection requests per second of the current account has reached the upper limit
The number of publishing requests per second of the current account has reached the upper limit
The number of requests from the current account to the rule engine per second has reached the upper limit
The number of requests sent by the current account to the device per second has reached the upper limit
The QPS of any device's uplink message reaches the upper limit
The QPS of downlink messages of any device reaches the upper limit
We enter the cloud monitoring console, find event monitoring in the left navigation bar, then click the alarm rules tab, and under system events, click the create event alarm button. The detailed configuration is as follows:


Click OK, and we have created a monitoring alert rule.



2.2 Threshold alarm
The threshold alarm indicators for the connection between the IoT platform and cloud monitoring are as follows:
Equipment online related:

Number of real-time online devices (MQTT)
Object model communication related:

Number of device event reporting failures
Number of failures to report device attributes
Number of failures to set device properties
Number of device service call failures
Rules engine flow related:

Rule engine message flow times (DATAHUB)
Rule engine message flow times (FC)
Rule engine message flow times (MNS)
Rule engine message flow times (MQ)
Rule engine message flow times (OTS)
Rule engine message flow times (RDS)
Rule engine message flow times (REPUBLISH)
Rule engine message flow times (TSDB)
Related uplink news:

The volume of messages sent to the platform (MQTT)
The volume of messages sent to the platform (CoAP)
The volume of messages sent to the platform (HTTP)
The volume of messages sent to the platform (HTTP/2)
Volume of messages sent to the platform (LoRa)
Downlink related news:

The volume of messages sent by the platform (MQTT)
The volume of messages sent by the platform (HTTP/2)
The volume of messages sent by the platform (LoRa)
We enter the cloud monitoring console, find the alarm service in the left navigation bar, then click the alarm rule, and under the threshold alarm, click the create alarm rule button. The reference is as follows:


First, select the product IoT platform, select the resource scope and specific monitoring product instances according to the business.


Second, we need to configure the triggering conditions of the alarm rule. For example, the figure below: Take 1 minute as the dimension to count the number of online devices. When the number of devices counted for three consecutive times is less than 15,000, a service alarm is triggered.


Finally, we configure the alarm receiver and notification method. By default, cloud monitoring can support multiple notification methods such as phone calls, text messages, emails, and DingTalk group robots. You can also trigger customer service callback functions by configuring webhooks.


After the threshold alarm configuration is successful, we can see the alarm rules that have taken effect in the list and check the alarm history.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us