All Products
Document Center

Build a monitoring system

Last Updated: May 08, 2018

Log Service overview

Log Service is an important infrastructure of Alibaba Cloud, and supports the collection and distribution of all cluster log data of Alibaba Cloud. Applications, suchh as Table Store, MaxCompute, and CNZZ use the Log Service Logtail to collect log data and consume data using API for export to a downstream real-time statistics system or offline system for statistics and analysis. As an infrastructure, the Log Service provides the following features:

  • Reliability: Proven by Alibaba Group’s internal users and tested by the enormous traffic during each Double 11 shopping festival over the years, the Log Service can guarantee data reliability and no data loss.
  • Scalability: When data traffic goes up, the number of shards can be increased to quickly and dynamically scale up the processing capabilities.
  • Accessibility: Manages the collection of logs from tens of thousands machines in one-click.

Log Service helps users collect logs, unify log format and offers APIs for downstream consumption. Downstream systems can access to multiple systems for repeated log consumption, such as importing from Spark or Storm for real-time computing, or importing from Elasticsearch for searching. Therefore, users can collect once and consume multiple times. Among the various data consumption scenarios, monitoring is the most common one. This article introduces Alibaba Cloud’s Log Service-based monitoring system.

Log Service collects the monitoring data of all clusters as logs to the server, solving the problems of multi-cluster management and collection of heterogeneous system logs. Monitoring data is unified into the same format and sent to the Log Service.

Log Service features for monitoring system

  • Unified machine management: Once Logtail is installed, all the subsequent operations can be performed on the log server.
  • Unified configuration management: You only must configure what logs files you want to collect at the server once, the configuration can be automatically distributed to all machines.
  • Structured data: To facilitate downstream consumption, all data is formatted to fit the Log Service data model.
  • Elastic serviceability: The ability to process massive data read and write operations.

Monitoring system architecture


How to build a monitoring system

1. Collect the monitoring data

For more infoormation about how to configure Log Service log collection and make sure that the logs have been collected by the Log Service, see Quick Start.

2. API consumption data used by the middleware

To select a suitable SDK version, see How to use SDK. Consume log data in batches from the Log Service using the SDK PullLog interface, and synchronize the data to the downstream real-time computing system.

3. Build Storm real-time computing system

Select Storm or other types of real-time computing system, configure the computing rules, select the monitoring indicators for computing, and then write the computing results into Table Store.

4. Display the monitoring information

Read the monitoring data stored in Table Store for front-end display, or read the data and trigger alarms based on the data results.