edit-icon download-icon

Log management

Last Updated: Nov 07, 2017

Here is a typical scenario: A server (container) stores a huge volume of application log data generated in different directories.

  • Developers deploy and deprecate new applications.
  • The server can scale out as needed, for example, scaled out during the peak periods and scaled in during the slack periods.
  • The log data is to be queried, monitored and warehoused depending on the different and ever-changing requirements.

log management

Challenges in the process

1. Fast application deployment and go-live, and a growing number of log types

Each application can generate Access, OpLog, Logic and Error logs. When more applications are added and the dependence exists between applications, the volume of logs explodes.

Here is an example of an online takeaway website:

Category Application Log Name
Web nginx wechat-nginx (WeChat server nginx log)
nginx alipay-nginx (Alipay server nginx log)
nginx server-access (server Access-Log)
Web-Error nginx-error alipay-nginx (nginx error log)
nginx-error
Web-App tomcat alipay-app (Alipay server application logic)
tomcat
App Mobile App deliver-app (delivery app status)
App-Error Mobile App deliver-error (error log)
Web H5 Web-click (H5 page click)
server server server internal logic log
Syslog server server system log

2. Logs are consumed for different purposes

For example, AccessLog can be used for billing, and for users to download; OpLog is to be queried by a DBA, which also requires BI analysis and full-link monitoring.

3. Environment and changes

With the incredibly fast evolution of the Internet, in the real world, we need to adapt to the ever-changing business and environment:

  • Application server resizing
  • Servers as machines
  • New application deployment
  • New log consumers

A perfect management architecture requires

  • A well-defined architecture with low cost
  • A stable and highly reliable, preferably unattended mechanism (which, for example, allows for auto-scaling - adding and removing servers as needed)
  • Standardized application deployment without complicated configuration
  • Easy compliance with log processing requirements

Log service solution

The LogHub feature of the Log Service defines the following concepts on log access, and uses Logtail to collect logs:

  • Project: a management container
  • LogStore: represents a log source
  • Machine group: represents the directory and format for logs
  • Config: indicates the path to logs

The relationships between these concepts are as follows:

  • A project includes multiple LogStores, machine groups and configs, with different projects meeting different business requirements.
  • An application can have multiple types of logs. There is a LogStore and a fixed directory (with the same config) per log type.

    1. app --> logstore1, logstore2, logstore3
    2. app --> config1, config2, config3
  • A single application can be deployed for multiple machine groups, and multiple applications for a single machine group.

    1. app --> machineGroup1, mahcineGroup2
    2. machineGroup1 --> app1, app2, app3
  • The collection directory defined in the config is applied to machine groups, and collected into any LogStore.

    1. config1 * machineGroup1 --> Logstore1
    2. config1 * machineGroup2 --> logstore1
    3. config2 * machineGroup1 --> logstore2

Advantages

  • Convenient: It provides WebConsole/SDK and other tools for batch management.

  • Large-scale: It manages machines and applications in the millions.

  • Real-time: Collection configuration takes effect just in minutes.

  • Elastic:

    • The machine ID function supports auto scaling up of servers.
    • LogHub supports auto scaling. For details, refer to the shard overview.
  • Stable and reliable: No human intervention is required.

  • For information on real-time computing, offline analysis, indexing and other query capabilities in log processing, refer to Service Introduction.

    • LogHub: Real-time collection and consumption. Uses 30+ methods to collect massive data for real-time downstream consumption.
    • LogShipper: Stable and reliable log shipping. It delivers data from LogHub to storage services (OSS/MaxCompute/Table Store) for storage and big data analysis.
    • LogSearch: Real-time data indexing and querying. It allows for centralized log query without caring about where active server logs are located.
Thank you! We've received your feedback.