Here is a typical scenario: A server (container) stores a huge volume of application log data generated in different directories.
- Developers deploy and deprecate new applications.
- The server can scale out as needed, for example, scaled out during the peak periods and scaled in during the slack periods.
- The log data is to be queried, monitored and warehoused depending on the different and ever-changing requirements.
Each application can generate Access, OpLog, Logic and Error logs. When more applications are added and the dependence exists between applications, the volume of logs explodes.
Here is an example of an online takeaway website:
|Web||nginx||wechat-nginx (WeChat server nginx log)|
|nginx||alipay-nginx (Alipay server nginx log)|
|nginx||server-access (server Access-Log)|
|Web-Error||nginx-error||alipay-nginx (nginx error log)|
|Web-app||tomcat||alipay-app (Alipay server application logic)|
|app||Mobile app||deliver-app (delivery app status)|
|app-error||Mobile app||deliver-error (error log)|
|Web||H5||Web-click (H5 page click)|
|server||server||server internal logic log|
|Syslog||server||server system log|
For example, AccessLog can be used for billing, and for users to download; OpLog is to be queried by a DBA, which also requires BI analysis and full-link monitoring.
With the incredibly fast evolution of the Internet, in the real world, we need to adapt to the ever-changing business and environment:
- Application server resizing
- Servers as machines
- New application deployment
- New log consumers
- A well-defined architecture with low cost
- A stable and highly reliable, preferably unattended mechanism (which, for example, allows for auto-scaling - adding and removing servers as needed)
- Standardized application deployment without complicated configuration
- Easy compliance with log processing requirements
The LogHub feature of the Log Service defines the following concepts on log access, and uses Logtail to collect logs:
- Project: a management container
- LogStore: represents a log source
- Machine group: represents the directory and format for logs
- Config: indicates the path to logs
The relationships between these concepts are as follows:
- A project includes multiple LogStores, machine groups and configs, with different projects meeting different business requirements.
An application can have multiple types of logs. There is a LogStore and a fixed directory (with the same config) per log type.
app --> logstore1, logstore2, logstore3
app --> config1, config2, config3
A single application can be deployed for multiple machine groups, and multiple applications for a single machine group.
app --> machineGroup1, mahcineGroup2
machineGroup1 --> app1, app2, app3
The collection directory defined in the config is applied to machine groups, and collected into any LogStore.
config1 * machineGroup1 --> Logstore1
config1 * machineGroup2 --> logstore1
config2 * machineGroup1 --> logstore2
Convenient: It provides WebConsole/SDK and other tools for batch management.
Large-scale: It manages machines and applications in the millions.
Real-time: Collection configuration takes effect just in minutes.
- The machine ID function supports auto scaling up of servers.
- LogHub supports auto scaling. For details, refer to the shard overview.
Stable and reliable: No human intervention is required.
For information on real-time computing, offline analysis, indexing and other query capabilities in log processing, refer to Service Introduction.
- LogHub: Real-time collection and consumption. Uses 30+ methods to collect massive data for real-time downstream consumption.
- LogShipper: Stable and reliable log shipping. It delivers data from LogHub to storage services (OSS/MaxCompute/Table Store) for storage and big data analysis.
- LogSearch: Real-time data indexing and querying. It allows for centralized log query without caring about where active server logs are located.