To facilitate understanding and use of Log Service, the following describes some basic concepts.
A region is a service node of Alibaba Cloud. By deploying services in different Alibaba Cloud regions, you can make your services closer to your clients for lower access latency and better user experience. Alibaba Cloud has multiple regions throughout the country.
The project is a basic management unit in Log Service and is used for resource isolation and control. You can use a project to manage all logs and related log sources of one application.
Logstore is a unit collecting, storing, and consuming log data in Log Service. Each logstore belongs to one project, and multiple logstores can be created for each project. You can create multiple logstores for one project according to actual needs. The common practice is to create an independent logstore for each type of log in one application. For example, assume that you have a game application named big-game, and there are three types of logs on the server: operation_log, application_log, and access_log. You can first create a project named big-game, and then create three logstores for the three types of logs under this project for log collection, storage and consumption, respectively.
A log is a minimum data unit processed in Log Service. Log Service uses a semi-structured data mode to define a log. The specific data model is as follows:
- Topic: a custom field to mark a batch of logs (for example: access_logs are marked according to sites). This field is a null string by default (the null string is also a valid topic).
- Time: a reserved field in the log, which is used to indicate the generation time of the log (precise to the second, calculated in seconds from 1970-1-1 00:00:00 UTC) and is generally generated directly based on the time in the log.
- Content: used to record the specific content of the log. Content is composed of one or more content items, and each content item is composed of a Key-Value pair.
- Source: source of the log, for example, the IP address of the device generating the log. This field is null by default.
Furthermore, Log Service has different requirements on values of different fields, as described in the following table:
|time||Integer, standard time format of Unix. The minimum unit is second.|
|topic||Any UTF-8 encoded string of no more than 128 bytes.|
|source||Any UTF-8 encoded string of no more than 128 bytes.|
|content||One or more Key-Value pairs. Key is a UTF-8 encoded string of no more than 128 bytes, which contains only letters, underlines, and numbers and cannot begin with a number. Value is any UTF-8 encoded string of no more than 1024*1024 bytes.|
The following keywords cannot be used in the key in content described in the preceding table:
Logs in the same Logstore can be grouped by log topics. You can specify the topic when writing a log. For example, a platform user can use the user ID as the log topic and write it into the log. If there is no need to group the logs in one logstore, the same log topic can be used for all logs.
NOTE: A null string is a valid log topic. The default log topic is a null string.
The following diagram describes the relation among Logstore, log topic, and log:
Various log formats are used in actual application scenarios. For ease of understanding, the following describes how to map an original Nginx access_log to the log data model in Log Service. Assume that the IP address of your Nginx server is
10.249.201.117, and the following is the original log:
10.1.168.193 - - [01/Mar/2012:16:12:07 +0800] "GET /Send?AccessKeyId=8225105404 HTTP/1.1" 200 5 "-" "Mozilla/5.0 (X11; Linux i686 on x86_64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2"
Map the original log to the log data model in Log Service as follows:
|topic||“”||Use the default value (null string).|
|time||1330589527||Precise generation time of the log (precise to the second), which is transformed from the time stamp in the original log.|
|source||“10.249.201.117”||Use the IP address of the server as the log source|
|content||Key-Value pair||Content of the log|
You can decide how to extract the original content of the log and combine it into Key-Value pairs. For example, see the following table:
|browser||“Mozilla/5.0 (X11; Linux i686 on x86_64; rv:10.0.2) Gecko/20100101 Firef|
A collection of multiple logs.
A group of logs.
A group of LogGroups used for return of the results.
Currently, the system supports the following content encoding method (scalable in the future). The Restful API layer is indicated by Content-Type.
|ProtoBuf||The data model is encoded by ProtoBuf.||application/x-protobuf|
required uint32 Time = 1;// UNIX Time Format
required string Key = 1;
required string Value = 2;
repeated Content Contents= 2;
repeated Log Logs= 1;
optional string Reserved = 2; // reserved fields
optional string Topic = 3;
optional string Source = 4;
repeated LogGroup logGroupList = 1;
NOTE: Because PB does not require uniqueness of the Key-Value pair, you need to avoid such case. Otherwise, the behavior is undefined.