Logs are records of changes made within a system. The records are ordered by time. These records contain information about operations on specific objects and results of the operations.

Log data is stored in different forms such as log files, log events, binary logs, and metric data. Each log file consists of one or more log entries. A log entry is the basic unit of data that can be processed in Log Service. Each log entry describes a single system event.

Log Service uses a semi-structured data model to define a log entry. This model includes the fields involving the topic, time, content, source, and tags of a log entry.

Log Service has different format requirements for different data fields, as described in the following table.

Field Description Format
Topic The user-defined field in a log entry. This field can be used to mark a group of logs. For example, you can specify topics for access logs based on sites. The field value can be a string of up to 128 bytes in length, including an empty string. The default value of this field is an empty string.
Time The time when a log entry is generated. This field is a reserved field. In most cases, the field value is generated based on the time information in the log entry. The value is a UNIX timestamp. It represents the number of seconds that have elapsed since 00:00:00 on January 1, 1970, 00:00:00 UTC.
Content The specific content of a log entry. The content consists of one or more items. Each item is a key-value pair. The key is a UTF-8 encoded string of up to 128 bytes in length. It can contain letters, digits, and underscores (_). The key cannot start with a digit and cannot contain the following keywords:
  • __time__
  • __source__
  • __topic__
  • __partition_time__
  • _extract_others_
  • __extract_others__
The value can be a string of up to 1024 × 1024 bytes in length.
Source The source of a log entry. For example, the value of this field can be the IP address of the server where the log entry is generated. The value of this field can be a string of up to 128 bytes in length. This field value is an empty string by default.
Tags Log tags include:
  • User-defined tags: the tags that you add when you call the PutLogs operation to write data to a specified Logstore.
  • System tags: the tags added by Log Service, including __client_ip__ and __receive_time__.
The field value follows a dictionary format. Both keys and values are strings. The field name is prefixed in the format of __tag__:.

Logs generated in different scenarios may have different formats. The following example describes how to convert a raw NGINX access log into the data model required by Log Service. Assume that the IP address of the NGINX server is 10.249.201.117. A raw log entry generated on this server is as follows:

10.1.168.193 - - [01/Mar/2012:16:12:07 +0800] "GET /Send? AccessKeyId=8225105404 HTTP/1.1" 200 5 "-" "Mozilla/5.0 (X11; Linux i686 on x86_64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2"

The following table describes how to convert the raw log entry into the data model that is required by Log Service.

Field Field value Description
Topic "" An empty string (default value) is used.
Time 1330589527 The time when the log entry was generated. The value indicates the number of seconds that have elapsed since 00:00:00 on January 1, 1970, 00:00:00 UTC. The field value is a UNIX timestamp that is converted from the time information in the raw log entry.
Content Key-value pairs The specific content of the log entry.
Source "10.249.201.117" The IP address of the server where the log entry is collected.
Tags N/A The field that is added by the user or Log Service.

You can decide how to extract the original content of a log entry and convert the extracted content into key-value pairs. The following table provides an example.

Key Value
ip 10.1.168.193
method GET
status 200
length 5
ref_url -
browser Mozilla/5.0 (X11; Linux i686 on x86_64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2