After logs are shipped from Log Service to Object Storage Service (OSS), the logs can be stored in different formats. This topic describes the Parquet format.
Parameters

The following table describes the parameters.
Parameter | Description |
---|---|
Key Name | The log field that you want to ship to OSS. You can view log fields on the Raw Logs tab of a Logstore. We recommend that you add log fields one by one. When the log
fields are shipped to OSS, the log fields are stored in a Parquet file in the order
that you add them. The names of the log fields are used as the names of the columns
in the Parquet file. The log fields that you can ship to OSS include the fields in
the log content and the reserved fields such as __time__, _topic__, and __source__. For more information about reserved fields, see Reserved fields. The values of the columns in a Parquet file are null in the following two scenarios:
Note
|
Type | The data type of the specified log field. The following types of data can be stored
in a Parquet file: STRING, BOOLEAN, INT32, INT64, FLOAT, and DOUBLE.
When logs are shipped from Log Service to OSS, the log fields in logs are converted from the string type to a data type that is supported in a Parquet file. If the log fields fail to be converted from the string type to a non-string type, the values of the columns in the Parquet file are null. |
URLs of files in OSS
After logs are shipped to OSS, the logs are stored in OSS buckets. The following table provides examples of the URLs of the files that store the logs.
Compression type | File extension | URL example | Description |
---|---|---|---|
Not compressed | .parquet | oss://oss-shipper-shenzhen/ecs_test/2016/01/26/20/54_1453812893059571256_937.parquet | You can download the OSS file to your computer and use the parquet-tools utility to open the file. For more information about the parquet-tools utility, visit parquet-tools. |
Snappy | .snappy.parquet | oss://oss-shipper-shenzhen/ecs_test/2016/01/26/20/54_1453812893059571256_937.snappy.parquet | You can download the OSS file to your computer and use the parquet-tools utility to open the file. For more information about the parquet-tools utility, visit parquet-tools. |
Data consumption
- You can consume data that is shipped to OSS by using E-MapReduce, Spark, or Hive. For more information, see LanguageManual DDL.
- You can also consume data by using inspection tools.
You can use the parquet-tools utility provided by the open source community to inspect Parquet files, view the schema of data in the files, and read the data. You can compile the utility or download the parquet-tools-1.6.0rc3-SNAPSHOT utility that Log Service provides to consume data.
- View the schema of data in a Parquet file
$ java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar schema -d 00_1490803532136470439_124353.snappy.parquet | head -n 30 message schema { optional int32 __time__; optional binary ip; optional binary __source__; optional binary method; optional binary __topic__; optional double seq; optional int64 status; optional binary time; optional binary url; optional boolean ua; } creator: parquet-cpp version 1.0.0 file schema: schema -------------------------------------------------------------------------------- __time__: OPTIONAL INT32 R:0 D:1 ip: OPTIONAL BINARY R:0 D:1 .......
- View all data in a Parquet file
$ java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar head -n 2 00_1490803532136470439_124353.snappy.parquet __time__ = 1490803230 ip = 10.200.98.220 __source__ = *.*.*.* method = POST __topic__ = seq = 1667821.0 status = 200 time = 30/Mar/2017:00:00:30 +0800 url = /PutData?Category=YunOsAccountOpLog&AccessKeyId=*************&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=********************************* HTTP/1.1 __time__ = 1490803230 ip = 10.200.98.220 __source__ = *.*.*.* method = POST __topic__ = seq = 1667822.0 status = 200 time = 30/Mar/2017:00:00:30 +0800 url = /PutData?Category=YunOsAccountOpLog&AccessKeyId=*************&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=********************************* HTTP/1.1
You can run the java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar -h command to view more information about the parquet-tools utility.
- View the schema of data in a Parquet file