You can upload log files to Object Storage Service (OSS) for storage. Then, you can import the log data from OSS to Log Service and perform supported operations on the data in Log Service. For example, you can query, analyze, and transform the data. You can import only the OSS objects that are no more than 5 GB in size to Log Service. If you want to import a compressed object, the size of the compressed object must be no more than 5 GB.

Prerequisites

  • Log files are uploaded to an OSS bucket. For more information, see Upload objects.
  • A project and a Logstore are created. For more information, see Create a project and Create a Logstore.
  • Log Service is authorized to assume the AliyunLogImportOSSRole role to access your OSS resources. You can complete the authorization on the Cloud Resource Access Authorization page.
    If you use a RAM user, you must grant the PassRole permission to the RAM user. The following example shows a policy that you can use to grant the PassRole permission. For more information, see Create a custom policy and Grant permissions to a RAM user.
    {
       "Statement": [
        {
          "Effect": "Allow",
          "Action": "ram:PassRole",
          "Resource": "acs:ram:*:*:role/aliyunlogimportossrole"
        }
      ],
      "Version": "1"
    }        

Create a data import configuration

Note OSS objects of the Normal type support object-level marks. After an object of this type is modified, all data of the object is imported to Log Service. OSS objects of the Appendable type support row-level marks. After an object of this type is modified, only the appended data is imported to Log Service.
  1. Log on to the Log Service console.
  2. On the Data Import tab in the Import Data section, click OSS - Data Import.
  3. Select the project and Logstore. Then, click Next.
  4. In the Configure Import Settings step, create a data import configuration.
    1. In the Specify Data Source step, configure the parameters. The following table describes the parameters.
      Parameter Description
      Config Name The name of the data import configuration.
      OSS Region The region where the OSS bucket resides. The OSS bucket stores the OSS objects that you want to import to Log Service.

      If the OSS bucket and the Log Service project reside in the same region, no Internet traffic is generated, and data is transferred at a high speed.

      Bucket The OSS bucket.
      Folder Prefix The directory of the OSS objects. If you configure this parameter, the system can find the OSS objects that you want to import in a more efficient manner. For example, if the OSS objects that you want to import are stored in the csv/ directory, you can set this parameter to csv/.

      If you leave this parameter empty, the system traverses the entire OSS bucket to find the OSS objects.

      Regular Expression Filter The regular expression that is used to filter OSS objects. Only the objects whose names match the regular expression are imported. The names include the paths of the objects. By default, this parameter is empty, which indicates that no filtering is performed.

      For example, if an OSS object that you want to import is named testdata/csv/bill.csv, you can set this parameter to (testdata/csv/)(.*).

      Data Format The format of the OSS objects. Valid values:
      • CSV: You can specify the first line of an OSS object as field names or specify custom field names. All lines except the first line are parsed as the values of log fields.
      • Single-line JSON: An OSS object is read line by line. Each line is parsed as a JSON object. The fields in JSON objects are log fields.
      • Parquet: An OSS object is automatically parsed into the format that is supported by Log Service. You do not need to configure further settings. If you use this format, you cannot preview data.
      • Single-line Text: Each line in an OSS object is parsed as a log.
      • Multi-line Text: Multiple lines in an OSS object are parsed as a log. You can specify a regular expression to match the first line or the last line for a log.
      Compression Format The compression format of the OSS objects that you want to import. Log Service decompresses the OSS objects based on the specified format to read data.
      Encoding Format The encoding format of the OSS objects that you want to import.
      Restore Archived Files If the OSS objects are Archive objects, Log Service cannot read data from the objects unless the objects are restored. If you turn on this switch, Archive objects are automatically restored.
      Note
      • It takes approximately 1 minute to restore Archive objects, which may cause the first preview to time out. If the first preview times out, you must wait for a period of time and try again.
      • Cold Archive objects cannot be imported. If you want to import Cold Archive objects, you must restore the objects by using OSS API operations first. For more information, see Cold Archive.
    2. Click Preview. The preview results are displayed.
    3. Confirm the preview results and click Next.
    4. In the Specify Data Type step, configure the parameters. The following tables describe the parameters.
      • Parameters related to log time
        Parameter Description
        Use System Time Specify whether to use the system time.
        • If you turn on Use System Time, the time field of a parsed log indicates the system time at which the log is imported.
        • If you turn off Use System Time, you must manually specify the time field and time format.
        Note We recommend that you turn on Use System Time. You can configure an index for the time field and use the index to query logs. If you import historical data to a Logstore and the data was generated earlier than the current time minus the data retention period of the Logstore, you cannot query the data in the Log Service console. For example, if you import data that was generated seven days ago to a Logstore whose data retention period is seven days, no results are returned when you query the data in the Log Service console.
        Regex to Extract Time If you select Single-line Text or Multi-line Text for Data Format and turn off Use System Time, you must specify a regular expression to extract log time.

        For example, if a sample log from a log file is 127.0.0.1 - - [10/Sep/2018:12:36:49 0800] "GET /index.html HTTP/1.1", you can set Regex to Extract Time to [0-9]{0,2}\/[0-9a-zA-Z]+\/[0-9:,]+.

        Time Field If you select CSV, Single-line JSON, or Parquet for Data Format and turn off Use System Time, you must specify a time field.

        For example, if the preview results of a CSV file display data as shown in the following figure, you can set Time Field to time_local.

        Extract time fields
        Time Format If you turn off Use System Time, you must specify a time format that is supported by Java SimpleDateFormat. The time format is used to parse the time field. For more information about the time format syntax, see Class SimpleDateFormat. For more information about the common time formats, see Time formats.
        Note Java SimpleDateFormat does not support UNIX timestamps. If you want to use UNIX timestamps, you can set this parameter to epoch.
        Time Zone If you turn off Use System Time, you must specify a time zone. The time zone is used to parse log time to obtain time zone information. If the log time that is extracted includes time zone information, this parameter becomes invalid.
      • Other parameters
        • Unique parameters when you set Data Format to CSV
          Parameter Description
          Delimiter The delimiter for logs. The default value is a comma (,).
          Quote The quote that is used to enclose a log field if the log field contains delimiters. The default value is double quotation marks (").
          Escape Character The escape character for logs. The default value is a backslash (\).
          Max Lines for Multiline Logging The maximum number of lines allowed for a log if the original log has multiple lines. Default value: 1.
          First Line as Field Name If you turn on First Line as Field Name, the first line in a CSV file is used to extract field names. For example, the first line in the CSV file that is shown in the following figure is used to extract field names. First line
          Custom Fields If you turn off First Line as Field Name, you can specify custom field names based on your business requirements. Separate multiple field names with commas (,).
          Lines to Skip The number of lines that are skipped. For example, if you set this parameter to 1, the first line of a CSV file is skipped, and log collection starts from the second line.
        • Unique parameters when you set Data Format to Multi-line Text
          Parameter Description
          Position to Match with Regex The usage of a regular expression.
          • If you select Regex to Match First Line Only, the regular expression that you specify is used to match the first line for a log. The unmatched lines are collected as a part of the log until the maximum number of lines that you specify is reached.
          • If you select Regex to Match Last Line Only, the regular expression that you specify is used to match the last line for a log. The unmatched lines are collected as a part of the next log until the maximum number of lines that you specify is reached.
          Regular Expression The regular expression. You can specify the regular expression based on log content. For more information, see How do I modify a regular expression?
          Max Lines The maximum number of lines allowed for a log.
    5. After you configure the parameters, click Test.
    6. After the test succeeds, click Next.
    7. In the Specify Scheduling Interval step, configure the parameters. The following table describes the parameters.
      Parameter Description
      Import Interval The interval at which the OSS objects are imported to Log Service.
      Import Now If you turn on Import Now, the OSS objects are immediately imported.
    8. Click Next.
  5. Preview data, configure indexes, and then click Next.
    By default, full-text indexing is enabled for Log Service. You can also configure field indexes based on collected logs in manual or automatic mode. For more information, see Configure indexes.
    Note If you want to query and analyze logs, you must enable full-text indexing or field indexing. If you enable both full-text indexing and field indexing, the system uses only field indexes.
  6. Click Next to complete the creation.

View the data import configuration

After you create the data import configuration, you can view the configuration details and related statistical reports in the Log Service console.

  1. In the Projects section, click the project to which the data import configuration belongs.
  2. In the left-side navigation pane, choose Log Storage > Logstores. Click the Logstore to which the data import configuration belongs, choose Data Import > Data Import, and then click the name of the data import configuration.
  3. On the Import Configuration Overview page, view the basic information and statistical reports of the data import configuration.
    Import Configuration Overview

What to do next

On the Import Configuration Overview page, you can perform the following operations on the data import configuration:

  • Modify the configuration

    Click Modify Settings to modify the data import configuration. For more information, see Create a data import configuration.

  • Delete the configuration
    Click Delete Configuration to delete the data import configuration.
    Warning After the data import configuration is deleted, it cannot be recovered.

FAQ

Issue Cause Solution
I cannot select an OSS bucket when I create a data import configuration. The AliyunLogImportOSSRole role is not assigned to Log Service. Complete authorization based on the descriptions in the "Prerequisites" section of this topic.
Data cannot be imported. The sizes of some OSS objects exceed 5 GB. Reduce the sizes of the OSS objects.
After data is imported, I cannot query or analyze the data. No indexes are configured, or configured indexes do not take effect. Before you import data, we recommend that you configure indexes for the Logstore to which you want to import the data. For more information, see Configure indexes. After the issue occurs, you can reconfigure indexes for your Logstore. For more information, see Reindex logs for a Logstore.
Archive objects cannot be imported. Restore Archived Files is turned off.
  • Method 1: Modify the data import configuration and turn on Restore Archived Files.
  • Method 2: Create a data import configuration and turn on Restore Archived Files.
The Regular Expression Filter parameter is specified, but no data is collected.
  • The specified regular expression is invalid.
  • A large number of OSS objects are stored in the OSS bucket. The system does not match OSS objects by traversing the OSS bucket before timeout.
Reconfigure the Regular Expression Filter parameter. If the issue persists, the cause may be that a large number of OSS objects are stored in the OSS bucket. In this case, specify a more specific directory of the OSS objects to reduce the number of objects that are involved in traversing.
Logs are imported, but no data is found in the Log Service console. The log time goes beyond the data retention period of the Logstore. Expired data is deleted. Check the time range for query and the data retention period of the Logstore.
The extracted log time is used to query data, but no data is found for that time. The specified time format is invalid. Check whether the time format is supported by Java SimpleDateFormat. For more information, see Class SimpleDateFormat.
An error occurred in parsing an OSS object that is in the Multi-line Text format. The specified regular expression that is used to match the first line or the last line for a log is invalid. Specify a valid regular expression.
The import speed suddenly slows down.
  • The data that needs to be imported is not sufficient.
  • A large number of OSS objects are stored in the OSS bucket. A large amount of time is consumed to traverse the OSS bucket for the OSS objects.
  1. Check whether the OSS bucket has sufficient data that needs to be imported.
  2. Check whether a large amount of time is consumed to traverse the OSS bucket for the OSS objects because a large number of OSS objects are stored in the OSS bucket. If the issue occurs due to this reason, you can configure the Folder Prefix and Regular Expression Filter parameters to reduce the number of OSS objects that are involved in traversing. Alternatively, you can migrate the OSS objects that have been imported to Log Service from the OSS bucket to a different directory or bucket in the OSS console.