You can import log files from Azure Blob into Simple Log Service (SLS) for centralized management, query, and analysis. SLS supports importing individual Azure Blob files up to 5 GB. For compressed files, this limit applies to the compressed size.
This document is the intellectual property of Alibaba Cloud. It describes how Alibaba Cloud services interact with third-party products and may reference third-party company or product names.
Prerequisites
You have uploaded log files to Azure Blob.
You have created a project and a Logstore.
Create a data import configuration
Log on to the Simple Log Service console.
In the Data Ingestion section, on the Data Import tab, select AzureBlob - Data Import.
Select the destination project and Logstore, and then click Next.
In the Import Configuration step, set the following parameters:
Parameter
Description
Job Name
The unique name of the SLS task.
Display Name
The display name of the task.
Job Description
The description of the import task.
ContainerName
The name of the Azure Blob container.
AccountName
The name of the Azure Blob account.
AccountKey
The key for the Azure Blob account.
AzureBlob Endpoint
The endpoint of the Azure Blob service.
NoteYou must specify an endpoint in non-public cloud environments. For example, for Azure China Cloud, enter https://<your ContainerName>.blob.core.chinacloudapi.cn
File Path Prefix Filter
Filters Azure Blob files by their path prefix to locate the files to import. For example, if all files to import are in the csv/ directory, set the prefix to csv/.
If you do not set this parameter, the entire Azure Blob container is traversed.
NoteSet this parameter. If a container has many files, traversing the entire container is highly inefficient.
File Path Regex Filter
Filters Azure Blob files using a regular expression on the file path to locate the files to import. Only files whose names, including the path, match the regular expression are imported. By default, this is empty, which means no files are filtered.
For example, if an Azure Blob file is
testdata/csv/bill.csv, you can set the regular expression to(testdata/csv/)(.*).For more information about how to adjust the regular expression, see How to debug regular expressions.
File Modification Time Filter
Filters Azure Blob files by modification time to locate the files to import.
All: Select this option to import all matching files.
From Specific Time: Select this option to import files modified after a specific point in time.
Specific Time Range: Select this option to import files modified within a specific time range.
Data Format
The format used to parse the data in the file.
CSV: A text file that uses a separator. You can use the first row as field names or specify field names manually. Each subsequent row is parsed as the values of log fields.
Single-line JSON: Reads the Azure Blob file line by line and parses each line as a JSON object. The fields in the JSON object correspond to the log fields.
Single-line Text Log: Parses each line in the Azure Blob file as a log entry.
Multi-line Text Logs: A multi-line mode that parses logs using a regular expression for the first or last line.
Compression Format
The compression format of the Azure Blob files to import. SLS decompresses and reads the data based on the specified format.
Encoding Format
The encoding format of the Azure Blob files to import. Only UTF-8 and GBK are supported.
New File Check Cycle
If new files are continuously added to the specified Azure Blob path, configure the New File Check Cycle. The import task then runs in the background to periodically discover and import new files. The backend ensures that data from the same file is never imported more than once.
If no new files will be added, set this to Never Check. The import task automatically exits after it reads all matching files.
Log Time Configuration
Time Field
When you set Data Format to CSV or Single-line JSON, this parameter specifies the column name to be used as the log's timestamp during import.
Regular Expression to Extract Time
You can use a regular expression to extract the time from a log.
For example, for the sample log 127.0.0.1 - - [10/Sep/2018:12:36:49 0800] "GET /index.html HTTP/1.1", set Regular Expression to Extract Time to
[0-9]{0,2}\/[0-9a-zA-Z]+\/[0-9:,]+.NoteFor other data formats, you can also use a regular expression to extract a specific portion of the time field's value.
Time Field Format
Specifies the time format to parse the value of the time field.
Supports Java SimpleDateFormat syntax, such as
yyyy-MM-dd HH:mm:ss. For more information about the syntax, see Class SimpleDateFormat. For common time formats, see Time formats.Supports epoch formats, including epoch, epochMillis, epochMicro, and epochNano.
Time Field Partition
Select the time zone for the time field. You do not need to set a time zone when the time field format is an epoch type.
If parsing the log time needs to account for daylight saving time, select a UTC format. Otherwise, select a GMT format.
NoteThe default time zone is UTC+8.
When you set Data Format to CSV, you must configure additional parameters, as described in the following table.
CSV-specific parameters
Parameter
Description
Delimiter
The separator for the logs. The default is a comma (,).
Quote
The quote character used for CSV strings.
Escape Character
The escape character for the logs. The default is a backslash (\).
Maximum Log Lines
After you enable the First line as field name switch, the first row of the CSV file is used as the field names.
Custom Fields
If you disable the First line as field name switch, define custom field names as needed. Separate multiple field names with a comma (,).
Lines to Skip
The number of log lines to skip. For example, a value of 1 means log collection starts from the second line of the CSV file.
Multi-line text log-specific parameters
Parameter
Description
Position to Match Regular Expression
Set the position for the regular expression to match. The options are:
First-line regular expression: Uses a regular expression to match the first line of a log entry. Unmatched lines are considered part of the current log entry, up to the maximum number of lines.
Last-line regular expression: Uses a regular expression to match the last line of a log entry. Unmatched lines are considered part of the next log entry, up to the maximum number of lines.
Regular Expression
Set the correct regular expression based on the log content.
For more information, see How to debug regular expressions.
Maximum Lines
The maximum number of lines for a single log entry.
Click Preview to preview the import results.
After you confirm that the settings are correct, click Next.
Preview the data, create indexes, and then click Next.
By default, full-text indexing is enabled in Simple Log Service. Alternatively, you can manually create field indexes based on the collected logs or click Automatically Generate Indexes to automatically generate them.
ImportantTo query and analyze logs, you must enable either full-text indexing or field indexing. If both are enabled, field indexing takes precedence.
View the import configuration
After you create an import configuration, you can view it and its statistical reports in the console.
In the Projects section, click the destination project.
In the navigation pane, choose . Select the destination Logstore and choose . Then, click the name of the configuration.
View the basic information and statistical reports for the import configuration.
You can also modify the configuration, start or stop the import, or delete the configuration.
WarningA deleted configuration cannot be recovered. Proceed with caution.
Billing
SLS does not charge for the data import feature. However, this feature accesses service provider APIs, which may incur traffic and request fees. The pricing model is as follows. The actual fees are subject to the service provider's bill.

Field | Description |
| Total data volume imported per day, in GB. |
| Egress traffic fee per GB of data. |
| Fee for every 10,000 Put requests. |
| Fee for every 10,000 Get requests. |
| New file check interval, in minutes. You can set the new file check interval when you create the data import configuration. |
| The number of files that can be listed in the Container based on the prefix. |
FAQ
Problem | Possible cause | Solution |
No data is displayed in the preview. | There are no files in Azure Blob, the files are empty, or no files match the filter conditions. |
|
The data contains garbled text. | The Data Format, Compression Format, or Encoding Format is configured incorrectly. | Confirm the actual format of the Azure Blob files, and then adjust the Data Format, Compression Format, or Encoding Format settings. To fix existing garbled data, create a new Logstore and a new import configuration. |
The timestamp displayed in SLS does not match the timestamp in the data. | When the import was configured, the log time field was not specified, or the time format or time zone was set incorrectly. | Set the specified log time field and configure the correct time format and time zone. |
After data is imported, it cannot be queried or analyzed. |
|
|
The number of imported log entries is less than expected. | Some files contain single lines of data larger than 3 MB, which are dropped during import. For more information, see Collection limits. | When you write data to Azure Blob files, ensure that no single line of data exceeds 3 MB. |
The number of files and the total data volume are large, but the import speed is slower than expected. The normal import speed can reach 80 MB/s. | The number of Logstore shards is too low. For more information, see Performance limits. | If the number of Logstore shards is low, increase the number of shards to 10 or more and monitor the latency. For more information, see Manage shards. |
Some files were not imported. | The filter conditions are set incorrectly, or some files are larger than 5 GB. For more information, see Collection limits. |
|
Multi-line text logs are parsed incorrectly. | The first-line regular expression or last-line regular expression is set incorrectly. | Verify that the first-line regular expression or last-line regular expression is correct. |
High latency for new file imports. | Too many existing files match the file path prefix filter. | If more than 1 million files match the path prefix, discovering new files is inefficient. To improve efficiency, set a more specific prefix and create multiple import tasks. |
Error handling
Error | Description |
Failed to read file | If an incomplete file error occurs while a file is being read due to a network exception, file corruption, or other issues, the import task automatically retries. If the read fails after three retries, the file is skipped. The retry interval is the same as the new file check interval. If the new file check interval is set to Never Check, the retry interval is 5 minutes. |
Compression format parsing error | If an invalid compression format error occurs when a file is being decompressed, the import task skips the file. |
Data format parsing error | If data parsing fails, the import task stores the original text content in the content field of the log. |
Container does not exist | The import task retries periodically. After the container is recreated, the import task automatically resumes. |
Permission error | If a permission error occurs when reading data from the container or writing data to the SLS Logstore, the import task retries periodically. The task automatically resumes after the permission issue is fixed. When a permission error occurs, the import task does not skip any files. Therefore, after the permission issue is fixed, the task automatically imports data from any unprocessed files in the container into the SLS Logstore. |