Import log files from Azure Blob into Simple Log Service (SLS) for centralized management, query, and analysis. The service supports importing individual Azure Blob files up to 5 GB. For compressed files, this limit applies to the post-decompression size.
This document is the intellectual property of Alibaba Cloud. It describes how Alibaba Cloud services interact with third-party products, and as such, may reference third-party company or product names.
Prerequisites
You have uploaded log files to Azure Blob.
You have created a project and logstore.
Create a data import configuration
Log on to the Simple Log Service console.
In the Quick Data Import section, click Import Data. Then select AzureBlob - Data Import.
Select the destination project and logstore, and then click Next.
In the Import Configuration step, set the following parameters:
Parameter
Description
Job Name
The unique name of the SLS task.
Display Name
The display name of the task.
Job Description
The description of the import task.
ContainerName
The name of the Azure Blob container.
AccountName
The name of the Azure Blob account.
AccountKey
The key for the Azure Blob account.
File Path Prefix Filter
Filters Azure Blob files by their path prefix. For example, if all files to import are in the csv/ directory, set the prefix to csv/.
If you do not set this parameter, the entire Azure Blob container is traversed.
NoteSet this parameter. Traversing a container with many files is highly inefficient.
File Path Regex Filter
Filter Azure Blob files using a regular expression to locate the files to import. Only files whose names, including the path, match the regular expression are imported. By default, this is empty, and no files are filtered.
For example, if an Azure Blob file is
testdata/csv/bill.csv, set the regular expression to(testdata/csv/)(.*).For more information about adjusting regular expressions, see How to debug regular expressions.
File Modification Time Filter
Filter Azure Blob files by modification time to locate the files to import.
All: Imports all matching files.
From Specific Time: Imports files modified after a specific point in time.
Specific Time Range: Imports files modified within a specific time range.
Data Format
The format used to parse the data in the file.
CSV: A text file that uses a separator. Use the first row as field names or specify field names manually. Each subsequent row is parsed as the values of log fields.
Single-line JSON: Reads the Azure Blob file line by line and parses each line as a JSON object. The fields in the JSON object correspond to the log fields.
Single-line Text Log: Parses each line in the Azure Blob file as a log entry.
Multi-line Text Logs: A multi-line mode that parses logs using a regular expression for the first or last line.
Compression Format
The compression format of the Azure Blob files to import. SLS decompresses and reads the data based on the specified format.
Encoding Format
The encoding format of the Azure Blob files to import. Only UTF-8 and GBK are supported.
New File Check Cycle
If new files are continuously added to the specified Azure Blob path, configure this parameter. The import task will then run in the background to periodically discover and import new files, ensuring that data from the same file is never imported more than once.
If no new files will be added, set this to Never Check. The import task automatically exits after it reads all matching files.
Log Time Configuration
Time Field
When Data Format is set to CSV or Single-line JSON, this parameter specifies the column name to be used as the log's timestamp during import.
Regular Expression to Extract Time
Extracts the timestamp from a log by using a regular expression.
For example, if a sample log is 127.0.0.1 - - [10/Sep/2018:12:36:49 0800] "GET /index.html HTTP/1.1", set this parameter to
[0-9]{0,2}\/[0-9a-zA-Z]+\/[0-9:,]+.NoteFor any data format, a regular expression can be used to extract a specific portion of the time field's value.
Time Field Format
The time format used to parse the value of the time field.
Supports Java SimpleDateFormat syntax, such as
yyyy-MM-dd HH:mm:ss. For more information about the syntax, see Class SimpleDateFormat. For common time formats, see Time formats.Supports epoch formats, including epoch, epochMillis, epochMicro, and epochNano.
Time Zone
Select the time zone for the time field. You do not need to set a time zone when the time field format is an epoch type.
If parsing the log time needs to account for daylight saving time, select a UTC format. Otherwise, select a GMT format.
NoteThe default time zone is UTC+8.
When you set Data Format to CSV, you must configure additional parameters, as described in the following table.
CSV-specific parameters
Parameter
Description
Delimiter
Set the separator for the logs. The default is a comma (,).
Quote
The quote character used for CSV strings.
Escape Character
Configure the escape character for the logs. The default is a backslash (\).
Maximum Lines
If you enable the First Line as Field Name option, the first row of the CSV file is used as the field names.
Custom Fields
If you disable the First Line as Field Name option, define custom field names as needed. Separate multiple field names with a comma (,).
Lines to Skip
Specify the number of log lines to skip. For example, a value of 1 means log collection starts from the second line of the CSV file.
Multi-line text log-specific parameters
Parameter
Description
Position to Match Regular Expression
Set the position for the regular expression to match. The options are:
Regular Expression to Match First Line: Uses a regular expression to match the first line of a log entry. Unmatched lines are considered part of the current log entry, up to the maximum number of lines.
Regular Expression to Match Last Line: Uses a regular expression to match the last line of a log entry. Unmatched lines are considered part of the next log entry, up to the maximum number of lines.
Regular Expression
Set the correct regular expression based on the log content.
For more information, see How to debug regular expressions.
Maximum Lines
The maximum number of lines for a single log entry.
Click Preview to preview the import results.
After you confirm the settings, click Next.
Preview the data and create indexes, and then click Next.
Full-text indexing is enabled by default. Alternatively, field indexes can be created manually based on the collected logs, or generated automatically via the Automatic Index Generation feature.
ImportantTo query and analyze logs, you must enable either full-text indexing or field indexing. If both are enabled, field indexing takes precedence.
View the import configuration
After you create an import configuration, view it and its statistical reports in the console.
In the Projects section, click the destination project.
In the navigation pane, choose . Select the destination logstore and choose . Then, click the configuration name.
View the basic information and statistical reports for the import configuration.
You can also modify the configuration, start or stop the import, or delete the configuration.
WarningA deleted configuration cannot be recovered. Proceed with caution.
Billing
SLS does not charge for the data import feature. However, this feature accesses service provider APIs, which incurs traffic and request fees. The following pricing model applies. The actual fees are subject to the service provider's bill.

Field | Description |
| Total data volume imported per day, in GB. |
| Egress traffic fee per GB of data. |
| Fee for every 10,000 Put requests. |
| Fee for every 10,000 Get requests. |
| New file check interval, in minutes. The new file check interval can be set when you create the data import configuration. |
| Number of files that can be listed in the Container based on the prefix. |
FAQ
Problem | Possible cause | Solution |
No data is displayed in the preview. | There are no files in Azure Blob, the files are empty, or no files match the filter conditions. |
|
The data contains garbled characters. | The Data Format, Compression Format, or Encoding Format is configured incorrectly. | Confirm the actual format of the Azure Blob files, and then adjust the Data Format, Compression Format, or Encoding Format settings. To fix existing garbled data, create a new logstore and a new import configuration. |
The timestamp displayed in SLS does not match the timestamp in the data. | When configuring the import, the log time field was not specified, or the time format or time zone was set incorrectly. | Set the specified log time field and the correct time format and time zone. |
After importing data, you cannot query or analyze it. |
|
|
The number of imported log entries is less than expected. | Some files contain single lines of data larger than 3 MB, which are dropped during import. For more information, see Collection limits. | When writing data to Azure Blob files, ensure that no single line of data exceeds 3 MB. |
The number of files and the total data volume are large, but the import speed is slower than expected. The normal speed can reach 80 MB/s. | The number of logstore shards is too low. For more information, see Performance limits. | If the number of logstore shards is low, try increasing the number of shards to 10 or more and monitor the latency. For more information, see Manage shards. |
Some files were not imported. | The filter conditions are set incorrectly, or some individual files are larger than 5 GB. For more information, see Collection limits. |
|
Multi-line text logs are parsed incorrectly. | The first-line or last-line regular expression is set incorrectly. | Verify that the regular expression for the first or last line is correct. |
High latency for new file imports. | Too many existing files match the file path prefix filter. | To avoid inefficient discovery of new files when more than 1 million files match the path prefix, set a more specific prefix and create multiple import tasks. |
Error handling
Error | Description |
Failed to read file | If an incomplete file error occurs while reading a file due to network exceptions, file corruption, or other issues, the import task automatically retries. If the read fails after three retries, the file is skipped. The retry interval is the same as the new file check interval. If the new file check interval is set to Never Check, the retry interval is 5 minutes. |
Compression format parsing error | If an invalid compression format error occurs when decompressing a file, the import task skips the file. |
Data format parsing error | If data parsing fails, the import task stores the original text content in the content field of the log. |
Container does not exist | The import task retries periodically. After the container is recreated, the import task automatically resumes. |
Permission error | If a permission error occurs, the import task retries periodically and automatically resumes after the issue is resolved. When a permission error occurs, the import task does not skip any files. Therefore, after the permission issue is fixed, the task automatically imports data from any unprocessed files in the container into the SLS logstore. |