Migrate Elasticsearch & OpenSearch Logs into SLS for Unified Analysis - Simple Log Service

Prerequisites

You have a running Elasticsearch/OpenSearch cluster.
You have created a Project and a LogStore. For more information, see Manage a Project and Create a basic LogStore.

Supported versions

This feature supports Elasticsearch 6.3 or later and OpenSearch 1.0.0 or later.

Create a data import configuration

Log on to the Simple Log Service console.
In the Data Collection section, on the Data Import tab, select ES/OpenSearch - Data Import.
Select the destination project and Logstore, and then click Next.

Configure the import settings.

In the Import Configuration step, configure the following parameters.

Parameter	Description
Job Name	The unique name of the SLS job.
Display Name	The display name of the job.
Job Description	The description of the import job.
Service Instance URL	The URL of the Elasticsearch/OpenSearch cluster. The format is `http://host:port/`. You can specify multiple URLs separated by commas (,), for example, `http://host1:port1/,http://host2:port2/`. Typically, the service port for an Elasticsearch/OpenSearch cluster is 9200. Important If you specify a VPC ID, you must set `host` to the IPv4 address of the corresponding ECS instance.
Index List	The indexes to import. Separate multiple indexes with commas (,), for example, `index1,index2,index3`.
User Name	The username for the Elasticsearch/OpenSearch cluster. Required only if authentication is enabled.
User Password	The password for the specified user.
Time Field	The field in your Elasticsearch/OpenSearch index that represents the log time. If you do not specify a time field, Simple Log Service uses the time of import as the log time. Important To perform an incremental import, you must specify the Time Field.
Time Field Format	The format used to parse the time field values. Supports Java `SimpleDateFormat` syntax, such as yyyy-MM-dd HH:mm:ss. For more information about the syntax, see Class SimpleDateFormat. For common time formats, see Time formats. Supports epoch formats. Valid values are `epoch`, `epochMillis`, `epochMacro`, and `epochNano`. Important To use UNIX timestamps, you must set the Time Field Format to an epoch format.
Time Zone	The time zone of the time field. This parameter is not required if the Time Field Format is set to an epoch format.
Query Statement	The query used to filter data. The query must follow the Elasticsearch/OpenSearch `query_string` format, for example, `gender:male and city:Shanghai`. For more information, see Query string query.
Import Mode	The import mode. Import Only Historical Data: The import job stops automatically after it imports all historical data. Automatically Import Incremental Data: The import job runs continuously to import new data. Important If you select Automatically Import Incremental Data, you must specify a Time Field.
Start Time	After you specify a start time, data is imported only if the value of the time field is later than or equal to the start time. Important This parameter takes effect only when the Time Field is specified.
End Time	After you specify an end time, data is imported only if the value of the time field is earlier than or equal to the end time. Important This parameter takes effect only when the Time Field is specified and the Import Mode is set to Import Only Historical Data.
Maximum Data Latency (Seconds)	The maximum allowed delay, in seconds, from when data is generated to when it is indexed in Elasticsearch/OpenSearch. Important Setting a value lower than the actual latency may result in data loss during the import. This parameter takes effect only when the Time Field is specified and the Import Mode is set to Automatically Import Incremental Data.
Incremental Data Check Interval (Seconds)	The interval, in seconds, at which SLS checks for new data in Elasticsearch/OpenSearch. Default value: 300. Minimum value: 60.
VPC ID	If your source cluster is an Alibaba Cloud Elasticsearch/OpenSearch cluster in a VPC or a self-managed cluster on an ECS instance, specify the VPC ID. This allows SLS to read data over the Alibaba Cloud internal network, providing better security and network stability. Important The source cluster must allow access from the CIDR block 100.104.0.0/16.

Click Preview to preview the import results.
After you confirm the preview, click Next.

Preview the data and configure indexes, and then click Next. By default, Simple Log Service enables a full-text index. You can also manually create field indexes based on the collected logs, or click Automatic Index Generation for Simple Log Service to create them automatically. For more information, see Create indexes.

Important
We recommend using a full-text index if you need to query all fields in your log data. If you need to query only specific fields, use field indexes to reduce index traffic. You must create field indexes to analyze fields by using SELECT statements.
Click Query Log. Then, you are redirected to the query and analysis page of your Logstore.
You must wait approximately 1 minute for the indexes to take effect. Then, you can view the collected logs on the Raw Logs tab. For more information about how to query and analyze logs, see Get started.

View a data import configuration

After you create a data import configuration, you can view the configuration details and related statistical reports in the console.

In the Projects section, click the destination Project.
Navigate to the destination LogStore. In the left-side navigation pane, choose Data Collection > Data Import and click the name of the configuration.
On the Import Configuration Overview page, view the basic information and statistical reports for the configuration.

Related operations

Delete a data import configuration

On the Import Configuration Overview page, you can click Delete Configuration to delete the configuration.

Warning
This action cannot be undone. Proceed with caution.
Stop and restart an import job

After you create a data import configuration, SLS creates a corresponding import job. On the Import Configuration Overview page, you can click Stop to pause the job. You can restart it later.

Important
A stopped job's state is retained for up to 24 hours. If you do not restart the job within this period, it becomes unavailable and cannot be restarted.

FAQ

Issue	Possible cause	Solution
A connection error (`failed to connect`) occurs during data preview.	The specified URL of the Elasticsearch/OpenSearch cluster is invalid. The IP addresses of the import service are not added to the whitelist of the source cluster, which prevents access. You did not specify the VPC ID when importing data from a cluster hosted on Alibaba Cloud.	Ensure that the specified URL of the Elasticsearch/OpenSearch cluster is correct. Add the required IP addresses to the cluster's whitelist to grant access to the import service. For more information, see IP address whitelist. When you import data from a cluster on Alibaba Cloud over an internal network, ensure that the correct VPC ID is specified.
A timeout error (`preview request timed out`) occurs during data preview.	The source Elasticsearch/OpenSearch index contains no data or no data that matches the filter criteria.	If the index contains no data, write data to the index and then try to preview again. Ensure that the specified time field and format match the actual data. Ensure that the source index contains data that matches the specified query or time range.
The log time displayed in Simple Log Service does not match the timestamp in the source data.	The time field was not specified, or the time format or time zone was configured incorrectly.	Specify the correct time field, time format, and time zone. For more information, see Create a data import configuration.
Unable to query or analyze imported data.	The data is outside the query time range. Indexes are not configured. The indexes have not taken effect.	Check whether the timestamp of the data is within the query time range. If not, adjust the time range and query again. Check whether indexes are configured for the LogStore. If not, create indexes. For more information, see Create indexes and Reindex. If indexes are configured and the Data Processing Insight dashboard shows the expected volume of imported data, the indexes may not have taken effect. Try to reindex the logs. For more information, see Reindex.
The number of imported log entries is less than expected.	Some source data documents may be larger than 3 MB. You can verify this on the Data Processing Insight dashboard.	Reduce the size of individual data documents in the source cluster.
When incremental import is enabled, there is a significant delay in importing new data.	The value for Maximum Data Latency (Seconds) is too large. The bandwidth of the source cluster has reached its limit. An unstable network connection when importing data over the Internet. The number of shards in the LogStore is too small. For other possible causes, see Performance limits.	Ensure that a reasonable value is set for Maximum Data Latency (Seconds) and adjust it based on the actual latency. Check if the traffic of the source cluster is reaching its bandwidth limit. This is especially important for clusters that are deployed on Alibaba Cloud. If the limit is reached or approached, upgrade the bandwidth. If you are importing data over the Internet, ensure that you have sufficient bandwidth. If the number of shards in the LogStore is too low, try increasing the shard count and observe the latency. For more information, see Manage shards.

Error handling mechanism

Error

Description

Communication errors with the Elasticsearch/OpenSearch cluster

The import job uses the scroll API to pull data from Elasticsearch/OpenSearch, with a default keep-alive duration of 24 hours. The job automatically retries if it encounters network connection errors or other communication failures, such as authentication errors.

If the connection cannot be restored within 24 hours, the Elasticsearch/OpenSearch cluster clears the scroll session information. This action prevents the import job from resuming and causes a "No search context found" error. In this case, you must create a new import job.