Limits on data import from Elasticsearch to Simple Log Service - Simple Log Service

This topic describes the limits on data import from Elasticsearch to Simple Log Service.

Limits on collection

Item

Description

Size of a single data record

The size of a single data record can be up to 3 MB. If the size of a single data record exceeds the limit, the record is discarded.

The Deliver Failed chart on the Data Processing Insight dashboard displays the number of data records that are discarded. For more information, see View a data import configuration.

Data latency

If you use the incremental mode of automated data import, the most recent data that is written to Elasticsearch is not immediately imported to Simple Log Service. The data write is affected by the value of the Maximum Latency in Seconds parameter.

For example, if you set the Maximum Latency in Seconds parameter to 300, the most recent data in Elasticsearch is imported to Simple Log Service 300 seconds after the data is written to Elasticsearch.

Limits on configuration

Item	Description
Number of data import configurations	The total number of data import configurations that can be created in a single project can be up to 100 regardless of configuration types. If you want to increase the quota, submit a ticket.
Bandwidth	When a data import task reads data from an Alibaba Cloud Elasticsearch cluster over a virtual private cloud (VPC), the default maximum network bandwidth that is allowed is 128 MB/s. If you require a larger bandwidth, submit a ticket.

Limits on performance

Item	Description
Number of concurrent tasks	A data import task uses the Scroll mode to pull Elasticsearch data. In this mode, the total number of shards that are used for concurrent tasks cannot exceed the maximum number of Scroll requests that are allowed on Elasticsearch. To change the maximum number of Scroll requests that are allowed, you can reconfigure `search.max_open_scroll_context` for your Elasticsearch cluster. The default value of search.max_open_scroll_context is 500.
Capabilities of Elasticsearch servers	The larger the size of an Elasticsearch cluster and the higher the configurations of a machine, the higher the overall throughput.
Query complexity	If a complex query is involved during a data import task, the Elasticsearch server may require a long period of time to process the query request, and the overall data read speed is affected.
Number of shards in a Logstore	The write performance of Simple Log Service varies based on the number of shards in a Logstore. A single shard supports a write speed of 5 MB/s. If an import task writes a large volume of data to Simple Log Service, we recommend that you increase the number of shards for the Logstore. For more information, see Manage shards.
Network	If you use an Alibaba Cloud Elasticsearch cluster in a VPC or a self-managed Elasticsearch cluster on an Elastic Compute Service (ECS) instance, you can read data over a VPC. In this case, no Internet traffic is generated, and the transmission speed can exceed 100 MB/s. When you import data over the Internet, the performance and bandwidth of the network cannot be ensured. This may cause import latency.

Impacts on Elasticsearch servers

Item

Description

Excessive search sessions

A data import task uses the Scroll mode to read data from an Elasticsearch server. In this mode, the task creates a Scroll request for each shard to which indexes are imported. The Elasticsearch server retains session information for each Scroll request. This consumes memory resources of the server.

Excessive workload

If the number of indexes that you want to import is large and the data set is large, the overall workload on the Elasticsearch server is high, and the overall availability of the Elasticsearch service is affected.

If the workload of an Elasticsearch cluster is high, you can submit a ticket to adjust the maximum traffic allowed for data import tasks.