All Products
Search
Document Center

Simple Log Service:Import data from Elasticsearch to Simple Log Service

Last Updated:Apr 16, 2024

This topic describes how to import data from Elasticsearch to Simple Log Service. After you import data to Simple Log Service, you can query, analyze, and transform the data in the Simple Log Service console.

Prerequisites

Create a data import configuration

  1. Log on to the Simple Log Service console.

  2. In the Import Data section, click the Data Import tab. Then, click Elasticsearch - Data Import.

  3. Select the project and Logstore. Then, click Next.

  4. Configure the parameters for the data import configuration.

    1. In the Configure Import Settings step, configure the following parameters.

      Parameter

      Description

      Configuration Name

      The name of the data import configuration.

      Service Instance URL

      The URL of the Elasticsearch server. Format: http://host:port/.

      You can specify multiple URLs. Separate multiple URLs with commas (,). Example: http://host1:port1/,http://host2:port2/.

      In most cases, the service port of an Elasticsearch server is port 9200.

      Important

      If you configure the VPC-based Instance ID parameter, you must set the host variable to the IPv4 address of the Elastic Compute Service (ECS) instance that is involved.

      Elasticsearch Index List

      The names of the indexes that you want to import. Separate multiple index names with commas (,). Example: index1,index2,index3.

      Elasticsearch User Name

      The username that is used to access the Elasticsearch cluster. This parameter is required only if user authentication is required to access the Elasticsearch cluster.

      Elasticsearch User Password

      The password that is used to access the Elasticsearch cluster.

      Time Field

      The time field that is used to record the log time. You can enter the name of the column that represents time in the Elasticsearch indexes.

      If you do not specify a time field, Simple Log Service uses the system time when data is imported.

      Important

      If you want to import incremental data, you must configure the Time Field parameter.

      Time Field Format

      The time format that is used to parse the value of the time field.

      • You can specify a time format that is supported by Java SimpleDateFormat. Example: yyyy-MM-dd HH:mm:ss. For more information about the time format syntax, see Class SimpleDateFormat. For more information about the common time formats, see Time formats.

      • You can specify an epoch time format. Valid values: epoch, epochMillis, epochMacro, and epochNano.

      Important

      Java SimpleDateFormat does not support UNIX timestamps. If you want to use UNIX timestamps, you must set the Time Field Format parameter to an epoch time format.

      Time Zone

      The time zone of the time field.

      If you set the Time Field Format parameter to an epoch time format, you do not need to configure the Time Zone parameter.

      Elasticsearch Query String

      The query statement that is used to filter data. The query statement must conform to the Elasticsearch query_string format. Example: gender:male and city:Shanghai. For more information, see Query string query.

      Import Method

      The mode that is used to import data. Valid values:

      • Import Only Historical Data: After data is imported, the import task automatically ends.

      • Automatically Import Incremental Data: The import task continuously runs.

        Important

        If you select Automatically Import Incremental Data, you must configure the Time Field parameter.

      Start At

      The start time. After you specify a start time, data is imported to Simple Log Service only if the value of the time field is greater than or equal to the start time.

      Important

      This parameter takes effect only if you configure the Time Field parameter.

      End Time

      The end time. After you specify an end time, data is imported to Simple Log Service only if the value of the time field is less than or equal to the end time.

      Important

      This parameter takes effect only if you configure the Time Field parameter and set the Import Method parameter to Import Only Historical Data.

      Maximum Latency in Seconds

      The maximum latency that is allowed between data generation and data written to Elasticsearch.

      Important
      • If you specify a value that is less than the actual latency, some data cannot be imported from Elasticsearch to Simple Log Service.

      • This parameter takes effect only if you configure the Time Field parameter and set the Import Method parameter to Automatically Import Incremental Data.

      Incremental Data Check Interval (Seconds)

      The interval at which Simple Log Services checks for incremental data in Elasticsearch. Unit: seconds. Default value: 300. Minimum value: 60.

      VPC-based Instance ID

      If the Elasticsearch cluster is an Alibaba Cloud Elasticsearch cluster in a virtual private cloud (VPC) or a self-managed Elasticsearch cluster on an ECS instance, you can configure this parameter to allow Simple Log Service to read data from the Elasticsearch cluster over an internal network of Alibaba Cloud. Data read over an internal network of Alibaba Cloud provides higher security and network stability.

      Important

      The Elasticsearch cluster must allow access from the CIDR block 100.104.0.0/16.

    2. Click Preview to preview the import result.

    3. After you confirm the result, click Next.

  5. Preview data, configure indexes, and then click Next.

    By default, full-text indexing is enabled for Log Service. You can also configure field indexes based on collected logs in manual mode or automatic mode. To configure field indexes in automatic mode, click Automatic Index Generation. This way, Log Service automatically creates field indexes. For more information, see Create indexes.

    Important

    If you want to query and analyze logs, you must enable full-text indexing or field indexing. If you enable both full-text indexing and field indexing, the system uses only field indexes.

  6. Click Log Query. You are redirected to the search and analysis page of the Logstore. On the page, check whether the import task of Elasticsearch data is successful.

    Wait for approximately 1 minute. If you can query Elasticsearch data, the import task is successful.

View a data import configuration

After you create a data import configuration, you can view the configuration and related reports in the Simple Log Service console.

  1. In the Projects section, click the project to which the data import configuration belongs.

  2. Find and click the Logstore to which the data import configuration belongs, choose Data Import > Data Import, and then click the name of the data import configuration.

  3. On the Import Configuration Overview page, view the basic information about the data import configuration and the related reports.

What to do next

  • Delete a data import configuration

    On the Import Configuration Overview page, you can click Delete Configuration to delete the data import configuration.

    Warning

    After a data import configuration is deleted, it cannot be restored. Proceed with caution.

  • Stop and restart the import task of a data import configuration

    After you create a data import configuration, Simple Log Service creates an import task. On the Import Configuration Overview page, you can click Stop to stop the import task. After the import task is stopped, you can also restart the import task.

    Important

    After an import task is stopped, the task is in the stopped state for up to 24 hours. If the import task is not restarted during this period, the task becomes unavailable. If you restart an unavailable import task, errors may occur.

FAQ

Issue

Possible cause

Solution

An Elasticsearch connection error occurs during the preview. Error code: failed to connect.

  • The specified URL of the Elasticsearch server is invalid.

  • The IP addresses that are used by the import task to access the Elasticsearch cluster are not added to the whitelist of the cluster. As a result, the import task cannot access the cluster.

  • The ID of the VPC in which the Alibaba Cloud Elasticsearch cluster resides is not specified. The cluster is the source of data import.

  • Make sure that the specified URL of the Elasticsearch server is valid.

  • Add the IP addresses that are used by the import task to access the Elasticsearch cluster to the whitelist of the cluster. For more information, see IP address whitelists.

  • If data is imported from an Alibaba Cloud Elasticsearch cluster over an internal network, make sure that the ID of the VPC in which the cluster resides is specified.

A timeout error occurs during the preview. Error code: preview request timed out.

The Elasticsearch index that you want to import contains no data or contains no data that meets the specified filter conditions.

  • If the Elasticsearch index contains no data, write data to the index and preview data again.

  • Make sure that the time field and time format that you specify match the actual time field and time format in the data that you want to import.

  • Make sure that the Elasticsearch index contains data that meets the specified filter conditions or time range.

The log time displayed in Simple Log Service is different from the actual time of imported data.

The time field is not specified in the data import configuration, or the specified time format or time zone is invalid.

Specify a time field or specify a valid time format or time zone. For more information, see Create a data import configuration.

After data is imported, the data cannot be queried or analyzed.

  • The data is not within the query time range.

  • No indexes are configured.

  • The indexes do not take effect.

  • Check whether the time of the data that you want to query is within the query time range that you specify.

    If no, adjust the query time range and query the data again.

  • Check whether indexes are configured for the Logstore to which the data is imported.

    If no, configure indexes first. For more information, see Create indexes and Reindex logs for a Logstore.

  • If indexes are configured for the Logstore and the volume of imported data is displayed as expected on the Data Processing Insight dashboard, the possible cause is that the indexes do not take effect. In this case, rebuild the indexes. For more information, see Reindex logs for a Logstore.

The number of imported data entries is less than expected.

Data entries whose size is larger than 3 MB exist in Elasticsearch. You can view the data entries on the Data Processing Insight dashboard.

Make sure that each data entry does not exceed 3 MB in size.

After incremental import is enabled, a large latency exists when new data is imported.

  • The value of Maximum Latency in Seconds is excessively large.

  • The bandwidth usage of the Elasticsearch cluster reaches the upper limit.

  • The network is unstable when data is imported over the Internet.

  • The number of shards in the Logstore is excessively small.

  • For more information about other possible causes, see Limits on performance.

  • Change the value of Maximum Latency in Seconds based on your business requirements.

  • Check whether the bandwidth usage of the Elasticsearch cluster, especially an Alibaba Cloud Elasticsearch cluster, reaches the upper limit. If the bandwidth usage reaches or approaches the upper limit, you must upgrade the existing bandwidth.

  • If Elasticsearch data is imported over the Internet, make sure that the Internet bandwidth is sufficient.

  • If the number of shards in the Logstore is excessively small, increase the number of shards and observe the latency. For more information, see Manage shards.

Error handling

Item

Description

Communication with the Elasticsearch cluster is abnormal.

The import task pulls Elasticsearch data in scroll mode. The default keep-alive duration is 24 hours. If network connection errors occur or other errors that prevent normal communication with Elasticsearch occur, the import task is automatically retried. The other errors include user authentication errors.

If the communication cannot be recovered within 24 hours, the scroll session information on Elasticsearch is deleted. As a result, the import task cannot be resumed even when the task is retried. The system reports the "No search context found" error. In this case, you can only re-create the import task.

The Logstore does not exist.

The import task is retried at regular intervals. If you re-create the Logstore within 24 hours, the import task continues to read Elasticsearch data from the position at which the task stops reading data. Otherwise, the scroll session information on Elasticsearch is deleted, and you can only re-create the import task.