All Products
Search
Document Center

OpenSearch:DLF data source

Last Updated:Dec 24, 2024

This topic describes how to use a Data Lake Formation (DLF) data source when you add a table in an OpenSearch Retrieval Engine Edition instance.

Prerequisites

  • You familiarize yourself with DLF.

  • A catalog, a database, and a table are created in DLF. They are used when you configure data synchronization.

Add a DLF data source

  1. Log on to the OpenSearch Retrieval Engine Edition console. In the left-side navigation pane, click Instances. On the Instances page, find the desired instance and click its ID. On the instance details page, click Table Management in the left-side navigation pane. On the page that appears, click Add Table.

  2. In the Basic Table Information step, configure the parameters and click Next.imageParameters:

  • Table Name: the name of the table. You can enter a custom table name.

  • Data Shards: the number of data shards. The number must be a positive integer that does not exceed 256. We recommend that you specify a number that does not exceed three times the number of Searcher workers.

  • Number of Resources for Data Updates: the number of resources that are used for data updates. By default, a free quota of two resources for data updates is provided for each table. Each resource consists of 4 vCPUs and 8 GB of memory. You are charged for resources that exceed the free quota.

  1. In the Data Synchronization step, configure the following parameters to add a data source, and then click Check to check the data source information. If the check is passed, click Next.

    dlf数据校验.png

  • Full Data Source: Select DLF.

  • Catalog ID: the ID of the DLF catalog that you want to access.

  • Database: the name of the database in the catalog.

  • Data Table: the name of the table in the database.

    Note
    • If you want to use DLF data sources for existing instances, you must first upgrade the offline versions of the instances.

    • Only catalogs of the Paimon type are supported.

    • For a primary key table in Paimon, you can add data to the table, delete data from the table, and modify and query data in the table. For an append-only table in Paimon, you can only add data to the table. You are not allowed to modify data or delete data from the table.

  1. In the Index Schema step, configure the parameters. You can use the Form Mode or Developer Mode. After the configuration, click Next.

    召回引擎dlf索引结构cn.png

  2. Confirm the creation. Then, the system automatically creates the configured table. You can view the table creation progress on the Change History page. After the table enters the In Use state, you can perform a query test on the Query Test page.

Precautions

When new data is written to the Paimon table in DLF, OpenSearch automatically creates an index in real time based on the new data. If you manually write data to the Paimon table by calling an API, data inconsistency may occur. Therefore, proceed with caution.