All Products
Search
Document Center

Lindorm:Use an open source client to connect to and use SearchIndex

Last Updated:Oct 23, 2024

LindormTable provides a new search type named search indexes. This topic describes how to use the open source HBase API and Elasticsearch API to access and use search indexes.

Background information

Search indexes provide a unified access method by integrating the capabilities of LindormTable that supports the open source HBase API and LindormSearch that supports the open source Elasticsearch API. This way, you can use search indexes to perform various types of queries. You can execute Lindorm SQL statements to connect to and use search indexes. In addition to this standard method, you can also use an open source client to use search indexes. For more information about search indexes, see Overview.

Note

If you execute a Lindorm SQL statement to create a table, execute SQL statements to use search indexes.

Prerequisites

  • Java Development Kit (JDK) V1.8 or later is installed.

  • The IP address of your client is added to the whitelist of your Lindorm instance. For more information, see Configure whitelists.

Overview

image

Procedure

Execute SQL statements to create and use a search index (recommended)

If a table is created by using the HBase API, you can use Lindorm SQL to create a search index for the table for better convenience. This method allows you to use open source clients to create and use a search index.

  1. Use Lindorm Shell to connect to LindormTable. For more information, see Use Lindorm Shell to connect to LindormTable.

  2. Create a wide table named testTable.

    create 'testTable', {NAME => 'f'}
  3. Use Lindorm SQL to map the columns in the wide table to those in the index table.

    ALTER TABLE testTable MAP DYNAMIC COLUMN f:name HSTRING;

    For more information about the mappings between data types, see Data type mappings.

  4. Create a search index named idx.

    CREATE INDEX idx USING SEARCH ON testTable (f:name);
  5. Insert a row of data to testTable.

    put 'testTable', 'row1', 'f:name', 'foo'
    Note

    After the status of the created search index becomes ACTIVE, data that is inserted to the wide table is automatically synchronized to the index table. For more information about how to see the status of an index, see SHOW INDEX. For data that is inserted to the wide table before the search index is created, you need to execute the ALTER INDEX statement to rebuild it. In this example, you can execute the ALTER INDEX idx ON test REBUILD; statement to rebuild the data. In the statement, idx is the name of the search index created for the wide table. For more information, see ALTER INDEX.

  6. Query indexed data.

    1. Use LindormSearch to query the primary key ID of the search index table.

      Note
      • The name of a search index table is concatenated in the following format: <Namespace name>.<Wide table name>.<Index name>.

      • The names of fields in a search index table is concatenated based on the following rules:

        • If the column family of the wide table is f, the field name in the search index table is the same as the corresponding column name in the wide table. For example, the name of the field that corresponds to the f:name column in the wide table is name.

        • If the column family of the wide table is not f, the field name in the search index table is the concatenated in the following format: <Column family name>_<Column name> For example, the name of the field that corresponds to the f1:name column in the wide table is f1_name.

      GET /default.testTable.idx/_search
      {
        "size": 10,
        "query": {
          "match": {
            "name": "foo"
          }
        }
      }

      The following result is returned:

      {
        "took" : 4,
        "timed_out" : false,
        "_shards" : {
          "total" : 1,
          "successful" : 1,
          "skipped" : 0,
          "failed" : 0
        },
        "hits" : {
          "total" : {
            "value" : 1,
            "relation" : "eq"
          },
          "max_score" : 1.0,
          "hits" : [
            {
              "_index" : "default.testTable.idx",
              "_id" : "726f7731",
              "_score" : 1.0,
              "_source" : {
                "update_version_l" : 1720072175536
              }
            }
          ]
        }
      }

      Obtain the primary key ID that is indicated by the _id field in the returned result.

    2. Run the following command in Lindorm Shell to query data in the wide table: If a search index is created by executing SQL statements, the primary key ID of the search index table is the HEX-encoded primary key ID of the wide table. Therefore, when you use the primary key ID of the search index table to query data in the wide table, specify the primary key ID in the HEX format. For more information, see rowkeyFormatterType.

      get 'testTable', "\x72\x6f\x77\x31"

      The following result is returned:

      COLUMN                                         CELL
       f:name                                        timestamp=1644462597661, value=foo
      1 row(s)
      Took 0.0942 seconds

Use Lindorm Shell to create a search index

  1. Use Lindorm Shell to connect to LindormTable. For more information, see Use Lindorm Shell to connect to LindormTable.

  2. Create a wide table named testTable in Lindorm Shell.

    create 'testTable', {NAME => 'f'}
  3. Create a search index table named democollection in LindormSearch. For more information, see Connect to LindormSearch.

    Note

    If you do not need to obtain specific field values, you can set _source to false to save resources.

  4. Map columns in the wide table to those in the index table. For example, map the f:name column in the testTable table to the name_s column in the democollection index table. In the f:name column, f specifies the column family name and name specifies the name of the column.

    1. In the bin directory of the file that is extracted from the HBase shell package, create a JSON file named schema. Copy the following sample code to the JSON file:

      {
        "sourceNamespace": "default",
        "sourceTable": "testTable",
        "targetIndexName": "democollection",
        "indexType": "ES",
        "rowkeyFormatterType": "STRING",
        "fields": [
          {
            "source": "f:name",
            "targetField": "name_s",
            "type": "STRING"
          }
        ]
      }
      Note
      • Make sure that all columns specified in the mappings are explicitly defined in LindormSearch and their names and data types are the same as actual columns.

      • For more information about the parameters in the JSON file, see Configure column mapping.

    2. Run the following command in Lindorm Shell to map columns in the wide table to those in the index table:

      alter_external_index 'testTable', 'schema.json'
      Note

      For more information about how to manage column mappings between a wide table and an index table, see Manage a column mapping relationship.

  5. Write a row of data to testTable. After column mappings between the wide table and the index table are configured, the data that is written to the wide table is automatically synchronized to the index table in real time. For data that is inserted to the wide table before the mappings are configured, you need to manually create indexes for full data to synchronize it to the index table. For more information, see Create indexes for full data.

    put 'testTable', 'row1', 'f:name', 'foo'
  6. Query indexed data.

    1. Use LindormSearch to query the primary key ID of the search index table.

      GET /democollection/_search
      {
        "size": 10,
        "query": {
          "match": {
            "name_s": "foo"
          }
        }
      }

      The following result is returned:

      {
        "took" : 4,
        "timed_out" : false,
        "_shards" : {
          "total" : 1,
          "successful" : 1,
          "skipped" : 0,
          "failed" : 0
        },
        "hits" : {
          "total" : {
            "value" : 1,
            "relation" : "eq"
          },
          "max_score" : 1.0,
          "hits" : [
            {
              "_index" : "democollection",
              "_id" : "row1",
              "_score" : 1.0,
              "_source" : {
                "update_version_l" : 1720072175536
              }
            }
          ]
        }
      }

      Obtain the primary key ID that is indicated by the _id field in the returned result.

    2. Convert the primary key ID of the search index table to the primary key row1 of the wide table based on the rowkeyFormatterType parameter defined in the mappings. For more information, see rowkeyFormatterType. In this example, the STRING method is used to convert the primary key ID of the search index table. Therefore, the primary key ID of the search index table are the same as that of the wide table.

    3. Run the following command in Lindorm Shell to query data in the wide table:

      get 'testTable','row1'

      The following result is returned:

      COLUMN                                         CELL
       f:name                                        timestamp=1644462597661, value=foo
      1 row(s)
      Took 0.0942 seconds
  7. View the status of real-time data synchronization tasks on the Lindorm Tunnel Service (LTS) web UI. After column mappings between the wide table and the index table are configured, the data that is written to the wide table is automatically synchronized to the index table in real time.

    1. To log on to the LTS web UI, click the instance ID in the Lindorm console. In the left-side navigation pane, choose Wide Table Engine > SearchIndex. In the Data Synchronization Management section, click ClusterManager Internet or ClusterManager VPC.

    2. In the left-side navigation pane, choose Lindorm Search > Real time replication to Solr.

      查看任务状态

      Note

      You can configure the latency metric for real-time data synchronization in the CloudMonitor console. The alert threshold for the maximum task latency can be set to 600,000 milliseconds. For more information, see Cloud service monitoring.