All Products
Search
Document Center

Lindorm:Query data in a wide table by using a search index

Last Updated:May 30, 2025

This topic describes how to use a search index to query data in a Lindorm wide table in multidimensional queries, paged queries, queries with results sorted, tokenized text queries, fuzzy queries, and aggregate queries.

Prerequisites

Run a multidimensional query

Execute the following statements to run multidimensional queries:

  • Example 1

    SELECT * FROM search_table WHERE gender='M' AND city='Hangzhou' OR city='Beijing';

    The following result is returned:

    +---------+--------+-----+--------+--------------+------------------+------+
    | user_id |  name  | age | gender |   address    |      email       | city |
    +---------+--------+-----+--------+--------------+------------------+------+
    | 6 | Li | 32 | M | Yuhang District, Hangzhou | a***@example.net | Hangzhou |
    | 1 | Zhang | 18 | M | Chaoyang District, Beijing | a***@example.net | Beijing |
    | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou |
    +---------+--------+-----+--------+--------------+------------------+------+
  • Example 2

    SELECT * FROM search_table WHERE age > 30 AND city ! ='Hangzhou ';

    The following result is returned:

    +---------+--------+-----+--------+--------------+------------------+------+
    | user_id |  name  | age | gender |   address    |      email       | city |
    +---------+--------+-----+--------+--------------+------------------+------+
    | 28 | Chen | 36 | F | Nanshan District, Shenzhen | a***@example.net | Shenzhen |
    +---------+--------+-----+--------+--------------+------------------+------+

Run a paged query and sort query results

  • A search index allows you to sort data based on each of the columns in your table. Execute the following statement to sort the output:

    SELECT * FROM search_table  WHERE gender='M' ORDER BY age DESC;

    The following result is returned:

    +---------+--------+-----+--------+--------------+------------------+------+
    | user_id |  name  | age | gender |   address    |      email       | city |
    +---------+--------+-----+--------+--------------+------------------+------+
    | 6 | Li | 32 | M | Yuhang District, Hangzhou | a***@example.net | Hangzhou |
    | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou |
    | 1 | Zhang | 18 | M | Chaoyang District, Beijing | a***@example.net | Beijing |
    +---------+--------+-----+--------+--------------+------------------+------+
  • A search index allows you to use limit or offset to perform data pagination. Execute the following statement to run a paged query:

    SELECT * FROM search_table WHERE gender='M' ORDER BY age DESC LIMIT 1,10;

    The following result is returned:

    +---------+--------+-----+--------+--------------+------------------+------+
    | user_id |  name  | age | gender |   address    |      email       | city |
    +---------+--------+-----+--------+--------------+------------------+------+
    | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou |
    | 1 | Zhang | 18 | M | Chaoyang District, Beijing | a***@example.net | Beijing |
    +---------+--------+-----+--------+--------------+------------------+------+

Run a fuzzy query

Execute the following statement to query data that includes example in the email column:

SELECT * FROM search_table WHERE email LIKE '%example%' ORDER BY user_id;

The following result is returned:

+---------+--------+-----+--------+--------------+------------------+------+
| user_id |  name  | age | gender |   address    |      email       | city |
+---------+--------+-----+--------+--------------+------------------+------+
| 1 | Zhang | 18 | M | Chaoyang District, Beijing | a***@example.net | Beijing |
| 6 | Li | 32 | M | Yuhang District, Hangzhou | a***@example.net | Hangzhou |
| 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou |
| 28 | Chen | 36 | F | Nanshan District, Shenzhen | a***@example.net | Shenzhen |
+---------+--------+-----+--------+--------------+------------------+------+
Note

The performance of fuzzy queries is limited due to the amount of data in tables. Therefore, we recommend that you run tokenized text queries instead of fuzzy queries.

Run an aggregate query

Search indexes support common aggregate functions such as COUNT, SUM, AVG, MIN, and MAX. Search indexes also support the DISTINCT and GROUP BY clauses.

  • Execute the following statement to query the number of data rows that include Hangzhou in the city column:

    SELECT COUNT(*) FROM search_table WHERE city='Hangzhou';

    The following result is returned:

    +----------+
    | EXPR$0   |
    +----------+
    | 2        |
    +----------+
  • Execute the following statement to query data and remove duplicate query results:

    SELECT distinct(age) FROM search_table WHERE gender='M';

    The following result is returned:

    +---------------+
    | DISTINCT(age) |
    +---------------+
    | 18            |
    | 28            |
    | 32            |
    +---------------+
  • Execute the following statement to query data and group the query results:

    SELECT city,count(*) AS cnt FROM search_table WHERE gender='M' GROUP BY city ORDER BY cnt DESC;

    The following result is returned:

    +------+-----+
    | city | cnt |
    +------+-----+
    | Hangzhou | 2 |
    | Beijing | 1 |
    +------+-----+
Note

If you do not specify a WHERE condition in an aggregate query, the corresponding search index cannot be hit. In this case, you can add the _l_force_index parameter before the WHERE keyword to force the query to hit a specified index. For more information, see the "Parameters of hintOptions" and "Example 4: Specify _l_force_index_ to forcibly use an index in the query" sections of the HINT topic.

Run a phrase-based query

Execute the following statement to run a query with phrases specified as a condition:

SELECT name,age,city,address FROM search_table WHERE age>1 AND age<100 AND address='"Hangzhou"' ORDER BY user_id;

The following result is returned:

+--------+-----+------+--------------+
|  name  | age | city |   address    |
+--------+-----+------+--------------+
| Li | 32 | Hangzhou | Yuhang District, Hangzhou |
| Wang | 28 | Hangzhou | Binjiang District, Hangzhou |
+--------+-----+------+--------------+

Run a tokenized query

Important

Tokenized queries rely on the MATCH function which is available only in LindormTable 2.7.2 and later versions. To view or upgrade your LindormTable version, see Release notes of LindormTable and Upgrade the minor engine version of a Lindorm instance.

For tokenized columns, you must use the MATCH function to implement tokenized queries. The following example queries data rows where age ranges from 10 to 30 and address matches Hangzhou, with the address column being a tokenized column.

SELECT * FROM search_table WHERE age > 10 AND age <= 30 AND MATCH (address) AGAINST ('Hangzhou') ORDER BY user_id;

The following result is returned:

+---------+--------+-----+--------+--------------+------------------+------+
| user_id |  name  | age | gender |   address    |      email       | city |
+---------+--------+-----+--------+--------------+------------------+------+
| 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou |
+---------+--------+-----+--------+--------------+------------------+------+

Run a tokenized text query

Important

To use the tokenized text query feature, the LindormTable version must be 2.8.0 or later, and the LindormSearch version must be compatible with ElasticSearch. If your version does not meet the preceding requirements, update the minor version.

Background information

If you set the type=text attribute for a VARCHAR column in a wide table when you create a search index, the system will split the text in the column into tokens and use tokens in the search index when you perform queries such as equivalent queries and fuzzy queries. This may result in unexpected search results.

For example, if you configure the address column as a tokenized column by setting the address(type=text, analyzer=ik)address(type=text,analyzer=ik) attribute, the system will split the text in the column into individual tokens based on the tokenization policy when you create an index. For example, the text Binjiang District, Hangzhou is split into individual tokens Hangzhou and Binjiang District. In this case, the fuzzy query fails if you use the LIKE operation.

In this example, no result is returned because District and Hangzhou are separated.

SELECT * FROM search_table WHERE city LIKE '%District, Hangzhou%' LIMIT 10;

Enable tokenized text queries

If you want a VARCHAR column to support tokenized queries, fuzzy queries, and equivalent queries at the same time, you can enable tokenized text queries.

To enable this feature, you must set the attribute of the corresponding column to type=string,textSubField=true, analyzer=ik when you create an index. You can configure the analyzer type specified by the analyzer parameter based on your business requirements. Sample code:

CREATE INDEX idx USING SEARCH ON search_table(user_id,address(type=string, textSubField=true, analyzer=ik));
Note

Only one search index can be created for each wide table. To enable tokenized text queries when a search index already exists, you must first delete the current search index, and then you can enable tokenized text queries when you create a new index.

Query methods

By default, the equivalent query and fuzzy query use the complete string for matching, while the tokenized query uses the MATCH function for matching. The following examples describe three common query methods and their query results.

  • Equivalent queries

    Example 1:

    SELECT * FROM search_table WHERE address='Hangzhou' LIMIT 10;

    No result is returned because no data record in the address column exactly matches the string Hangzhou.

    Example 2:

    SELECT * FROM search_table WHERE address='Binjiang District, Hangzhou' LIMIT 10;

    The following result is returned:

    +---------+--------+-----+--------+--------------+------------------+------+
    | user_id |  name  | age | gender |   address    |      email       | city |
    +---------+--------+-----+--------+--------------+------------------+------+
    | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou |
    +---------+--------+-----+--------+--------------+------------------+------+
  • Fuzzy queries

    Example:

    SELECT * FROM search_table WHERE address like '%Binjiang%' LIMIT 10;

    The following result is returned:

    +---------+--------+-----+--------+--------------+------------------+------+
    | user_id |  name  | age | gender |   address    |      email       | city |
    +---------+--------+-----+--------+--------------+------------------+------+
    | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou |
    +---------+--------+-----+--------+--------------+------------------+------+
  • Tokenized queries

    Example:

    SELECT * FROM search_table WHERE MATCH (address) AGAINST ('Hangzhou') LIMIT 10;

    The following result is returned:

    +---------+--------+-----+--------+--------------+------------------+------+
    | user_id |  name  | age | gender |   address    |      email       | city |
    +---------+--------+-----+--------+--------------+------------------+------+
    | 6 | Li | 32 | M | Yuhang District, Hangzhou | a***@example.net | Hangzhou |
    | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou |
    +---------+--------+-----+--------+--------------+------------------+------+