This topic describes how to use a search index to query data in a Lindorm wide table using multidimensional queries, paged and sorted queries, tokenized text queries, fuzzy queries, and aggregate queries.
Prerequisites
You have connected to LindormTable using Lindorm-cli. For more information, see Connect to and use LindormTable with Lindorm-cli.
You have prepared test data and created a search index. For more information, see Manage search indexes.
Run a multidimensional query
Run the following statements to perform multidimensional queries.
Example 1
SELECT * FROM search_table WHERE gender='M' AND city='Hangzhou' OR city='Beijing';The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 6 | Li | 32 | M | Yuhang District, Hangzhou | a***@example.net | Hangzhou | | 1 | Zhang | 18 | M | Chaoyang District, Beijing | a***@example.net | Beijing | | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | +---------+--------+-----+--------+--------------+------------------+------+Example 2
SELECT * FROM search_table WHERE age > 30 AND city != 'Hangzhou';The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 28 | Chen | 36 | F | Nanshan District, Shenzhen | a***@example.net | Shenzhen | +---------+--------+-----+--------+--------------+------------------+------+
Sorting and pagination
A search index supports sorting on any column. Run the following statement to sort the query results.
SELECT * FROM search_table WHERE gender='M' ORDER BY age DESC;The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 6 | Li | 32 | M | Yuhang District, Hangzhou | a***@example.net | Hangzhou | | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | | 1 | Zhang | 18 | M | Chaoyang District, Beijing | a***@example.net | Beijing | +---------+--------+-----+--------+--------------+------------------+------+A search index supports pagination using
limitoroffset. Run the following statement to perform a paged query.SELECT * FROM search_table WHERE gender='M' ORDER BY age DESC LIMIT 1,10;The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | | 1 | Zhang | 18 | M | Chaoyang District, Beijing | a***@example.net | Beijing | +---------+--------+-----+--------+--------------+------------------+------+
Run a fuzzy query
Run the following statement to query for data that contains `example` in the email column.
SELECT * FROM search_table WHERE email LIKE '%example%';The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+
| user_id | name | age | gender | address | email | city |
+---------+--------+-----+--------+--------------+------------------+------+
| 1 | Zhang | 18 | M | Chaoyang District, Beijing | a***@example.net | Beijing |
| 6 | Li | 32 | M | Yuhang District, Hangzhou | a***@example.net | Hangzhou |
| 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou |
| 28 | Chen | 36 | F | Nanshan District, Shenzhen | a***@example.net | Shenzhen |
+---------+--------+-----+--------+--------------+------------------+------+The performance of fuzzy queries depends on the amount of data in a table. We recommend that you perform tokenized text queries instead.
Aggregation and analysis
Search indexes support common aggregate functions such as COUNT, SUM, AVG, MIN, and MAX. Search indexes also support DISTINCT and GROUP BY.
Run the following statement to count the rows where the
citycolumn isHangzhou.SELECT COUNT(*) FROM search_table WHERE city='Hangzhou';The following result is returned:
+----------+ | EXPR$0 | +----------+ | 2 | +----------+Run the following statement to remove duplicate values from the query result.
SELECT distinct(age) FROM search_table WHERE gender='M';The following result is returned:
+---------------+ | DISTINCT(age) | +---------------+ | 18 | | 28 | | 32 | +---------------+Run the following statement to group the query results.
SELECT city,count(*) AS cnt FROM search_table WHERE gender='M' GROUP BY city ORDER BY cnt DESC;The following result is returned:
+------+-----+ | city | cnt | +------+-----+ | Hangzhou | 2 | | Beijing | 1 | +------+-----+
If you do not specify a WHERE condition in an aggregate query, the query does not use the search index. In this case, you can add the HINT parameter _l_force_index_ before the WHERE keyword to force the query to use the search index. For more information, see Parameters of hintOptions and Example 4: Specify _l_force_index_ to forcibly use an index in the query.
Run a tokenized text query
For tokenized columns, you can use the MATCH function to perform tokenized text queries.
ImportantThe MATCH function is supported in LindormTable 2.7.2 and later versions. To view or upgrade your LindormTable version, see Release notes for LindormTable and Minor version update.
The following example queries rows where the
agevalue is in the range(10,30]and theaddressvalue matchesHangzhou. Theaddresscolumn is tokenized.SELECT * FROM search_table WHERE age > 10 AND age <= 30 AND MATCH (address) AGAINST ('Hangzhou');The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | +---------+--------+-----+--------+--------------+------------------+------+The following example queries rows where
ageis in the range(30,35]andaddressmatches the exact phraseYuhang District, Hangzhou, and theaddresscolumn is tokenized.SELECT * FROM search_table WHERE age > 30 AND age <= 35 AND MATCH (address) AGAINST ('"Yuhang District, Hangzhou"');The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 6 | Li | 32 | M | Yuhang District, Hangzhou | a***@example.net | Hangzhou | +---------+--------+-----+--------+--------------+------------------+------+
In LindormTable versions earlier than 2.7.2, you can use the equality operator
=to perform tokenized text queries.ImportantEquality-based tokenized text queries have been deprecated. Upgrade to a newer version and use the MATCH function instead. To view or upgrade your LindormTable version, see Release notes for LindormTable and Minor version update.
Starting from LindormTable 2.8.2.3, new search indexes no longer support equality-based tokenized text queries. Existing search indexes still support them.
The following example queries rows where the
agevalue is in the range(10, 30]and theaddressvalue matchesHangzhou, with theaddresscolumn tokenized.SELECT * FROM search_table WHERE age > 10 AND age <= 30 AND address = 'Hangzhou';The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | +---------+--------+-----+--------+--------------+------------------+------+
Run a tokenized text query with subfields
The tokenized text subfield feature requires LindormTable 2.8.0 or later and LindormSearch compatible with Elasticsearch. If your version does not meet these requirements, update the minor version.
Background
When you create a search index, if you set a VARCHAR column to the tokenized type (type=text), the system uses tokenization for matching in equivalent and fuzzy queries. This may produce unexpected results.
For example, if you configure the address(type=text,analyzer=ik) column, the system splits text into tokens based on the tokenizer policy when it creates the index. For instance, Binjiang District, Hangzhou becomes two tokens: Hangzhou and Binjiang District. As a result, a fuzzy query that uses LIKE may fail because the target term is split across tokens.
In the following example, District and Binjiang are split into separate tokens, so a match does not occur. The query returns no results.
SELECT * FROM search_table WHERE city LIKE '%Binjiang District%' LIMIT 10;Enable Method
To allow a VARCHAR column to support tokenized text queries, fuzzy queries, and equivalent queries simultaneously, you can enable the tokenized text subfield feature.
To enable this feature, set the column attribute to type=string,textSubField=true,analyzer=ik when you create the index. Select an analyzer based on your use case. For more information, see analyzer type. The following is an example:
CREATE INDEX idx USING SEARCH ON search_table(user_id,address(type=string, textSubField=true, analyzer=ik));If a search index already exists and you want to enable the tokenized text subfield feature, you can use Single-table multiple search indexes (public preview) to create a new search index with tokenized text subfields, and then switch queries to the new index.
Query methods
By default, equivalent and fuzzy queries use full-string matching. Tokenized text queries use the MATCH function. The following examples show three common query methods and their results.
Equivalent queries
Example 1:
SELECT * FROM search_table WHERE address='Hangzhou' LIMIT 10;The query returns no results because no record in the
addresscolumn exactly matches the stringHangzhou.Example 2:
SELECT * FROM search_table WHERE address='Binjiang District, Hangzhou' LIMIT 10;The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | +---------+--------+-----+--------+--------------+------------------+------+Fuzzy queries
Example:
SELECT * FROM search_table WHERE address like '%Binjiang%' LIMIT 10;The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | +---------+--------+-----+--------+--------------+------------------+------+Tokenized text queries
Example:
SELECT * FROM search_table WHERE MATCH (address) AGAINST ('Hangzhou') LIMIT 10;The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 6 | Li | 32 | M | Yuhang District, Hangzhou | a***@example.net | Hangzhou | | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | +---------+--------+-----+--------+--------------+------------------+------+