This topic describes how to use a search index to query data in a Lindorm wide table in multidimensional queries, paged queries, queries with results sorted, tokenized text queries, fuzzy queries, and aggregate queries.
Prerequisites
You are connected to LindormTable by using Lindorm-cli. For more information, see Use Lindorm-cli to connect to and use LindormTable.
Test data is generated and a search index is created. For more information, see Manage search indexes.
Run a multidimensional query
Execute the following statements to run multidimensional queries:
Example 1
SELECT * FROM search_table WHERE gender='M' AND city='Hangzhou' OR city='Beijing';
The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 6 | Li | 32 | M | Yuhang District, Hangzhou | a***@example.net | Hangzhou | | 1 | Zhang | 18 | M | Chaoyang District, Beijing | a***@example.net | Beijing | | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | +---------+--------+-----+--------+--------------+------------------+------+
Example 2
SELECT * FROM search_table WHERE age > 30 AND city ! ='Hangzhou ';
The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 28 | Chen | 36 | F | Nanshan District, Shenzhen | a***@example.net | Shenzhen | +---------+--------+-----+--------+--------------+------------------+------+
Run a paged query and sort query results
A search index allows you to sort data based on each of the columns in your table. Execute the following statement to sort the output:
SELECT * FROM search_table WHERE gender='M' ORDER BY age DESC;
The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 6 | Li | 32 | M | Yuhang District, Hangzhou | a***@example.net | Hangzhou | | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | | 1 | Zhang | 18 | M | Chaoyang District, Beijing | a***@example.net | Beijing | +---------+--------+-----+--------+--------------+------------------+------+
A search index allows you to use
limit
oroffset
to perform data pagination. Execute the following statement to run a paged query:SELECT * FROM search_table WHERE gender='M' ORDER BY age DESC LIMIT 1,10;
The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | | 1 | Zhang | 18 | M | Chaoyang District, Beijing | a***@example.net | Beijing | +---------+--------+-----+--------+--------------+------------------+------+
Run a fuzzy query
Execute the following statement to query data that includes example
in the email
column:
SELECT * FROM search_table WHERE email LIKE '%example%' ORDER BY user_id;
The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+
| user_id | name | age | gender | address | email | city |
+---------+--------+-----+--------+--------------+------------------+------+
| 1 | Zhang | 18 | M | Chaoyang District, Beijing | a***@example.net | Beijing |
| 6 | Li | 32 | M | Yuhang District, Hangzhou | a***@example.net | Hangzhou |
| 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou |
| 28 | Chen | 36 | F | Nanshan District, Shenzhen | a***@example.net | Shenzhen |
+---------+--------+-----+--------+--------------+------------------+------+
The performance of fuzzy queries is limited due to the amount of data in tables. Therefore, we recommend that you run tokenized text queries instead of fuzzy queries.
Run an aggregate query
Search indexes support common aggregate functions such as COUNT, SUM, AVG, MIN, and MAX. Search indexes also support the DISTINCT and GROUP BY clauses.
Execute the following statement to query the number of data rows that include
Hangzhou
in thecity
column:SELECT COUNT(*) FROM search_table WHERE city='Hangzhou';
The following result is returned:
+----------+ | EXPR$0 | +----------+ | 2 | +----------+
Execute the following statement to query data and remove duplicate query results:
SELECT distinct(age) FROM search_table WHERE gender='M';
The following result is returned:
+---------------+ | DISTINCT(age) | +---------------+ | 18 | | 28 | | 32 | +---------------+
Execute the following statement to query data and group the query results:
SELECT city,count(*) AS cnt FROM search_table WHERE gender='M' GROUP BY city ORDER BY cnt DESC;
The following result is returned:
+------+-----+ | city | cnt | +------+-----+ | Hangzhou | 2 | | Beijing | 1 | +------+-----+
If you do not specify a WHERE
condition in an aggregate query, the corresponding search index cannot be hit. In this case, you can add the _l_force_index
parameter before the WHERE
keyword to force the query to hit a specified index. For more information, see the "Parameters of hintOptions" and "Example 4: Specify _l_force_index_
to forcibly use an index in the query" sections of the HINT topic.
Run a phrase-based query
Execute the following statement to run a query with phrases specified as a condition:
SELECT name,age,city,address FROM search_table WHERE age>1 AND age<100 AND address='"Hangzhou"' ORDER BY user_id;
The following result is returned:
+--------+-----+------+--------------+
| name | age | city | address |
+--------+-----+------+--------------+
| Li | 32 | Hangzhou | Yuhang District, Hangzhou |
| Wang | 28 | Hangzhou | Binjiang District, Hangzhou |
+--------+-----+------+--------------+
Run a tokenized query
Tokenized queries rely on the MATCH
function which is available only in LindormTable 2.7.2 and later versions. To view or upgrade your LindormTable version, see Release notes of LindormTable and Upgrade the minor engine version of a Lindorm instance.
For tokenized columns, you must use the MATCH
function to implement tokenized queries. The following example queries data rows where age
ranges from 10 to 30 and address
matches Hangzhou
, with the address
column being a tokenized column.
SELECT * FROM search_table WHERE age > 10 AND age <= 30 AND MATCH (address) AGAINST ('Hangzhou') ORDER BY user_id;
The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+
| user_id | name | age | gender | address | email | city |
+---------+--------+-----+--------+--------------+------------------+------+
| 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou |
+---------+--------+-----+--------+--------------+------------------+------+
Run a tokenized text query
To use the tokenized text query feature, the LindormTable version must be 2.8.0 or later, and the LindormSearch version must be compatible with ElasticSearch. If your version does not meet the preceding requirements, update the minor version.
Background information
If you set the type=text
attribute for a VARCHAR column in a wide table when you create a search index, the system will split the text in the column into tokens and use tokens in the search index when you perform queries such as equivalent queries and fuzzy queries. This may result in unexpected search results.
For example, if you configure the address
column as a tokenized column by setting the address(type=text, analyzer=ik)
address(type=text,analyzer=ik)
attribute, the system will split the text in the column into individual tokens based on the tokenization policy when you create an index. For example, the text Binjiang District, Hangzhou
is split into individual tokens Hangzhou
and Binjiang District
. In this case, the fuzzy query fails if you use the LIKE
operation.
In this example, no result is returned because District
and Hangzhou
are separated.
SELECT * FROM search_table WHERE city LIKE '%District, Hangzhou%' LIMIT 10;
Enable tokenized text queries
If you want a VARCHAR column to support tokenized queries, fuzzy queries, and equivalent queries at the same time, you can enable tokenized text queries.
To enable this feature, you must set the attribute of the corresponding column to type=string,textSubField=true, analyzer=ik
when you create an index. You can configure the analyzer type specified by the analyzer
parameter based on your business requirements. Sample code:
CREATE INDEX idx USING SEARCH ON search_table(user_id,address(type=string, textSubField=true, analyzer=ik));
Only one search index can be created for each wide table. To enable tokenized text queries when a search index already exists, you must first delete the current search index, and then you can enable tokenized text queries when you create a new index.
Query methods
By default, the equivalent query and fuzzy query use the complete string for matching, while the tokenized query uses the MATCH
function for matching. The following examples describe three common query methods and their query results.
Equivalent queries
Example 1:
SELECT * FROM search_table WHERE address='Hangzhou' LIMIT 10;
No result is returned because no data record in the
address
column exactly matches the stringHangzhou
.Example 2:
SELECT * FROM search_table WHERE address='Binjiang District, Hangzhou' LIMIT 10;
The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | +---------+--------+-----+--------+--------------+------------------+------+
Fuzzy queries
Example:
SELECT * FROM search_table WHERE address like '%Binjiang%' LIMIT 10;
The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | +---------+--------+-----+--------+--------------+------------------+------+
Tokenized queries
Example:
SELECT * FROM search_table WHERE MATCH (address) AGAINST ('Hangzhou') LIMIT 10;
The following result is returned:
+---------+--------+-----+--------+--------------+------------------+------+ | user_id | name | age | gender | address | email | city | +---------+--------+-----+--------+--------------+------------------+------+ | 6 | Li | 32 | M | Yuhang District, Hangzhou | a***@example.net | Hangzhou | | 20 | Wang | 28 | M | Binjiang District, Hangzhou | a***@example.net | Hangzhou | +---------+--------+-----+--------+--------------+------------------+------+