This topic describes the syntax of Data Manipulation Language (DML) statements and provides examples.
SELECT
Run a query.
Syntax
select_statement ::= SELECT [ JSON | HOTDATA ] ( select_clause | '*' )
FROM table_name
[ WHERE where_clause ]
[ GROUP BY group_by_clause ]
[ ORDER BY ordering_clause ]
[ PER PARTITION LIMIT (integer | bind_marker) ]
[ LIMIT (integer | bind_marker) ]
[ ALLOW FILTERING ]
select_clause ::= selector [ AS identifier ] ( ',' selector [ AS identifier ] )
selector ::= column_name
| term
| CAST '(' selector AS cql_type ')'
| function_name '(' [ selector ( ',' selector )* ] ')'
| COUNT '(' '*' ')'
where_clause ::= relation ( AND relation )*
relation ::= column_name operator term
'(' column_name ( ',' column_name )* ')' operator tuple_literal
TOKEN '(' column_name ( ',' column_name )* ')' operator term
operator ::= '=' | '<' | '>' | '<=' | '>=' | '!=' | IN | CONTAINS | CONTAINS KEY
group_by_clause ::= column_name ( ',' column_name )*
ordering_clause ::= column_name [ ASC | DESC ] ( ',' column_name [ ASC | DESC ] )*A SELECT statement reads one or more columns from one or more rows in a table and returns a result set of rows that match the query. Each row contains the selected values. You can also apply functions, including aggregate functions, to the results.
A SELECT statement must include at least one select clause and the name of the table to query. Note that Lindorm CQL does not currently support joins or subqueries, so SELECT statements apply only to a single table. Most SELECT statements also include a WHERE clause, and you can optionally add other clauses to sort or limit the results. If you include the
ALLOW FILTERINGclause, queries that require filtering are permitted.
Examples
CREATE TABLE persioninfo (c1 int primary key, c2 text, c3 text);
SELECT c1, c2 FROM persioninfo WHERE c1 IN (199, 200, 207);
SELECT JSON c1, c2 FROM persioninfo WHERE c1 = 199;
SELECT c1 AS name, c2 AS address FROM persioninfo;
CREATE TABLE persioninfo (c1 int , c2 text, c3 text, PRIMARY KEY(c1,c2));
SELECT *
FROM persioninfo
WHERE c1 = 12
AND c2 > '2011-02-03'
AND c2 <= '2012-01-01';
SELECT COUNT (*) AS user_count FROM persioninfo;Select clause
The select clause determines which columns to query and return in the result set. It also specifies any transformations to apply to the results. The clause consists of a comma-separated list of selectors or a wildcard character (*) to select all columns defined in the table. The select clause includes the following components:
Selectors
The following types are available:
column_name: The name of a column in the queried table, used to retrieve the column’s value.term: A term, typically nested within other selectors such as functions. If you select a term directly, the corresponding column in the result set contains only that term’s value for every returned row.CAST '(' selector AS cql_type ')': A cast operation that converts a nested selector to a compatible type.function_name '(' [ selector ( ',' selector )* ] ')': A function invocation where the arguments are selectors.COUNT '(' '*' ')': A special invocation of the COUNT function—COUNT(*)—that counts all non-empty results.
WRITETIME function
The select clause supports only one special function—the
WRITETIMEfunction—which is not allowed anywhere else. This function takes exactly one argument, which must be a column name. It retrieves the timestamp stored as metadata for that column, indicating when the column’s value was written.WHERE clause
The
WHEREclause specifies the rows to query. It consists of relations on columns that are part of the primary key (partition key or clustering columns) or have a secondary index defined on them. Not all relational queries are allowed. For example, non-equality relations on apartition keyare unsupported (the IN operator is treated as an equality relation). For a givenpartition key, rows are sorted by theclustering columns, and relations on these columns are restricted to those that select a contiguous set of rows based on that sort order. For example, create a table namedposts:CREATE TABLE posts ( userid text, blog_title text, posted_at timestamp, entry_title text, content text, category int, PRIMARY KEY (userid, blog_title, posted_at) );You can run the following query:
SELECT entry_title, content FROM posts WHERE userid = 'john doe' AND blog_title='John''s Blog' AND posted_at >= '2012-01-01' AND posted_at < '2012-01-31';However, the following query is not allowed because it does not select a contiguous set of rows. This assumes that no secondary index is set on the table.
SELECT entry_title, content FROM posts WHERE userid = 'john doe' AND posted_at >= '2012-01-01' AND posted_at < '2012-01-31';Additionally, the IN relation is allowed only on the last column of the
partition keyand the last column of theprimary key. You can also use tuple notation to groupclustering columns. For example:SELECT * FROM posts WHERE userid = 'john doe' AND (blog_title, posted_at) > ('John''s Blog', '2012-01-01');This query requests all rows that come after `blog_title` = 'John''s Blog' and `posted_at` = '2012-01-01' in the clustering order. Specifically, a row is returned if its `blog_title` is greater than 'John''s Blog', even if its `posted_at` is less than or equal to '2012-01-01'. This is different from the following query:
SELECT * FROM posts WHERE userid = 'john doe' AND blog_title > 'John''s Blog' AND posted_at > '2012-01-01';Tuple notation can also be used in the IN clause for
clustering columns:SELECT * FROM posts WHERE userid = 'john doe' AND (blog_title, posted_at) IN (('John''s Blog', '2012-01-01'), ('Extreme Chess', '2014-06-01'));Grouping results
The GROUP BY option condenses all selected rows that share identical values for a specified set of columns into a single row. With GROUP BY, you can group rows only at the
partition keylevel or theclustering columnlevel. Therefore, GROUP BY accepts only primary key column names as parameters, in primary key order. If a primary key column is constrained by an equality condition, it does not need to appear in the GROUP BY clause.NoteBecause the CQL GROUP BY clause pulls data to a single server node for processing, the operation can be time-consuming if the data volume is large.
Ordering results
The ORDER BY clause lets you choose the order of returned results. It takes a list of column names and a sort direction as parameters (ASC for ascending, DESC for descending; unspecified defaults to ASC). The allowed sort orders are constrained by the
clustering orderdefined on the table:If the table was created without a specific
CLUSTERING ORDER, the allowed sort orders are the natural order of theclustering columnsor its reverse.Otherwise, the allowed sort orders are the order specified in the
CLUSTERING ORDERoption or its reverse.
Limiting results
The LIMIT option of a SELECT statement restricts the number of rows that the query returns.
Allowing filtering
By default, Lindorm CQL allows only select queries that do not involve server-side data filtering. This means that all records read are guaranteed to be returned in the result set, either fully or partially. These non-filtering queries have predictable performance because they can be completed in a time proportional to the volume of data returned by the query, which you can control with LIMIT.
The ALLOW FILTERING option lets you explicitly run certain queries that require filtering. Note that a query that uses ALLOW FILTERING may cause unpredictable server-side performance issues. Even a query that selects a small number of records can perform poorly, depending on the total amount of data stored in the cluster.
For example, create the following table, which contains user profiles with their birth year (with a secondary index) and country of residence:
CREATE TABLE users ( username text PRIMARY KEY, firstname text, lastname text, birth_year int, country text ); CREATE INDEX ON users(birth_year);You can run the following valid queries:
SELECT * FROM users; SELECT * FROM users WHERE birth_year = 1981;In both cases, Lindorm CQL guarantees that the performance of these queries is proportional to the amount of data returned. Specifically, if no users were born in 1981, the performance of the second query does not directly depend on the number of user profiles stored in the database. However, due to how secondary indexes are implemented, the query might still depend on the number of nodes in the cluster and indirectly on the amount of data stored in the cluster. In practice, both queries can return very large result sets. However, you can always control the amount of data returned by adding a LIMIT. The following query is rejected:
SELECT * FROM users WHERE birth_year = 1981 AND country = 'FR';This is because Lindorm CQL cannot guarantee that it will not scan a large amount of data, even if the query result is small. It typically scans the index entries for all users born in 1981, even if only a few are from France. However, you can use ALLOW FILTERING to force the execution of this query. Therefore, the following query is valid:
SELECT * FROM users WHERE birth_year = 1981 AND country = 'FR' ALLOW FILTERING;HOTDATA
In scenarios where you specify hot and cold data separation when you create a table, you can specify HOTDATA in the SELECT statement to query only hot data:
SELECT HOTDATA * FROM persioninfo;If you do not specify this option, the query searches both hot and cold data and then merges the results.
SEARCH INDEX SELECT operation
Lindorm supports fuzzy match queries, multi-dimensional queries, and sorting for columns that have a SEARCH INDEX.
Common fuzzy matches
When you perform a SELECT operation on columns with a SEARCH INDEX, you can use fuzzy matching. The wildcard character % represents zero or more characters. Common fuzzy match types include prefix match, suffix match, prefix and suffix match, and exact match. The following examples show how to use these match types:
SELECT * FROM persioninfo WHERE c2 LIKE 'v2';// Exact match. The value in column c2 is 'v2'. SELECT * FROM kss.tb WHERE c2 LIKE '%v2'; // Suffix match. Matches values that end with 'v2'. SELECT * FROM kss.tb WHERE c2 LIKE 'v2%' ;// Prefix match. Matches values that start with 'v2'. SELECT * FROM kss.tb WHERE c2 LIKE '%v2%';// Contains match. Matches values that contain 'v2'.Specify a search_query
You can use the search_query keyword to run any native Solr query statement within a SELECT statement. This supports more complex query and retrieval needs. The following example shows the syntax:
SELECT selectors FROM table WHERE (indexed_column_expression | search_query = 'search_expression') [ LIMIT n ] [ ORDER BY column_name ] ;In a SELECT statement, the search_query keyword is followed by a search expression. For more information about the character restrictions for the search expression, see Local Parameters in Queries. The following options are commonly used with search_query:
{ "q": query_expression (string), "fq": filter_query_expression(s) (string_or_array_of_strings, ...), "facet": facet_query_expression (object) "sort": sort_expression (string), "start": start_index(number), timeAllowed: search_time_limit_ms, "TZ": zoneID), // Any valid zone ID in java TimeZone class "paging": "driver" (string), "distrib.singlePass": true|false (boolean), "shards.failover": true|false (boolean), // Default: true "shards.tolerant": true|false (boolean), // Default: false "commit": true|false (boolean), "route.partition": partition_routing_expression (array_of_strings), "route.range": range_routing_expression (array_of_strings), "query.name": query_name (string), }An example follows:
CREATE TABLE persioninfo (name text PRIMARY KEY, city text,id int) WITH extensions = {'CONSISTENCY_TYPE':'strong', 'MUTABILITY':'IMMUTABLE'}; CREATE SEARCH INDEX sidx on persioninfo WITH COLUMNS (id,city); INSERT INTO persioninfo (name,city, id) VALUES ('MY','hangzhou', 18); SELECT name,id FROM persioninfo WHERE search_query = 'city:hangzhou';NoteBecause
search_querysupports rich semantics, you cannot currently combine asearch_queryclause with other conditions. For example,select * from table where search_query = 'c1:name' AND column = 'a'is not supported.
UPDATE
Update the data.
Syntax
update_statement ::= UPDATE table_name
[ USING update_parameter ( AND update_parameter )* ]
SET assignment ( ',' assignment )*
WHERE where_clause
update_parameter ::= ( integer | bind_marker )
assignment ::= simple_selection '=' term
| column_name '=' column_name ( '+' | '-' ) term
| column_name '=' list_literal '+' column_name
simple_selection ::= column_name
| column_name '[' term ']'
| column_name '.' `field_name
condition ::= simple_selection operator termAn UPDATE statement updates one or more columns for specified rows in a table. The WHERE clause selects the rows to update and must include all columns that form the
PRIMARY KEY. Use the SET keyword to assign values to non-primary-key columns.Unlike in standard SQL, an UPDATE statement does not check whether a row exists before the operation. If the row does not exist, a new row is created. Otherwise, the existing row is updated. Therefore, you cannot determine whether a row was created or updated. In Lindorm CQL, counter columns can coexist with non-counter columns. This behavior differs from that of Apache Cassandra CQL.
In an UPDATE statement, all updates for the same partition key are
atomicandisolated.For an
assignment, use expressions like c = c + 3 to increment or decrement a counter. The column name after the equals (=) sign must match the one before it. Note that increment and decrement operations are allowed only on counter columns and are the only permitted update operations for counters.
Example
CREATE TABLE product (productname text PRIMARY KEY, company text, subdepartment text);
UPDATE product
SET company = 'aliyun cloud',
subdepartment = 'database nosql'
WHERE productname = 'lindorm';DELETE
You can delete a line or part of a line.
Syntax
delete_statement ::= DELETE [ simple_selection ( ',' simple_selection ) ]
FROM table_name
[ USING update_parameter ( AND update_parameter )* ]
WHERE where_clauseThe
DELETEstatement removes columns or entire rows. If you specify column names immediately after theDELETEkeyword, only those columns are deleted from the rows indicated by the WHERE clause. Otherwise, entire rows are deleted.The WHERE clause specifies the rows to delete. You can use the IN operator to delete multiple rows in a single statement. Deleting a range of rows is not currently supported.
In a
DELETEstatement, all delete operations affecting rows with the same partition key are applied atomically and in isolation.
Example
INSERT INTO t1 (c1,c2,c3) VALUES ( '11', '22', '33');
DELETE c2 FROM t1 WHERE c1 IN ( '11');