All Products
Search
Document Center

ApsaraDB for HBase:DDL

Last Updated:Mar 06, 2024

This topic describes the DDL syntax. This topic also provides examples to show how to use the DDL syntax.

Overview

Lindorm Cassandra Query Language (CQL) stores data in tables. The schema of a table defines the layout of data that is stored in the table. The tables are grouped into keyspaces. Multiple options are available for keyspaces. If you specify an option for a keyspace, the option is applied to all tables in the keyspace. The replication strategy is the most frequently used option. The replication strategy and the number of replicas use the default settings. These settings in Lindorm CQL cannot be modified. Lindorm CQL allows you to use CQL to manage high-performance secondary indexes and full-text indexes of ApsaraDB for Lindorm (Lindorm). This helps you use indexes with ease.

Syntax of keyspaces and tables

Lindorm CQL keyspaces use the following syntax:

keyspace_name ::=  name
name          ::=  unquoted_name | quoted_name
unquoted_name ::=  re('[a-zA-Z_0-9]{1, 48}')
quoted_name   ::=  '"' unquoted_name '"'

Lindorm CQL tables use the following syntax:

table_name    ::=  [ keyspace_name '.' ] name
name          ::=  unquoted_name | quoted_name
unquoted_name ::=  re('[a-zA-Z_0-9]{1, 48}')
quoted_name   ::=  '"' unquoted_name '"'
Note
  • The name of a keyspace or table can contain only letters, digits, and special characters.

  • The name of a keyspace or table must be 1 to 48 characters in length. This limit ensures that each specified keyspace or table name conforms to the predefined naming rules of file systems.

  • By default, the names of keyspaces and tables are not case-sensitive. For example, myTable and mytable specify the same table. However, if a name is quoted by using double quotation marks ("), the name is case-sensitive. For example, "myTable" and "mytable" are different tables.

  • Tables are stored in keyspaces. You can specify that the name of a table must be fully-qualified by the keyspace to which the table belongs. For example, both the ks and gs keyspaces contain a table named table. However, ks.table is different from gc.table.

Indexes

Lindorm CQL supports secondary indexes and full-text indexes that are created by using the engine-based Search Index.

  • Secondary indexes

    You can use Lindorm CQL to create secondary indexes on a table. Then, you can use the secondary indexes to query data from the table. The following statement is used to specify the name of a secondary index.

    index_name ::=  re('[a-zA-Z_0-9]+')
  • Search Index

    Search Index can be used to perform the following operations:

    • Multidimensional queries: Lindorm aggregates data across columns and returns the query result in an efficient way.

    • Sequencing: Lindorm supports the ORDER BY keyword. You can use this keyword to sort results based on the specified columns.

    • Fuzzy match.

    Before you use Search Index in Lindorm CQL, you must enable the Search Index feature and Lindorm Tunnel Service (LTS). For more information, see Overview.

CREATE KEYSPACE

Creates a keyspace.

Syntax

CREATE KEYSPACE [ IF NOT EXISTS ] keyspace_name WITH options
Note

Keyspaces of Lindorm are similar to namespaces. Lindorm CQL does not allow you to modify the replication strategy and the number of replicas. The default values of the related parameters are used.

Parameters

Parameter

Example

Description

keyspace_name

testks

The name of the keyspace.

options

replication

Valid options: replication and durable_writes.

  • replication: the MAP data type. This option specifies the number of replicas generated. By default, Lindorm uses double-replica storage.

  • durable_writes: the BOOLEAN data type. This option specifies whether to enable durable writes. Default value: True.

Examples

CREATE KEYSPACE testks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}
Note

You cannot configure replication and durable_writes at the compatibility layer of Lindorm CQL keyspaces. The following default settings are used: the number of replicas is 2 and durable_writes is set to True.

ALTER KEYSPACE

Modifies the options of a keyspace.

Syntax

ALTER KEYSPACE keyspace_name WITH options
Note

In Lindorm CQL, the options parameter supported by ALTER KEYSPACE statements is the same as the options parameter supported by CREATE KEYSPACE statements. However, you do not need to configure the options parameter in ALTER KEYSPACE statements because the default settings are always used.

Parameters

The options parameter used in ALTER KEYSPACE statements is the same as the options parameter used in CREATE KEYSPACE statements. For more information, see the Parameters section for CREATE KEYSPACE.

Examples

ALTER KEYSPACE testks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}

USE

Uses a keyspace. In Lindorm CQL, objects such as tables, user-defined types (UDTs), and user-defined functions (UDFs) are bound to a keyspace.

Syntax

USE keyspace_name

Parameters

Parameter

Example

Description

keyspace_name

testks

The name of the keyspace.

Examples

USE testks;

DROP KEYSPACE

Deletes a keyspace.

Syntax

DROP KEYSPACE [ IF EXISTS ] keyspace_name
Note
  • When you execute the DROP KEYSPACE statement to delete a keyspace, the keyspace is immediately deleted and cannot be restored. The data of the keyspace is also deleted, including the tables, UDTs, and UDFs in the keyspace.

  • You can use the IF EXISTS option to prevent the system from returning an error when the keyspace that you want to delete does not exist. This way, if the keyspace does not exist, the delete operation is not performed.

  • Only the root user can delete keyspaces.

Parameters

Parameter

Example

Description

keyspace_name

testks

The name of the keyspace.

Examples

DROP KEYSPACE testks;

CREATE TABLE

Creates a table.

Syntax

CREATE TABLE [ IF NOT EXISTS ] table_name
                  '('
                       column_definition
                       ( ',' column_definition )*
                       [ ',' PRIMARY KEY '(' primary_key ')' ]
                 ')' [ WITH table_options ]
column_definition      ::=  column_name cql_type [ STATIC ] [ PRIMARY KEY]
primary_key            ::=  partition_key [ ',' clustering_columns ]
partition_key          ::=  column_name
                            | '(' column_name ( ',' column_name )* ')'
clustering_columns     ::=  column_name ( ',' column_name )*
table_options          ::=  CLUSTERING ORDER BY '(' clustering_order ')' [ AND options ]
                            | options
clustering_order       ::=  column_name (ASC | DESC) ( ',' column_name (ASC | DESC) )*

Description

Each Lindorm CQL table has a name and consists of a set of rows. When you create a table, you define the columns of which each row consists, the columns of which the primary key consists, and the table options. You can use the IF NOT EXISTS option to prevent the system from returning an error when the table that you want to create already exists. This way, if the table already exists, the create operation is not performed.

Each row in a Lindorm CQL table has a set of predefined columns. The predefined columns are specified when you create the table. You can execute ALTER statements to add predefined columns.

The column_definition parameter specifies the name and the data type of a column. This parameter also specifies the values that the column can accept. You can also specify a modifier for a column.

  • STATIC: declares that the column is a static column. The STATIC modifier is not supported.

  • PRIMARY KEY: declares that the column is the sole component of the primary key for the table.

In a table, a row is uniquely identified by the primary key. You must specify a unique primary key for each table. Each table has only one primary key. A primary key consists of one or more defined columns in the table. In syntax, the primary key is specified by using the PRIMARY KEY phrase that is followed by a string of column names within parentheses. If the primary key contains only one column, you can add the PRIMARY KEY phrase to the column in the table definition. The order of the columns in the primary key definition is important. This order affects the distribution and storage order of the data.

A Lindorm CQL primary key consists of the following two parts:

  • partition key: the first part of a primary key. A partition key can contain a single column or multiple columns that are enclosed in a pair of parentheses (). A table contains at least one partition key.

  • clustering columns: the columns that follow the partition key in the primary key definition. The order of these columns defines the clustering order. In Lindorm CQL, the clustering columns are optional.

The following items provide examples of how to define a primary key:

  • In PRIMARY KEY(a), a is a partition key. No clustering columns are specified.

  • In PRIMARY KEY(a, b, c), a is a partition key. b and c are clustering columns.

  • In PRIMARY KEY((a, b), c), a and b compose a composite partition key. c is the clustering column.

In Lindorm CQL, primary keys are important. A primary key consists of a partition key and clustering columns. In actual scenarios, a partition key must be used in combination with clustering columns.

Lindorm CQL defines partitions for tables. A partition is a set of rows that share the same partition key. If a partition key consists of several columns, the rows of the columns belong to the same partition. The values of these partition key columns are the same. The following statements provide an example of how to create a table and query all the data of the table:

CREATE TABLE persioninfo (
    a int,
    b int,
    c int,
    d int,
    PRIMARY KEY ((a, b), c, d)
);

SELECT * FROM persioninfo;
   a | b | c | d
  ---+---+---+---
   0 | 0 | 0 | 0    // row 1
   0 | 0 | 1 | 1    // row 2
   0 | 1 | 2 | 2    // row 3
   0 | 1 | 3 | 3    // row 4
   1 | 1 | 4 | 4    // row 5

In Lindorm CQL, a primary key consists of a partition key and clustering columns. A partition key is as important as clustering columns. A partition key is used in combination with clustering columns to identify the node to which a row of data belongs. A partition key alone cannot identify the node to which a row of data belongs. In the preceding example, row 1 and row 2 reside in the same node in Lindorm CQL. However, row 1 and row 2 may reside in different nodes in the traditional CQL. This is the main difference between Lindorm CQL and the traditional CQL.

Examples

CREATE TABLE tb (name text PRIMARY KEY , age int);   // By default, the statement is used to create a table whose primary key is the name column. 
CREATE TABLE ttltb (name text PRIMARY KEY, age int) WITH default_time_to_live = 1000;  // Sets the Time-to-Live (TTL) to 1000s. 
CREATE TABLE cptb (name text PRIMARY KEY, age int) WITH compression = {'sstable_compression': 'LZ4Compressor'};   // Sets the compression algorithm to LZ4Compressor. By default, SNAPPY is used. 
CREATE TABLE hct ( name text PRIMARY KEY , age int  ) WITH extensions = {'COLD_BOUNDARY':'10'};  // Sets the timeline of cold and hot data separation for Lindorm to 10s. 

Features that are unsupported by Lindorm CQL tables

PARTITION KEY: determines the related node to which distributed data belongs. Lindorm CQL cannot identify the physical node to which a row of data belongs based only on the partition key.

Supported table options

The table options supported by Lindorm CQL are different from the table options supported by Cassandra. The following table describes the options that are supported by Lindorm CQL.

Option

Type

Description

default_time_to_live

int

The default TTL of the table. Unit: seconds. Default value: 0.

compression

map

The compression algorithm for the SSTables. Valid values: LZ4, ZSTD, and SNAPPY. For more information, see the "Compression" section.

extensions

map

Specifies the required extensions. These extensions include the cold storage, cold and hot data separation, and consistency level of tables. For more information, see the "Extensions" section.

Compression

Lindorm CQL supports the following compression algorithms. Use the default coefficient to configure each compression algorithm.

  • LZ4Compressor (LZ4). Execute the following statement to create a table that is compressed by using the LZ4 algorithm:

    CREATE TABLE persioninfo (
       id int,
       name text,
       address text,
       PRIMARY KEY (id, name)
    ) with compression = {'class': 'LZ4Compressor'};
  • ZstdCompressor (Zstd). Execute the following statement to create a table that is compressed by using the Zstd algorithm:

    CREATE TABLE persioninfo (
       id int,
       name text,
       address text,
       PRIMARY KEY (id, name)
    ) with compression = {'class': 'ZstdCompressor'};
  • SnappyCompressor (SNAPPY). Execute the following statement to create a table that is compressed by using the SNAPPY algorithm:

    CREATE TABLE persioninfo (
       id int,
       name text,
       address text,
       PRIMARY KEY (id, name)
    ) with compression = {'class': 'SnappyCompressor'};
Note

You can execute the ALTER TABLE statement to modify the value of the compression property.

Extensions

You can use the extensions option to configure the following properties that are specific to Lindorm:

  • Cold storage. This property is specified by the STORAGE_POLICY keyword. The value COLD specifies cold storage. The value DEFAULT specifies hot storage.

    CREATE TABLE persioninfo (name text PRIMARY KEY, age int) WITH extensions = {'STORAGE_POLICY' : 'COLD'};  // Specifies cold storage for the table. 
    ALTER TABLE persioninfo WITH extensions = {'STORAGE_POLICY' : 'DEFAULT'};  // Changes the table property from cold storage to hot storage. 
  • Cold and hot data separation. This property is specified by the COLD_BOUNDARY keyword. If you specify to enable cold and hot data separation, you do not need to set the STORAGE_POLICY parameter for a table or a column family to COLD. If the STORAGE_POLICY parameter is set to COLD, you must disable cold and hot data separation. For more information, see Overview.

    CREATE TABLE persioninfo (name text PRIMARY KEY, age int) with extensions = {'COLD_BOUNDARY':'86400'};  // Sets the timeline of cold and hot data separation for the table to 86400s. Data that is written to the table after the specified time is written to the cold medium. 
    ALTER TABLE persioninfo with extensions = {'COLD_BOUNDARY':''};    // Disables cold and hot data separation. 
    ALTER TABLE persioninfo with extensions = {'COLD_BOUNDARY':'1000'};   // Modifies the timeline to 1000s. 
  • CONSISTENCY_TYPE: specifies the consistency level for a table in the multi-zone scenarios of Lindorm. Valid values: eventual, timestamp, basic, and strong.

    CREATE TABLE persioninfo (name text PRIMARY KEY, age int) with extensions = {'CONSISTENCY_TYPE':'strong'};  // Sets the consistency level to strong. 
    ALTER TABLE persioninfo with extensions = {'CONSISTENCY_TYPE':'eventual'};  // Modifies the consistency level to eventual. 
  • MUTABILITY: Before you use secondary indexes and full-text indexes, you must configure this parameter. Valid values: IMMUTABLE, IMMUTABLE_ROWS, MUTABLE_LATEST, and MUTABLE_ALL. The IMMUTABLE_ROWS, MUTABLE_LATEST, and MUTABLE_ALL values are valid only when the CONSISTENCY_TYPE parameter is set to strong. Default value: MUTABLE_ALL.

    CREATE TABLE persioninfo (name text PRIMARY KEY, age int) with extensions = {'MUTABILITY':'IMMUTABLE'};  // Sets the MUTABILITY parameter to IMMUTABLE. 
    ALTER TABLE persioninfo with extensions = {'MUTABILITY':'MUTABLE_LATEST'};   // Modifies the value of the MUTABILITY parameter to MUTABLE_LATEST. 

Unsupported table options

Option

Type

Description

comment

string

The description of the table. Default value: f.

speculative_retry

simple simple

Default value: 99PERCENTILE.

cdc

boolean

Specifies whether to create a change data capture (CDC) log for the table. Default value: false.

gc_grace_seconds

int

The amount of time to wait before the data is marked with a tombstone. The tombstone is used as a deletion mark. Default value: 86400.

bloom_filter_fp_chance

float

The probability of a false positive for each stable Bloom filter. The system determines the probability based on the size of each Bloom filer. If you reduce the value of this parameter, the size of each Bloom filer that resides in memory or on disks is affected. Default value: 0.00075.

compaction

map

Default value: STCS.

caching

map

None

memtable_flush_period_in_ms

map

Default value: 0.

read_repair

int

Default value: BLOCKING.

ALTER TABLE

Modifies a table.

Syntax

ALTER TABLE table_name alter_table_instruction
alter_table_instruction ::=  ADD column_name cql_type ( ',' column_name cql_type )*
                             | DROP column_name ( column_name )*
                             | WITH options

Description

The following items describe the ALTER TABLE statement:

  • You can use the ADD instruction to add columns to a table. The primary key of a table cannot be changed after the table is created. Therefore, you cannot add a column as a part of a primary key. Take note of the limits when you add columns to compact tables.

  • Lindorm CQL does not support the DROP operation on columns.

  • You can use the WITH instruction to modify the settings of a table. You can modify all options that are specified when the table is created, except for CLUSTERING ORDER. The default_time_to_live, compression and extensions options can be modified.

Examples

CREATE TABLE persioninfo (name text PRIMARY KEY, age int);
ALTER TABLE persioninfo ADD address text;

DROP TABLE

Deletes a table.

Syntax

DROP TABLE [ IF EXISTS ] table_name
Note
  • If you use the DROP TABLE statement to delete a table, the table and the data are immediately deleted and cannot be restored.

  • You can use the IF EXISTS option to prevent the system from returning an error when the table that you want to delete does not exist. This way, if the table does not exist, the delete operation is not performed.

  • Only the root user can delete tables.

Parameters

Parameter

Example

Description

table_name

persioninfo

The name of the table.

Examples

DROP TABLE persioninfo;

TRUNCATE

Clears the data of a table.

Syntax

TRUNCATE [ TABLE ] table_name
Note
  • You can execute the TRUNCATE statement to permanently delete all data of a table. However, the schema status of the table is not deleted.

  • Only the root user can clear data of tables.

Parameters

Parameter

Example

Description

table_name

persioninfo

The name of the table.

Examples

TRUNCATE TABLE persioninfo;

CREATE INDEX

Creates a secondary index.

Syntax

CREATE [ CUSTOM ] INDEX [ IF NOT EXISTS ] [ index_name ]
                                ON table_name '(' index_identifier ')'
                                [ USING string [ WITH OPTIONS = map_literal ] ]
index_identifier       ::=  column_name
                           | '(' column_name ')'

Description

You can execute the CREATE INDEX statement to create a secondary index for a column in a table. You can specify an index name before the ON keyword. If data already exists for a column, the column is asynchronously indexed. After the index is created, new data for the column is automatically indexed.

If the index that you want to create already exists, an error message is returned. To prevent the system from returning an error message, you can use the IF NOT EXISTS option. In this case, your query has no result.

The CREATE INDEX statement can only be used to create indexes for a single column. If you want to create indexes for multiple columns in a table, specify USING 'com.alibaba.lindorm.cserver.schema.index.LindormSecondaryIndex' in the CREATE CUSTOM INDEX statement.

Examples

CREATE INDEX myindex ON persioninfo (c2);
CREATE INDEX ON persioninfo (c2);
CREATE CUSTOM INDEX myindex ON persioninfo (c1,c2) USING 'com.alibaba.lindorm.cserver.schema.index.LindormSecondaryIndex';

DROP INDEX

Deletes a secondary index.

Syntax

DROP INDEX [ IF EXISTS ] index_name
Note
  • You can execute the DROP INDEX statement to delete an existing secondary index. The index_name parameter in the statement specifies the keyspace of the index.

  • You can use the IF EXISTS option to prevent the system from returning an error when the index that you want to delete does not exist. This way, if the index does not exist, the delete operation is not performed.

Examples

DROP INDEX myindex;

CREAT SEARCH INDEX

Creates a full-text index.

Syntax

CREATE SEARCH INDEX [ IF NOT EXISTS ] index_name ON [keyspace_name.]table_name
| [ WITH [ COLUMNS (column1,...,columnn) ]
| [ WITH [ COLUMNS (*) ]
Note
  • You can execute the CREATE SEARCH INDEX statement to create a full-text index on some columns of the source table.

  • WITH COLUMNS (column) specifies one or more columns on which a full-text index is created. Separate multiple columns with commas (,).

  • WITH COLUMNS (*) uses the * wildcard to specify all columns on which a full-text index is created.

  • You can extend the properties of the source table by using the extensions keyword when you create a full-text index for a table. By default, you must specify the CONSISTENCY_TYPE and the MUTABILITY properties. This rule applies when you use Lindorm in multi-zone scenarios. This rule does not apply to single-zone scenarios.

Examples

CREATE SEARCH INDEX schidx ON persioninfo WITH COLUMNS (c2, c3); 

DROP SEARCH INDEX

Deletes a full-text index.

Syntax

DROP SEARCH INDEX [IF EXISTS] ON [keyspace_name.]table_name;

Examples

DROP SEARCH INDEX ON testks.persioninfo;

REBUILD SEARCH INDEX

Sets the indexing status to active.

Syntax

REBUILD SEARCH INDEX [ASYNC] [IF EXISTS] ON [keyspace_name.]table_name;
Note

After you create a full-text index, the indexing status is INACTIVE. You must execute the REBUILD SEARCH INDEX statement to activate the index. The REBUILD SEARCH INDEX statement can also be used to create a full-text index for existing data. The process may require an extended period of time. You can configure the ASYNC parameter to run the process in an asynchronous manner.

Examples

REBUILD SEARCH INDEX ON persioninfo;
REBUILD SEARCH INDEX ASYNC ON persioninfo;

ALTER SEARCH INDEX

Modifies the status of a full-text index. This statement can also be used to manually add or delete an indexed column.

Syntax

ALTER SEARCH INDEX SCHEMA [IF EXISTS] ON [keyspace_name.]table_name
  ( ADD FIELD column_name
  | DROP FIELD column_name) ;
Note
  • After you execute the ALTER SEARCH INDEX statement to modify the status of a full-text index, you must execute the REBUILD SEARCH INDEX statement to reset the status of the full-text index.

  • If you execute the ADD SEARCH INDEX statement to add an indexed column or the DROP SEARCH INDEX statement to delete an indexed column, make sure that the indexed column exists in the table.

Examples

ALTER SEARCH INDEX SCHEMA ON persioninfo ADD  (c3);
ALTER SEARCH INDEX SCHEMA ON persioninfo DROP  (c2);