Scan all or some of the table shards in all or some of the database shards - PolarDB

This topic describes the hint syntax and sample code that can be used to scan all or some of the table shards in all or some of the database shards.

Information provided in this topic is applicable to PolarDB-X 1.0 V5.3 and later.

DRDS provides the capability to route an SQL statement to one or more database shards for execution. PolarDB-X 1.0 also provides SCAN HINT to scan all or some of the table shards in all or some of the database shards. You can use SCAN HINT to route an SQL statement to all database shards at a time. For example, you can query all the table shards in a specified database shard or query the amount of the data in each physical table that corresponds to a specified logical table.

You can use SCAN HINT to execute SQL statements in the following manners:

Execute an SQL statement on all the table shards in all database shards.
Execute an SQL statement on all the table shards in the specified database shards.
Execute an SQL statement on the specified table shards in the specified database shards. The names of the physical tables are calculated based on the given conditions.
Execute an SQL statement on the specified table shards in the specified database shards. The table shards are explicitly specified by using the names of the physical tables.

SCAN HINT can be used in DML statements, DDL statements, and some Data Access Language (DAL) statements.

Syntax

# SCAN HINT

# Route an SQL statement to all the table shards in all database shards.
SCAN()                               

# Route an SQL statement to all the table shards in the specified database shards.
SCAN(NODE="node_list")               # Specify the database shards.

# Route an SQL statement to the specified table shards in the specified database shards. The names of the physical tables are calculated based on the given conditions.
SCAN(
  [TABLE=]"table_name_list"          # Specify the names of the logical tables.
  , CONDITION="condition_string"     # Calculate the names of the physical tables based on the values of the TABLE and CONDITION parameters.
  [, NODE="node_list"] )             # Filter the results obtained based on the value of the CONDITION parameter to retain only the names of the tables that are in the specified physical databases.

# Route an SQL statement to the specified table shards in the specified database shards. The table shards are explicitly specified by using the names of the physical tables.
SCAN(
  [TABLE=]"table_name_list"          # Specify the names of the logical tables.
  , REAL_TABLE=("table_name_list")   # Specify the names of the physical tables. These physical table names are used to query data from all the physical databases.
  [, NODE="node_list"] )             # Filter the results obtained based on the value of the CONDITION parameter to retain only the names of the tables that are in the specified physical databases.


# Specify the names of physical table shards or logical tables.
table_name_list: 
    table_name [, table_name]...

# Specify physical databases by using GROUP_KEY and GROUP_INDEX. To obtain the values of GROUP_KEY and GROUP_INDEX, you can execute the SHOW NODE statement. 
node_list: 
    {group_key | group_index} [, {group_key | group_index}]...
    
# Specify an SQL WHERE clause. You must specify conditions for each table, such as t1.id = 2 and t2.id = 2.
condition_string: 
    where_condition

Note

You can specify PolarDB-X 1.0 hints in the following formats: /*+TDDL:hint_command*/ and /!+TDDL:hint_command*/.
If you specify hints in the /*+TDDL:hint_command*/ format, add the -c parameter to the command used to log on to the MySQL command-line client: mysql. This way, you can execute SQL statements that contain the PolarDB-X 1.0 hints on the client. If you do not add the -c parameter, the client deletes comments in the SQL statements before it sends the SQL statements to servers for execution. PolarDB-X 1.0 hints in this format are defined as MySQL comments. Therefore, the PolarDB-X 1.0 hints are deleted and cannot take effect. For more information, see mysql client options.

Examples

Execute an SQL statement on all the table shards in all database shards.
```
SELECT /*+TDDL:scan()*/ COUNT(1) FROM t1      
```
When the SQL statement is executed, DRDS routes the SQL statement to all the physical tables of logical table t1. Then, DRDS merges the result sets and returns the final result.
Execute an SQL statement on all the table shards in the specified database shards.
```
SELECT /*+TDDL:scan(node='0,1,2')*/ COUNT(1) FROM t1          
```
When the SQL statement is executed, DRDS calculates the names of the physical tables of logical table t1 in database shards 0000, 0001, and 0002. Then, DRDS routes the SQL statement to the table shards. After the SQL statement is executed, DRDS merges the result sets and returns the final result.
Execute an SQL statement on the specified table shards based on the given conditions.
```
SELECT /*+TDDL:scan('t1', condition='t1.id = 2')*/ COUNT(1) FROM t1           
```
When the SQL statement is executed, DRDS calculates the names of all the physical tables correspond to logical table t1 and meet the conditions. Then, DRDS routes the SQL statement to the specified table shards. After the SQL statement is executed, DRDS merges the result sets and returns the final result.
Execute an SQL statement that contains a JOIN clause on the specified table shards based on the given conditions.
```
SELECT /*+TDDL:scan('t1, t2', condition='t1.id = 2 and t2.id = 2')*/ * FROM t1 a JOIN t2 b ON a.id = b.id WHERE b.name = "test"
            
```
When the SQL statement is executed, DRDS calculates the names of the physical tables correspond to logical tables t1 and t2 and meet the conditions. Then, DRDS routes the SQL statement to the specified table shards. After the SQL statement is executed, DRDS merges the result sets and returns the final result. Note: When you use this hint, you must make sure that the two tables belong to the same database shard. You must also make sure that the number of shards in one of the table is the same as the number of shards in the other table. Otherwise, the table shard names obtained by PolarDB-X 1.0 represent table shards that are not in the same database shard. If this issue occurs, DRDS reports an error.
Execute an SQL statement on the specified table shards in the specified database shards. The table shards are explicitly specified by using the names of the physical tables.
```
SELECT /*+TDDL:scan('t1', real_table=("t1_00", "t1_01"))*/ COUNT(1) FROM t1
            
```
When the SQL statement is executed, DRDS routes the SQL statement to table shards t1_00 and t1_01 in all database shards. After the SQL statement is executed, DRDS merges the result sets and returns the final result.
Execute an SQL statement that contains a JOIN clause on the specified table shards in the specified database shards. The table shards are explicitly specified by using the names of the physical tables.
```
SELECT /*+TDDL:scan('t1, t2', real_table=("t1_00,t2_00", "t1_01,t2_01"))*/ * FROM t1 a JOIN t2 b ON a.id = b.id WHERE b.name = "test";
            
```
When the SQL statement is executed, DRDS routes the SQL statement to table shards t1_00, t2_00, t1_01, and t2_01 in all database shards. After the SQL statement is executed, DRDS merges the result sets and returns the final result.