Distributed Relational Database Service (PolarDB-X 1.0) supports efficient data scanning and uses aggregate functions for statistical summary during full table scans.

Common scanning scenarios

  • Scan of table without database or table shards: PolarDB-X 1.0 transmits the original SQL statement to the backend ApsaraDB RDS for MySQL database for execution. In this case, PolarDB-X 1.0 supports any aggregate functions..
  • Non-full table scan: SQL passes through PolarDB-X 1.0 transmits the original SQL statement to each single ApsaraDB RDS for MySQL database for execution. For example, when the shard key in a WHERE clause indicates "equal to", a non-full table scan is performed. In this case, PolarDB-X 1.0 also supports all aggregate functions..
  • Full table scan: Currently, supported aggregate functions are COUNT, MAX, MIN, and SUM. In addition, LIKE, ORDER BY, LIMIT, and GROUP BY are also supported during full table scan.
  • Parallel scan of all table shards: If you need to export data from all databases, you can run the SHOW command to view the table topology and scan all table shards in parallel. For more information, see the following.

Traverse tables by using a hint

  1. Run SHOW TOPOLOGY FROM TABLE_NAME to obtain the table topology.
    mysql> SHOW TOPOLOGY FROM DRDS_USERS;
    +------+-------------------+--------------+
    | ID   | GROUP_NAME        | TABLE_NAME   |
    +------+-------------------+--------------+
    | 0 | DRDS_00_RDS | DRDS_USERS |
    | 1 | DRDS_01_RDS | DRDS_USERS |
    +------+-------------------+--------------+
    2 rows in set (0.06 sec)       
    Note
    • By default, the non-sharding tables are stored in database shard 0.
    • We recommend that you execute SHOW TOPOLOGY FROM TABLE_NAME to obtain the latest table topology before each scanning operation.
  2. Traverse each table based on the topology.
    • Execute the current SQL statement on database shard 0.
      /! TDDL:node='DRDS_00_RDS'*/ SELECT * FROM DRDS_USERS;
    • Execute the current SQL statement on database shard 1.
      /! TDDL:node='DRDS_01_RDS'*/ SELECT * FROM DRDS_USERS;

Parallel scans

PolarDB-X 1.0 allows you to run mysqldump to export data. However, if you want to scan data faster, you can enable multiple sessions for each table shard to scan tables in parallel.

mysql> SHOW TOPOLOGY FROM LJLTEST;
+------+----------------+------------+
| ID   | GROUP_NAME     | TABLE_NAME |
+------+----------------+------------+
|    0 | TDDL5_00_GROUP | ljltest_00 |
|    1 | TDDL5_00_GROUP | ljltest_01 |
|    2 | TDDL5_00_GROUP | ljltest_02 |
|    3 | TDDL5_01_GROUP | ljltest_03 |
|    4 | TDDL5_01_GROUP | ljltest_04 |
|    5 | TDDL5_01_GROUP | ljltest_05 |
|    6 | TDDL5_02_GROUP | ljltest_06 |
|    7 | TDDL5_02_GROUP | ljltest_07 |
|    8 | TDDL5_02_GROUP | ljltest_08 |
|    9 | TDDL5_03_GROUP | ljltest_09 |
|   10 | TDDL5_03_GROUP | ljltest_10 |
|   11 | TDDL5_03_GROUP | ljltest_11 |
+------+----------------+------------+
12 rows in set (0.06 sec)
As shown above, the table has four database shards, and each database shard has three table shards. Execute the following SQL statement on the table shards of the TDDL5_00_GROUP database.
/! TDDL:node='TDDL5_00_GROUP'*/ select * from ljltest_00;
Note
  • TDDL5_00_GROUP in HINT corresponds to the GROUP_NAME column in the output of the SHOW TOPOLOGY command.
  • The table name in the SQL statement is the table shard name.
  • At this time, you can enable up to 12 sessions (corresponding to 12 table shards respectively) to process data in parallel.