All Products
Search
Document Center

Best practices

Last Updated: Aug 15, 2020

After the new data definition language (DDL) execution engine is enabled, if the DDL statement execution fails or is unexpectedly interrupted, the corresponding DDL job enters the Pending state. At this time, if you attempt to execute another DDL statement on the same object such as a table before the job in the Pending state is processed, the execution is prohibited and an error occurs. In this case, to resume the service, you must properly process the pending job to exit the Pending state.

DDL execution scenarios may be very complex and cannot be all illustrated for explanation. This topic describes best practices to summarize the common principles for processing pending jobs.

  1. Execute the SHOW [FULL] DDL statement to view the information and failure causes of a DDL job. The REMARK field records the error information.

  2. Perform common handling: (It is suggested to select the most suitable method as required.)

    • Analyze the failure causes and rectify or eliminate the factors that cause the failure. For example, if the failure is caused by data problems, correct the data, for example, deduplicate the data. If the failure is caused by other constraints, check whether the constraints can be removed. After the failure is rectified, execute the RECOVER DDL statement to resume the pending job.

    • If the failure factors cannot be removed and the DDL job cannot be executed due to the failure, you can execute the REMOVE DDL statement to delete the job. Before the deletion, you must ensure that the DDL job is not executed. Otherwise, inconsistent states may be caused. After the DDL job is deleted, the service is resumed.

    • If you only want to delete the table associated with the failed DDL job (for example, if the table does not contain any data, you can directly delete the table and recreate a table), you can execute the REMOVE DDL statement to delete the job. Then, execute the DROP TABLE IF EXISTS statement to delete the table. Before the deletion, ensure that no data exists in the table or the data in the table is no longer used. In addition, ensure that the IF EXISTS syntax is specified in the DROP TABLE statement so that the data can be force deleted.

The following example describes how to process a DDL job that enters the Pending state after it is failed to be executed.

Execute the following statement to create a table without specifying any primary key, and insert data rows with duplicate values into the table. In this example, the value of id is 1 in two rows.

  1. mysql> create table test_pending (id int not null, age int) dbpartition by hash(id);
  2. Query OK, 0 rows affected (0.33 sec)
  3. mysql> insert into test_pending values(1,10),(1,20),(2,20),(3,30);
  4. Query OK, 4 rows affected (0.10 sec)
  5. mysql> select * from test_pending order by id;
  6. +------+------+
  7. | id | age |
  8. +------+------+
  9. | 1 | 10 |
  10. | 1 | 20 |
  11. | 2 | 20 |
  12. | 3 | 30 |
  13. +------+------+
  14. 4 rows in set (0.10 sec)

Assume that subsequently you want to add the primary key to the id column. The DDL execution fails because existing data in the table violates the uniqueness constraint.

  1. mysql> alter table test_pending add primary key (id);
  2. ERROR 4636 (HY000): [f5be83373466000][10.81.69.55:3306][ddltest]ERR-CODE: [TDDL-4636][ERR_DDL_JOB_ERROR] Not all physical operations have been done successfully: expected 9,
  3. but done 8. Caused by: 1062:DDLTEST_1562056402230OYMK_7WW7_0001:Duplicate entry '1' for key 'PRIMARY' on `test_pending`;.

Execute the SHOW FULL DDL statement to view the job status and the failure cause. You can learn that the data in a physical table contains duplicate values, resulting in the failure in adding the primary key.

  1. mysql> show full ddl\G
  2. *************************** 1. row ***************************
  3. JOB_ID: 1106733441212637184
  4. PARENT_JOB_ID: 0
  5. SERVER: 1:102:10.81.69.55
  6. OBJECT_SCHEMA: ddltest
  7. OBJECT_NAME: test_pending
  8. NEW_OBJECT_NAME:
  9. JOB_TYPE: ALTER_TABLE
  10. PHASE: EXECUTE
  11. STATE: PENDING
  12. PROGRESS: 77%
  13. START_TIME: 2019-09-06 17:17:55.002
  14. END_TIME: 2019-09-06 17:17:55.273
  15. ELAPSED_TIME(MS): 271
  16. DDL_STMT: alter table test_pending add primary key (id)
  17. REMARK: ERR-CODE: [TDDL-4636][ERR_DDL_JOB_ERROR] Not all physical operations have been done successfully: expected 9, but done 8. Caused by: 1062:DDLTEST_1562056402
  18. 230OYMK_7WW7_0001:Duplicate entry '1' for key 'PRIMARY' on `test_pending`;.

Details about the REMARK field:

  • Not all physical operations have been done successfully: expected 9, but done 8.:The DDL statement for this logical table involves nine physical DDL statements, of which eight are executed and one failed. This failed physical DDL statement causes the entire logical DDL statement to fail. As a result, the job enters the Pending state.

  • Caused by: 1062:DDLTEST_1562056402 230OYMK_7WW7_0001:Duplicate entry '1' for key 'PRIMARY' on 'test_pending';: Root cause of failure is that in the physical database DDLTEST_1562056402 230OYMK_7WW7_0001, the physical table test_pending contains duplicate data 1 of the id field. As a result, the primary key cannot be added.

At this time, the logical table is in an inconsistent state:

  1. mysql> check table test_pending;
  2. +----------------------------------------+-------+----------+-------------------------------------------------------------------------------------------------------------+
  3. | TABLE | OP | MSG_TYPE | MSG_TEXT |
  4. +----------------------------------------+-------+----------+-------------------------------------------------------------------------------------------------------------+
  5. | ddltest_1562056402230oymk.test_pending | check | Error | Table 'DDLTEST_1562056402230OYMK_7WW7_0001.test_pending' find incorrect columns 'id', please recreate table |
  6. +----------------------------------------+-------+----------+-------------------------------------------------------------------------------------------------------------+
  7. 1 row in set (0.04 sec)

Other DDL statements also cannot be executed and the corresponding error is received:

  1. mysql> drop table test_pending;
  2. ERROR 4644 (HY000): [f5beae39d466000][10.81.69.55:3306][ddltest]ERR-CODE: [TDDL-4644][ERR_PENDING_DDL_JOB_EXISTS] Another DDL job '1106733441212637184' with operation 'ALTER_
  3. TABLE' is pending on ddltest.test_pending in ddltest. Please use SHOW DDL to check it, and then recover or rollback it using RECOVER DDL or ROLLBACK DDL, or just remove it us
  4. ing REMOVE DDL if you confirm that the pending job can be discarded.

Based on the common principles described above, you can select any of the following operations to process the pending job. The results are shown separately.

Option 1: Deduplicate the data to resume the DDL job and then continue to add the primary key.

Deduplicate the data to retain only one copy of the data as needed. You can delete the data in the DRDS instance or directly connect to the underlying ApsaraDB RDS for MySQL database and delete the duplicate data based on error messages:

  1. mysql> delete from test_pending where id=1 and age=20;
  2. Query OK, 1 row affected (0.07 sec)
  3. mysql> select * from test_pending order by id;
  4. +------+------+
  5. | id | age |
  6. +------+------+
  7. | 1 | 10 |
  8. | 2 | 20 |
  9. | 3 | 30 |
  10. +------+------+
  11. 3 rows in set (0.02 sec)

After you ensure that the table contains no duplicate data, resume the execution of the pending DDL job. If the execution is successful, the pending job is automatically cleared and the primary key is added.

  1. mysql> recover ddl 1106733441212637184;
  2. Query OK, 0 rows affected (1.28 sec)
  3. mysql> show full ddl\G
  4. Empty set (0.00 sec)
  5. mysql> show create table test_pending\G
  6. *************************** 1. row ***************************
  7. Table: test_pending
  8. Create Table: CREATE TABLE `test_pending` (
  9. `id` int(11) NOT NULL,
  10. `age` int(11) DEFAULT NULL,
  11. PRIMARY KEY (`id`),
  12. KEY `auto_shard_key_id` (`id`) USING BTREE
  13. ) ENGINE=InnoDB DEFAULT CHARSET=utf8 dbpartition by hash(`id`)
  14. 1 row in set (0.02 sec)
  15. mysql> check table test_pending;
  16. +----------------------------------------+-------+----------+----------+
  17. | TABLE | OP | MSG_TYPE | MSG_TEXT |
  18. +----------------------------------------+-------+----------+----------+
  19. | ddltest_1562056402230oymk.test_pending | check | status | OK |
  20. +----------------------------------------+-------+----------+----------+
  21. 1 row in set (0.10 sec)

Option 2: Directly delete the job and then delete the corresponding table (ensure that the test data is no longer used). You can subsequently re-create the table as needed.

  1. mysql> remove ddl 1106733441212637184;
  2. Query OK, 1 row affected (0.02 sec)
  3. mysql> drop table if exists test_pending;
  4. Query OK, 0 rows affected (0.44 sec)
  5. mysql> show tables like 'test_pending';
  6. Empty set (0.01 sec)