Starting September 1, 2024, the SQL template algorithm in Database Autonomy Service (DAS) is optimized to improve template accuracy and aggregation.
Background information
DAS uses the SQL template algorithm to process SQL statements into templates for aggregation and analysis in slow query and SQL audit scenarios. The previous algorithm cannot handle truncated SQL statements well. For example, truncation at unpredictable positions can cause template expansion, making aggregation difficult.
Effective date
Starting September 1, 2024, the SQL template algorithm for slow queries and SQL audits is optimized.
After September 1, 2024, no further notifications are sent for iterative optimizations to the SQL template algorithm. API responses reflect the latest changes.
Effective scope
The optimization applies to the following database engines: ApsaraDB RDS for MySQL, ApsaraDB RDS for PostgreSQL, ApsaraDB RDS for SQL Server, ApsaraDB RDS for MariaDB, and PolarDB for MySQL.
Content
The following optimizations are provided:
-
$$is recognized as a string constant delimiter for ApsaraDB RDS for PostgreSQL, and identifiers are no longer incorrectly replaced with question marks (?).For example, the following original SQL statement is used:
UPDATE "study" SET "name" = 'xiaoming', "ext" = $${"math":90,"english":91}$$ where id=128;Before optimization, the following SQL template is generated:
UPDATE ? SET ? = ?, ? = $${?:?,?:?}$$ where id=?;After optimization, the following SQL template is generated:
UPDATE "study" SET "name"=?,"ext"=? WHERE id=?; -
Numeric suffixes in table and column names are replaced with placeholders to reduce the number of templates.
For example, the following original SQL statement is used:
select * from [school_3].[class].[student_25];Before optimization, the following SQL template is generated:
select * from [school_3].[class].[student_25];After optimization, the following SQL template is generated:
SELECT * FROM [school_?].[class].[student_?]; -
For ApsaraDB RDS for SQL Server instances, SQL statement prefixes are removed to improve statement type parsing.
For example, the following original SQL statement is used:
(@P0 nvarchar(4000))select id, name from student WHERE name = @P0;Before optimization, the following SQL template is generated:
Generated SQL template: (@P0 nvarchar(?))select id, name from student WHERE name = @P0; Parsed type of the SQL statement: p0After optimization, the following SQL template is generated:
Generated SQL template: SELECT id,name FROM student WHERE name=?; Parsed type of the SQL statement: select -
Unnecessary spaces are removed and the original keyword case is preserved without affecting syntax.
For example, the following original SQL statement is used:
select `name` from `student` where `id` = 1 and (`name` = 'xiaoming' or `class` = 2);Before optimization, the following SQL template is generated:
SELECT `name` FROM `student` WHERE `id` = ? AND (`name` = ? OR `class` = ?)After optimization, the following SQL template is generated:
SELECT `name` FROM `student` WHERE `id`=? AND (`name`=? OR `class`=?); -
All parentheses (()) in the original SQL statement are retained.
For example, the following original SQL statement is used:
select `name` from `student` where `id` = 1 and (`name` = 'xiaoming');Before optimization, the following SQL template is generated:
SELECT `name` FROM `student` WHERE `id` = ? AND `name` = ?After optimization, the following SQL template is generated:
SELECT `name` FROM `student` WHERE `id`=? AND (`name`=?); -
The parentheses (()) that surround the case expression are no longer converted to "AS".
For example, the following original SQL statement is used:
select `name`, ( CASE WHEN score > 90 THEN 'A' END ) `grade` from `student`;Before optimization, the following SQL template is generated:
SELECT `name` , CASE WHEN score > ? THEN ? END AS `grade` FROM `student`After optimization, the following SQL template is generated:
SELECT `name`,(CASE WHEN score>? THEN ? END)`grade` FROM `student`; -
The content after a number sign (#) can be correctly parsed.
For example, the following original SQL statement is used:
select `name`, `#grade` from `student`;Before optimization, the following SQL template is generated:
SELECT `name`, `After optimization, the following SQL template is generated:
SELECT `name`,`#grade` FROM `student`; -
For truncated SQL statements, content in unpaired parentheses is discarded to reduce the number of templates.
For example, the following original SQL statement is used:
select `name`, `grade` from `student` where id = (select uid fromBefore optimization, the following SQL template is generated:
select `name`, `grade` from `student` where id = (select uid fromAfter optimization, the following SQL template is generated:
SELECT `name`,`grade` FROM `student` WHERE id= -
Similar expressions are merged to prevent template expansion.
For example, the following original SQL statement is used:
SELECT CASE WHEN score >= 90 THEN 'A' WHEN score >= 80 THEN 'B' WHEN score >= 70 THEN 'C' WHEN score >= 60 THEN 'D' ELSE 'F' END AS grade FROM students;Before optimization, the following SQL template is generated:
SELECT CASE WHEN score >= ? THEN ? WHEN score >= ? THEN ? WHEN score >= ? THEN ? WHEN score >= ? THEN ? ELSE ? END AS grade FROM studentsAfter optimization, the following SQL template is generated:
SELECT CASE WHEN score>=? THEN ? ELSE ? END AS grade FROM students;
Impact
-
After the optimization, the SQLHash value for a given SQLText changes when you call the DescribeSlowLogs or DescribeSlowLogRecords operation.
-
After the optimization, the SqlId value changes when you call the GetFullRequestStatResultByInstanceId or GetAsyncErrorRequestListByCode operation.