Reads:39987Replies:0
Solution of high CPU usage of Alibaba Cloud RDS SQL Server - non-SARG queries
Abstract
The most-encountered issue for Alibaba Cloud RDS SQL Server customers is high CPU usage which leads to slow responses of SQL Server services and query timeout, and even service suspension. This article, as the fourth one in this series, analyzes the solution to high CPU usage because of non-SARG queries. Introduction of issues "New guy, have you heard of the RDBMS non-SARG query? I heard of it for the first time today," asked Laruence confusedly. "Haha, Laruence. You are out of the loop. It's another important cause of high CPU usage in RDBMS databases. Let me explain what it is," New guy gets excited. Analysis of scenarios SARG is the abbreviation for Search Argument. Non-SARG, in simple words, means that the query predicate (especially the WHERE clause or the ON clause for connection operations) causes the index to be invalid. The query optimizer is not able to use the efficient Index Seek operation and has to switch to Index Scan operation with a lower efficiency, resulting in increased CPU usage and reduced query performance. Common non-SARG operations include the use of scalar functions in the WHERE statement such as Datediff, Dateadd, Year, Rtrim, Upper, Lower, and LIKE full fuzzy matching (in a format such as LIKE '% XXX%') or user-defined functions. Solution The core idea of solving non-SARG queries is to avoid the use of scalar functions in WHERE statements by means of equivalent rewriting of the query logic or optimization of functional designs. We will only illustrate the solution with the following common non-SARG operations as examples and other operations can follow suit. DATEDIFF For example, we need to query the order information records for the day. The non-SARG statement is as follows (the Datediff function is used in the WHERE statement): USE TestDb GO CREATE INDEX IX_OrderDate ON dbo.SalesOrder([OrderDate]) WITH (FILLFACTOR = 90); GO SET STATISTICS TIME ON SET STATISTICS IO ON -- find out sales info of today. SELECT ItemID, OrderQty, Price,OrderDate FROM dbo.SalesOrder WHERE DATEDIFF(DAY, OrderDate, GETDATE()) = 0 SET STATISTICS TIME OFF SET STATISTICS IO OFF From the performance index, the I/O consumption is 8406, CPU consumption is 171 milliseconds, the execution time is 211 milliseconds, and the screenshot is shown below: ![]() The execution plan window. The SQL Server optimizer adopts the Index Scan, and the screenshot is shown below: ![]() In fact, we can change our thinking logic. The data for the day is actually the records of the time greater than or equal to 0:00 of the day, and smaller than or equal to the current time. Then the equivalent logic code can be rewritten as follows: USE TestDb GO SET STATISTICS TIME ON SET STATISTICS IO ON DECLARE @today DATETIME = CONVERT(CHAR(10), GETDATE(), 120) ; SELECT ItemID, OrderQty, Price,OrderDate FROM dbo.SalesOrder WHERE OrderDate >= CONVERT(CHAR(10), GETDATE(), 120) AND OrderDate <= GETDATE(); SET STATISTICS TIME OFF SET STATISTICS IO OFF Here is the optimized performance index. The I/O consumption is 3, CPU consumption is 0 milliseconds, and the execution time is 109 milliseconds. ![]() The query optimizer adopted the more efficient Index Seek for execution plan operations and the screenshot is shown below: ![]() From the performance index comparison, we can see that the I/O dropped from 8406 to 3, the CPU consumption from 171 milliseconds down to 0 milliseconds, and the execution time from 211 to 109 milliseconds. In particular, the I/O and CPU improvement is very remarkable. UPPER/LOWER Because the default sorting rule in SQL Server is case-insensitive, in other words, SQL Server deems UPPER as identical with upper, the UPPER or LOWER functions can be ignored during character string comparison operations to avoid non-SARG queries and increased CPU usage. Of course, it's very important that you confirm that your database is case-insensitive before rewriting non-SARG queries, that is, case-insensitive keywords are contained. The method to check is as follows: USE TestDb GO EXEC sys.sp_helpsort The result is shown in the figure below: ![]() RTRIM Some people like to use the RTRIM function to cancel the space at the rightmost of the character string before comparison when writing character string comparison. In fact, this approach is superfluous and completely unnecessary. The reason is that SQL Server will automatically ignore the rightmost space when doing string comparisons. As a result, this operation will result in non-SARG queries which will increase CPU usage. We can write a very simple example to prove this point. In the IF statement, the RTRIM on the left of the equal sign is closely followed by a space character, and the TRIM string on the right of the equal sign does not include a space, but the result of the IF statement is true. The code is shown as below: IF 'TRIM ' = 'TRIM' PRINT 'equals' ELSE PRINT 'not equal' The result is shown in the figure below: ![]() LIKE full fuzzy matching The LIKE full fuzzy matching in the WHERE statement will also cause invalidity of the SQL Server indexes, so it is also a type of non-SARG query Optimization of this kind of scenario includes two methods: the first method is to optimize the design. For example, we often encounter the scenario of carrying out full fuzzy matching through the phone number, but the input telephone numbers in customers' queries are often complete ones. Here we can change the design to only support the query of complete phone numbers, that is, changing the LIKE full fuzzy matching into the equate operation, avoiding non-SARG operations. The second solution is that if full fuzzy matching queries are unavoidable in the design, we can choose the SQL Server Fulltext technology to solve the full fuzzy queries of LIKE statements, so as to avoid non-SARG operations and reduce CPU usage. Summary This article analyzes the reasons for non-SARG query statements to cause increased Alibaba Cloud RDS SQL Server CPU usage and determines that the core idea of solving such problems is to rewrite the semantic logic to avoid non-SARG queries that result in high CPU usage. It also illustrates several typical methods to rewrite non-SARG queries. Reference Alibaba Cloud RDS [Adolph edited the post at Jul 13, 2017 16:12 PM]
|
|