Adolph
Engineer
Engineer
  • UID623
  • Fans2
  • Follows1
  • Posts72
Reads:765Replies:0

Solution of high CPU usage of Alibaba Cloud RDS SQL Server - data type conversion

Created#
More Posted time:May 3, 2017 9:18 AM
Abstract
In the first two articles, we discussed two important reasons leading to high CPU usage: missing indexes and index fragmentation. The third article of this series will discuss the implicit conversion of data types.
Analysis of scenarios
In SQL Server, the data types at both ends of the comparison operator (greater than, smaller than, equal to, or connect) must be consistent for the comparison to work. Otherwise, SQL Server will implicitly convert the data type according to the priority level from low to high before the comparison. This behavior can be seen by the CONVERT_IMPLICIT keyword in the execution plan. We can also witness this point in later examples. Unfortunately, the implicit conversion of data types in the formal table fields of SQL Server will introduce several problems:
The data has been converted, so the execution plan cannot adopt the better Index Seek, but instead uses Index Scan which has slightly inferior performance
 The adoption of Index Scan makes the I/O consumption very high
 Data type conversion calculation and soaring I/O result in very high CPU usage
The reference table of the SQL Server data type conversion is shown below (picture from Microsoft's official website):
 
Solve the problem
Here, we break down this example into five small steps:
 Test environment: Build a simple test environment.
Execute the query: SQL Server will implicitly convert the data in the formal table field into the INT type
 Data type implicit conversion: The expression form of implicit conversion of data types
 Solve the problem: Modify the type of data on the right of the comparison operator for the query statement to the string format
 Efficiency comparison: Query the performance comparison before and after the modification
Test environment
We will make slight modifications on the test table in Solution to high CPU usage of Alibaba Cloud RDS SQL Server - missing index, one article in this series: delete the index > modify the field ItemID into the VARCHAR data type > create the index. The code is as follows:
USE TestDb
GO

--===Drop index included
DROP INDEX IX_UserID_OrderDate_@ItemID_@OrderQty_@Price
ON dbo.SalesOrder;
GO

--===Change Data type for Testing Data convertion
ALTER TABLE dbo.SalesOrder
ALTER COLUMN ItemID VARCHAR(8) NOT NULL
GO

--=== Create Indexes
CREATE INDEX IX_ItemID_UserID_OrderDate_@OrderQty_@Price
ON dbo.SalesOrder([ItemID],[UserID],[OrderDate])
INCLUDE ([OrderQty], [Price])
WITH (FILLFACTOR = 85);
GO

EXEC sys.sp_help 'dbo.SalesOrder'
GO

From the result, the ItemID has changed to the VARCHAR data type, as shown below:



Execute the query
Now we can execute the following query statement. Activate the actual execution plan and pay attention to the WHERE ItemID = 250 clause. The data type on the right of the equal sign is INT which has a higher priority level than the VARCHAR data type on the left. Therefore, SQL Server must first implicitly convert the types of all the data stored in this field in the formal table to INT, and then compare the data with 250. This conversion operation will result in soaring I/O, failure of indexes and high CPU usage.
USE TestDb
GO
SET STATISTICS TIME ON
SET STATISTICS IO ON

SELECT
    ItemID, OrderQty, Price
FROM dbo.SalesOrder
WHERE ItemID = 250
    AND OrderDate >= DATEADD(MONTH, -1, GETDATE())
    AND OrderDate <= GETDATE();

SET STATISTICS TIME OFF
SET STATISTICS IO OFF


From the performance index page, we can see that the I/O consumption is 14950, the CPU consumption is 327 milliseconds, the execution time is 362 milliseconds, and the screenshot is shown below:


Implicit conversion of data types
We can see from the CONVERT_IMPLICIT keyword in the execution plan that SQL Server made implicit conversion of data types for the data in the formal table. The execution plan screenshot is shown as follows:


Solve the problem
After clear analysis of the problem, the solution becomes very simple. We only need to ensure the consistency of data types on both ends of the comparison operator so that SQL Server will not implicitly convert the types of data in the formal table fields, so that the I/O and CPU overhead can be greatly reduced. Therefore, we can modify the WHERE ItemID = 250 clause to WHERE ItemID = '250'. The query after the modification is as follows:
USE TestDb
GO
SET STATISTICS TIME ON
SET STATISTICS IO ON

SELECT
    ItemID, OrderQty, Price
FROM dbo.SalesOrder
WHERE ItemID = '250'        --correct here to ItemID = '250'
    AND OrderDate >= DATEADD(MONTH, -1, GETDATE())
    AND OrderDate <= GETDATE();

SET STATISTICS TIME OFF
SET STATISTICS IO OFF


The performance index page shows that the I/O consumption is 61, the CPU consumption is 16 milliseconds, the execution time is 190 milliseconds, and the screenshot is shown below:


The execution plan after the query data is modified adopted the Index Seek and there was no CONVERT_IMPLICIT keyword. The execution plan screenshot is shown as follows:

Performance comparison
We compared the performance index data before and after the query statement was modified. The execution time is shortened by 47.51%, the CPU consumption improved by 95.1%, the I/O consumption improved by 99.6%, and the average performance increased by 80.74%. The comparison data diagram is as follows:


Summary
The implicit conversion of data types in the formal table fields by SQL Server is another major cause of high CPU usage. This article details the causes, representation and solutions to the problem. This serves as a reminder to always remember to maintain data type consistency at both ends of the comparison operator when writing query statements.
References
Data Type Conversion
Alibaba Cloud RDS
[Adolph edited the post at Jul 13, 2017 16:12 PM]
Guest