• UID623
  • Fans4
  • Follows1
  • Posts72

Solution to high CPU usage of Alibaba Cloud RDS SQL Server - index fragmentation

More Posted time:May 2, 2017 14:06 PM
In the previous article we analyzed that one of the reasons for high CPU usage is a missing index. Next in this second article in this series we will discuss another common cause of high CPU usage, "index fragmentation", Removing index fragmentation is another edge tool for solving resolving slow responses of SQL Server services and query timeout.

Introduction of issues
"Laruence, I shared the topic of high CPU usage because of missing indexes in my last article and received good responses. Next, I plan to share the topic of high CPU usage caused by index fragmentation," the new guy took the initiative to report the work to Laruence.
For details of the last article, see the link: Solution to high CPU usage of Alibaba Cloud RDS SQL Server - missing index.

What is the index fragmentation?
Index fragmentation refers to both blank space in the index file page and the index page split by pages, as well as data pages with disorderly indexes. We call the first two types the internal fragmentation of the index, and the last one the external fragmentation of the index.
Having finished the literal theory descriptions, we can use a phone book as an example to help you easily understand it. If the phone book stores the contact phone numbers of all people in the form of city name, name and telephone number combinations, the phone book will need to be updated in any of the following situations:
 When a person changes his/her phone number or changes their name (Western women will often change their last name to their husband's after getting married, such as: Obama's wife is named Michelle Obama): An update is required
 When a person moves from one city to another: The old record should be deleted and a new record of the new city should be inserted
 When a new contact completes the telephone business procedures: A record should be inserted in the corresponding city
 When a telephone number is not used any more: The corresponding entry needs to be deleted
The consequence of these operations is that update operations may cause out-of-order data; delete operations may cause an empty space; and insert operations will cause a page split. The final result will be the internal fragmentation and external fragmentation of the phone book (similar to the index).

Hazards of the index fragmentation
Now that we know what index fragmentation is, let's look at the hazards resulting from index fragmentation.
For example, if there is a fully-used phone book (without any fragmentation) containing 1,000 pages, and the various operations mentioned above caused 10% fragmentation, we will ultimately need 1,100 pages to store the content. For each phone book, we will waste 100 printed pages, as well as 10% of the searching and reading time of every user. In terms of SQL Server index fragmentation, the principle is the same: because the minimal unit for SQL Server to read data is by data page, rather than a single record, the same query statement will need SQL Server to read from more of the disk. In addition, index fragmentation will waste more memory resources to store the read data. Therefore, the higher the degree of fragmentation, the more memory waste and the lower query performance. Microsoft recommends that the index fragmentation rate should be controlled between 5% to 30%, in which scenario index reorganization is required. When the fragmentation rate exceeds 30%, re-indexing is required.

We will describe the solution from the following aspects:
Simulate the generation of index fragmentation
 Get the index fragmentation information
 Re-indexing
 Comparison

Simulate the generation of index fragmentation
I assume that we need to change 1 million data records, including UPDATE, DELETE, and INSERT operations. Based on the Table dbo.SalesOrder created in the previous article, we can use the following methods to change the data:
USE TestDb


    @do INT = 0
    , @loop INT = 1000000
    , @sql NVARCHAR(MAX) = N''

WHILE @do < @loop
        SET @sql =
            WHEN @do % 3 = 0
                THEN N'DELETE TOP(1) FROM dbo.SalesOrder;'
            WHEN @do % 3 = 1
                THEN N'UPDATE TOP(1) A SET OrderQty = OrderQty + 1, Price = Price - 1, OrderDate = GETDATE() FROM dbo.SalesOrder AS A'
            WHEN @do % 3 = 2
                THEN N'INSERT INTO dbo.SalesOrder(OrderID, ItemID, UserID, OrderQty, Price, OrderDate, LastUpdateTime, OrderComment)
                            SELECT TOP 1 NEWID(),ItemID, UserID, OrderQty, Price, OrderDate, LastUpdateTime, OrderComment
                            FROM dbo.SalesOrder
            ELSE N''

        EXEC sys.sp_executesql @sql

        IF @do % 2000 = 0
            RAISERROR('%d rows', 10, 1, @do) WITH NOWAIT

        SET @do = @do + 1;

Get the index fragmentation information
We can use the system function sys.dm_db_index_physical_stats to get the index fragmentation information. The method for querying index fragmentation is as follows:
USE TestDb

        db_name(database_id) AS db_name
        ,object_name(ix.object_id) AS object_name
FROM sys.dm_db_index_physical_stats(db_id(),object_id('dbo.SalesOrder','U'),NULL,NULL,'LIMITED') AS fra
    CROSS APPLY sys.indexes AS ix WITH (NOLOCK)
WHERE ix.object_id = fra.object_id
    and ix.index_id = fra.index_id

EXEC sys.sp_spaceused SalesOrder

The query result is shown in the figure below:

Having identified how to solve the issue, the process will be simple. From the query result, I found the primary key fragmentation rate reached 92.2%, already very high. Re-indexing method:
USE TestDb

ON dbo.SalesOrder REBUILD

Execute the index fragmentation query again and the result is as follows:

After the index fragmentation is reorganized, the primary key fragmentation rate dropped from 92.2% to 0.1%. The index space, data space and total space sizes reduced by 4.3%, 52.8% and 72.4% respectively, and the average space was reduced by 43.17%.

You must be very careful when re-indexing a production environment, because of the following reasons:
 Re-indexing will consume a lot of system I/O reading and writing resources.
 Re-indexing will cause deadlocks or lock waits of the query process, especially in non-enterprise editions of SQL Server (The enterprise edition can use the ONLINE option to avoid this issue as much as possible).
 Re-indexing will lead to the database log files filling up quickly, and impose pressure on Database Mirroring, Log Shipping, and Backup.
Therefore, it is recommended you select a slow business period for index fragmentation reorganization operations.

This elaborates on index fragmentation – another common reason for high CPU usage – from the definition of index fragmentation, its hazards, solutions and notes of attention.

Stop Worrying About SQL Server Fragmentation
Reorganize Indexes
Alibaba Cloud RDS
[Cloudy edited the post at Jun 16, 2021 17:53 PM]