Online indexing in SQLServer - Alibaba Cloud Developer Forums: Cloud Discussion Forums

Elizabeth
Engineer
Engineer
  • UID625
  • Fans3
  • Follows1
  • Posts68
Reads:1692Replies:0

[MySQL]Online indexing in SQLServer

Created#
More Posted time:Oct 19, 2016 16:38 PM
Supported versions
First, SQLServer only supports online indexing in the Enterprise, Developer and Evaluation versions.
Process
Suppose we want to create a clustered index for a heap table online:


There are three phases: the preparation phase, the build phase and the final phase;
Preparation phase: This phase is very short. The row version of the table for operation is created, the metadata is updated to create an index and the S lock and IS lock are implemented on the table;
Build phase: This phase is the primary phase. It scans, sorts and merges the data in the original table, and inserts the target index through bulk load and implements the IS lock on the table. The users’ add, delete, edit and query operations in this phase will act on the old and the new indexes, which is also the root cause as to why online indexing will increase the overhead;
Because both the source and target structures are maintained during the online index operation, the resource usage for insert, update, and delete transactions is increased, potentially up to double. This could cause a decrease in performance and greater resource usage, especially CPU time, during the index operation.
Final phase: This phase is very short. The metadata will be updated and the original table may be deleted, with a heavy lock implemented (SCH-M). In addition, the uncommitted update operations before the start of this phase must be completed (otherwise, the operation will be congested in this phase forever).
Whether new DML operations will be congested is dependent on whether there is a conflict between the lock required by the operation and the SCH-M lock (conflict always exists except it is nolock).
Notes worth attention
1. Online indexing: If I want to modify an existing index or the operation influences an existing index (for example, modifying a clustered index will rebuild the non-clustered index), will the existing index still be available?
A: Yes.
2. In which cases is online indexing not supported?
o When the created or re-built index is a clustered index and the table for the operation contains big fields, including:  image, ntext, text, varchar (max), nvarchar (max), varbinary (max), and xml.
o When the created or re-built index is a non-clustered index, but the key or nonkey (include) columns involved contain big fields.
o When the created or re-built index acts on a local temporary table.
3. The online indexing may lead to a deadlock. If so, the victim will be other transactions.
Test
1. The test_1 is a heap table with no indexes, but only a small amount of data.
In session 111, execute the following SQL statements:

begin tran
create clustered index CIX_test_1_c1
on test_1 (c1)
with(online=on)
--rollback tran


2. View the locks obtained by session 111.
select * from sys.dm_tran_locks where request_session_id=111


3. Other DML operations are congested.


This test extends the architecture modification lock in the 3rd phase and congests the DML operations. In actual production environments, most of the time spent should be in the 2nd phase. So what we call “no influence to DML” is not totally influence-free.
So what we call “no influence to DML” is not totally influence-free, just most of the time.
Issues from online indexing
DBAs should be clear-minded about the potential issues from online indexing to tables with a few locks, with the CASE for example.
I have a scenario where I need to create a Clustered Index (CI) on a very large SQL Server 2012 database table. This table has about approximately 10 billion rows, 500 GB in size. The job ran for about 20 hours into it and then fails with error: "Out of disk space in tempdb". My tempDB size is 1.8TB, but yet it's still not enough.
I have exhausted Google search and decided to post it here to get some extra help.
CREATE CLUSTERED INDEX CI_IndexName

ON TableName(Column1,Column2)

WITH (MAXDOP= 4, ONLINE=ON, SORT_IN_TEMPDB = ON, DATA_COMPRESSION=PAGE)

ON sh_WeekDT(Day_DT)

GO


10 billion rows of records, 500GB in size, adding an index for 20 hours and an error prompted, saying the tempdb space is insufficient. How can we solve the issue?
After you understand the online indexing, you can give a suggestion: give up online indexing and change to offline indexing. The one who raised the question made the change, and ran the job for 7.45 hours, and then succeeded.
Guest