DBA daily management-Data Archiving (Archiving-Data)-Alibaba Cloud Developer Community

 

as databases become larger and larger, performance and management challenges will also become greater and greater. Each query may need to find more data pages, especially when the query has scan operations, resulting in slower queries. At the same time, more and more data needs to be backed up, and the backup operation lasts longer and longer. The data files used by backup files and databases will become larger and larger, and a series of problems will arise.

Data archiving becomes very important. Data archiving is closely related to capacity planning, but capacity planning will be described in other articles.

The goal of data archiving is to monitor the size and growth rate of data. It is particularly important that even small databases should be well planned, because you can hardly guarantee whether small databases will grow rapidly or become very important in a certain period of time in the future, planning well will have a very important impact on future management and optimization. In addition, archiving means that data needs to be saved for later use, that is, you cannot delete it directly. Many data warehouses need these data, but business databases do not always need to be saved, which reflects the importance of archiving.

In addition, the benefits of archiving, as mentioned at the beginning, reduce the size of backup files, speed up database restoration and resource overhead, reduce the overhead of database management, improve running performance, etc. When sorting data, you only need to sort out the activity data.

In short, data archiving and capacity planning are one of the key tasks of DBA or database management personnel.

 

 

1. Locate which data needs to be retained?

Generally, archived data has obvious time or business columns, which can better identify the data to be processed. However, it depends on the business requirements.

2. How to access these data when necessary?

You can access data in different ways based on the archive method.

3. What are the security requirements for archived data?

Because archived data no longer exists in corresponding tables, databases, or even servers, security issues should be taken into account when archiving, and storage of archived data should also be taken into consideration.

 

for the above questions, there are several aspects to consider:

  • if the data must be stored in the same database:

you can move the data to be archived to a new table. In practice, the archived table should have a prefix or suffix for daily use. At the same time, these tables are placed in a separate filegroup, because these files almost only have the "read-only" feature, so they can be put into a filegroup, on the one hand, the backup size can be reduced, you only need to use the file backup or partial backup function to back up the filegroup where the active data is located. For archived data, you only need to perform a regular backup. The restoration is also fast. On the other hand, because these filegroups are read-only, some performance improvement technologies that are effective for static data can be used, such as Index views and column storage indexes, in addition, read-only files and filegroups do not need to be locked, which can reduce lock contention.

However, if you perform a full backup, this part of data is still included in the backup file without reducing the file size. Therefore, for this type of archive, we recommend that you use partial backup or file backup. In addition, when you need to access the data, you usually use the view, merge the required data, and then display it. Generally, the performance of this display will not be greatly improved, because views are not used to improve performance and permission access is also considered, the impact is not very obvious because they are all in the same library.

For archiving in this case, 2005 and later versions have a very useful function-table partitioning, through which no logical changes can be made, however, it is physically divided into several zones. Ideally, data operations can only occur in a few or even one zone, with partition indexes, improved I/O utilization. If you add the appropriate data compression function (remember that it is not shrink, the function appeared in 2008), it can improve the I/O utilization and reduce the space utilization.

 The following example shows how to move unnecessary data to a new filegroup and perform access operations. This example includes two methods:

  1. for versions 2000 or 2005 or later that do not support Table partitions, such as the standard version, this method can only be used.
  2. It demonstrates how to use table partitions.
The following example demonstrates how to use table partitions:

1. Create a filegroup and create a file to store archived data separately. In this example, the Sales.SalesOrderHeader table in the sample database AdventureWorks2012 is used to archive the data of the table in 2006:

   1:  USE master
   2:   
   3:  GO
   4:   
   5:  ALTERDATABASE AdventureWorks2012
   6:   
   7:  ADD FILEGROUP Test1FG1;
   8:   
   9:  GO
  10:   
  11:  ALTERDATABASE AdventureWorks2012 
  12:   
  13:  ADDFILE
  14:   
  15:  (
  16:   
  17:  NAME = test1dat3,
  18:   
  19:  FILENAME = 'D:\DB_Data\t1dat3.ndf',
  20:   
  21:  SIZE = 5MB,
  22:   
  23:  MAXSIZE = 100MB,
  24:   
  25:  FILEGROWTH = 5MB
  26:   
  27:  ),
  28:   
  29:  (
  30:   
  31:  NAME = test1dat4,
  32:   
  33:  FILENAME = 'D:\DB_Data\t1dat4.ndf',
  34:   
  35:  SIZE = 5MB,
  36:   
  37:  MAXSIZE = 100MB,
  38:   
  39:  FILEGROWTH = 5MB
  40:   
  41:  )
  42:   
  43:  TO FILEGROUP Test1FG1;
  44:   
  45:  GO

 

view filegroups and file information:

   1:  SELECT * FROM sys.filegroups

the result is as follows:

view File information:

   1:  USE AdventureWorks2012
   2:   
   3:  GO
   4:   
   5:  SELECT file_guid,name,physical_name 
   6:   
   7:  FROM sys.database_files

 

the result is as follows:

you can see that two files have been created.

2. Create an archive table in the file of the new filegroup and insert the following data:

   1:  USE AdventureWorks2012
   2:   
   3:  GO
   4:   
   5:  /*
   6:   
   7:  由于需要指定表所在的文件,所以不能用select * into 来创建表
   8:   
   9:  */
  10:   
  11:  IF OBJECT_ID(N'Sales.Ar_SalesOrderHeader','U') ISNULL
  12:   
  13:  CREATETABLE Sales.Ar_SalesOrderHeader
  14:   
  15:  (
  16:   
  17:  [SalesOrderID] [INT] IDENTITY(1, 1) NOTFORREPLICATIONNOTNULL,
  18:   
  19:  [RevisionNumber] [TINYINT] NOTNULLCONSTRAINT [DF_Arc_SalesOrderHeader_RevisionNumber] DEFAULT ((0)),
  20:   
  21:  [OrderDate] [DATETIME] NOTNULLCONSTRAINT [DF_Arc_SalesOrderHeader_OrderDate] DEFAULT (Getdate()),
  22:   
  23:  [DueDate] [DATETIME] NOTNULL,
  24:   
  25:  [ShipDate] [DATETIME] NULL,
  26:   
  27:  [Status] [TINYINT] NOTNULLCONSTRAINT [DF_Arc_SalesOrderHeader_Status] DEFAULT ((1)),
  28:   
  29:  [OnlineOrderFlag] [dbo].[FLAG] NOTNULLCONSTRAINT [DF_Arc_SalesOrderHeader_OnlineOrderFlag] DEFAULT ((1)),
  30:   
  31:  [SalesOrderNumber] AS ( Isnull(N'SO'
  32:   
  33:  + CONVERT([NVARCHAR](23), [SalesOrderID]), N'*** ERROR ***') ),
  34:   
  35:  [PurchaseOrderNumber] [dbo].[ORDERNUMBER] NULL,
  36:   
  37:  [AccountNumber] [dbo].[ACCOUNTNUMBER] NULL,
  38:   
  39:  [CustomerID] [INT] NOTNULL,
  40:   
  41:  [SalesPersonID] [INT] NULL,
  42:   
  43:  [TerritoryID] [INT] NULL,
  44:   
  45:  [BillToAddressID] [INT] NOTNULL,
  46:   
  47:  [ShipToAddressID] [INT] NOTNULL,
  48:   
  49:  [ShipMethodID] [INT] NOTNULL,
  50:   
  51:  [CreditCardID] [INT] NULL,
  52:   
  53:  [CreditCardApprovalCode] [VARCHAR](15) NULL,
  54:   
  55:  [CurrencyRateID] [INT] NULL,
  56:   
  57:  [SubTotal] [MONEY] NOTNULLCONSTRAINT [DF_Arc_SalesOrderHeader_SubTotal] DEFAULT ((0.00)),
  58:   
  59:  [TaxAmt] [MONEY] NOTNULLCONSTRAINT [DF_Arc_SalesOrderHeader_TaxAmt] DEFAULT ((0.00)),
  60:   
  61:  [Freight] [MONEY] NOTNULLCONSTRAINT [DF_Arc_SalesOrderHeader_Freight] DEFAULT ((0.00)),
  62:   
  63:  [TotalDue] AS ( Isnull(( [SubTotal] + [TaxAmt] ) + [Freight], ( 0 )) ),
  64:   
  65:  [Comment] [NVARCHAR](128) NULL,
  66:   
  67:  [rowguid] [UNIQUEIDENTIFIER] ROWGUIDCOLNOTNULLCONSTRAINT [DF_Arc_SalesOrderHeader_rowguid] DEFAULT (Newid()),
  68:   
  69:  [ModifiedDate] [DATETIME] NOTNULLCONSTRAINT [DF_Arc_SalesOrderHeader_ModifiedDate] DEFAULT (Getdate()),
  70:   

71: CONSTRAINT [PK_Arc_SalesOrderHeader_SalesOrderID] PRIMARY  KEY  CLUSTERED ([SalesOrderID] ASC )

 

  72:   WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON
  73:  )
  74:   
  75:  ON [Test1FG1]

 

insert data:

   1:  USE AdventureWorks2012 --使用目标数据库,特别是对于有自增列的表归档操作,更需要使用目标数据库
   2:   
   3:  GO
   4:   
   5:  SETIDENTITY_INSERT Sales.Ar_SalesOrderHeader ON;--由于这个表存在自增主键,所以需要使用这个SET选项
   6:   
   7:  INSERT INTO Sales.Ar_SalesOrderHeader
   8:   
   9:  ([SalesOrderID],
  10:   
  11:  [RevisionNumber],
  12:   
  13:  [OrderDate],
  14:   
  15:  [DueDate],
  16:   
  17:  [ShipDate],
  18:   
  19:  [Status],
  20:   
  21:  [OnlineOrderFlag],
  22:   
  23:  --[SalesOrderNumber],--这个列在表中是计算列,所以不用插入
  24:   
  25:  [PurchaseOrderNumber],
  26:   
  27:  [AccountNumber],
  28:   
  29:  [CustomerID],
  30:   
  31:  [SalesPersonID],
  32:   
  33:  [TerritoryID],
  34:   
  35:  [BillToAddressID],
  36:   
  37:  [ShipToAddressID],
  38:   
  39:  [ShipMethodID],
  40:   
  41:  [CreditCardID],
  42:   
  43:  [CreditCardApprovalCode],
  44:   
  45:  [CurrencyRateID],
  46:   
  47:  [SubTotal],
  48:   
  49:  [TaxAmt],
  50:   
  51:  [Freight],
  52:   
  53:  --[TotalDue],--这个列在表中是计算列,所以不用插入
  54:   
  55:  [Comment],
  56:   
  57:  [rowguid],
  58:   
  59:  [ModifiedDate])
  60:   
  61:  SELECT [SalesOrderID],
  62:   
  63:  [RevisionNumber],
  64:   
  65:  [OrderDate],
  66:   
  67:  [DueDate],
  68:   
  69:  [ShipDate],
  70:   
  71:  [Status],
  72:   
  73:  [OnlineOrderFlag],
  74:   
  75:  --[SalesOrderNumber],
  76:   
  77:  [PurchaseOrderNumber],
  78:   
  79:  [AccountNumber],
  80:   
  81:  [CustomerID],
  82:   
  83:  [SalesPersonID],
  84:   
  85:  [TerritoryID],
  86:   
  87:  [BillToAddressID],
  88:   
  89:  [ShipToAddressID],
  90:   
  91:  [ShipMethodID],
  92:   
  93:  [CreditCardID],
  94:   
  95:  [CreditCardApprovalCode],
  96:   
  97:  [CurrencyRateID],
  98:   
  99:  [SubTotal],
 100:   
 101:  [TaxAmt],
 102:   
 103:  [Freight],
 104:   
 105:  --[TotalDue],
 106:   
 107:  [Comment],
 108:   
 109:  [rowguid],
 110:   
 111:  [ModifiedDate]
 112:   
 113:  FROM [Sales].[SalesOrderHeader]
 114:   
 115:  WHERECONVERT(NVARCHAR(30), OrderDate, 121) BETWEEN'2006-01-01 00:00:00.000'AND'2006-12-31 23:59:59.997'
 116:   
 117:  SETIDENTITY_INSERT Sales.Ar_SalesOrderHeader OFF;

 

verification data:

   1:  USE AdventureWorks2012
   2:   
   3:  GO
   4:   
   5:  SELECT * FROM Sales.Ar_SalesOrderHeader
   6:   
   7:  EXCEPT
   8:   
   9:  SELECT * FROM Sales.SalesOrderHeader

the result is as follows:

note that the EXCEPT set operator appeared in 2005 is used here. If there is data, it is proved that there is inconsistent data. No data proves that the two tables are exactly the same. For 2000, NOT EXISTS can be used.

3. Delete the data in the source table and archive the data:

   1:  USE AdventureWorks2012
   2:   
   3:  GO
   4:   
   5:  DELETEFROM Sales.SalesOrderHeader
   6:   
   7:  WHERECONVERT(NVARCHAR(30), OrderDate, 121) BETWEEN'2006-01-01 00:00:00.000'AND'2006-12-31 23:59:59.997'

4. We can look at the filegroup where the archived table is located:

you can set this filegroup to read-only during the non-Archive period. You can ignore this part of backup and restoration. The knowledge about backup and restoration will be described in other articles.

Now Let's demonstrate how to archive data with table partitions:

this section will only be a demonstration without too much explanation. The detailed description of partitions will be reflected in other articles. One of the advantages of using partitions is that you do not need to modify programs and query statements. In other ways, you need to modify query statements by using views or adding union/union all.

1. Check the table to be queried. Because there is no suitable table in the sample database for demonstration, and there are already partitions on it, create a new table:

   1:  USE AdventureWorks2012
   2:   
   3:  GO
   4:   
   5:  DELETEFROM Sales.SalesOrderHeader
   6:   
   7:  WHERECONVERT(NVARCHAR(30), OrderDate, 121) BETWEEN'2006-01-01 00:00:00.000'AND'2006-12-31 23:59:59.997'

2. Check the year segment on the table:

   1:  SELECTDISTINCT DATEPART(yyyy,OrderDate)
   2:   
   3:  FROM Sales.SalesOrderHeader_Demo
   4:   
   5:  ORDERBY DATEPART(yyyy,OrderDate)

result:

assume that we only need to query the data of the current year, so that we can use it as the basis for partitioning over the past few years.

Step 1: Create a partition function:

   1:  USE AdventureWorks2012
   2:   
   3:  GO
   4:   
   5:  /*
   6:   
   7:  注意:所有用户自定义数据类型、别名数据类型、时间戳、图像、XML、varchar(max)、nvarchar(max)及varbinary(max)都不能做为分区列
   8:   
   9:  这里使用OrderDate作为分区依据,但是分区函数中输入的是数据类型,而不是列名,详见联机丛书说明
  10:   
  11:  */
  12:   
  13:  CREATE PARTITION FUNCTION PFL_Years(DATETIME)
  14:   
  15:  AS RANGE LEFT --LFET代表包含VALUES中的值,也就是说这个例子中,第一个分区包含小于等于2005-12-31 23:59:59.997的数据,
  16:   
  17:  FORVALUES('2005-12-31 23:59:59.997','2006-12-31 23:59:59.997','2007-12-31 23:59:59.997','2008-12-31 23:59:59.997','2009-12-31 23:59:59.997')
  18:   
  19:  GO

 

step 2: Create a filegroup. You may not create partitions. However, as a best practice, you can put archived data into a new filegroup, you can share the overhead (the filegroups of the same database are placed on different physical disks), reduce the impact of database damage, and so on. Move the filegroup where archived data is stored to a lower I/O disk and move the filegroup where active data is stored to a higher I/O physical disk can improve performance.

   1:  USE [master]
   2:   
   3:  GO
   4:   
   5:  ALTERDATABASE [AdventureWorks2012] ADD FILEGROUP [CY00]
   6:   
   7:  GO
   8:   
   9:  ALTERDATABASE [AdventureWorks2012] ADD FILEGROUP [CY02]
  10:   
  11:  GO
  12:   
  13:  ALTERDATABASE [AdventureWorks2012] ADD FILEGROUP [CY04]
  14:   
  15:  GO
  16:   
  17:  ALTERDATABASE [AdventureWorks2012] ADD FILEGROUP [CY05]
  18:   
  19:  GO
  20:   
  21:  ALTERDATABASE [AdventureWorks2012] ADD FILEGROUP [CY06]
  22:   
  23:  GO
  24:   
  25:  ALTERDATABASE [AdventureWorks2012] ADD FILEGROUP [CY07]
  26:   
  27:  GO

Create an object to each filegroup:

   1:  USE [master]
   2:   
   3:  GO
   4:   
   5:  ALTERDATABASE [AdventureWorks2012] ADDFILE ( NAME = N'test', FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\test.ndf' , SIZE = 5120KB , FILEGROWTH = 1024KB ) TO FILEGROUP [CY00]
   6:   
   7:  GO
   8:   
   9:  ALTERDATABASE [AdventureWorks2012] ADDFILE ( NAME = N'test1', FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\test1.ndf' , SIZE = 5120KB , FILEGROWTH = 1024KB ) TO FILEGROUP [CY02]
  10:   
  11:  GO
  12:   
  13:  ALTERDATABASE [AdventureWorks2012] ADDFILE ( NAME = N'test2', FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\test2.ndf' , SIZE = 5120KB , FILEGROWTH = 1024KB ) TO FILEGROUP [CY04]
  14:   
  15:  GO
  16:   
  17:  ALTERDATABASE [AdventureWorks2012] ADDFILE ( NAME = N'test3', FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\test3.ndf' , SIZE = 5120KB , FILEGROWTH = 1024KB ) TO FILEGROUP [CY05]
  18:   
  19:  GO
  20:   
  21:  ALTERDATABASE [AdventureWorks2012] ADDFILE ( NAME = N'test4', FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\test4.ndf' , SIZE = 5120KB , FILEGROWTH = 1024KB ) TO FILEGROUP [CY06]
  22:   
  23:  GO
  24:   
  25:  ALTERDATABASE [AdventureWorks2012] ADDFILE ( NAME = N'test5', FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\test5.ndf' , SIZE = 5120KB , FILEGROWTH = 1024KB ) TO FILEGROUP [CY07]
  26:   
  27:  GO

step 3: Create a partition scheme:

   1:  USE AdventureWorks2012
   2:   
   3:  GO
   4:   
   5:  CREATE PARTITION SCHEME CYScheme
   6:   
   7:  AS
   8:   
   9:  PARTITION PFL_Years
  10:   
  11:  TO ([CY00],[CY02],[CY04],[CY05],[CY06],[CY07])

step 4: partition the target table:

   1:  USE [AdventureWorks2012]
   2:   
   3:  GO
   4:   
   5:  BEGINTRANSACTION
   6:   
   7:  CREATECLUSTEREDINDEX [ClusteredIndex_on_CYScheme_635139020318369500] ON [Sales].[SalesOrderHeader_Demo]
   8:   
   9:  (
  10:   
  11:  [OrderDate]
  12:   
  13:  )WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [CYScheme]([OrderDate])
  14:   
  15:  DROPINDEX [ClusteredIndex_on_CYScheme_635139020318369500] ON [Sales].[SalesOrderHeader_Demo]
  16:   
  17:  COMMITTRANSACTION

step 5: check the partition status:

   1:  SELECT * FROM sys.partitions
   2:   
   3:  WHERE object_id=OBJECT_ID(N'[Sales].[SalesOrderHeader_Demo]')

at this point, partitioning is not the focus of this article, so many details are not explained, just tell readers that you can use partitions to archive data.

  • If data does not need to exist in the same database:

in this case, the performance is significantly improved. You can simply query the archived data to the corresponding table in the new database, the new database can be on the same server or different servers, and then delete the data from the original database. In this way, the data volume on the active database is reduced, and the "slimming" effect is also achieved. Because archiving is usually initiated periodically, you can set the archive library to read-only during two archiving periods, which reduces the overhead of accessing the library and does not need to back up the library frequently. Since the activity Library has been reduced, the overhead in query and management can be maintained to a certain extent, and will not increase significantly with the increase of time.

As above, when you need to archive data, you can use views or linked servers to merge data. However, you need to add additional permissions on different databases or even servers, generally, read-only permissions are required. We recommend that you do not use a role with high permissions.

Example:

1. Locate the data to be archived. In this example, assume that the table Sales in the sample database AdventureWorks2012. SalesOrderHeader need to be archived, this table has obvious archive columns: OrderDate, we assume that the orders generated in 2005 are archived (the 2012 database contains the order data from 2005 to 2008), you need to create an archive library first, and then create an identical archive table:

   1:  USE master
   2:   
   3:  GO
   4:   
   5:  /*
   6:   
   7:  判断是否存在库,不存在才创建,记住不要使用“存在则删除”的逻辑,否则会把归档数据删掉
   8:   
   9:  */
  10:   
  11:  IF DB_ID(N'Arc_AdventureWorks2012') ISNULL
  12:   
  13:  BEGIN
  14:   
  15:  CREATEDATABASE Arc_AdventureWorks2012
  16:   
  17:  END
  18:   
  19:  /*
  20:   
  21:  判断表是否存在,不存在则创建,由于示例数据库并不是dbo架构,
  22:   
  23:  所以这里先创建一个架构,如果需要归档的表是dbo,那么没必要额外创建
  24:   
  25:  */
  26:   
  27:  USE Arc_AdventureWorks2012
  28:   
  29:  GO
  30:   
  31:  CREATESCHEMA [Sales] AUTHORIZATION [db_owner]--这一步要单独执行
  32:   
  33:  GO
  34:   
  35:  IF OBJECT_ID(N'Sales.Arc_SalesOrderHeader') ISNULL
  36:   
  37:  BEGIN
  38:   
  39:  SELECT * INTO Sales.Arc_SalesOrderHeader FROM AdventureWorks2012.Sales.SalesOrderHeader WHERE 1=2 --创建表结构ENDND
  40:   
  41:  END

enable SSMS check:

2. Check the 2005 data in the activity Library and insert it into the archive library. After the insertion is successful, delete the data in the activity Library:

   1:  USE Arc_AdventureWorks2012 --使用目标数据库,特别是对于有自增列的表归档操作,更需要使用目标数据库
   2:   
   3:  GO
   4:   
   5:  SETIDENTITY_INSERT [Sales].[Arc_SalesOrderHeader] ON;--由于这个表存在自增主键,所以需要使用这个SET选项
   6:   
   7:  INSERT INTO [Sales].[Arc_SalesOrderHeader]
   8:   
   9:  ([SalesOrderID],
  10:   
  11:  [RevisionNumber],
  12:   
  13:  [OrderDate],
  14:   
  15:  [DueDate],
  16:   
  17:  [ShipDate],
  18:   
  19:  [Status],
  20:   
  21:  [OnlineOrderFlag],
  22:   
  23:  [SalesOrderNumber],
  24:   
  25:  [PurchaseOrderNumber],
  26:   
  27:  [AccountNumber],
  28:   
  29:  [CustomerID],
  30:   
  31:  [SalesPersonID],
  32:   
  33:  [TerritoryID],
  34:   
  35:  [BillToAddressID],
  36:   
  37:  [ShipToAddressID],
  38:   
  39:  [ShipMethodID],
  40:   
  41:  [CreditCardID],
  42:   
  43:  [CreditCardApprovalCode],
  44:   
  45:  [CurrencyRateID],
  46:   
  47:  [SubTotal],
  48:   
  49:  [TaxAmt],
  50:   
  51:  [Freight],
  52:   
  53:  [TotalDue],
  54:   
  55:  [Comment],
  56:   
  57:  [rowguid],
  58:   
  59:  [ModifiedDate])
  60:   
  61:  SELECT [SalesOrderID],
  62:   
  63:  [RevisionNumber],
  64:   
  65:  [OrderDate],
  66:   
  67:  [DueDate],
  68:   
  69:  [ShipDate],
  70:   
  71:  [Status],
  72:   
  73:  [OnlineOrderFlag],
  74:   
  75:  [SalesOrderNumber],
  76:   
  77:  [PurchaseOrderNumber],
  78:   
  79:  [AccountNumber],
  80:   
  81:  [CustomerID],
  82:   
  83:  [SalesPersonID],
  84:   
  85:  [TerritoryID],
  86:   
  87:  [BillToAddressID],
  88:   
  89:  [ShipToAddressID],
  90:   
  91:  [ShipMethodID],
  92:   
  93:  [CreditCardID],
  94:   
  95:  [CreditCardApprovalCode],
  96:   
  97:  [CurrencyRateID],
  98:   
  99:  [SubTotal],
 100:   
 101:  [TaxAmt],
 102:   
 103:  [Freight],
 104:   
 105:  [TotalDue],
 106:   
 107:  [Comment],
 108:   
 109:  [rowguid],
 110:   
 111:  [ModifiedDate]
 112:   
 113:  FROM AdventureWorks2012.[Sales].[SalesOrderHeader]
 114:   
 115:  WHERECONVERT(NVARCHAR(30), OrderDate, 121) BETWEEN'2005-01-01 00:00:00.000'AND'2005-12-31 23:59:59.997'
 116:   
 117:  SETIDENTITY_INSERT [Sales].[Arc_SalesOrderHeader] OFF;

verification data:

   1:  USE Arc_AdventureWorks2012
   2:   
   3:  GO
   4:   
   5:  SELECT *
   6:   
   7:  FROM [AdventureWorks2012].[Sales].[SalesOrderHeader]
   8:   
   9:  WHERECONVERT(NVARCHAR(30), OrderDate, 121) BETWEEN'2005-01-01 00:00:00.000'AND'2005-12-31 23:59:59.997'
  10:   
  11:  EXCEPT
  12:   
  13:  SELECT *
  14:   
  15:  FROM [Arc_AdventureWorks2012].[Sales].[Arc_SalesOrderHeader]
  16:   
  17:  WHERECONVERT(NVARCHAR(30), OrderDate, 121) BETWEEN'2005-01-01 00:00:00.000'AND'2005-12-31 23:59:59.997'

the result is as follows:

note that the EXCEPT set operator appeared in 2005 is used here. If there is data, it is proved that there is inconsistent data. No data proves that the two tables are exactly the same. For 2000, NOT EXISTS can be used.

After verification, delete the corresponding data of the active library. Note that the selected library does not use the archive library:

   1:  USE AdventureWorks2012
   2:   
   3:  GO
   4:   
   5:  /*
   6:   
   7:  切记反复验证WHERE 条件和表名
   8:   
   9:  */
  10:   
  11:  DELETEFROM Sales.SalesOrderHeader
  12:   
  13:  WHERECONVERT(NVARCHAR(30), OrderDate, 121)
  14:   
  15:  BETWEEN'2005-01-01 00:00:00.000'AND'2005-12-31 23:59:59.997'
  16:   
  17:  GO
  18:   
  19:  /*
  20:   
  21:  验证是否删除成功
  22:   
  23:  */
  24:   
  25:  IF (SELECTCOUNT(1) FROM Sales.SalesOrderHeader
  26:   
  27:  WHERECONVERT(NVARCHAR(30), OrderDate, 121)
  28:   
  29:  BETWEEN'2005-01-01 00:00:00.000'AND'2005-12-31 23:59:59.997')=0
  30:   
  31:  BEGIN
  32:   
  33:  SELECT'删除成功'
  34:   
  35:  END

at this point, the data has been archived successfully. For some data with foreign key associations or business associations, more operations are required, but the steps are similar to those shown here.

3. To use archived data, assume that you have been granted the permission to access the corresponding table of the archive database for a specific user on the AdventureWorks2012, you only need to create a view to merge the query, of course, you can also not use the view:

   1:  USE AdventureWorks2012
   2:   
   3:  GO
   4:   
   5:  /*
   6:   
   7:  创建视图
   8:   
   9:  */
  10:   
  11:  IF OBJECT_ID(N'Arc_View','V') ISNOTNULL
  12:   
  13:  DROPVIEW Arc_View
  14:   
  15:  GO
  16:   
  17:  CREATEVIEW Arc_View
  18:   
  19:  AS
  20:   
  21:  SELECT *
  22:   
  23:  FROM AdventureWorks2012.Sales.SalesOrderHeader
  24:   
  25:  UNIONALL
  26:   
  27:  SELECT *
  28:   
  29:  FROM Arc_AdventureWorks2012.Sales.Arc_SalesOrderHeader
  30:   
  31:  go
  32:   
  33:  /*
  34:   
  35:  查询视图
  36:   
  37:  */
  38:   
  39:  SELECT * FROM arc_view
  40:   
  41:  ORDERBY OrderDate 

the result is as follows:

4. You can modify the script to be automated, and then put it into the SQL Agent for regular execution. For archive databases, you only need to back up the data before and after the data changes.

  • Data is not required to be saved:

this situation is extreme, but it is not uncommon. This part may not be used in the visible future, so you can back up these data and delete them from the active database, the backup here can be a full backup, which can be used to back up data into files (such as text files) and save them. Import it as needed. In this case, there is hardly any security problem in the database.

In this case, no demonstration is required. You can use SSIS, BCP, and SQLServer import and export tools to export the data to be deleted into files, and then delete the corresponding data in the active Library.

 

the above three situations depend on the actual situation, especially the business requirements. There is no better statement. However, data must be backed up and saved in either way to avoid permanent data loss. In the preceding three cases, you can use proxy jobs to implement automated operations. Only regular inspection and monitoring are required.

Selected, One-Stop Store for Enterprise Applications
Support various scenarios to meet companies' needs at different stages of development

Start Building Today with a Free Trial to 50+ Products

Learn and experience the power of Alibaba Cloud.

Sign Up Now