Data archiving moves cold data from your source database to an Object Storage Service (OSS) bucket, freeing up production storage and reducing costs. After archiving, you can query the data in OSS and, for MySQL-based sources, restore it back to the original database.
Prerequisites
Before you begin, ensure that you have:
A source database of one of the following types:
MySQL: ApsaraDB RDS for MySQL, PolarDB for MySQL, or AnalyticDB for MySQL V3.0
PostgreSQL: ApsaraDB RDS for PostgreSQL or PolarDB for PostgreSQL
ApsaraDB for MongoDB 3.6 or later (public preview)
NoteApsaraDB for MongoDB is in public preview as a source database for data archiving. Data that is archived to dedicated storage can only be used for data query and cannot be restored to ApsaraDB for MongoDB databases.
PolarDB-X 2.0
Source and destination database instances managed in Stable Change or Security Collaboration mode
ImportantFor periodic archiving tasks with instances in Stable Change mode, enable security hosting or switch the control mode to Security Collaboration to prevent task failures caused by instance logon expiration. See Enable security hosting and Change the control mode of an instance.
A Standard or Infrequent Access (IA) OSS bucket. To create a bucket, see Get started by using the OSS console.
The OSS bucket registered with Data Management (DMS) over a public endpoint
Important- Registering over an internal endpoint prevents data archiving. If you have questions, contact technical support in the DingTalk group (ID: 19040023327). - Before registering the OSS bucket, obtain the AccessKey ID and AccessKey secret of your Alibaba Cloud account. See Create an AccessKey pair. For registration steps, see Register an Alibaba Cloud database instance.
Source tables with a primary key or unique key
A timestamp field in source tables lets you use it as a filter condition to select data to archive — for example, records older than six months.
Usage notes
If a source table has a composite primary key that consists of three columns or has three or more unique keys, the archiving task may fail. Fully test these source tables before archiving, or consider other methods, such as Database Backup Service (DBS).
The Data Archiving feature is available only in the Singapore and Indonesia (Jakarta) regions.
If you archive data from an ApsaraDB for MongoDB database, the structure of the archived data may have issues. We recommend that you validate and fix the archived data before you clean up the temporary tables.
When you archive data from ApsaraDB for MongoDB to OSS, schemaless data is converted into structured data. To ensure that you can query the archived data in OSS, make sure that all documents within the same ApsaraDB for MongoDB collection have the same field names, number of fields, and data types before archiving. Otherwise, queries may fail.
Limitations
Region: The data archiving feature is available only in the Singapore and Indonesia (Jakarta) regions.
Composite primary key: If a source table has a composite primary key that consists of three columns, or three or more unique keys, the archiving task may fail. Test thoroughly before archiving such tables, or use an alternative archiving method such as Data Disaster Recovery.
MongoDB schema consistency: OSS converts schemaless MongoDB data to structured data. All documents within the same MongoDB collection must have identical field names, quantity, and data types, or queries on the archived data may fail.
MongoDB temporary tables: After archiving MongoDB data, check for schema issues in the archived data, fix any issues, and clean up temporary tables.
Time zone: The data archiving feature operates on UTC+0 for data archiving and post-behavior operations, regardless of the source database's time zone.
Archive data to an OSS bucket
Step 1: Create an archiving ticket
Log on to the DMS console V5.0.
In the top navigation bar, choose Solution > Data Archiving.
In the upper-right corner of the Data Archiving Tickets page, click Data Archiving.
In the Apply step, configure the following parameters.
Parameter Description Task Name A descriptive name for the archiving task. Archiving Destination Select OSS. OSS Instance The destination OSS instance. Select a Standard or IA OSS instance. OSS BUCKET The OSS bucket within the selected instance. Archived data is stored in the archivefolder — do not delete it. If the bucket you need is not listed, log on to the OSS instance in the DMS console first. See Register an OSS bucket.Source Database The source database to archive data from. Archive Configuration The source tables and optional filter conditions. To archive data from a specific time range (for example, data generated more than six months ago), configure time variables in the Variable Configuration section first, then reference them here. Click Add to include more source tables. Archive Table Mapping Destination table settings in the OSS bucket. Find a table and click Edit in the Actions column to configure it. Variable Configuration (Optional) Time variables used in filter conditions. For example, a variable named 6_month_agoinyyyy-MM-ddformat with an offset of -6 months resolves to a date six months before the current date. See Configure time variables.Post-behavior Controls what happens after archiving: - Clean up the archived data of the original table (delete-No Lock): DMS deletes archived data from source tables without locking them, and creates temporary backup tables in the source database.
- Clear this option to keep archived data in the source tables for manual deletion later.
- Verify the archived data (time-consuming): DMS compares archived data against the original and displays any discrepancies. Verification time depends on data volume.
Operation Mode How the task runs after the ticket is approved: - Single execution: Runs once.
- Cyclic scheduling: Runs on a recurring schedule. See Periodic scheduling.
Click Submit.
After the ticket is approved, DMS automatically runs the archiving task.
DMS runs archiving tasks and post-behavior operations based on UTC+0, regardless of the source database's time zone.
Step 2: (Optional) Clean up the source tables
After verifying that the archived data is correct, remove it from the source tables and reclaim storage space.
Delete the archived data from the source tables:
If you selected Clean up the archived data of the original table (delete-No Lock), DMS already deleted the archived data and created temporary backup tables. To drop those backup tables, create a Normal Data Modify ticket.
In the lower section of the Ticket Details panel, click Click here to view the DROP statement used to drop the temporary table and copy the SQL statements.
If you cleared this option, manually delete the archived data by creating a Normal Data Modify ticket.
Reclaim storage space on the source tables:
To optimize storage after deletion, create a lock-free change ticket. See Perform lock-free DDL operations.
In the lower section of the Ticket Details panel, click View SQL Statements Used to Optimize Storage Usage of Source Table and copy the SQL statements.
Step 3: Query the archived data
Restore the archived data to the source database
Billing
Restoring archived data to the source database is free of charge.
Limitations
Only one restoration task can run at a time. Wait for the current task to complete before starting another.
Restoration is supported only when the source database engine is ApsaraDB RDS for MySQL, PolarDB for MySQL, or AnalyticDB for MySQL V3.0.
Restore archived data
On the Data Archiving Tickets page, set View to Ticket View.
Find the ticket and click Details in the Actions column.
On the Ticket Details page, click Execute. In the Execute section, click Restore to Source Database in the Actions column.
In the confirmation dialog, click OK.
DMS creates and automatically runs a data restoration task. Duration depends on the amount of data to restore.
After restoration, DMS creates temporary tables in the source database. This may consume significant storage space. Proceed with caution.
Query the restored data
In the Basic Information section of the Ticket Details page, hover over the database name next to Database to Change, then click Query in the pop-up window.
Double-click the table name and click Execute.
The restored data appears on the Execution History tab in the SQLConsole panel.