All Products
Search
Document Center

Data Management:Archive data to dedicated storage

Last Updated:Mar 26, 2024

This topic describes how to archive data to dedicated storage in Data Management (DMS), transfer data from dedicated storage to Object Storage Service (OSS), and delete the archived data from dedicated storage.

Prerequisites

  • The source database from which you want to archive data is of one of the following types:

    • MySQL: ApsaraDB RDS for MySQL, PolarDB for MySQL, and AnalyticDB for MySQL V3.0

      Note

      If the source database is a MySQL database, the database account of the source database must have the REPLICATION CLIENT permission.

    • PostgreSQL: ApsaraDB RDS for PostgreSQL and PolarDB for PostgreSQL

    • ApsaraDB for MongoDB V3.6 and later

    Note

    Both physical and logical databases are supported.

  • The source database is managed in Stable Change or Security Collaboration mode.

    Note

    If you want to configure a periodic data archiving task and the source database instance is managed in Stable Change mode, we recommend that you enable security hosting for the database instance or change the control mode of the database instance to Security Collaboration. Otherwise, the data archiving task may fail due to instance logon expiration For more information, see the Enable security hosting section of the "Security hosting" topic and Change the control mode of an instance.

  • The source tables whose data you want to archive must have a primary key or a unique key.

    Note

    We recommend that you provide a field that indicates the time of each data modification operation in the source tables. You can use this field as a filter condition when you archive data from the source tables.

Billing

You can use the data archiving feature free of charge during public preview. You will be notified before you are charged for this feature.

Usage notes

  • Data can be archived to dedicated storage without limits on the storage capacity of dedicated storage.

  • The data archiving feature is supported only in the Singapore and Indonesia (Jakarta) regions.

Procedure

  1. Log on to the DMS console V5.0.
  2. In the top navigation bar, choose Solution > Data Archiving.

    Note

    If you use the DMS console in simple mode, move the pointer over the 2023-01-28_15-57-17.png icon in the upper-left corner and choose All functions > Solution > Data Archiving.

  3. In the upper-right corner of the Data ArchivingTickets page, click Data Archiving.

  4. On the Ticket Application page, configure parameters to create a data archiving ticket. The following table describes the parameters.

    Parameter

    Required

    Description

    Archiving Destination

    Yes

    The destination to which you want to archive data. Select Dedicated Storage.

    Source Database

    Yes

    The source database from which you want to archive data.

    Archiving Configuration

    Yes

    The one or more tables whose data you want to archive.

    • Optional. Specify one or more filter conditions for querying data to be archived in the specified tables. Example: gmt_modified<='${6_month_ago}'.

      Note

      If you want to use time variables in scenarios such as archiving data generated six months ago, you can set the variables in the Variable Configuration section before you configure the parameters in the Archive configuration section.

    • Optional. Click Add to add more source tables.

    Archive Table Mapping

    No

    The table settings to be used in dedicated storage. You can view information about the table data archived to dedicated storage, such as the table names, columns, and database shard keys.

    Variable Configuration

    No

    The variables that are used when you configure conditions for filtering archived data. For example, you have created a variable named 6_month_ago in the yyyy-MM-dd format and set the offset to - 6 Month. In this case, if the current date is August 12, 2021, the value of the ${6_month_ago} variable is 2021-02-11, which indicates February 11, 2021. For more information about how to configure time variables, see the Configure time variables section of the "Variables" topic.

    Post-behavior

    No

    • Specify whether to delete the archived data from the source tables. If you select Clean up the archived data of the original table (delete-No Lock), the archived data is automatically deleted from the source tables. You can execute the DELETE statement to delete temporary backup tables. The temporary backup tables are generated in the source database to store the archived data when the source tables are deleted. Make sure that the source database has sufficient storage space to prevent instance unavailability caused by insufficient storage space.

      After the data is archived and you verify that the archived data is correct, you can create a regular data change ticket to clear the temporary backup tables from the source database.

    • If you do not select Clean up the archived data of the original table (delete-No Lock), the archived data is not deleted from the source tables. In this case, you need to manually delete data from the source tables and optimize the storage usage.

      1. To delete the archived data from the source tables, create a regular data change ticket. For more information, see Perform regular data change.

      2. To optimize the storage usage of the source tables, create a lock-free change ticket. For more information, see Perform lock-free DDL operations.

    Operation Mode

    Yes

    The method to be used for running the data archiving task. Valid values:

    • Single execution: After the data archiving ticket is approved, DMS runs the data archiving task only once.

    • Cyclic scheduling: After the data archiving ticket is approved, DMS runs the data archiving task based on the scheduling cycle that you specify. For more information, see the Periodic scheduling section of the "Archive data to a Lindorm instance" topic.

  5. Click Submit.

  6. After the data archiving ticket is approved, DMS automatically executes the data archiving task. Wait until the data archiving task is complete.

    If the data archiving task fails to be executed, you can click Details in the Actions column in the Execute step to view the logs of the data archiving task and identify the cause of the task failure. If an issue occurs due to a network or database connection failure, you can click Retry Breakpoint to restart the task.

    Note

    The archived data is stored in the Parquet format and the data volume is compressed.

  7. Query the archived data. For more information, see the Query data archived to dedicated storage section of the "Query data archived to OSS buckets or dedicated storage" topic.

Restore the archived data to the source database

Billing

You can restore the archived data to the source database free of charge.

Limits

You can use this feature only if the source database engine is ApsaraDB RDS for MySQL, PolarDB for MySQL, or AnalyticDB for MySQL V3.0.

Procedure

  1. On the Data ArchivingTickets page, set the View parameter to Ticket View.

  2. Find the ticket that you want to manage and click Details in the Actions column.

  3. On the Ticket Details page, click Execute. In the Execute section, click Restore to Source Database in the Actions column.

  4. In the message that appears, click OK.

    A data restoration task is created. The system automatically executes the data restoration task. Wait until the task is complete. The duration of this process depends on the amount of data to be restored.

    Important

    After you restore the archived data to the source database, the system automatically creates temporary tables in the source database. This may cause insufficient storage space in the source database. Proceed with caution.

  5. Query the restored data.

    1. In the Basic Information section of the Ticket Details page, move the pointer over the database name next to Database to Change and click Query in the pop-up window.

    2. Double-click the table name and click Execute.

      You can view the restored data on the Execution History tab that appears in the lower part of the SQLConsole tab.

Merge the archived data files

Important

The file merging operation is irreversible. Proceed with caution.

The file merging feature allows you to reorganize and sort the data files that are archived at a time. For example, you can merge two data files into one file. After the files are merged, the query performance of the archived data is improved.

Billing

You can merge the archived data files free of charge.

Usage notes

  • After the archived data files are merged, the merged data file occupies more storage space. In most cases, the merged data file occupies 1.5 to 2 times the storage space that is occupied by the original archived data files.

  • During file merging, you can query the archived data as expected.

Procedure

  1. On the Data ArchivingTickets page, set the View parameter to Ticket View.

  2. Find the ticket that you want to manage and click Details in the Actions column.

  3. On the Ticket Details page, click Execute. In the Execute section, choose More > File Merging in the Actions column.

  4. In the message that appears, click OK.

    A file merging task is created. The system automatically executes the file merging task. Wait until the task is complete. The duration of this process depends on the amount of archived data.

    Note

    After the file merging task is complete, the system automatically replaces the data source with the merged file.

  5. Query the data of the merged file. For more information, see the Query data archived to dedicated storage section of the "Query data archived to OSS buckets or dedicated storage" topic.

Delete the archived data from dedicated storage

Important

After the archived data is deleted from dedicated storage, you cannot restore the deleted data. Proceed with caution.

After you physically delete data that is archived by using a data archiving ticket from dedicated storage, the storage space occupied by the archived data is released.

Usage notes

After the archived data is deleted, you cannot query the archived data or restore the archived data to the source database.

Procedure

  1. On the Data ArchivingTickets page, set the View parameter to Ticket View.

  2. Find the ticket that you want to manage and click Details in the Actions column.

  3. On the Ticket Details page, click Execute. In the Execute section, choose More > Delete in the Actions column.

  4. In the message that appears, click OK.

FAQ

Q: Are tables in a logical database merged into one physical table when I query the archived data of the logical database?

A: Yes. When the data of a logical database is archived, all tables in the logical database are merged into one physical database table. Therefore, on the Data ArchivingTickets page, if you set the View parameter to Archive View, you can select a logical table to query all data of the tables in the logical database.

Q: If periodic archiving is configured for a data archiving ticket, am I able to query the data that is archived within a specific period of time?

A: Yes. If periodic archiving is configured for a data archiving ticket, data is archived within different periods of time based on the scheduling cycle that you specify. Therefore, you can query the data archived within a specific period of time based on a specific point in time when the data archiving task is run.

Q: How do I query all data that is periodically archived from a table or all data that is archived from a table across data archiving tickets?

A: On the Logical data warehouse page, click the Data source tab in the left sidebar. On the Data source tab, choose Dedicated Data Lakehouse > archive. In the database list, find the source database and double-click the name of the table whose data you want to query or the ticket number. The query SQL statement is automatically filled in the statement execution area on the SQLConsole tab. In the upper-left corner of the SQLConsole tab, click Execute. Then, you can view the archived data on the Execution History tab that appears in the lower part of the Logical data warehouse page.

Q: How much storage space does data of the temporary backup tables in the source database occupy after I select Clean up the archived data of the original table (delete-No Lock) when I archive data from an ApsaraDB RDS database to dedicated storage?

A: The data of the temporary tables occupies up to twice the storage space that is occupied by the data in the source database.