All Products
Search
Document Center

Historical data cleaning

Last Updated: Jun 28, 2020

To prevent the accumulation of historical data from affecting the stability of the production environment, Data Management Service (DMS) provides the historical data cleaning feature to delete historical data on a regular basis. This topic describes how to use the historical data cleaning feature.

Precautions

  • This feature supports both physical and logical databases.
  • Currently, this feature can only be used for MySQL databases.

Procedure

  1. Log on to the DMS console.

  2. In the top navigation bar, choose Data Plans > Data Changes > History Data Clean.

  3. On the page that appears, set relevant parameters and click Submit to create a data change ticket. The following table describes the parameters.

    Parameter Description
    Reason Category
    The reason for this data change, which helps you easily locate the ticket in subsequent operations.
    Business Background The purpose or objective of this data change. This helps reduce communication costs and accelerate the approval process.
    Change Stakeholder The stakeholders of the data change. All stakeholders can view the ticket details and assist R&D engineers in advancing the approval process. Except for the DMS administrator and database administrator (DBA), users irrelevant to the ticket are not allowed to view the ticket details.
    Database The database on which you have the change permission. You cannot submit a data change ticket if you only have the permission to query databases or change data in tables.
    Deletion Settings The table name, time field, and retention period. The unit of the retention period is days. The time field records the time when a row of data is inserted. The time field is used together with the retention period to determine whether a row of data has been stored for longer than the retention period and needs to be deleted.
    If the table is a logical table, you must enter its logical name rather than its physical name. Based on the preceding three fields, the system will generate an SQL script to delete historical data. For example, if the table name is api_call_record_11, the time field is gmt_create, and the retention period is seven days, the system will generate the following SQL statement:
    DELETE FROM `api_call_record_11` WHERE `gmt_create` < SUBDATE(CURDATE(),INTERVAL 7 DAY);.
    Schedule The time point at which the system begins to run the task. After you set this parameter, the system will generate a standard CRON expression correspondingly.
    In DMS, if you use a scheduled task to delete data, the task scans the full table and batch deletes data based on the primary key or non-null unique key of the table. Therefore, we recommend that you set a task execution time that is during off-peak hours and a reasonable interval at which the scheduled task is run. The minimum time interval is 1 hour. For more information, see Change schemas without locking tables.
    Policy Configuration The policy for running the task. You can specify an end time for task execution, so that the task is automatically suspended after this end time and will not affect business during peak hours.

    1

  4. After you submit the ticket, wait until the system verifies that your configurations are valid. Then, you need to submit the ticket for approval. After the ticket is approved, a scheduled task is generated as configured.

    Before you submit the ticket for approval, you can change the deletion settings and schedule that you have configured. You will not be able to change these configurations after you submit the ticket for approval.

  5. After your ticket is approved, the system automatically generates the scheduled task and sends an email to the ticket owner. You can view details of the scheduled task. You can also perform the following operations:

    • Suspend execution.
    • Restart execution.

      Disable execution: To disable execution, you only need to close the ticket. After you close the ticket, you cannot restart execution unless you submit a new ticket.

    • Change the ticket owner.

      By default, the owner of a ticket is the user who submits the ticket.
      Only the ticket owner has the permissions to suspend and restart execution of a scheduled task. Each time the scheduled task is run, the system sends an email only to the ticket owner.

    2

  6. After the scheduled task is generated, the system will run the SQL script that is generated based on your configurations at the specified task execution time. You can view scheduling information about the task and details of each execution in the ticket.

    Note that before each execution, the system checks if the last execution has been completed. If the last execution is still in progress, the system will not run the SQL script for this execution. Therefore, you need to set a reasonable interval at which the scheduled task is run.