All Products
Search
Document Center

Data Management:Back up an on-premises database or a cloud database from a third-party provider

Last Updated:Mar 12, 2025

Data Disaster Recovery supports Alibaba Cloud databases, self-managed databases hosted on the Elastic Compute Service (ECS) instances, on-premises databases, and cloud databases from third-party providers. The topic describes how to back up an on-premises database or a cloud database from a third-party provider.

Usage notes

If the databases and tables that you want to back up have issues, such as unreasonable table schema, large tables, and large fields, the backup schedule may have insufficient resources and may cause backup exceptions. Therefore, we recommend that you select a large specification type when you create a backup schedule to avoid exceptions in subsequent backups.

Procedure

Automatic backup for services

Purchase a backup schedule (logical backup)

  1. Log on to the DMS console V5.0.
  2. In the top navigation bar, choose Security and Specifications (DBS) > Disaster Recovery for Data (DBS) > Disaster Recovery Data Source.

    Note

    If you use the DMS console in simple mode, move the pointer over the 2023-01-28_15-57-17.png icon in the upper-left corner of the DMS console and choose All Features > Security and Specifications (DBS) > Disaster Recovery for Data (DBS) > Disaster Recovery Data Source.

  3. In the upper part of the page, select a region. On the Disaster Recovery Data Source page, choose On-premise Database and Cloud Database from Third-party Provider > Automatic Backup for Users, and click the ID of the data source that you want to use to go to the details page.

  4. On the Backup Policies page, click Configure Backup Policy.

  5. In the Select Backup Schedule step, click Purchase Backup Schedule to go to the buy page.

  6. Configure the following parameters and click Buy Now in the lower-right corner of the page.

    Parameter

    Description

    Product Type

    Select Backup Schedule.

    Region

    The region in which you want to store the backup data.

    Note

    Make sure that the backup schedule and the Elastic Compute Service (ECS) instance reside in the same region.

    Data Source Type

    Set the value to MySQL.

    Specification

    The backup schedule specifications that you want to use. Higher specifications offer higher backup and restoration performance. DBS supports the following backup schedule specifications: micro, small, medium, large, and xlarge. The xlarge specification type provides extra large specifications without an upper limit on the amount of backup data.

    Note
    • To ensure fast backup and restoration of specific database instances, such as database instances in the production environment, we recommend that you select xlarge or Large.

    • If you do not require high backup and restoration performance, you can select the most cost-effective backup schedule type based on your business requirements. For more information, see Select a backup schedule type.

    • If the databases and tables that you want to back up involve issues such as unreasonable schemas, large tables, and large fields, the resources of a backup instance of lower specifications may be insufficient to complete the backup. As a result, a backup error occurs. We recommend that you purchase a backup instance of higher specifications to prevent backup errors.

    Backup Method

    Select Logical Backup.

    Storage Size

    You do not need to select a capacity when you create the backup schedule. You are charged based on the amount of data that is stored in Data Disaster Recovery. For more information, see Storage fees.

    Resource Group

    The resource group that is used by the backup schedule. You can use the default resource group or select a resource group based on your business requirements.

    Quantity

    The number of backup schedules that you want to purchase. To back up multiple database instances, you must purchase multiple backup schedules. For example, if you want to back up Database Instance A and Database Instance B, you must purchase two backup schedules.

    Subscription Duration

    The subscription duration of the backup schedule that you want to purchase.

  7. On the Confirm Order page, confirm the order information, read and select the terms of services, and then click Pay.

    After you complete the payment, go back to the Select Backup Schedule step and click Paid to view the created backup schedule.

    image

Configure the backup policy

  1. In the Select Backup Schedule step, select the backup schedule that you want to configure and click Next.

    image

  2. In the Select Database and Table step, select the databases and tables that you want to back up, click the image icon to move them to the Selected Objects section, and then click Submit.

    image..png

  3. On the Backup Policy page, click the Logical Backup tab and then click Start to initiate backups.

    After you click Start, the system immediately initiates a full backup and an incremental backup.

    image

    Note

    If you want to perform other operations, such as modifying the backup policy, you can skip this step. Data Disaster Recovery automatically initiates the backup later based on the backup policy.

Automatic backup for users

Important
  • Only database instances that run MySQL 5.5 are supported.

  • Only the China (Hangzhou) region is supported.

Configure the backup source and upload the backup file

  1. Log on to the DMS console V5.0.
  2. In the top navigation bar, choose Security and Specifications (DBS) > Disaster Recovery for Data (DBS) > Disaster Recovery Data Source.

    Note

    If you use the DMS console in simple mode, move the pointer over the 2023-01-28_15-57-17.png icon in the upper-left corner of the DMS console and choose All Features > Security and Specifications (DBS) > Disaster Recovery for Data (DBS) > Disaster Recovery Data Source.

  3. In the upper part of the page, select a region. On the Disaster Recovery Data Source page, click the On-premise Database and Cloud Database from Third-party Provider tab, and add the data source based on the data source type.

  4. Click Add Data Source. In the dialog box that appears, configure the following parameters, select the backup schedule that you want to use, and then click Next.

    image

    Parameter

    Description

    Data Source Name

    We recommend that you use a descriptive name that is easy to identify.

    Engine Type

    The type of database engine. Only MySQL is supported.

    Engine Version

    The engine version of the database that you want to back up.

    Engine Parameters

    {"lower_case_table_names":1}

    If no backup schedule is available, click Purchase Backup Schedule to go to the buy page and purchase a backup schedule.

    Description

    Product Type

    Select Backup Schedule. The pay-as-you-go billing method is not supported.

    Region

    The region in which you want to store the backup data.

    Data Source Type

    Select MySQL.

    Specification

    Select xmicro. For more information about the free quota provided by the xmicro specification type, see Backup schedule types.

    Backup Method

    Select Physical Backup.

    Storage Size

    You do not need to select a capacity when you create the backup schedule. You are charged based on the amount of data that is stored in Data Disaster Recovery. For more information, see Storage fees.

    Resource Group

    The resource group that is used by the backup schedule. You can use the default resource group or select a resource group based on your business requirements.

    Quantity

    The number of backup schedules that you want to purchase. To back up multiple database instances, you must purchase multiple backup schedules. For example, if you want to back up Database Instance A and Database Instance B, you must purchase two backup schedules.

    Subscription Duration

    The subscription duration of the backup schedule that you want to purchase.

  5. Upload the backup set to the specified bucket. For more information, see Upload data (automatic backup for users).

  6. Click OK.

View the backup information

On the Automatic Backup for Users tab, click the ID of the data source that you want to manage.

image

image

Configure the backup policy

  1. On the Automatic Backup for Users tab, find the data source that you want to manage and click View Backup Policy in the Actions column.

    image

    image

  2. Click OK.

View and download backup data

  1. On the Automatic Backup for Users tab, click the ID of the data source that you want to manage.

  2. In the left-side navigation pane, click Backup Data.

    Note

    After you upload the data scripts and create a data source, when a new backup set is generated, the system automatically synchronizes the backup set to the Backup Data page.

  3. Click Download in the Actions column of the backup set to download the backup set.

Create a restoration task

Note

To restore data, the backup set must be displayed on the Data Backup page and must be in the Completed state.

  1. On the Automatic Backup for Users tab, click the ID of the data source that you want to manage.

  2. In the left-side navigation pane, click Backup Data. On the Logical Backup tab, click Create Restore Task to configure the following restoration parameters.

    Parameter

    Description

    Task Name

    The name of the restoration task. We recommend that you use a descriptive name that is easy to identify.

    Restore To

    The destination database instance. Default value: New Instance(RDS).

    Database Location

    The location of the destination database instance. Default value: RDS.

    Instance Region

    The region where the destination database instance resides. Only the China (Hangzhou) region is supported.

    VPC

    The virtual private cloud (VPC) in which the destination database instance resides.

    VSwitch

    The vSwitch that is connected to the destination database.

    Instance Edition

    The edition of the destination database instance.

    Instance Specifications

    The type of the destination database instance.

    Storage Space

    The storage space required for the destination database instance.

    Restore Mode

    You can restore the database only by point in time.

    Restore Time

    The point in time to restore the backup sets. The available time range is displayed after the Restore Mode parameter.

  3. After the configuration is complete, click Submit. A restoration task is created. The task information is displayed on the Restore Tasks page.

View a restoration task

  1. On the Automatic Backup for Users tab, click the ID of the data source that you want to manage.

  2. In the left-side navigation pane, choose Task Management > Restore Tasks.

  3. Click the instance ID in the Restoration Result column to go to the Basic Information page of the RDS instance to which the backup sets are restored.

View the recovery drill

  1. On the Automatic Backup for Users tab, click the ID of the data source that you want to manage.

  2. In the left-side navigation pane, click Recovery Drill.

    Recovery drill metrics

    The recovery drill metrics are Recovery Task Success Rate, Average Recovery Duration, Data Backup Drill Coverage Rate, and Log Backup Drill Coverage Rate.

    Metric

    Description

    Recovery Task Success Rate

    The success rate of restoration tasks that restore data to points in time within the specified period of time.

    Average Recovery Duration

    The average duration of successful restoration tasks within the specified period of time.

    Data Backup Drill Coverage Rate

    The coverage rate of the data recovery drills that are performed on open source MySQL instances or ApsaraDB RDS for MySQL instances within the specified period of time.

    Log Backup Drill Coverage Rate

    The coverage rate of the log recovery drills that are performed on open source MySQL instances or ApsaraDB RDS for MySQL instances within the specified period of time.

    Recovery drill timeline

    The timeline displays the recovery drill details at each point in time within the specified period of time. You can click the timeline to view the drill information at the current point in time.

    Recovery drill details

    Click the Data Backup tab to view the recovery drill details for data backup. Click the instance ID in the Drill Result column to go to the Basic Information page of the restored RDS instance.

    Click the Log Backup tab to view the recovery drill details for log backup.

Upload data (automatic backup for users)

Preparations

  • Configure a data source and obtain the ID of the data source. For more information about how to configure a data source, see Add a data source.

  • Create a RAM user, grant the RAM user the required permissions to manage the specified instance, and prepare the AccessKey ID and AccessKey secret. For more information, see Create a RAM user and Grant permissions to a RAM user.

Dependencies

  • Command tools: Bash and Python3.

  • Python libraries: oss2, alibabacloud_openapi_util, alibabacloud_tea_openapi, and alibabacloud_tea_util.

## Install the Alibaba Cloud SDK.
pip3 install --upgrade pip
pip3 install -i https://mirrors.aliyun.com/pypi/simple/ oss2 alibabacloud_openapi_util alibabacloud_tea_openapi alibabacloud_tea_util

Upload the complete scripts

You need to upload backup data by using the following scripts:

Bash script

Replace with the actual data source parameters as needed.

boot_backup.sh: simulates the process of a full backup by using the xtrabackup tool.

#!/bin/bash
# The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations is a high-risk operation. We recommend that you create and use a RAM user for API access or routine O&M.
## Do not save the AccessKey pair in the code. This may lead to key leakage. You can save the Access pair in a configuration file based on your business requirements.

AK=<ALIBABA_CLOUD_ACCESS_KEY_ID>
SK=<ALIBABA_CLOUD_ACCESS_KEY_SECRET>
DBS_API=dbs-api.<ID of the region where your data source resides, for example, cn-hangzhou>.aliyuncs.com
DATASOURCE_ID=<Your data source ID>

## Obtain the current path of the script.
BASE_PATH=$(cd `dirname $0`; pwd)

TS=`date +%Y%m%d_%H%M%S`
XTRA_DATA_DIR=~/tool/xtrabackup_data/$TS

mkdir -p $XTRA_DATA_DIR

## Run the xtrabackup backup command to output the error log to the xtrabackup.log file.
~/innobackupex --defaults-file=/etc/my.cnf --user='root' --password='root' --host='localhost' --port=3306 --socket='/var/lib/mysql/mysql.sock' --parallel=4 $XTRA_DATA_DIR/ 2>$XTRA_DATA_DIR/xtrabackup.log

## Obtain the directory of the current xtrabackup backup.
backup_dir=`ls $XTRA_DATA_DIR | grep -v xtrabackup.log | head -n1`
echo -e "\033[33mexecute innobackupex success, backup_dir: $backup_dir" && echo -n -e "\033[0m" && chmod 755 $XTRA_DATA_DIR/$backup_dir

## Package the data in a tar.gz file.
cd $XTRA_DATA_DIR/$backup_dir && tar -czvf ../$backup_dir.tar.gz . && file ../$backup_dir.tar.gz
echo -e "\033[33mpackage to $backup_dir.tar.gz" && echo -n -e "\033[0m" && sleep 2

## Use the Python script to upload the backup data and register the backup set metadata.
python3 $BASE_PATH/upload_and_register_backup_set.py --access_key_id $AK --access_key_secret $SK --endpoint $DBS_API --datasource_id $DATASOURCE_ID --region_code=cn-hangzhou --data_type=FullBackup --file_path=$XTRA_DATA_DIR/$backup_dir.tar.gz --xtrabackup_log_path=$XTRA_DATA_DIR/xtrabackup.log

Python script

  • upload_and_register_backup_set.py: uploads data of the full backup and log backup, parse the corresponding metadata, and register the backup set metadata.

    # -*- coding: utf-8 -*-
    # This file is auto-generated, don't edit it. Thanks.
    import os
    import argparse
    import re
    import time
    import json
    from datetime import datetime
    
    import oss2
    from alibabacloud_openapi_util.client import Client as OpenApiUtilClient
    from alibabacloud_tea_openapi import models as open_api_models
    from alibabacloud_tea_openapi.client import Client as OpenApiClient
    from alibabacloud_tea_util import models as util_models
    
    import xtrabackup_info_parser
    import xtrabackup_log_parser
    
    def init_command_args():
        parser = argparse.ArgumentParser(description="A sample command-line parser.")
        parser.add_argument("--access_key_id", help="Aliyun AccessKeyId.")
        parser.add_argument("--access_key_secret", help="Aliyun AccessKeySecret.")
        parser.add_argument("--endpoint", help="Aliyun API Endpoint.")
        parser.add_argument("--region_code", help="Aliyun DataSource RegionCode.")
        parser.add_argument("--datasource_id", help="Aliyun DataSourceId.")
        parser.add_argument("--data_type", help="BackupSet DataType: FullBackup | LogBackup.")
        parser.add_argument("--file_path", help="BackupSet File Path.")
        parser.add_argument("--xtrabackup_info_path", help="Xtrabackup Info Path.")
        parser.add_argument("--xtrabackup_log_path", help="Xtrabackup Log Path.")
        parser.add_argument("--begin_time", help="Binlog Begin Time.")
        parser.add_argument("--end_time", help="Binlog End Time.")
    
        args = parser.parse_args()
        if args.access_key_id:
            print(f"access_key_id: ************")
        if args.access_key_secret:
            print(f"access_key_secret: ************")
        if args.endpoint:
            print(f"endpoint: {args.endpoint}")
        if args.region_code:
            print(f"region_code: {args.region_code}")
        if args.datasource_id:
            print(f"datasource_id: {args.datasource_id}")
        if args.data_type:
            print(f"data_type: {args.data_type}")
        if args.file_path:
            print(f"file_path: {args.file_path}")
        if args.xtrabackup_info_path:
            print(f"xtrabackup_info_path: {args.xtrabackup_info_path}")
        if args.xtrabackup_log_path:
            print(f"xtrabackup_log_path: {args.xtrabackup_log_path}")
        if args.begin_time:
            print(f"begin_time: {args.begin_time}")
        if args.end_time:
            print(f"end_time: {args.end_time}")
    
        print('\n')
        return args
    
    
    def date_to_unix_timestamp(date_str):
        dt_obj = datetime.strptime(date_str, "%Y-%m-%d %H:%M:%S")
        # Use the .time() method to obtain the time tuple, and use time.mktime to convert the time tuple to a timestamp accurate to second.
        timestamp_seconds = time.mktime(dt_obj.timetuple())
        return int(timestamp_seconds) * 1000
    
    
    def create_oss_client(params):
        # Alibaba Cloud Object Storage Service (OSS) authentication information
        access_key_id = params['AccessKeyId']
        access_key_secret = params['AccessKeySecret']
        security_token = params['SecurityToken']
        bucket_name = params['BucketName']
        endpoint = params['OssEndpoint']
    
        # Initialize the OSS client.
        auth = oss2.StsAuth(access_key_id, access_key_secret, security_token)
        return oss2.Bucket(auth, endpoint, bucket_name)
    
    
    def upload_oss_file(oss_client, file_path, object_key):
        """
        Run a multipart upload task to upload a large object to OSS.
        :param oss_client:
        : param file_path: the local file path.
        : param object_key: the object key in OSS, which is a file name.
        """
        # Set the part size in bytes. The default value is 1 MB.
        part_size = 1024 * 1024 * 5
        # Initiate the multipart upload task.
        upload_id = oss_client.init_multipart_upload(object_key).upload_id
    
        # Open the file and read the content.
        with open(file_path, 'rb') as file_obj:
            parts = []
            while True:
                data = file_obj.read(part_size)
                if not data:
                    break
                # Upload parts.
                result = oss_client.upload_part(object_key, upload_id, len(parts) + 1, data)
                parts.append(oss2.models.PartInfo(len(parts) + 1, result.etag))
    
            # Complete multipart upload.
            oss_client.complete_multipart_upload(object_key, upload_id, parts)
    
    
    class OssUploader:
    
        def __init__(self, access_key_id, access_key_secret, endpoint, region_code, datasource_id):
            self.access_key_id = access_key_id
            self.access_key_secret = access_key_secret
            self.endpoint = endpoint
            self.region_code = region_code
            self.datasource_id = datasource_id
    
            config = open_api_models.Config(access_key_id, access_key_secret)
            # For more information about endpoints, visit https://api.aliyun.com/product/Rds.
            config.endpoint = endpoint
            self.client = OpenApiClient(config)
    
        """
        Register backup set metadata.
        """
        def configure_backup_set_info(self, req_param):
            params = open_api_models.Params(
                # The operation that you want to call.
                action='ConfigureBackupSetInfo',
                # The version number of the operation.
                version='2021-01-01',
                # The protocol of the operation.
                protocol='HTTPS',
                # The HTTP method of the operation.
                method='POST',
                auth_type='AK',
                style='RPC',
                # The URL of the operation.
                pathname='/',
                # The format of the request body.
                req_body_type='json',
                # The format of the response body.
                body_type='json'
            )
            # runtime options
            runtime = util_models.RuntimeOptions()
            request = open_api_models.OpenApiRequest(
                query=OpenApiUtilClient.query(req_param)
            )
            # The response is of the MAP type, which contains the response body, response headers, and HTTP status code. 
            print(f"ConfigureBackupSetInfo request: {req_param}")
            data = self.client.call_api(params, request, runtime)
    
            print(f"ConfigureBackupSetInfo response: {data}")
            return data['body']['Data']
    
        """
        Obtain the OSS upload information.
        """
        def describe_bak_datasource_storage_access_info(self, req_param):
            params = open_api_models.Params(
                # The operation that you want to call.
                action='DescribeBakDataSourceStorageAccessInfo',
                # The version number of the operation.
                version='2021-01-01',
                # The protocol of the operation.
                protocol='HTTPS',
                # The HTTP method of the operation.
                method='POST',
                auth_type='AK',
                style='RPC',
                # The URL of the operation.
                pathname='/',
                # The format of the request body.
                req_body_type='json',
                # The format of the response body.
                body_type='json'
            )
            # runtime options
            runtime = util_models.RuntimeOptions()
            request = open_api_models.OpenApiRequest(
                query=OpenApiUtilClient.query(req_param)
            )
            # The response is of the MAP type, which contains the response body, response headers, and HTTP status code. 
            print(f"DescribeBakDataSourceStorageAccessInfo request: {req_param}")
            data = self.client.call_api(params, request, runtime)
    
            print(f"DescribeBakDataSourceStorageAccessInfo response: {data}")
            return data['body']['Data']
    
        def _fetch_oss_access_info(self, params):
            info = self.describe_bak_datasource_storage_access_info({
                'RegionId': params['RegionId'],
                'DataSourceId': params['DataSourceId'],
                'RegionCode': params['RegionCode'],
                'BackupType': params['BackupType'],
                'BackupSetId': params['BackupSetId']
            })
            return info['OssAccessInfo']
    
        def upload_and_register_backup_set(self, file_path, data_type, extra_meta):
            filename = os.path.basename(file_path)
            params = {'BackupMode': 'Automated', 'BackupMethod': 'Physical', 'RegionId': self.region_code,
                      'RegionCode': self.region_code, 'DataSourceId': self.datasource_id, 'BackupSetName': filename,
                      'ExtraMeta': extra_meta, 'BackupType': data_type, 'UploadStatus': 'WaitingUpload'}
    
            # Register a backup set for the first time. The ID of the backup set is returned.
            data = self.configure_backup_set_info(params)
            params['BackupSetId'] = data['BackupSetId']
            print(f"------ configure_backup_set_info success: {file_path}, {data_type}, {params['BackupSetId']}, WaitingUpload\n")
    
            # Upload data to OSS.
            oss_info = self._fetch_oss_access_info(params)
            oss_client = create_oss_client(oss_info)
            upload_oss_file(oss_client, file_path, oss_info['ObjectKey'])
            print(f"------ upload_oss_file success: {file_path}, {data_type}, {params['BackupSetId']}\n")
    
            # Mark that the backup set is uploaded.
            params['UploadStatus'] = 'UploadSuccess'
            self.configure_backup_set_info(params)
            print(f"------ configure_backup_set_info success: {file_path}, {data_type}, {params['BackupSetId']}, UploadSuccess\n")
    
    
    if __name__ == '__main__':
        args = init_command_args()
        uploader = OssUploader(args.access_key_id, args.access_key_secret,
                               args.endpoint, args.region_code, args.datasource_id)
    
        # Construct extraMeta in different methods for full backup and log backup.
        extra_meta = '{}'
        if args.data_type == 'FullBackup':
            obj = {}
            if args.xtrabackup_log_path is not None:
                obj = xtrabackup_log_parser.analyze_slave_status(logpath=args.xtrabackup_log_path)
            elif args.xtrabackup_info_path is not None:
                parser = xtrabackup_info_parser.ExtraMetaParser(file_path=args.xtrabackup_info_path)
                obj = parser.get_extra_meta()
            extra_meta = {'BINLOG_FILE': obj.get('BINLOG_FILE'),
                          'version': obj.get("SERVER_VERSION"),
                          'dataBegin': date_to_unix_timestamp(obj.get("START_TIME")),
                          'dataEnd': date_to_unix_timestamp(obj.get("END_TIME")),
                          'consistentTime': int(date_to_unix_timestamp(obj.get("END_TIME")) / 1000)}
            extra_meta = json.dumps(extra_meta)
    
        elif args.data_type == 'LogBackup':
            obj = {'dataBegin': date_to_unix_timestamp(args.begin_time),
                   'dataEnd': date_to_unix_timestamp(args.end_time)}
            extra_meta = json.dumps(obj)
        print(f"get extra meta json string: {extra_meta}")
    
        # Upload data and register backup set metadata.
        uploader.upload_and_register_backup_set(file_path=args.file_path, data_type=args.data_type, extra_meta=extra_meta)
  • xtrabackup_info_parser.py: parses metadata by using the xtrabackup_info file.

    # -*- coding: utf-8 -*-
    # This file is auto-generated, don't edit it. Thanks.
    import re
    import json
    
    
    class ExtraMetaParser:
        def __init__(self, file_path):
            self.file_path = file_path
            pass
    
        def _parse_xtrabackup_info(self):
            config_data = {}
            with open(self.file_path, 'r') as file:
                for line in file:
                    line = line.strip()
                    if line and not line.startswith('#'):
                        key, value = line.split('=', 1)
                        config_data[key.strip()] = value.strip()
            return config_data
    
        def get_extra_meta(self):
            config_data = self._parse_xtrabackup_info()
            print(f"xtrabackup_info: {config_data}")
    
            binlog_pos = config_data.get("binlog_pos")
            pattern = f"filename '(.*)', position (.*)"
            match = re.search(pattern, binlog_pos)
    
            return {'BINLOG_FILE': match.group(1),
                    'SERVER_VERSION': config_data.get("server_version"),
                    'START_TIME': config_data.get("start_time"),
                    'END_TIME': config_data.get("end_time")}
  • xtrabackup_log_parser.py: parses metadata by using the xtrabackup.log file.

    #!/usr/bin/python
    # coding:utf8
    
    import io
    import re
    import sys
    from datetime import datetime
    
    from six import binary_type, text_type
    
    
    def parse_date_part(date_part, time_part):
        """
        Parse and return the complete datetime string based on the given date part and time part. 
        """
        # Obtain the first two digits of the current year.
        current_century = datetime.now().strftime("%Y")[:2]
    
        year_short = date_part[:2]
        # Obtain the complete year value.
        year_full = current_century + year_short
        date_full = year_full + date_part[2:]
        datetime_str = date_full + " " + time_part
    
        dt = datetime.strptime(datetime_str, "%Y%m%d %H:%M:%S")
        formatted_datetime = dt.strftime("%Y-%m-%d %H:%M:%S")
        return text_type(formatted_datetime)
    
    
    def analyze_slave_status(logpath=None):
        slave_status = {}
        start_time_pattern = (
            r"(\d{6}) (\d{2}:\d{2}:\d{2}) .*Connecting to MySQL server host:"
        )
        """
            240925 17:46:58 completed OK!
            240925 02:22:58 innobackupex: completed OK!
            240925 02:22:58 xtrabackup: completed OK!
        """
        end_time_pattern = r"(\d{6}) (\d{2}:\d{2}:\d{2}) .*completed OK!"
        with io.open(logpath, "rb") as fp:
            lines = fp.read().splitlines()
            for i in reversed(range(len(lines))):
                line = lines[i]
                if isinstance(line, binary_type):
                    line = line.decode("utf-8")
    
                m = re.search(start_time_pattern, line)
                if m:
                    # Extract the date part and time part.
                    date_part = m.group(1)
                    time_part = m.group(2)
                    slave_status["START_TIME"] = parse_date_part(date_part, time_part)
                    continue
    
                m = re.search(r"Using server version (\S*)", line)
                if m:
                    slave_status["SERVER_VERSION"] = text_type(m.group(1))
                    continue
    
                m = re.search("MySQL binlog position:", line)
                if m:
                    binlog_line = line
                    m = re.search(r"filename '(\S*)'", binlog_line)
                    if m:
                        slave_status["BINLOG_FILE"] = text_type(m.group(1))
                    m = re.search(r"position (\d+)", binlog_line)
                    m2 = re.search(r"position '(\d+)'", binlog_line)
                    if m:
                        try:
                            slave_status["BINLOG_POS"] = int(m.group(1))
                        except ValueError:
                            pass
                    elif m2:
                        try:
                            slave_status["BINLOG_POS"] = int(m2.group(1))
                        except ValueError:
                            pass
                    continue
    
                m = re.search("consistent_time (\d+)", line)
                if m:
                    try:
                        slave_status["CONSISTENT_TIME"] = int(m.group(1))
                    except ValueError:
                        pass
                    continue
    
                m = re.search(end_time_pattern, line)
                if m:
                    date_part = m.group(1)
                    time_part = m.group(2)
                    slave_status["END_TIME"] = parse_date_part(date_part, time_part)
                    continue
    
        return slave_status
    
    if __name__ == "__main__":
        logpath = sys.argv[1]
        slave_status = analyze_slave_status(logpath)
        print(slave_status)
    

Process the Bash script

  1. Configure the AccessKey ID, AccessKey secret, DBS_API endpoint, and data source ID.

    #!/bin/bash
    # The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations is a high-risk operation. We recommend that you create and use a RAM user for API access or routine O&M.
    ## Do not save the AccessKey pair in the code. This may lead to key leakage. You can save the Access pair in a configuration file based on your business requirements.
    
    AK=<ALIBABA_CLOUD_ACCESS_KEY_ID>
    SK=<ALIBABA_CLOUD_ACCESS_KEY_SECRET>
    DBS_API=dbs-api.<ID of the region where your data source resides, for example, cn-hangzhou>.aliyuncs.com
    DATASOURCE_ID=<Your data source ID>
  2. Run the xtrabackup backup command to save the backup data to the specified directory and output the error log to the xtrabackup.log file.

    ## Obtain the current path of the script.
    BASE_PATH=$(cd `dirname $0`; pwd)
    
    TS=`date +%Y%m%d_%H%M%S`
    XTRA_DATA_DIR=~/tool/xtrabackup_data/$TS
    
    mkdir -p $XTRA_DATA_DIR
    
    ## Run the xtrabackup backup command to output the error log to the xtrabackup.log file.
    ~/innobackupex --defaults-file=/etc/my.cnf --user='<Your database account, for example, root>' --password='<Your database password, for example, root>' --host='<Your database IP address, for example, localhost>' --port=<Your database port number, for example, 3306> --socket='parallel /var/lib/mysql/mysql.sock' --parallel=4 $XTRA_DATA_DIR/ 2>$XTRA_DATA_DIR/xtrabackup.log 
    • The xtrabackup.log file is parsed to obtain the metadata of the full backup set. The metadata includes the backup start time, backup end time, consistency point in time, and the name of the corresponding binary log.

    • Limits:

      The xtrabackup backup command does not support the compression and encryption parameters (--compress --compress-threads and -- encrypt --encrypt-key-file --encrypt-threads). Before you use the xtrabackup backup command, delete the corresponding parameters.

  3. Compress the full backup data in directory format into a tar.gz file by using gzip compression.

    ## Obtain the directory of the current xtrabackup backup.
    backup_dir=`ls $XTRA_DATA_DIR | grep -v xtrabackup.log | head -n1`
    echo -e "\033[33mexecute innobackupex success, backup_dir: $backup_dir" && echo -n -e "\033[0m" && chmod 755 $XTRA_DATA_DIR/$backup_dir
    
    ## Package the data in a tar.gz file.
    cd $XTRA_DATA_DIR/$backup_dir && tar -czvf ../$backup_dir.tar.gz . && file ../$backup_dir.tar.gz
    echo -e "\033[33mpackage to $backup_dir.tar.gz" && echo -n -e "\033[0m" && sleep 2
    • You can upload backup data only in following formats:

      • tar: The directory file is tar-packaged.

      • tar.gz: The directory file is tar-packaged and then gzip-compressed.

    • Precautions:

      • Before you run the tar command to package the file, you need to run the chmod 755 command to modify the file directory permissions.

      • You must go to the folder root directory and then run the tar command to package the file.

  4. Use the upload_and_register_backup_set.py script to upload backup data and register backup set metadata.

    ## Use the Python script to upload backup data and register backup set metadata.
    python3 $BASE_PATH/upload_and_register_backup_set.py --access_key_id $AK --access_key_secret $SK --endpoint $DBS_API --datasource_id $DATASOURCE_ID --region_code=cn-hangzhou --data_type=FullBackup --file_path=$XTRA_DATA_DIR/$backup_dir.tar.gz --xtrabackup_log_path=$XTRA_DATA_DIR/xtrabackup.log

    Parameter

    Description

    --access_key_id

    The AccessKey ID.

    --access_key_secret

    The AccessKey secret.

    --endpoint

    The endpoint of the API operation for Data Disaster Recovery. For more information, see Endpoints.

    --datasource_id

    The ID of the data source for Data Disaster Recovery.

    --region_code

    The information of the corresponding region.

    --data_type

    The type of the backup data. Valid values: FullBackup and LogBackup.

    --file_path

    The path of full backup data.

    --xtrabackup_log_path

    The path of the xtrabackup.log file generated by running the xtrabackup command.

Process the Python script

  1. Use the main function as the entry function.

    Set the data_type parameter to FullBackug or LogBackup and construct the extra_meta information on which metadata registration depends.

    Note
    • BINLOG_FILE: the name of the binary log.

    • dataBegin: the start time of the backup, accurate to in milliseconds.

    • dataEnd: the end time of the backup, accurate to milliseconds.

    • consistentTime: the point in time at which the data in the backup set is consistent, accurate to seconds.

    • The following describes the extra_meta format for full backup:

      {
        'BINLOG_FILE':'mysql-bin.001',
        'version':'5.5',
        'dataBegin':17274********,
        'dataEnd':17274********,
        'consistentTime':17274********
      }
    • The following describes the extra_meta format for log backup:

      {
        'dataBegin':17274********,
        'dataEnd':17274********
      }
  2. Use the OssUploader.upload_and_register_backup_set method to upload backup data and register backup set metadata.

    if __name__ == '__main__':
        args = init_command_args()
        uploader = OssUploader(args.access_key_id, args.access_key_secret,
                               args.endpoint, args.region_code, args.datasource_id)
    
        # Construct extraMeta in different methods for full backup and log backup.
        extra_meta = '{}'
        if args.data_type == 'FullBackup':
            obj = {}
            if args.xtrabackup_log_path is not None:
                obj = xtrabackup_log_parser.analyze_slave_status(logpath=args.xtrabackup_log_path)
            elif args.xtrabackup_info_path is not None:
                parser = xtrabackup_info_parser.ExtraMetaParser(file_path=args.xtrabackup_info_path)
                obj = parser.get_extra_meta()
            extra_meta = {'BINLOG_FILE': obj.get('BINLOG_FILE'),
                          'version': obj.get("SERVER_VERSION"),
                          'dataBegin': date_to_unix_timestamp(obj.get("START_TIME")),
                          'dataEnd': date_to_unix_timestamp(obj.get("END_TIME")),
                          'consistentTime': int(date_to_unix_timestamp(obj.get("END_TIME")) / 1000)}
            extra_meta = json.dumps(extra_meta)
    
        elif args.data_type == 'LogBackup':
            obj = {'dataBegin': date_to_unix_timestamp(args.begin_time),
                   'dataEnd': date_to_unix_timestamp(args.end_time)}
            extra_meta = json.dumps(obj)
        print(f"get extra meta json string: {extra_meta}")
    
        # Upload data and register backup set metadata.
        uploader.upload_and_register_backup_set(file_path=args.file_path, data_type=args.data_type, extra_meta=extra_meta)
  3. Use the OssUploader.upload_and_register_backup_set method to back up the data upload process.

    class OssUploader:
    
        def __init__(self, access_key_id, access_key_secret, endpoint, region_code, datasource_id):
            self.access_key_id = access_key_id
            self.access_key_secret = access_key_secret
            self.endpoint = endpoint
            self.region_code = region_code
            self.datasource_id = datasource_id
    
            config = open_api_models.Config(access_key_id, access_key_secret)
            # For more information about endpoints, visit https://api.aliyun.com/product/Rds.
            config.endpoint = endpoint
            self.client = OpenApiClient(config)
    
        """
        Register backup set metadata.
        """
        def configure_backup_set_info(self, req_param):
            params = open_api_models.Params(
                # The operation that you want to call.
                action='ConfigureBackupSetInfo',
                # The version number of the operation.
                version='2021-01-01',
                # The protocol of the operation.
                protocol='HTTPS',
                # The HTTP method of the operation.
                method='POST',
                auth_type='AK',
                style='RPC',
                # The URL of the operation.
                pathname='/',
                # The format of the request body.
                req_body_type='json',
                # The format of the response body.
                body_type='json'
            )
            # runtime options
            runtime = util_models.RuntimeOptions()
            request = open_api_models.OpenApiRequest(
                query=OpenApiUtilClient.query(req_param)
            )
            # The response is of the MAP type, which contains the response body, response headers, and HTTP status code. 
            print(f"ConfigureBackupSetInfo request: {req_param}")
            data = self.client.call_api(params, request, runtime)
    
            print(f"ConfigureBackupSetInfo response: {data}")
            return data['body']['Data']
    
        """
        Obtain the OSS upload information.
        """
        def describe_bak_datasource_storage_access_info(self, req_param):
            params = open_api_models.Params(
                # The operation that you want to call.
                action='DescribeBakDataSourceStorageAccessInfo',
                # The version number of the operation.
                version='2021-01-01',
                # The protocol of the operation.
                protocol='HTTPS',
                # The HTTP method of the operation.
                method='POST',
                auth_type='AK',
                style='RPC',
                # The URL of the operation.
                pathname='/',
                # The format of the request body.
                req_body_type='json',
                # The format of the response body.
                body_type='json'
            )
            # runtime options
            runtime = util_models.RuntimeOptions()
            request = open_api_models.OpenApiRequest(
                query=OpenApiUtilClient.query(req_param)
            )
            # The response is of the MAP type, which contains the response body, response headers, and HTTP status code. 
            print(f"DescribeBakDataSourceStorageAccessInfo request: {req_param}")
            data = self.client.call_api(params, request, runtime)
    
            print(f"DescribeBakDataSourceStorageAccessInfo response: {data}")
            return data['body']['Data']
    
        def _fetch_oss_access_info(self, params):
            info = self.describe_bak_datasource_storage_access_info({
                'RegionId': params['RegionId'],
                'DataSourceId': params['DataSourceId'],
                'RegionCode': params['RegionCode'],
                'BackupType': params['BackupType'],
                'BackupSetId': params['BackupSetId']
            })
            return info['OssAccessInfo']
    
        def upload_and_register_backup_set(self, file_path, data_type, extra_meta):
            filename = os.path.basename(file_path)
            params = {'BackupMode': 'Automated', 'BackupMethod': 'Physical', 'RegionId': self.region_code,
                      'RegionCode': self.region_code, 'DataSourceId': self.datasource_id, 'BackupSetName': filename,
                      'ExtraMeta': extra_meta, 'BackupType': data_type, 'UploadStatus': 'WaitingUpload'}
    
            # Register a backup set for the first time. The ID of the backup set is returned.
            data = self.configure_backup_set_info(params)
            params['BackupSetId'] = data['BackupSetId']
            print(f"------ configure_backup_set_info success: {file_path}, {data_type}, {params['BackupSetId']}, WaitingUpload\n")
    
            # Upload data to OSS.
            oss_info = self._fetch_oss_access_info(params)
            oss_client = create_oss_client(oss_info)
            upload_oss_file(oss_client, file_path, oss_info['ObjectKey'])
            print(f"------ upload_oss_file success: {file_path}, {data_type}, {params['BackupSetId']}\n")
    
            # Mark that the backup set is uploaded.
            params['UploadStatus'] = 'UploadSuccess'
            self.configure_backup_set_info(params)
            print(f"------ configure_backup_set_info success: {file_path}, {data_type}, {params['BackupSetId']}, UploadSuccess\n")