This topic describes how to use the data analysis feature in Data Management (DMS).

Background information

A conventional practice of analyzing data is to use offline SQL scripts to periodically collect data from a source database to a data warehouse. The data warehouse can be a Hadoop or MaxCompute data warehouse. Then, the extract, transform, and load (ETL) process is implemented. This method is time-consuming and requires heavy investment in technology. After the data is analyzed, you must wait until the next day to obtain a data analysis report.

To resolve the preceding issues, DMS provides the data analysis feature. This feature combines the synchronization channels of Data Transmission Service (DTS) and the high-performance database analysis engine of AnalyticDB for MySQL. You can use this feature to create a data warehouse for real-time data synchronization with ease. Data can be synchronized to an AnalyticDB for MySQL instance with seconds of latency. This helps you obtain accurate and up-to-date information about your business and facilitates business analysis and decision-making. This way, better business performance is achieved.
Note
  • DTS enables data transmission by using highly stable channels and ensures high data reliability. For more information, see What is DTS?.
  • AnalyticDB for MySQL is a cloud-based data warehouse for massively parallel processing (MPP). You can use AnalyticDB for MySQL to process petabytes of data in a fast, easy, and flexible way and perform highly concurrent write operations. For more information, see What is AnalyticDB for MySQL?

Billing

  • AnalyticDB for MySQL: If you purchase an AnalyticDB for MySQL instance, you are charged for the new instance. For more information, see Billing methods and prices in AnalyticDB for MySQL documentation.
    Note By default, an AnalyticDB for MySQL V3.0 instance with a three-month subscription is created.
  • DTS: You are charged for data synchronization. For more information, see Pricing in DTS documentation.
    Note If you enable the data analysis feature before March 31, 2021, you can use a data synchronization instance of DTS free of charge for two years.

Prerequisites

  • An ApsaraDB RDS for MySQL instance is purchased by using your Alibaba Cloud account. An ApsaraDB RDS for MySQL database is created in the instance as the source database.
  • You have logged on to the ApsaraDB RDS for MySQL database. You do not need to log on to the database if the database instance is managed in Security Collaboration mode in DMS. For more information about control modes, see Control modes.
    Note If you need to synchronize data to an existing AnalyticDB for MySQL instance, register the instance with DMS and log on to the instance. For more information, see Register an ApsaraDB instance.
  • You are a DMS administrator, database administrator (DBA), or regular user.
    Note If you are a regular user, you must have the export permissions on the ApsaraDB RDS for MySQL database and the change permissions on an existing AnalyticDB for MySQL database. You do not need to obtain the change permissions on a database in a new AnalyticDB for MySQL instance. For more information about permissions, see Permission management.

Procedure

  1. Log on to the DMS console V5.0.
    Note To switch to the previous version of the DMS console, click the 5租户头像 icon in the lower-right corner of the page. For more information, see Switch to the previous version of the DMS console.
  2. Set the Data source and Analytics Engine parameters.
    Data analysis configurations
    Parameter Description
    Data source The source database whose data is to be analyzed.
    Analytics Engine The AnalyticDB for MySQL instance to be used for data analysis. Valid values:
    • Purchase AnalyticDB Instance (8 Cores and 32 GB of Memory): indicates that a new AnalyticDB for MySQL instance is to be purchased. If you set the Analytics Engine parameter to this value, set the following parameters:
      • Region: the region where the AnalyticDB for MySQL instance resides. You cannot change the region after the instance is created. To maximize the access speed, we recommend that you select a region in close proximity to the geographic location of your business.
      • Zone: an independent geographical area in the region. The zones that are deployed in the same region are similar.
      • VPC: a virtual private cloud (VPC), which is an isolated network with higher security. For more information, see Default VPC and default vSwitch.
      • vSwitch: a vSwitch in the VPC.
      Note By default, an AnalyticDB for MySQL V3.0 instance with a three-month subscription is created.
    • Select Existing AnalyticDB Instance: the ID of an existing AnalyticDB for MySQL instance.
      Note Only AnalyticDB for MySQL V3.0 instances are supported.
  3. Optional:Set the parameters in the Advanced Settings section.
    Parameter Description
    Destination Database Name The name of the destination database. Valid values:
    • Automatically Create Database with the Same Name: indicates that a database with the same name as the source database is to be created in the destination database instance. This is the default value.
    • Rename Database: the name that you specify for the destination database.
    Synchronization Mode The method to be used to synchronize data. You can use multiple methods at the same time. Valid values:
    • Initial Schema Synchronization: indicates that table schemas in the source database are synchronized to the destination database. If the name of a table in the destination database is the same as that in the source database, an error message is returned. This is the default value.
    • Full Data: indicates that all of the data in the source database is synchronized to the destination database. This is also the default value.
    • Incremental Data: indicates that the incremental data in the source database is synchronized to the destination database. By default, this option is selected and cannot be cleared.
    Synchronization Range The tables to be synchronized in the source database. Valid values:
    • Full Database: indicates that all of the tables in the source database are synchronized to the destination database. This is the default value.
    • Specific Tables: indicates that one or more specified tables are synchronized to the destination database.
  4. Click Submit.
    DMS checks whether you have required permissions on the databases.
  5. Click Submit.
    After the ticket is approved, DMS runs the synchronization task based on the configurations. You can view the progress of the synchronization task in the Data Synchronization step.
    Note After you purchase an AnalyticDB for MySQL instance, the database instance is automatically registered with DMS. In addition, a default account and its password are created for you to log on to the database instance in DMS. To view the information about the new database instance or reset the password, go to the AnalyticDB for MySQL console.

What to do next

You can perform the following operations in the Data analysis step on the details tab of the data analysis ticket:

  • Click SQL Console to analyze data in the data warehouse in real time. For more information, see Overview.
  • Click Task Orchestration to implement the ETL process by using SQL statements. For more information, see Basic concepts.
  • Click Data Visualization to create reports based on real-time data. For more information, see Overview.
  • Click Data Service to create API operations based on real-time data. For more information, see Overview.
Data analysis follow-ups