This topic describes how to use the data analysis feature in Data Management (DMS).

Background information

To perform data analysis, a conventional practice is to use offline SQL scripts to periodically collect data from a source database to a data warehouse. The data warehouse can be a Hadoop or MaxCompute data warehouse. Then, the extract, transform, and load (ETL) process is implemented. This method is time-consuming and incurs high technical costs. After the data analysis is performed, you must wait until the next day to obtain a data analysis report.

To resolve the preceding issues, DMS provides the data analysis feature. This feature combines the synchronization channels of Data Transmission Service (DTS) and the high-performance database analysis engine of AnalyticDB for MySQL. You can use this feature to create a data warehouse for real-time data synchronization with ease. Data can be synchronized to an AnalyticDB for MySQL instance at a second-level latency. This helps you accurately track business in real time and facilitates business analysis and decision-making. This way, better business performance is achieved.
Note
  • DTS enables data transmission by using highly stable channels and ensures high data reliability. For more information, see What is DTS?.
  • AnalyticDB for MySQL is a massively parallel processing (MPP)-based data warehouse that is managed on the cloud and can be used to process petabytes of data. You can use AnalyticDB for MySQL to process a huge amount of data in a fast, easy, and flexible way and perform highly concurrent write operations. For more information, see What is AnalyticDB for MySQL?

Billing information

  • AnalyticDB for MySQL: If you do not use an existing AnalyticDB for MySQL instance, you are charged for a new AnalyticDB for MySQL instance. For more information, see Billing methods and prices in AnalyticDB for MySQL documentation.
    Note By default, an AnalyticDB for MySQL V3.0 instance with a three-month subscription is created. If you purchase an AnalyticDB for MySQL instance for the first time, you need only to pay CNY 1 for using the instance for three months.
  • DTS: You are charged for data synchronization. For more information, see Pricing in DTS documentation.
    Note If you enable the data analysis feature before March 31, 2021, you can use a DTS data synchronization instance free of charge for two years.

Prerequisites

  • An ApsaraDB RDS for MySQL database is created as the source database. The ApsaraDB RDS for MySQL instance is purchased by using your Alibaba Cloud account.
  • You have logged on to the ApsaraDB RDS for MySQL database. You do not need to log on to the database if the database instance is managed in Security Collaboration mode in DMS. For more information about control modes, see Control modes.
    Note If you need to synchronize data to an existing AnalyticDB for MySQL instance, register the instance with DMS and log on to the instance. For more information, see Register an ApsaraDB instance.
  • You are a DMS administrator, database administrator (DBA), or regular user.
    Note If you are a regular user, you must have the export permission on the ApsaraDB RDS for MySQL database and the change permission on an existing AnalyticDB for MySQL database. You do not need to obtain the change permission on a database in a new AnalyticDB for MySQL instance. For more information about permissions, see Permission management.

Procedure

  1. Log on to the DMS console.
  2. In the top navigation bar, move the pointer over the More icon and choose Data Plans > Data Analysis.
    Note You can also click the Data analysis icon icon on the SQLConsole tab of the source database.
  3. Set the Data source and Analytics Engine parameters.
    Data analysis configurations
    Parameter Description
    Data source The source database to be analyzed.
    Analytics Engine The AnalyticDB for MySQL engine to be used for data analysis. Valid values:
    • Purchase AnalyticDB Instance (8 Cores and 32 GB of Memory): indicates that a new AnalyticDB for MySQL instance is to be purchased. If you set the Analytics Engine parameter to this value, set the following parameters:
      • Region: the region where the cluster resides. You cannot change the region after the cluster is created. To maximize the access speed, we recommend that you select a region in close proximity to the geographic location of your business.
      • Zone: an independent geographical area in the region. The zones that are deployed in the same region are similar.
      • VPC: the virtual private cloud (VPC) for data analysis, which is an isolated network with higher security. For more information, see Default VPC and default vSwitch.
      • vSwitch: a vSwitch in the VPC.
      Note By default, an AnalyticDB for MySQL V3.0 instance with a three-month subscription is created. If you purchase an AnalyticDB for MySQL instance for the first time, you need only to pay CNY 1 for using the instance for three months.
    • Select Existing AnalyticDB Instance: the ID of an existing AnalyticDB for MySQL instance.
      Note Only AnalyticDB for MySQL V3.0 instances are supported.
  4. Optional:Set the parameters in the Advanced Settings section.
    Parameter Description
    Destination Database Name The name of the destination database. Valid values:
    • Automatically Create Database with the Same Name: indicates that a database with the same name as the source database is to be created in the destination database instance. This value is the default value.
    • Rename Database: the name that you specify for the destination database.
    Synchronization Mode The method to be used to synchronize data. You can use multiple methods at the same time. Valid values:
    • Initial Schema Synchronization: indicates that table schemas in the source database are synchronized to the destination database. If the name of a table in the destination database is the same as that in the source database, an error message is returned. This value is the default value.
    • Full Data: indicates that all of the data in the source database is synchronized to the destination database. This value is also the default value.
    • Incremental Data: indicates that the incremental data in the source database is synchronized to the destination database. By default, this option is selected and cannot be cleared.
    Synchronization Range The tables to be synchronized in the source database. Valid values:
    • Full Database: indicates that all of the tables in the source database are synchronized to the destination database. This value is the default value.
    • Specific Tables: indicates that one or more specified tables are synchronized to the destination database.
  5. Click Submit.
    DMS checks whether you have required permissions on the databases.
  6. Submit the ticket for approval.
    After the ticket is approved, DMS runs the synchronization task based on the configurations. You can view the progress of the synchronization task in the Data Synchronization step.
    Note After you purchase an AnalyticDB for MySQL instance, the database instance is automatically registered with DMS. In addition, a default account and its password are created for you to log on to the database instance in DMS. To view the information about the new database instance or reset the password, go to the AnalyticDB for MySQL console.

What to do next

You can perform the following operations in the Data Analysis step on the details tab of the data analysis ticket:

  • Click SQL Console to analyze data in the data warehouse in real time. For more information, see Overview.
  • Click Task Orchestration to implement the ETL process by using SQL statements. For more information, see Basic concepts.
  • Click Data Visualization to create reports based on real-time data. For more information, see Overview.
  • Click Data Service to create API operations based on real-time data. For more information, see Overview.
Data analysis follow-ups