All Products
Search
Document Center

Report analysis

Last Updated: Oct 09, 2020

This topic describes how to use the task orchestration feature of Data Management Service (DMS) to generate data analysis reports.

Background

Business growth of enterprises brings increasing report requirements. You may need to use reports in the following scenarios:

  • Report the business status and performance to leaders.
  • Analyze business scenarios to find growth points.
  • Collect business statistics, such as daily revenue and daily waybill amount.

Big data platforms have been developed for many years. Enterprises usually use big data platforms to generate data reports. Alternatively, you can use the task orchestration feature of DMS.

We recommend that you use the task orchestration feature of DMS to generate reports because it has the following advantages:

  • Unified management for business and analysis databases. You can manage both business databases and analysis databases in DMS without the need to go to other big data platforms.
  • Multiple database engines for data analysis. You can select the engine of the analysis database based on the size of business data. For example, you can select ApsaraDB for RDS to analyze gigabytes of data and select AnalyticDB to analyze petabytes of data. DMS supports multiple database engines that can meet most data analysis requirements. For more information about the engines supported by DMS, see Instance management.
  • Centralized permission management. DMS manages permissions in a centralized manner to guarantee data security.

Procedure

To use the task orchestration feature of DMS to generate data analysis reports, follow these steps:

  • Import data from the business database to the analysis database.
  • Analyze data in the analysis database.
  • Return analysis results to the business database.

This process requires the following DMS tasks:

  • Single Instance SQL
  • Cross Database SQL

Preparations

  • Prepare a business database that stores data to be analyzed. This topic uses the example data provided by TPC Benchmark™H.
  • Obtain the export permission on the business database, the change permission on the analysis database, and the permission to query the data dictionary. You can apply for the permissions in the Permission module of the DMS console.
  • Create a DMS task flow.

Import data from the business database to the analysis database

You can create a Cross Database SQL task to import data from the business database to the analysis database.

  • Create a Cross Database SQL task named table_sync.
    table_sync

  • Write and execute an SQL statement to import data from the business database to the analysis database.
    7

  • Set variables for the task. The following figure shows the order_date variable in the preceding SQL statement.
    Variable

  • After you set variables as needed, preview the SQL statement to check the logic.

Analyze data in the analysis database

This example aims to obtain the following statistics:

  1. Query the number of customers who place orders, the number of orders, and the total amount of orders per day.
  2. Perform a join query with the lineitem table to query the number of parts, the total discount amount, and the total tax amount of orders per day.

Query the number of customers who place orders, the number of orders, and the total amount of orders per day

  • Create a Single Instance SQL task named daily_orders.
    daily_orders

  • Select the analysis database. Write SQL statements to create a table named daily_orders_summary for storing the analysis results and design the statistic logic.
    8

  • Set variables and save the SQL statements.

Perform a join query with the lineitem table

Perform a join query with the lineitem table to query the number of parts, the total discount amount, and the total tax amount of orders per day.

  • Import data from the lineitem table to the analysis database. For more information, see the “Import data from the business database to the analysis database” section.
  • Write SQL statements and set variables as required to design the statistic logic. For more information, see the statistic logic of the daily_orders task.

Return analysis results to the business database

Create a Cross Database SQL task to return analysis results to the business database. The operations are similar to those of importing data from the business database to the analysis database.

Connect tasks in the DAG

DAG

Set scheduling parameters

  • Click a blank area on the canvas. The property tabs of the task flow appear in the right-side pane.
  • Click the Scheduling tab.
  • Turn on the scheduling switch and set scheduling parameters as required.
    Scheduling tab

Run the task flow

  • Before you run the task flow, you can test the task flow by using the try run feature.
  • To learn about the running status of the task flow, go to the Operation Center tab.