In most cases, Alibaba Cloud users use ApsaraDB RDS or a self-managed database hosted on an Elastic Compute Service (ECS) instance as a business system database. The volume of data stored in the business system database increases as business data increases. However, the computing capabilities of ApsaraDB RDS or the self-managed database hosted on the ECS instance are limited. If you use ApsaraDB RDS or a self-managed database hosted on an ECS instance to create a data warehouse, online business is affected because the database occupies the computing resources of online business. If you use a self-managed open source big data ecosystem such as Hive or Spark, professional big data engineers are required to perform operation and maintenance, which is more complex and expensive than MySQL.

Solution

The one-click data warehousing feature of Data Lake Analytics (DLA) allows you to configure a data source (such as ApsaraDB RDS or a self-managed database hosted on an ECS instance) and a destination data warehouse Object Storage Service (OSS). DLA automatically and seamlessly synchronizes data from a data source to OSS at a specified time. Then, DLA creates a schema that is the same as the table schema of the data source in the DLA console and OSS, and analyzes data based on the data in OSS without affecting online business at the data source end.

Benefits

The one-click data warehousing feature provides the following benefits:

  • Synchronizes data from thousands of tables in a data source such as ApsaraDB RDS or a self-managed database hosted on an ECS instance with one click. No additional configuration is required.
  • Provides services based on the serverless architecture. No instance maintenance is required and no maintenance fee is incurred.
  • Stores data of the data source in OSS during data synchronization. This way, business at the data source end is not affected during data warehousing.
  • Allows you to configure one-click data warehousing tasks in the DLA console. You can specify the time for data transfer.

    You can use your data warehouse after you configure a one-click data warehousing task. For more information about how to use a data warehouse, see the descriptions about how to use ApsaraDB RDS for MySQL.

  • Provides improved computing capabilities. DLA provides large memory and concurrent computations. These improvements are used to perform complex multi-table join operations and other operations that are related to data warehousing.