Data Management (DMS) provides the Data Warehouse Development feature. When you use this feature, a data warehouse project is the basic unit where data is developed and managed. A data warehouse project must be created based on a data warehouse engine and bound to a database.
Data warehouse engine
A data warehouse engine is a database instance that is added to DMS. In DMS, only the following types of instances can be used as data warehouse engines: AnalyticDB for MySQL V3.0, AnalyticDB for PostgreSQL, and ApsaraDB RDS for MySQL. An instance to be used as a data warehouse engine must be in Security Collaboration mode. For more information, see Control modes.
The data integration feature supports data transmission between different databases and allows you to synchronize data to a data warehouse for development. Data synchronization is divided into the following two types:
- Real-time synchronization
The data synchronization feature of Alibaba Cloud Data Transmission Service (DTS) is used to synchronize data to a database in real time. This is the basis for building a real-time data warehouse. For more information, see Overview of data synchronization scenarios.
- Batch synchronization
The data migration feature of Alibaba Cloud DTS is used to migrate a large amount of data to a database at a time. You can also configure scheduling properties to synchronize incremental data periodically, for example, on a daily basis.
Table in a data warehouse
- Internal table
Internal tables are data tables in a data warehouse engine. Most of the data to be processed or transmitted in the process of data warehouse development is stored in internal tables. You can associate internal tables with themes that reflect business purposes. You can also put different internal tables at different layers to implement hierarchical management in the data warehouse.
- External table
External tables are tables that you want to use in a data warehouse but are not in the data warehouse engine. When you develop a data warehouse, you may need to use data in a table that is not in the data warehouse engine. In this case, you must define the table as an external table in the data warehouse. This way, you can import the data from the external table to the data warehouse. You can also perform join queries to query data in both the external table and internal tables.
Themes are used to identify the business purposes of internal tables. The Data Warehouse Development feature of DMS supports two-level themes.
By default, a data warehouse is divided into three layers. You can modify the default layers as needed. Based on the development logic, the following default layers are provided:
- Operational data store (ODS): The ODS layer stores raw data in the data warehouse. The data structure is consistent with that in the source system. The ODS layer serves as the data staging area of the data warehouse.
- Common data model (CDM): The CDM layer is also called the general data model layer. It consists of the dimension data (DIM), data warehouse detail (DWD), and data warehouse summary (DWS) layers. The CDM layer processes and integrates the data of the ODS layer to define conformed dimensions, create reusable detailed fact tables for data analysis and statistics collection, and aggregate common metrics.
- Application data store (ADS): The ADS layer stores personalized statistical metrics of data products. It processes the data of the CDM and ODS layers.
Folders are used to manage internal tables and task nodes.
Other terms include task flow, custom task variable, and Operation Centre. For more information, see Basic concepts.