Data layering is used to design the structure of data models and divide the physical layer based on the comprehensive analysis of business scenarios, data scenarios, and system scenarios. Each data layer serves a specific purpose. Data layering helps you organize, manage, and maintain data in an efficient manner. This topic describes how to create and manage data layers.

Background information

A data warehouse is a collection of various types of data, such as logs, database data, text data, and external data. Data layers, data domains, and business processes determine the logical structure of a data warehouse during data modeling. Before raw data enters a data warehouse, data layers ensure that disorganized raw data is cleansed and filtered. This optimizes the data query process and improves the efficiency at which data is obtained, calculated, and analyzed. Data layers associate data of different dimensions to facilitate multidimensional analysis and decision-making.

Plan data layers

You can plan data layers for data models based on your business requirements.

By default, a data warehouse is divided into the following layers: operational data store (ODS), dimension (DIM), data warehouse detail (DWD), data warehouse summary (DWS), and application data service (ADS).
  • ODS
    This layer is used to receive and process raw data that needs to be stored in a data warehouse. The structure of a data table at the ODS layer is consistent with the structure of a data table in which the raw data is stored. The ODS layer serves as the staging area for the data warehouse. The following operations are performed on the raw data at the ODS layer:
    • Synchronize incremental or full structured raw data to the data warehouse.
    • Structure unstructured raw data, such as logs, and store the results in MaxCompute.
    • Record changes of raw data or cleanse raw data based on your business requirements.
    The name of a data table at the ODS layer must start with ods and the time-to-live (TTL) of the table must be 366 days.
  • DWD

    At this layer, data models are built based on the business activities that are performed by an enterprise. You can create a fact table with the most granular level based on the characteristics of a specific business activity. In combination with data usage habits of the enterprise, you can duplicate some key attribute fields of dimensions in fact tables to create wide tables. At the same time, you can associate fact tables with dimension tables as little as possible to improve the usability of fact tables.

  • DWS

    At this layer, data models are built based on the objects of a specific theme to be analyzed. You can create a general aggregate table based on the metric requirements of upper-layer applications and products.

    Some general dimensions can be abstracted at the ODS layer based on preliminary classification and summary of user behavior. For example, the dimensions are time, IP, and ID. You can use these dimensions to obtain statistical data, such as the numbers of products purchased by users at different logon IP addresses in each time period. At the DWS layer, you can add multi-granularity aggregate tables on top of general aggregate tables to improve the calculation efficiency. For example, a lot of time can be saved if user behavior in a time period of 7, 30, or 90 days is calculated.

  • ADS

    This layer is used to store product-specific metric data and generate various reports. For example, the ADS layer can store statistics on the sales volume and the ranking of each type of ball sports goods in the Hangzhou region from June 9 to June 19 for an e-commerce enterprise.

  • DIM

    At this layer, data models are built based on dimensions. At the DIM layer, you can define dimensions, determine the primary keys, add dimension attributes, and associate different dimensions. This ensures data consistency in data analysis and reduces the risk of inconsistent data calculation specifications and algorithms.

Create a data layer

By default, the system creates the following layers for you: ODS, DIM, DWD, DWS, and ADS. These layers can meet your requirements in most scenarios. If you have special requirements, you can perform the following steps to create a data layer.

Special requirement: Abstract a temporary (TMP) layer and store temporary tables at this layer. Specify some standards and verification rules for each layer, such as table naming conventions. This ensures that tables created at a layer conform to the rules specified for this layer.

  1. Go to the Data Modeling page.
  2. Go to the Data Layer page.
    In the top navigation bar of the Data Modeling page, click Data Warehouse Planning. In the left-side navigation pane of the Data Warehouse Planning page, click Data Layer.
  3. Create a data layer.
    1. Click Create. In the Create Data Layer dialog box, configure basic information for the data layer. Create Data Layer
      Parameter Description
      Abbreviation The abbreviation of the data layer. The abbreviation uniquely identifies a data layer.

      The abbreviation can be up to 128 characters in length, and can contain letters, digits, and underscores (_). It must start with a letter.

      Name The name of the data layer.

      The name can be up to 2,048 characters in length, and can contain letters, digits, underscores (_), and ampersands (&). It must start with a letter or a digit.

      Display Name The display name of the data layer.

      The display name can be up to 2,048 characters in length, and can contain letters, digits, underscores (_), ampersands (&), and parentheses (). It must start with a letter or a digit.

      Owner The owner of the data layer. The owner of the data layer is the owner of the workspace.
      Description The description of the data layer. You can select a data layer to store specific business data based on the description of the data layer.

      The description can be up to 2,048 characters in length.

  4. Click Confirm.

What to do next

After the data layer is created, you must add a data layer checker to specify the naming conventions of tables at the data layer. For more information, see Configure a data layer checker.