Data layering is used to design the structure of a data model and divide the physical layer based on the comprehensive analysis of business scenarios, data scenarios, and system scenarios. Each data layer serves a specific purpose. Data layering helps you organize, manage, and maintain data in an efficient manner. This topic describes how to create and manage data layers.

Background information

A data warehouse is a collection of various types of data, such as logs, database data, text data, and external data. In data modeling, the logical structure of a data warehouse is built based on data layers, data domains, business processes, data marts, and subject areas. Data domains and business processes are used at a common layer to build data models for the common layer. Data marts and subject areas are used at an application layer to build data models for specific business applications.

Before raw data is stored in a data warehouse, the raw data is cleansed and filtered at a data layer. This helps optimize the data query process and improves the efficiency of obtaining, calculating, and analyzing data. Data layers associate data of different dimensions for multidimensional analysis and decision-making.

Plan data layers

You must design and plan data layers based on your business requirements and comprehensive analysis of business scenarios, data scenarios, and system scenarios.

By default, a data warehouse is divided into the following layers: operational data store (ODS), dimension (DIM), data warehouse detail (DWD), data warehouse summary (DWS), and application data service (ADS).
  • ODS
    This layer is used to receive and process raw data that needs to be stored in a data warehouse. The structure of a data table at the ODS layer is the same as the structure of a data table in which the raw data is stored. The ODS layer serves as the staging area for the data warehouse. The following operations are performed on the raw data at the ODS layer:
    • Synchronize incremental or full structured raw data to the data warehouse.
    • Structure unstructured raw data, such as logs, and store the outputs in MaxCompute.
    • Record changes in raw data or cleanse raw data based on your business requirements.
    The name of a data table at the ODS layer must start with ods and the time to live (TTL) of the table must be 366 days.
  • DWD

    At this layer, data models are built based on the business activities of an enterprise. You can create a fact table that uses the highest granularity level based on the characteristics of a specific business activity. You can duplicate some key attribute fields of dimensions in fact tables and create wide tables based on the data usage habits of the enterprise. You can also associate fact tables with dimension tables as little as possible to improve the usability of fact tables.

  • DWS

    At this layer, data models are built based on specific subject objects that you want to analyze. You can create a general aggregate table based on the metric requirements of upper-layer applications and products.

    Some general dimensions can be abstracted at the ODS layer based on preliminary classification and summary of user behavior. For example, the dimensions are time, IP address, and ID. You can use these dimensions to obtain statistical data, such as the numbers of products purchased by users at different logon IP addresses in each time period. At the DWS layer, you can add multi-granularity aggregate tables on top of general aggregate tables to improve the calculation efficiency. For example, you can save a long period of time if you evaluate user behavior based on the time interval of 7 days, 30 days, or 90 days.

  • ADS

    This layer is used to store the metric data of products and generate various reports. For example, the ADS layer can be used by an e-commerce enterprise to store statistical information about the sales volume and the ranking of each type of ball sports goods in the Hangzhou region from June 9 to June 19.

  • DIM

    At this layer, data models are built based on dimensions. At the DIM layer, you can define dimensions, determine the primary keys, add dimension attributes, and associate different dimensions. This ensures data consistency in data analysis and mitigates the risks of inconsistent data calculation specifications and algorithms.

The following figures show two display modes for data layers: Tiled Display and Hierarchy Display. Data Layer
Display mode Description
Tiled Display Data layers are displayed in tiled mode.
Hierarchy Display DataWorks provides you with the Data Import Layer, Common Layer, Application Layer, and Others data layer categories. You can create a data layer and add the data layer to a data layer category.
  • Data Import Layer: A data layer of this category is used to ingest basic data such as database data, logs, and messages. You can store only ODS tables at a data layer of the data import layer category.
  • Common Layer: A data layer of this category is used to process and integrate common data to define unified dimensions, create reusable detailed fact tables for data analysis and statistics collection, and aggregate common metrics. You can store fact tables, dimension tables, and aggregate tables at a data layer of the common layer category.
  • Application Layer: A data layer of this category is used to reconstruct the data that is processed and integrated at a data layer of the common layer category based on your business requirements. You can store application tables and dimension tables at a data layer of the application layer category.
  • Others: Data layers that are automatically created by the system and data layers that are customized by users are added to this data layer category.
    • System data layer: This type of data layer is added to the Others data layer category. If you want to change the data layer category to which a system data layer belongs from Others to another data layer category, contact Alibaba Cloud technical support.
    • Custom data layer: This type of data layer is added to the Others data layer category. If you want to change the data layer category to which a custom data layer belongs from Others to another data layer category, modify the data layer.
    Note The supported model types vary based on the data layer category. Before you add a model table to a data layer of your desired data layer category, make sure that you change the data layer category to which the data layer belongs from Others to your desired data layer category that supports the model type of the table you create.
Note You can change the value of the Data Layer parameter only once after you configure this parameter. Select a suitable data layer category based on your business requirements.

Create a data layer

By default, the system creates the following layers for you: ODS, DIM, DWD, DWS, and ADS. These layers can meet your business requirements in most scenarios. If you have special requirements, you can perform the steps described in this section to create a data layer.

Sample scenario for a special requirement: Abstract a temporary (TMP) layer and store temporary tables at this layer. Specify some standards and verification rules for this layer, such as table naming conventions and TTL. This ensures that tables created at this layer conform to the standards and rules specified for this layer.

  1. Go to the Data Modeling page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where the workspace that you want to manage resides. Find the workspace and click Data Development in the Actions column.
    4. In the upper-left corner of the DataStudio page, click the Icon icon and choose All Products > Data Modeling > Data Warehouse Planning. The Data Layer page appears.
  2. Create a data layer.
    1. Click Create. In the Create Data Layer dialog box, configure basic parameters for the data layer. Create Data Layer
      Parameter Description
      Abbreviation The abbreviation of the name of the data layer. An abbreviation uniquely identifies a data layer.

      The abbreviation can be up to 128 characters in length, and can contain letters, digits, and underscores (_). It must start with a letter.

      Name The name of the data layer.

      The name can be up to 2,048 characters in length, and can contain letters, digits, underscores (_), and ampersands (&). It must start with a letter or a digit.

      Display Name The display name of the data layer.

      The display name can be up to 2,048 characters in length, and can contain letters, digits, underscores (_), ampersands (&), and parentheses (). It must start with a letter or a digit.

      Owner The owner of the data layer. The default value is the current logon account.
      Data Layer The value of this parameter determines the value of the Model Type parameter. This parameter specifies the data layer category to which the data layer belongs.
      • Data Import Layer: A data layer of this category is used to ingest basic data such as database data, logs, and messages. You can store only ODS tables at a data layer of the data import layer category.
      • Common Layer: A data layer of this category is used to process and integrate common data to define unified dimensions, create reusable detailed fact tables for data analysis and statistics collection, and aggregate common metrics. You can store fact tables, dimension tables, and aggregate tables at a data layer of the common layer category.
      • Application Layer: A data layer of this category is used to reconstruct the data that is processed and integrated at a data layer of the common layer category based on your business requirements. You can store application tables and dimension tables at a data layer of the application layer category.
      • Others: Data layers that are automatically created by the system and data layers that are customized by users are added to this data layer category.
        • System data layer: This type of data layer is added to the Others data layer category. If you want to change the data layer category to which a system data layer belongs from Others to another data layer category, contact Alibaba Cloud technical support.
        • Custom data layer: This type of data layer is added to the Others data layer category. If you want to change the data layer category to which a custom data layer belongs from Others to another data layer category, modify the data layer.
        Note The supported model types vary based on the data layer category. Before you add a model table to a data layer of your desired data layer category, make sure that you change the data layer category to which the data layer belongs from Others to your desired data layer category that supports the model type of the table you create.
      Model Type The value of this parameter is determined by the value of the Data Layer parameter. This parameter specifies the type of the model table that you want to store.
      • ODS Table: You can set the Model Type parameter to this value only if you set Data Layer to Data Import Layer.
      • Fact Table: You can set the Model Type parameter to this value only if you set Data Layer to Common Layer.
      • Dimension Table: You can set the Model Type parameter to this value if you set Data Layer to Common Layer or Application Layer.
      • Aggregate Table: You can set the Model Type parameter to this value only if you set Data Layer to Common Layer.
      • Application Table: You can set the Model Type parameter to this value only if you set Data Layer to Application Layer.
      Description The description of the data layer. You can select a data layer to store specific business data based on the description of the data layer.

      The description of the data layer can be up to 2,048 characters in length.

  3. Click Confirm.

What to do next

After you create the data layer, you must add a data layer checker to specify the naming conventions of tables at the data layer. For more information, see Configure a data layer checker.