A dimension is the basic unit of dimensional modeling. In dimensional modeling, measures are referred to as facts and environments are referred to as dimensions. A dimension table contains the details about a specific attribute in a fact table. Common dimensions include the date dimension and the city dimension. This topic describes how to create a dimension table.
Prerequisites
A common layer is used to process and integrate common data to define unified dimension tables. An application layer is used to reconstruct the data that is processed and integrated at a common layer based on your business requirements. You can create a dimension table at a common layer or an application layer based on your business requirements. The following table describes the prerequisites that must be met before you create a dimension table at a specific data layer.
Data layer category | Prerequisite | References |
Common layer | A common layer is created. A dimension table must belong to a common layer or an application layer. | |
A data domain is created. The data domain is used to determine the perspective of business data stored in a dimension table. | ||
Application layer | An application layer is created. A dimension table must belong to a common layer or an application layer. | |
A data mart is created. A dimension table at an application layer must belong to a data mart, which is used to organize data for a specific product or scenario. |
Background information
Extract all the dimensions that possibly exist in each data domain, and store the dimensions and attributes of the dimensions in dimension tables. For example, when you analyze e-commerce business data, possible dimensions (attributes of each dimension) include order (order ID, order creation time, buyer ID, and seller ID), user (gender and birthdate), and commodity (commodity ID, commodity name, and commodity put-on-shelf time). In this case, you can create the following dimension tables: order dimension table, user dimension table, and commodity dimension table. The attributes of each dimension are used as the fields in the dimension table. You can deploy the dimension tables in a data warehouse and perform extract, transform, and load (ETL) operations to store dimension data in the format defined in the dimension table. This allows business personnel to access the data for subsequent data analysis.
As illustrated in the preceding figure:
When you create a dimension table, you can perform the following operations:
You can specify a business category or data domain in which you want to create the dimension table. You can view all dimension tables in a specific data domain or data mart.
You can specify the data layer at which you want to create the dimension table. In most cases, a dimension table is stored at the Dimension (DIM) layer, which belongs to the common layer category.
After you create a dimension table, you can add dimension attributes as fields of the dimension table. You can also associate the dimension table with fields of other dimension tables, partition the dimension table, and use the same field standard to define the value range of the fields. This ensures consistent attributes for the dimension data across the entire data domain.
After you create and configure a dimension table, you can perform the following operations:
You can publish and materialize the dimension table to a compute engine instance. The dimension table can be used in the compute engine instance for data analysis.
When you design and create derived metrics and aggregate tables, you can directly associate and use the fields of the dimension table.
Create a dimension table
Go to the Data Modeling page.
Log on to the DataWorks console. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Data Modeling.
In the top navigation bar of the Data Modeling page, click Dimensional Modeling to go to the Dimensional Modeling page.
Create a dimension table.
On the Dimensional Modeling page, move the pointer over the icon and choose .
In the Basic Information section of the configuration tab that appears, configure basic information for the dimension table.
You can configure parameters such as Data Layer, Data Domain, Business Category, and Mart for the dimension table. After the dimension table is created, you can view it in the specified data layer, data domain, business category, or data mart. In this example, a common dimension table is created. The following table describes key parameters.
Parameter
Description
Data Domain
The data domain to which the dimension table belongs.
Business Category
The business category to which the dimension table belongs.
Storage Policy
The policy based on which you want to store data in the dimension table, such as the storage period of data and the volume range of data.
Dimension Table
The dimension with which you want to associate the dimension table. You can analyze data in the dimension table based on this dimension.
NoteFor information about how to create a dimension, see Create a conceptual model: dimension.
Naming Rule
The checker that is used to check whether the name of the dimension table conforms to the configured naming conventions. After you select a checker, you must configure the Name parameter based on the naming conventions configured in the checker.
NoteFor information about how to configure a checker, see Configure and use a checker at a data layer.
Name
The name of the dimension table. If you configure the Naming Rule parameter, the name that you specify must conform to the naming conventions defined in the checker specified by the Naming Rule parameter.
Display Name
The display name of the dimension table.
Lifecycle
The lifecycle of the dimension table. Unit: day. The maximum value of this parameter is 36000.
Owner
The owner of the dimension table. The default owner is the creator of the dimension table.
Description
The description of the dimension table.
Table Type
The type of the dimension table. Valid values:
Common Dimension Table: a common dimension table. You can select this option if you do not have special requirements.
Enumeration Dimension Table: a dimension table that supports enumeration, such as a gender dimension table.
Hierarchy Dimension Table: a dimension table in which fields are at different hierarchy levels. This type of dimension table supports metric roll-up operations and drill-down operations. For example, a hierarchy dimension table can contain the country, province, city, and county fields.
After you complete the configuration, click Save in the upper part of the configuration tab.
You can view and manage the dimension table in the specified data domain or business category in the left-side navigation tree on the Dimensional Modeling page.
Add fields for the dimension table
After you configure basic information for the dimension table, you must add fields for the dimension table.
You can use fast modeling language (FML) statements to add fields for a table, configure attributes for the fields, and configure associations and partitions for the table. For more information, see Use FML statements to configure and manage data tables.
You can add fields for the dimension table in shortcut mode or script mode. The shortcut mode supports the following method to add fields for a dimension table: Import from Table/View. To use this method to add fields for a dimension table, you must import all or specific fields from a physical table or view that is selected from the Search for Existing Table/View drop-down lists to the dimension table.
You can import fields only from physical tables or views that are created within MaxCompute, Hologres, and E-MapReduce (EMR) Hive compute engine instances.
Shortcut Mode: Import from Table/View
Click Expand next to Import from Table/View.
In the Search for Existing Table/View drop-down lists, select a compute engine type from the left one, and enter a keyword and select the desired physical table or view from the displayed search results from the right one. Click the following icons next to the name of the selected physical table or view to import all or specific fields from the physical table or view to the dimension table.
NoteWhen you search for the desired physical table or view, keyword-based fuzzy match is supported. After you enter a keyword, all physical tables or views whose names contain the keyword are displayed.
You can search for only tables in the production environment. You cannot search for tables in the development environment.
: imports all fields from the selected physical table or view to the dimension table.
: imports specific fields from the selected physical table or view to the dimension table.
If you choose to import specific fields, select the fields from the physical table or view in the dialog box that appears. Then, click Import.
NoteIf one of the imported fields has no display name, you can fill the description of the field as the display name of the field as prompted.
Script Mode
The script mode allows you to edit code to manage the fields in the table. After you click Script Mode, the statement that is used to create the table is displayed in the code editor in the Script Mode dialog box. You can modify the statement based on your business requirements. Then, click OK.
Configure attributes and associations for fields in the dimension table
After fields are added for the table, you can configure Associated Field Standard, Associated Lookup Table, Redundant Field, and Associated Granularity/Metric for the fields.
Configure attributes for fields in the table.
By default, the following basic attributes are displayed for fields: Field Name, Type, Field Display Name, Description, Primary Key, Not Empty, Measurement Unit, and Actions. You can click Field Display Settings in the upper-right corner of the list of the fields that are added and select the attributes that you want to display. You can also modify the values of the attributes based on your business requirements.
Configure Associated Field Standard and Associated Lookup Table for fields in the table.
You can associate field standards and lookup tables with the added fields by configuring Associated Field Standard and Associated Lookup Table for fields in the table.
Associated Field Standard: allows you to manage the values of the fields that have the same meaning but different names in a centralized manner. A field standard can also be used to define the value range and measurement unit for the fields.
Associated Lookup Table: allows you to define the value range of the fields that use a specific field standard.
Configure Redundant Field for fields in the fact table.
In a traditional star model, dimensions are stored in dimension tables. Foreign keys of fact tables are used to obtain dimensions to reduce the storage space that is required. In the dimension model design in DataWorks Data Modeling, some frequently used fields, such as the user ID and the commonly used analysis dimension, can be configured as redundant fields to improve downstream query efficiency, simplify data acquisition operations, and reduce the number of tables that need to be associated.
Example 1: You can configure the "consignee address" and "consignee mobile phone number" attributes of the "delivery address" dimension in the "delivery address" dimension table as the redundant fields of the "order creation" fact table.
Example 2: You can configure the "procurement information" and "brand information" attributes in the "commodity information" table as the redundant fields of the "commodity" dimension table.
You can find the field for which you want to configure redundant fields and click Redundant Field in the Actions column. In the dialog box that appears, select one or more fields that you want to associate with the field and click Save.
Then, click Save in the upper part of the configuration tab of the table.
What to do next
After the preceding configuration is complete, you must configure partitions for the table and publish the table to the required environment. For more information, see Publish and materialize a table.