DataWorks Data Modeling follows the dimensional modeling methodology by Ralph Kimball. Using the dimensional modeling feature in DataWorks, you can design a data warehouse model by creating dimension tables, fact tables, aggregate tables, and application tables based on your business needs. You can then quickly publish the models to the appropriate computing resource. You can also use reverse modeling to generate models from existing physical tables.
You can use Data Modeling for data warehouse planning and design and to quickly create physical tables from models in a computing resource. Deleting models does not affect the physical tables.
Modeling perspectives
Dimensional Modeling lets you classify created model tables into the ODS and Common Layers and the Application Layer. You can select the appropriate layer for modeling based on your needs. Different layers support different management perspectives.
Common Layer: Processes and integrates common data from the ODS layer. At this layer, you can establish unified metric dimensions and build reusable detailed fact data and summary data for analysis and statistics. You can manage model tables from the Data Domain or Business Category perspective.
Application Layer: Based on specific application requirements, this layer retrieves processed and integrated data from the common layer. It is used for personalized data statistics for specific application scenarios or products. You can manage model tables at this layer only from the Business Category perspective.
Unlayered: If a created model table is not assigned to a layer (either the Common Layer or the Application Layer), it is placed in the Unlayered category.
After selecting a layer, you can create target tables only in that layer. You can also switch to a management perspective supported by that layer in the directory tree to view and manage the model tables within it.
Dimensions
A dimension is an angle from which to measure and observe a business. It is a perspective used to describe the environment when analyzing data. In dimensional modeling, dimensions define the background or conditions of a business process. They help us understand data from different angles. For example, in an E-commerce transaction scenario, product, region, and time can all be dimensions.
For more information about how to design and create a dimension, see Create a conceptual model: Dimension.
Dimension table
Extract all the dimensions that possibly exist in each data domain, and store the dimensions and attributes of the dimensions in dimension tables. For example, when you analyze e-commerce business data, possible dimensions (attributes of each dimension) include order (order ID, order creation time, buyer ID, and seller ID), user (gender and birthdate), and commodity (commodity ID, commodity name, and commodity put-on-shelf time). You can create the following dimension tables: order dimension table, user dimension table, and commodity dimension table. The attributes of each dimension are used as the fields in the dimension table. You can deploy the dimension tables in a data warehouse and perform extract, transform, and load (ETL) operations to store dimension data in the format defined in the dimension table. This allows business personnel to access the data for subsequent data analysis.
For more information about how to design and create a dimension table, see Create a logical model: Dimension table.
Fact table
Sort and analyze data that is generated in each business process, and store the data in fact tables as fields. For example, you can create a fact table for the business process of placing an order, and record the following information as fields in the fact table: order ID, order creation time, commodity ID, number of commodities, and sales amount. You can deploy the fact tables in a data warehouse and perform ETL operations to summarize and store data in the format defined in the fact table. This allows business personnel to access the data for subsequent data analysis.
For more information about how to design and create a fact table, see Create a logical model: Fact table.
Aggregate table
An aggregate table organizes statistical data for multiple derived metrics that share the same period and dimensions within a data domain. It is the result of a high-level business abstraction and provides a basis for subsequent business queries, online analytical processing (OLAP) analysis, and data distribution. For more information about how to design and create an aggregate table, see Create a logical model: Aggregate table.
Application table
An application table is oriented toward specific business scenarios. It organizes statistical data for multiple atomic metrics, derived metrics, or statistical granularities that share the same period and dimensions. This provides a basis for subsequent business queries, OLAP analysis, and data distribution. You can design application tables based on your business needs and application scenarios. For more information about how to design and create an application table, see Create a logical model: Application table.
Reverse modeling
Reverse modeling generates logical models from existing physical tables. This feature lets you quickly create models without having to perform the modeling operations again, which saves a significant amount of time. For more information about reverse modeling, see Reverse modeling: Perform reverse modeling on physical tables.