As businesses grow rapidly, their data increases exponentially. This data is often large, complex, and lacks consistent standards, making it difficult to manage. The intelligent data modeling service of DataWorks provides a way to manage this large, complex, and unstructured data in a structured and organized manner. This helps enterprises generate more value from their data and maximize its potential.
Prerequisites
The intelligent data modeling service of DataWorks is a value-added service. You must activate the service before you can use its features. For more information about the specifications and billing, see Intelligent data modeling billing.
You can use data modeling only on a PC with Chrome 69 or later.
Limits
The permissions for using intelligent data modeling vary based on roles in a DataWorks workspace:
All roles in a DataWorks workspace can view data model details. These roles include Visitor, Workspace Administrator, Model Designer, and Project Owner.
Only users with the Workspace Administrator, Developer, O&M, or Model Designer role can edit model information.
Only users with the Workspace Administrator or O&M role can publish a data model.
To perform these operations, grant the required role permissions to the user. For more information, see Manage permissions on workspace-level services.
Overview
DataWorks data modeling supports data warehouse planning, the creation and accumulation of enterprise data standards, dimensional modeling, and the definition of data metrics. You can use data modeling to materialize the dimension tables, fact tables, and aggregate tables created during the modeling process into a compute engine for further use.

Data Warehouse Planning
When you use DataWorks for data modeling, you can design data layers, business categories, subject areas, and business processes on the Data Warehouse Planning page.
Data Layer
You can design the data layers of your data warehouse based on your business and data scenarios. By default, DataWorks provides the following five common data warehouse layers:
Operational data store (ODS)
Data warehouse detail (DWD)
Data warehouse summary (DWS)
Application data service (ADS)
Dimension (DIM)
You can also create other data layers as needed. For more information, see Custom layers.
Business Category
If your business is complex and different business types need to share data domains, you can plan business categories to quickly locate data for a specific business during model design and application. You can then associate these categories with the corresponding dimension and fact tables. For more information, see Business categories.
Data Domain
A data domain is a high-level standard for data classification. It is a collection of abstracted, refined, and combined business processes. A data domain helps business personnel quickly identify their business data from a large amount of data.
Data domains are for business analysis. A data domain corresponds to a macro analysis area, such as procurement, supply chain, HR, or E-commerce. We recommend that data domains are managed and set by a unified organization or personnel, such as data architects or model team members. Data domain designers must have a deep understanding of the enterprise's business and be able to express their interpretation and abstraction of the business. For more information, see Data domains.
Business Process
A business process describes the flow of a business activity. For example, in E-commerce, adding an item to a shopping cart, placing an order, and making a payment can each be a business process. Business processes are typically used in business performance analysis. For example, in a common funnel analysis, the business activity of purchasing a product is broken down into business processes such as browsing products, adding to the shopping cart, placing an order, paying, and confirming receipt. You can then count the number of orders for each business process and perform a funnel analysis on this metric. For more information, see Business processes.
Data mart
A data mart defines detailed business subjects for a business category. It uses subject areas to partition data in the target mart based on different analysis perspectives. The final data is used for statistical analysis in business applications. An example is an O&M platform data mart. For more information, see Data marts.
Subject area
A subject area partitions a data mart based on an analysis perspective. It is usually a collection of closely related data subjects. You can divide these data subjects into different subject areas based on your business focus. For example, the E-commerce industry is often divided into transaction, member, and product domains. For more information, see Subject areas.
Data Standard
DataWorks data modeling supports the planning and creation of data standards before modeling. You can also accumulate enterprise data standards based on business conditions during modeling. By standardizing constraints for lookup tables, measurement units, field standards, and naming dictionaries, you can ensure consistency in data processing during subsequent modeling and application.
For example, consider a registration table and a logon table. The registration table stores a member ID in a field named user_id. The logon table also stores a member ID, but in a field named userid. In this case, you can create a unified field standard for the member ID by specifying a lookup table for data processing, defining field properties such as data type, length, and default value, and specifying the measurement unit for the data. After the field standard is created, you can directly associate it with any member ID field during modeling. This ensures that all member ID fields are consistent.
For more information, see Field standards.
Dimensional Modeling
The data modeling concept in DataWorks follows the dimensional modeling principle. When you use the dimensional modeling feature of DataWorks to design a data warehouse model:
Dimension Table
Based on the data domain plan for your business, you can extract the dimensions that might be used for data analysis in each business data domain and store these dimensions and their properties in dimension tables. For example, when you analyze E-commerce business data, available dimensions and their properties include the following: order dimension (properties include order ID, order creation time, buyer ID, and seller ID), user dimension (gender, date of birth), and product dimension (product ID, product name, and product listing time). You can create dimension tables for orders, users, and products, and use the dimension properties as fields in the tables. You can then deploy these dimension tables to the data warehouse. You can use an extract, transform, and load (ETL) process to store the actual dimension data according to the dimension table definitions. This makes it easy for business personnel to access the data for future analysis.
Fact Table
Based on the business process plan, you can sort and analyze the actual data that might be generated in each business process and store these data fields in fact tables. For example, for the order placement business process, you can create an order placement fact table to record data fields that might be generated, such as order ID, order creation time, product ID, quantity, and amount. You can then deploy these fact tables to the data warehouse. You can use an ETL process to summarize and store the real data according to the fact table definitions. This makes it easy for business personnel to access the data for analysis.
Aggregate Table
Based on business data analysis and data warehouse layers, you can summarize and analyze detailed fact data and dimension data to create aggregate tables. For subsequent data analysis, you can directly use the data in the aggregate tables instead of accessing data from fact tables and dimension tables.
Application table
An application table is for specific business scenarios. It organizes statistical data from multiple atomic metrics, derived metrics, or statistic granularities that share the same period and dimensions. This provides a foundation for subsequent business queries, online analytical processing (OLAP) analysis, and data distribution. You can design application tables based on your needs and application scenarios.
Reverse Modeling
Reverse modeling is mainly used to reverse-engineer models generated by other modeling tools into the dimensional modeling module of DataWorks. For example, if you have already generated a model with another tool and want to switch to the intelligent modeling of DataWorks for subsequent work, you can use the reverse modeling feature. This feature helps you quickly apply your existing model to DataWorks dimensional modeling without performing the modeling operations again, which saves a significant amount of time.
For more information, see Create a logical model: Dimension table, Create a logical model: Fact table, Create a logical model: Aggregate table, and Create a logical model: Application table. For more information about reverse modeling, see Perform reverse modeling on physical tables.
Data Metric
The data modeling feature of DataWorks provides a data metric function that offers a unified way to build a metric system.
A metric system consists of Atomic Metrics, Modifiers, Periods, and Derived Metrics.
Atomic Metric: A measure within a business process, such as "payment amount" in a "pay for order" business process.
Modifier: A qualifier that narrows the scope of a metric, such as limiting the "payment amount" to "maternity and infant products".
Period: The time range or point in time for which metrics are calculated. For example, the period for the "payment amount" metric can be set to "last 7 days".
Derived Metric: A combination of an atomic metric, modifiers, and a period. For example, the payment amount for maternity and infant products in the last 7 days.
For more information, see Data Metric.
Why data modeling is necessary
Standardized management of massive data
The larger an enterprise's business, the more complex its data structure becomes. The volume of data grows rapidly with business development. Managing and storing data in a structured and organized manner is a challenge that every enterprise faces.
Interconnection of business data to break down information barriers
Data silos form when data from different businesses and departments within a company is managed independently. This prevents decision-makers from obtaining a clear and quick overview of the company's data. Breaking down these information silos between departments or business domains is a major challenge in enterprise data management.
Integration of data standards for unified and flexible connections
Different descriptions for the same data make enterprise data difficult to manage and can lead to duplication and inaccurate results. A key part of standardized management is creating a unified data standard that can flexibly connect to upstream and downstream businesses without disrupting the existing system architecture.
Maximization of data value and enterprise profit
Make the best use of all types of enterprise data to maximize its value and provide more efficient data services for the enterprise.