All Products
Search
Document Center

DataWorks:Overview

Last Updated:Mar 20, 2024

When business data rapidly grows, it is difficult to manage the huge amount of complex data that has different data standards. DataWorks Data Modeling is provided to structure and manage the huge amount of disordered and complex data. Data Modeling helps enterprises gain more value from business data.

Limits

Limits on the usage of DataWorks Data Modeling vary based on roles in a DataWorks workspace.

  • Browse model details: All roles such as Visitor, Workspace Administrator, Model Designer, and Workspace Owner in a DataWorks workspace can browse the details of a data model. For more information about roles in a DataWorks workspace, see Manage permissions on workspace-level services.

  • Edit model information: Only the Workspace Administrator, Development, O&M, and Model Designer roles can edit model information. You can assign one of these roles to a user if you want to allow the user to edit model information. For more information about how to assign a role to a user, see Manage permissions on workspace-level services.

  • Publish a data model: Only the Workspace Administrator and O&M roles can publish a data model. You can assign one of these roles to a user if you want to allow the user to publish a data model. For more information about how to assign a role to a user, see Manage permissions on workspace-level services.

Overview

In DataWorks Data Modeling, you can plan and design a data warehouse, formulate and summarize data standards, perform dimensional modeling, and define data metrics. You can use Data Modeling to materialize the dimension tables, fact tables, and aggregate tables generated from data modeling into compute engines and use the materialized tables for further processing.

架构图

  • Data warehouse planning

    You can design data layers, business categories, subject areas, and business processes on the Data Warehouse Planning page.

    • Data layer

      You can design data layers in a data warehouse based on business scenarios and data scenarios. By default, DataWorks creates the following common layers for you:

      • Operational data store (ODS)

      • Data warehouse detail (DWD)

      • Data warehouse summary (DWS)

      • Application data service (ADS)

      • Dimension (DIM)

      You can create data layers based on your business requirements. For more information about how to create a data layer, see Create a data layer.

    • Business category

      If your business is complex and needs to share the same data domain, you can categorize the business, plan different business categories for the business, and associate the business categories with dimension tables and fact tables during subsequent data modeling. This helps you quickly locate data of business during model design and application. For more information about how to create a business category, see Business category.

    • Data domain

      A data domain is a high-level data classification standard. It is a collection of business processes that are abstracted, refined, and combined. A data domain is the first data grouping entry for business personnel. It helps business personnel quickly locate the desired business data from large amounts of data.

      Data domains are often used for business analysis and can be used as an analysis domain, such as procurement, supply chain, human resources, and e-commerce. We recommend that a data domain is uniformly managed and configured by an experienced organization or team, such as data architects or a model design team. Data domain designers must have a deep understanding of enterprise business and can fully express their interpretation and abstraction of the business. For more information about how to plan and create a data domain in DataWorks, see Data domain.

    • Business process

      A business process is used to describe the process of a business activity, such as adding commodities to the shopping cart, placing an order, or paying for an order. Business processes have typical application during business effect analysis, such as funnel analysis of commodity purchases. You can break down a commodity purchase into the following business processes: browsing commodities, adding commodities to the shopping cart, placing an order, paying for the order, and confirming the receipt of a commodity. Use the number of orders as a metric for each business process and perform funnel analysis on the metric. For more information about how to create a business process in DataWorks, see Business process.

  • Data standard

    DataWorks Data Modeling allows you to plan and formulate data standards before data modeling, or summarize data standards based on the business conditions during data modeling. The lookup table, measurement unit, field standard, and naming dictionary are standardized to ensure consistent data processing during subsequent modeling and application.

    For example, a registration table and a logon table are created. The registration table contains a member ID column that is specified by the user_id field. The logon table also contains a member ID column that is specified by the userid field. You can create a unified field standard for the member ID columns. For example, you can specify a lookup table for data processing, the attribute requirements for each field (such as the data type, length, and default value of each field), and the measurement unit of data. The field standard can be directly applied to the member ID fields during data modeling. The field standard ensures that all the member ID fields observe the same standard.

    For more information about how to create a field standard, see Field standard.

  • Dimensional modeling

    DataWorks Data Modeling adopts the dimensional modeling thought. When you use the dimensional modeling feature to design data models in a data warehouse, take note of the following points:

    • Dimension table

      Extract all the dimensions that possibly exist in each data domain, and store the dimensions and attributes of the dimensions in dimension tables. For example, when you analyze e-commerce business data, possible dimensions (attributes of each dimension) include order (order ID, order creation time, buyer ID, and seller ID), user (gender and birthdate), and commodity (commodity ID, commodity name, and commodity put-on-shelf time). You can create the following dimension tables: order dimension table, user dimension table, and commodity dimension table. The attributes of each dimension are used as the fields in the dimension table. You can deploy the dimension tables in a data warehouse and perform extract, transform, and load (ETL) operations to store dimension data in the format defined in the dimension table. This allows business personnel to access the data for subsequent data analysis.

    • Fact table

      Sort and analyze data that is generated in each business process, and store the data in fact tables as fields. For example, you can create a fact table for the business process of placing an order, and record the following information as fields in the fact table: order ID, order creation time, commodity ID, number of commodities, and sales amount. You can deploy the fact tables in a data warehouse and perform ETL operations to summarize and store data in the format defined in the fact table. This allows business personnel to access the data for subsequent data analysis.

    • Aggregate table

      Summarize and analyze fact data and dimension data based on business data analysis and data layering, and create an aggregate table. This way, you can directly access data in the aggregate table for subsequent data analysis, without the need to access the data in fact tables and dimension tables.

    • Reverse modeling

      Reverse modeling is used to apply the models generated by using other modeling tools to DataWorks Dimensional Modeling. For example, if you generated a model by using other modeling tools and you want to use DataWorks Data Modeling for subsequent modeling, you can use the reverse modeling feature of DataWorks. This feature eliminates the need for second modeling. It helps you quickly apply existing models to DataWorks Dimensional Modeling, and therefore saves you a lot of time.

    For more information about how to create a dimension table, a fact table, and an aggregate table, see Create a logical model: dimension table, Create a logical model: fact table, and Create a logical model: aggregate table. For more information about reverse modeling, see Perform reverse modeling on physical tables.

  • Data metric

    DataWorks Data Modeling provides the Data Metric feature, which allows you to establish a unified metric system.

    A metric system consists of atomic metrics, modifiers, periods, and derived metrics.

    • Atomic metric: a measurement used for a business process. For example, you can create an atomic metric named Payment Amount for the Order Placing business process.

    • Modifier: limits the scope of business based on which a specific metric collects data. For example, you can create a modifier named Maternity and Infant Products to limit the statistical scope of the Payment Amount atomic metric.

    • Period: specifies the time range within which or the point in time at which a metric collects data. For example, you can create a period named Last Seven Days for the Payment Amount atomic metric.

    • Derived metric: consists of an atomic metric, a period, and one or more modifiers. For example, you can create a derived metric named Payment Amount of Maternity and Infant Products in Last Seven Days.

    For more information about how to build a metric system, see Overview.

Importance of data modeling

  • Standardize management of massive data

    Larger enterprises have more complex data structures. How to manage and store data in a structured and orderly manner is a challenge that every large enterprise faces.

  • Break information barriers by interconnecting business data

    If the data of each business or department in an enterprise is isolated from one another, the decision-makers cannot clearly or fully understand the data. How to break data silos between departments or business domains is a big challenge for business data management.

  • Integrate data standards to achieve unified and flexible data interconnection

    Inconsistent descriptions of the same data result in duplicate data, incorrect calculation results, and difficulties in business data management. How to formulate a unified data standard without changing the original system architecture and realize flexible interconnection between upstream and downstream business is one of the core focuses of standardized management.

  • Maximize data value to maximize profit

    Make the most of various types of enterprise data to maximize the data value to deliver a more efficient data service for enterprises.