If you have existing physical tables in a compute engine and want to manage them in DataWorks Data Modeling, use reverse modeling to import them into the dimensional modeling framework. This eliminates the need to recreate each model manually.
Supported engines
Reverse modeling supports physical tables in the production environments of the following compute engines:
| Compute engine | Supported execution methods |
|---|---|
| MaxCompute | Full Update, Incremental Update |
| E-MapReduce (EMR) Hive | Full Update, Incremental Update |
Prerequisites
Before you begin, ensure that you have:
-
A data source configured for the target compute engine in your DataWorks workspace. See Manage data sources.
-
Data warehouse layers defined to classify models. See Define data warehouse layers.
-
The following resources set up for the target data layer:
Data layer Required resources Common Layer A data domain (Data domains) and a business process (Business processes) Application Layer A data mart (Data marts) and a subject area (Subject areas)
How it works
Reverse modeling reads physical tables from a compute engine and imports them as managed models in the DataWorks dimensional modeling module. The process follows four steps:
-
Configure a reverse modeling policy — specify which tables to import, how to name the generated models, and whether to run a full or incremental import.
-
Parse tables and match models — DataWorks scans the target compute engine, applies the table name matching rule, and identifies which models to create.
-
Confirm model information — review and adjust the auto-generated model metadata (such as data domain and business process) before committing.
-
Generate models — DataWorks creates the final models. Successfully generated models are already materialized in the compute engine and stored in the dimensional modeling module — no further publishing is needed.
A reverse modeling policy cannot be modified after it is created and used to generate models. Plan your policy before starting.
Run reverse modeling
Step 1: Go to the Reverse Modeling page
-
Log on to the DataWorks console. In the top navigation bar, select the target region. In the left-side navigation pane, choose Data Development and O&M > Data Modeling.
-
Select the target workspace from the drop-down list and click Go to Data Modeling.
-
In the top navigation bar, click Dimensional Modeling.
-
In the left-side navigation pane, click Reverse Modeling.
Step 2: Start a modeling task
-
First-time use: On the Reverse Modeling page, click Start Now.
-
Subsequent use: On the Modeling Tasks list, click Start Reverse Modeling in the upper-right corner.
Step 3: Configure a reverse modeling policy
A reverse modeling policy cannot be modified after it is created and used to generate models. Plan your policy before proceeding.
Configure the following parameters:
| Parameter | Description |
|---|---|
| Workspace | The DataWorks workspace where the source tables are located. You can only select workspaces that your account belongs to. To use a different workspace, you must be added as a member first. See Manage permissions on modules at the workspace level. |
| Compute Engine Type | The type of compute engine. Only MaxCompute and EMR Hive production environments are supported. |
| Compute Engine Instance | The specific compute engine instance where the source tables are located. |
| Table Name Matching Rule | How DataWorks identifies which tables to import: Fuzzy Match (enter a keyword to match all tables whose names contain it) or Exact Match (enter the full table name). To specify multiple tables, separate names with semicolons (;) — no spaces after the semicolons. If no tables match, the task fails and no models are generated. |
| Data Layer of Model After Reverse Modeling | The data layer for the generated models: Common Layer (fact tables, dimension tables, and aggregate tables) or Application Layer (application tables and dimension tables). |
| Table Naming Rule | How DataWorks parses matched table names and assigns the generated models to the correct data warehouse layer. A table name can contain up to nine underscores, and each segment can map to attributes like Business Process, Data Domain, or Custom content. If a segment cannot be matched, the corresponding attribute is left blank — you can fill it in during the model information confirmation step. Choose one of the following parsing methods: Table Name Checker (select a pre-configured checker; see Configure data warehouse layer checkers and Use checkers) or Custom Rule (define a custom combination of Business Process, Data Domain, Business Category, and Custom content). |
| Execution Method | Whether to import all matched tables or only new ones. See Choose an execution method below. |
Choose an execution method
| Full Update | Incremental Update | |
|---|---|---|
| What it does | Imports all matched tables. If a model for a matched table already exists, deletes it and creates a new one. | Filters out tables that already have a corresponding model. Imports only tables without an existing model. |
| Use when | You want to regenerate models for all matched tables from scratch. | Some models already exist and do not need to be re-created. |
After configuring all parameters, click Create Model to parse the tables.
Step 4: Confirm model information
DataWorks generates initial models based on the policy. Before generating the final models:
-
Adjust Table Type and Data Layer assignments as needed.
-
Assign Data Domain, Business Process, or other attributes that were left blank.
-
Remove any tables you do not want to model.
Click Generate Model to create the final models.
Step 5: View the results
After the task completes, view the count of successfully created models by type.
For failed tasks, click Error Logs to see the detailed error information and resolve the issue.
Successfully generated models are already materialized in the corresponding compute engine — no further publishing is required. Go to the Dimensional Modeling page to view and manage them. See Publish and manage a table.
View modeling tasks
On the Reverse Modeling > Modeling Tasks page, view the details and operation logs of all tasks.
| Section | Description |
|---|---|
| 1 | Filter tasks by Task ID, Operated By, or Operation Date. |
| 2 | View task details, including the reverse modeling rules and results. For a completed task, click View Logs to see its logs. For an incomplete task, click View Task to return to the task details page and continue. |
What's next
-
Go to the model management tree on the Dimensional Modeling page to view the created models. See Publish and materialize a table.
-
Go to DataStudio to work with your data. See Introduction to DataStudio features.