All Products
Search
Document Center

Dataphin:Standard modeling process based on ordering business

Last Updated:Dec 17, 2025

This practice, utilizing the Dataphin system, describes the comprehensive process of segmenting data into subject areas, defining business processes, constructing logical models, and establishing statistical metrics, including atomic and derived metrics.

Prerequisites

You must purchase Alibaba Cloud's MaxCompute and Dataphin (Intelligent R&D Edition) services. It is advisable to select the same region for both services during purchase.

Background information

  • Standard definition leverages dimensional modeling to categorize and define subject areas, business processes, dimensions, atomic metrics, statistical periods, and derived metrics.

  • Develop various metrics tailored to business scenarios for insightful analysis of the E-commerce order management process.

Scenario introduction

This tutorial presents the complete lifecycle of the E-commerce order management process, detailing the transactional interactions between buyers and sellers.

  • ① The buyer selects product specifications, agrees with the seller on the price, payment method, and awaits deposit payment.

  • ② Following the deposit payment, an order is generated.

  • ③ The seller arranges for the shipment of goods as per the buyer's requirements.

  • ④ Upon receiving and verifying the goods, the buyer pays the remaining balance.

  • ⑤ With both parties confirming the order accuracy, the transaction is concluded, and the ordering process is completed.

image

Based on the scenario described, you can create a derived metric in Dataphin for orders exceeding a total amount of 500,000 in the past 7 days, as illustrated below.

image

Practice steps

Step 1: Resource preparation

  • Activate the Dataphin system. For more information, see Activate Dataphin--Semi-managed.

  • Create a project in Dataphin to ensure multi-user isolation and data access control. For more information, see Create a general project.

    Note

    The project name should be set as dataphin_order, ideally matching the compute source name.

  • Create a MaxCompute project via the MaxCompute console. SQL and logical table tasks will utilize MaxCompute's computing capabilities for data processing. For activation instructions, see Create a MaxCompute project.

Step 2: Prepare physical tables and data

  1. On the Dataphin home page, navigate to R&D > Data R&D, and select the dataphin_order project in the development environment.

  2. In the left-side navigation pane, select Data Processing > Tables. In the table management list, click Newimage icon.

  3. In the New Physical Table dialog box, configure the s_user physical table as shown below, and click Confirm.

    image

  4. On the Physical Table Details page, click Import From Create Table Statement button. Add the create table statement to Dataphin, click Submit, click Confirm And Submit, and then click Go To Publish.

    The create table statement is as follows:

    CREATE TABLE IF NOT EXISTS s_user (
      id BIGINT COMMENT 'ID'
      ,gmt_create STRING COMMENT 'Creation time'
      ,gmt_modified STRING COMMENT 'Modification time'
      ,reg_date STRING COMMENT 'Registration date'
      ,user_name STRING COMMENT 'nick nickname'
      ,certify_status BIGINT COMMENT 'Authentication status: 0: Unauthenticated 1: First-level authentication, real-name authentication 2: Second-level authentication, based on first-level + bank account authentication'
      ,user_level BIGINT COMMENT 'User level: 1: One star 2: Two stars 3: Three stars 4: Four stars 5: Five stars'
      ,tp_account STRING COMMENT 'Third-party source account'
      ,id_card_number STRING COMMENT 'ID card number'
      ,dob STRING COMMENT 'Date of birth'
      ,user_gender STRING COMMENT 'Gender: F: Female, M: Male'
      ,full_name STRING COMMENT 'Real name'
      ,address_id BIGINT COMMENT 'Address'
      ,mobile_phone STRING COMMENT 'Mobile phone number'
      ,email STRING COMMENT 'Email address'
    ) PARTITIONED BY (
      ds STRING
    ) STORED AS ALIORC TBLPROPERTIES (
      'columnar.nested.type' = 'true'
    );
    CREATE TABLE IF NOT EXISTS s_order (
      id BIGINT COMMENT 'ID'
      ,gmt_create STRING COMMENT 'Creation time'
      ,gmt_modified STRING COMMENT 'Modification time'
      ,order_time STRING COMMENT 'Order time'
      ,pay_time STRING COMMENT 'Payment time'
      ,ship_time STRING COMMENT 'Shipping time'
      ,end_time STRING COMMENT 'Transaction completion/cancellation time'
      ,buyer_id BIGINT COMMENT 'Buyer ID'
      ,total_items_amount DOUBLE COMMENT 'Total item amount, sum of unit price × quantity, unit: cents'
      ,discount_amount DOUBLE COMMENT 'Total discount amount, unit: cents'
      ,delivery_amount DOUBLE COMMENT 'Shipping fee, unit: cents'
      ,total_amount DOUBLE COMMENT 'Total amount paid, unit: cents'
      ,`status` BIGINT COMMENT 'Status: -1: Cancelled 1: New 2: Paid 3: Shipped 4: Confirmed receipt'
      ,delivery_address_id BIGINT COMMENT 'Delivery address'
    ) PARTITIONED BY (
      ds STRING
    ) STORED AS ALIORC TBLPROPERTIES (
      'columnar.nested.type' = 'true'
    );
  5. In the Objects To Publish list, publish the physical table to the production environment.

    The creation method for the s_order physical table is the same as for s_user.

  6. On the Data Development page, switch to the production environment, and click the s_user physical table to enter the offline physical table details page. Click the Import Data button, and configure the data import as shown below.

    Upload the data from the s_order.csv and s_user.csv files to Dataphin.

    image

  7. Click Next. In the Import Data dialog box, click Same Name Mapping, then click Start Import to synchronize the data to the s_user and s_order tables in the production environment.

Step 3: Standard modeling

Create data block

  1. Log on to the Dataphin console, click Enter Dataphin>> to access the Dataphin home page.

  2. From the top menu bar on the Dataphin home page, select Planning > Data Architecture. On the Business Unit page, click the New Data Block button to open the New Data Block dialog box.

  3. In the New Data Block dialog box, select the Dev_Prod mode, configure the LD_retail data block as shown below, click Next, and then click Confirm to complete the creation of the LD_retail block.

    image

Create subject area

  1. In the top menu bar on the Dataphin home page, select Planning > Data Architecture.

  2. On the Business Unit page, click the LD_retail data block to enter the Subject Area Management tab, and click the New Subject Area button.

  3. In the New Subject Area panel, create the sales domain according to the parameters shown below, and click Confirm to complete the creation of the sales domain.

    image

Create business object

  1. Click the Business Entity tab, and then click the New Business Entity button.

  2. In the New Business Entity panel, create the user business object, configure it according to the parameters shown below, click Confirm, and then click Publish to publish the user business object.

    The configuration method for the seller business object, buyer business object, and product business object is the same as for the user business object.

    image

Create business event

  1. Click the Business Entity tab, and then click the New Business Entity button.

  2. In the Create Business Entity panel, create a place order business event. Configure the parameters as shown in the following figure. Click OK, and then click Publish to publish the place order business event.

    The configuration method for the payment business event, closing business event, viewing business event, and clicking business event is the same as for the ordering business event.

    image

Create business process

  1. Click the Business Entity tab, and then click the New Business Entity button.

  2. In the New Business Entity panel, create the order business process, configure it according to the parameters shown below, click Confirm, and then click Publish to publish the order business process.

    The configuration method for the browsing business process (associated entities are user and product business objects, process events are viewing and clicking business events) is the same as for the order business process.

    image

After creation, you can view the following sales order relationship diagram.

image

Create logical dimension table

  1. In the top menu bar on the Dataphin home page, select R&D > Data Development, and select the dataphin_order project in the development environment.

  2. In the left menu bar, select Standard Modeling > Logical Dimension Table, in the Logical Dimension Table list, click Newimageicon.

  3. In the New Logical Dimension Table dialog box, create the [User] logical dimension table.

    image

  4. Add fields to the logical dimension table. It is recommended that the primary key be non-empty and unique to avoid data duplication or dirty data flowing into the downstream, causing data calculation errors.

    image

  5. In the Computation LogicCalculation logic configuration, click Source Configuration, select the buyer physical table${dataphin_order}.s_user, associate the primary key with id, click Confirm, and then click Next. The constraint configuration uses the default configuration, click Next.

  6. In the Scheduling & Parameter Configuration, click Auto Parse, then click Node (task) Name under the column, click Click To Supplement, in the Click To Supplement dialog box, select the virtual_root_node node, click Save And Submit, and then click Go To Publish.

Create logical fact table

  1. In the left menu bar, select Standard Modeling > Logical Fact Table, in the Logical Fact Table list, click Newimageicon.

  2. In the New Logical Fact Table dialog box, create the [Order] logical fact table.

    image

  3. Add fields to the logical fact table. It is recommended that the primary key be non-empty and unique to avoid data duplication or dirty data flowing into the downstream, causing data calculation errors.

    image

  4. To establish a model relationship for the buyer_id field, click on Associate Dimension.

    image

  5. In the Computation LogicCalculation logic configuration, click Source Configuration, select the buyer physical table${dataphin_order}.s_user, associate the primary key with id, click Confirm, and then click Next. The constraint configuration uses the default configuration, click Next.

  6. In the Scheduling & Parameter Configuration, click Auto Parse, then click Node (task) Name under the column, click Click To Supplement, in the Click To Supplement dialog box, select the virtual_root_node node, click Save And Submit, and then click Go To Publish.

Create atomic metric

  1. In the left menu bar, select Standard Modeling > Metrics, in the Metrics list, click Newimageicon.

  2. In the New Atomic Metric dialog box, create the order count metric (order count can be accumulated) according to the parameters shown below, click Save And Submit, and then click Confirm And Submit to complete the creation of the [Order Count] atomic metric.

    image

Create business filter

  1. In the left menu bar, select Standard Modeling > Business Condition, in the Business Condition list, click Newimageicon.

  2. In the New Business Condition dialog box, create the business filter for amounts exceeding 500,000 according to the parameters shown below, click Save And Submit, and then click Confirm And Submit to complete the creation of the [Amount Exceeding 500,000] business filter.

    image

Create derived metric

  1. In the left menu bar, select Standard Modeling > Metrics, in the Metrics list, click Newimageicon.

  2. In the New Derived Metric page, create the order count for orders with a total amount exceeding 500,000 in the last 7 days according to the parameters shown below, click Next, and then click Submit, and then click Confirm And Submit to complete the creation of the [Order Count for Orders with a Total Amount Exceeding 500,000 in the Last 7 Days] derived metric.

    image

You can view the created derived metrics in the dws_all table.

Step 4: Publish and maintain data backfill

  1. In the top menu bar on the Dataphin home page, select R&D > Task Publish, in the Objects To Publish list, publish the user logical dimension table, order logical fact table, order count atomic metric, amount exceeding 500,000 business filter, order count for orders with a total amount exceeding 500,000 in the last 7 days derived metric, and dws_all logical aggregate table to the production environment.

  2. Click the Task Maintenance in the top menu bar, in the left-side navigation pane, select Task Maintenance > Recurring Task, and then select the dataphin_order project in the production environment.

  3. On the Recurring Task page, click the Details And Aggregate Table tab, select More-Data Backfill-Backfill Current Task and Downstream in the operation column of the order real-time logical table, perform data backfill for the logical table task. For details, see Data backfill for current and downstream tasks.

    image

  4. You can view the running results on the data backfill instance page.

Step 5: Verify data

  1. In the top menu bar on the Dataphin home page, select R&D > Data R&D, and select the dataphin_order project in the development environment.

  2. Click the Ad Hoc Query in the left-side navigation pane, click the New icon in the ad hoc query list to enter the New Ad Hoc Query dialog box to configure parameters and create an ad hoc query task.

  3. On the Code Writing page, write code and verify whether the data meets expectations.

    For example, after writing the following command, click Run at the top of the page to verify the data in the logical aggregate table.

    SELECT * FROM LD_retail.dws_all WHERE ds = '${bizdate}';

    image