This topic describes how to choose a purchase plan in business scenarios where nodes are run as scheduled and data output is required in a timely manner.

Note
  • The system may send you notifications when you have an overdue payment. In this case, you must complete the overdue payment at the earliest opportunity to ensure service continuity.

    If you have other questions, submit a ticket.

  • Shared resource groups are default resource groups.
  • The peak hours for DataWorks tenants to run nodes are from 00:00 to 09:00 each day. If you use default resource groups during the peak hours, you share resources with other tenants.
  • When tenants share resources, some tenants may preempt the resources. If your nodes must be completed in time, use exclusive resource groups to run the nodes. DataWorks does not charge you additional fees for the node instances that are run on exclusive resource groups. For more information, see Exclusive resource mode.

Scenario 1: Run nodes as scheduled every day

  • Description

    After the data warehouse of an enterprise is migrated to the cloud, a basic scheduling system is required to schedule hundreds of nodes, and the cost needs to be controlled.

  • Analysis

    When big data compute engines, such as MaxCompute and Flink of Alibaba Cloud are used, most enterprises require a stable and robust scheduling system to run their data production nodes based on the node dependencies and scheduled time. If an enterprise develops the system on its own, it consumes a lot of labor and maintenance costs.

  • Purchase plan

    Required: DataWorks (pay-as-you-go). For more information, see Pay-as-you-go.

    After you activate DataWorks based on the pay-as-you-go billing method, you can use the features of DataWorks Basic Edition for free. In this case, you can use not only the basic scheduling features for nodes, but also the basic features of all DataWorks services to complete the all-in-one data development process at a low cost. For more information about the features of DataWorks services, see Feature comparison among DataWorks editions.

Scenario 2: Run a specific number of instances concurrently on a daily basis

  • Description

    A report needs to be viewed at 09:00 every morning due to business needs.

  • Analysis

    In a business scenario with strong demand for timeliness of data output, a descendant node must be run at the specified time after the ancestor node is run.

  • Purchase plan
    • Required: DataWorks (pay-as-you-go) and DataWorks exclusive resources for scheduling (subscription).
    • Optional: DataWorks advanced editions. You can purchase Standard Edition, Professional Edition, Enterprise Edition, or Ultimate Edition as needed.

Scenario 3: Run a specific number of instances concurrently on a daily basis, and transmit data concurrently by using multiple threads

  • Description

    A report needs to be viewed at 09:00 every morning due to business needs. The main content includes Content Delivery Network (CDN) access logs and client distribution by type. The raw data is stored in ApsaraDB for RDS databases that are managed by administration experts. The daily data increment is about 30 GB. Therefore, data synchronization is required.

  • Analysis

    Based on Scenario 2, Scenario 3 adds the timeliness requirement for a large number of sync nodes. Therefore, in addition to making sure that the sync nodes are run as scheduled, you also need to deploy fixed computing and network resources to support concurrent data transmission by using multiple threads.

  • Purchase plan
    • Required: DataWorks (pay-as-you-go), DataWorks exclusive resources for scheduling, and DataWorks exclusive resources for Data Integration.

      Assume that 1,500 computing nodes and 600 data integration nodes are run every day, and different types of nodes are run in different periods. You are billed based on the following calculation logic:

      Computing nodes

      Data integration nodes

      Note The preceding results are calculated based on the business volume and the expected running period. We recommend that you adjust the purchase quantity based on your actual business volume.
    • Optional: DataWorks advanced editions (subscription). You can purchase Standard Edition, Professional Edition, Enterprise Edition, or Ultimate Edition as needed.

Purchase description

  • A node to be run in Operation Center of DataWorks requires computing resources for scheduling. If the node is a data integration node, you must add scheduling resources for data transmission. Therefore, you can purchase both exclusive resources for scheduling and exclusive resources for Data Integration to ensure the proper running of nodes.
  • DataWorks exclusive resources for Data Integration can ensure that a sufficient number of concurrent threads for data integration nodes can start at the same time, but it cannot guarantee the synchronization rate.
  • DataWorks (pay-as-you-go) uses shared resource groups for scheduling. If you activate DataWorks based on the pay-as-you-go billing method, you cannot ensure that all nodes can be run as scheduled during peak hours. For more information, see Pay-as-you-go.
  • DataWorks Standard Edition and more advanced editions support the intelligent monitoring feature. After you configure monitoring rules, you can monitor large workflows globally and ensure that all nodes are completed on time.