All Products
Search
Document Center

DataWorks:Use service orchestration

Last Updated:Aug 17, 2023

The service orchestration feature of DataService Studio allows you to configure workflows by dragging nodes to directed acyclic graphs (DAGs). You can arrange APIs and functions in a serial, parallel, or branch structure based on the business logic.

Prerequisites

  • DataWorks Enterprise Edition or a more advanced edition is activated. For more information, see Billing of DataWorks advanced editions.
  • A DataWorks workspace is created in the China (Shanghai) region.

Background information

When you run a workflow to call APIs, DataWorks runs the nodes in the workflow in sequence, passes parameters among the nodes, and changes the status of each node. The service orchestration feature simplifies the process of calling multiple APIs or functions and reduces development and O&M costs. This way, you can focus on business development.

The service orchestration feature provides the following benefits:
  • Reduced cost of developing APIs

    After you drag nodes to a DAG, you can arrange APIs and functions in a serial, parallel, or branch structure without the need to write code. This reduces the cost of developing APIs.

  • Higher performance in calling APIs and functions

    A workflow allows you to call multiple APIs and functions in a container. Compared with writing code to call APIs and functions, the service orchestration feature reduces the latency of calling APIs and functions and greatly improves the calling performance.

  • Serverless architecture

    The service orchestration feature is built based on a serverless architecture. A serverless architecture supports automatic resource scaling based on your business requirements. You do not need to focus on the underlying runtime environment. You can focus only on the business logic.

Values of request and response parameters

DataService Studio uses JSONPath to obtain parameter values. JSONPath is a query language that allows you to extract data from JSON files. For more information, see JSONPath.

For example, three nodes are run in the following order: A, B, and then C. Node C needs to use the response parameters of Node A and Node B.
  • Response parameter of Node A: {"namea":"valuea"}

    Expression for obtaining the value of the response parameter of Node A: ${A.namea}

  • Response parameter of Node B: {"nameb":"valueb"}

    Expression for obtaining the value of the response parameter of Node B: $.nameb or ${B.nameb}

The built-in start node provides request parameters for the whole workflow. For example, a request parameter of a workflow is {"namewf":"valuewf"}. All nodes of the workflow can obtain the value of the request parameter by using the ${START.namewf} expression.
Note The start node and the end node are built-in nodes of a workflow. You can rename the nodes but cannot delete them. The start node of workflow is equivalent to Node 0 of the workflow.

Parameters

  • Request parameters of a workflow
    On the configuration tab of a workflow, click the Request Param tab in the right-side navigation pane. Then, you can configure request parameters in manual adding or automatic parsing mode.
    • Manual adding: Click Add Parameter and manually add a request parameter for the workflow.
    • Automatic parsing: If the first node of the workflow is an API node, click Automatically parse request parameters to automatically map the request parameters of this API node to the request parameters of the workflow.
  • Request parameters of an API node
    Click an API node. In the panel that appears, click Input Request Parameters and specify values for request parameters.
    • If you do not specify a value for a request parameter, DataService Studio obtains the value of the same parameter in the first layer of the JSON string that is returned by the parent node, and assigns the value to the request parameter.
      Note If the current node is the first node of a workflow, the values of the request parameters of this node are assigned to the same parameters of the workflow.
    • If you specify a value for a request parameter, DataService Studio uses the value that you specify.
      Note To reference the value of a specified parameter that is returned by a specified ancestor node, you must use a JSONPath expression.
  • Response parameters of an API node
    Click an API node. In the panel that appears, select set output results and customize the output of the node by using JSON expressions. The following sample code provides an example:
    {
      "return1":"$.data.rows.user_id",
      "return2":"$.data.rows.user_name"
    }
  • Request parameters of a Python node

    Click a Python node. In the panel that appears, specify request parameters in the Request Parameters field.

  • Response parameters of a Python node
    Click a Python node. In the panel that appears, select set output results and customize the output of the node by using JSON expressions. The following sample code provides an example:
    {
      "return1":"$.data.rows.user_id",
      "return2":"$.data.rows.user_name"
    }
The following table describes common JSONPath expressions that are used to obtain parameter values.
JSONPath expressionRole in request parametersRole in response parameters
$.Obtains the root object of the output of the parent node. Obtains the root object of the output of the current node.
$.paramObtains the value of the param parameter in the output of the parent node. Obtains the value of the param parameter in the output of the current node.
${START}Obtains the output of the start node.
${NodeID}Obtains the output of the node with the specified ID.
${NodeID.param}Obtains the value of the param parameter in the output of the node with the specified ID.

Example

Before you perform the following steps, make sure that a data source is added. For more information, see Configure a data source. In this example, a MySQL data source is used.

  1. Go to the DataService Studio page.

    Log on to the DataWorks console. In the left-side navigation pane, click DataService Studio. On the page that appears, select the desired workspace from the drop-down list and click Go to DataService Studio.

  2. Register an API.
    In this example, the registration method is used to create an API.
    1. On the Service Development tab, move the pointer over the Create icon and choose Create API > Register API.
      You can also expand the desired business process, right-click API, and then choose Create API > Register API.
    2. In the Register API dialog box, configure the parameters based on your business requirements. For more information, see Register an API.
    3. Click Determine.
  3. Register a function.
    1. On the Service Development tab, move the pointer over the Create icon and choose Create Function > Create Python Function.
      You can also expand the desired business process, right-click Function, and then choose Create Function > Create Python Function.
    2. In the Create Python Function dialog box, configure the parameters based on your business requirements. For more information, see Manage functions.
    3. Click OK.
    4. On the configuration tab of the function, enter the following code in the Edit Code section:
      # -*- coding: utf-8 -*-
      # event (str) : in filter it is the API result, in other cases, it is your param
      # context : some environment information, temporarily useless
      # import module limit: json,time,random,pickle,re,math
      import json
      def handler(event,context):
          # load str to json object
          obj = json.loads(event)
          # add your code here
          # end add
          return obj
    5. In the Environment Configuration section, configure the Memory and Function Timeout parameters.
    6. Click the Save icon in the toolbar.
  4. Create a workflow.
    1. On the Service Development tab, move the pointer over the create icon and select Create Workflow.
      You can also expand the desired business process, right-click Create Workflow, and then select Create Workflow.
    2. In the Create Workflow dialog box, configure the parameters based on your business requirements.
      Create Workflow
      ParameterDescription
      API NameThe name of the API. The name must be 4 to 50 characters in length and can contain letters, digits, and underscores (_). The name must start with a letter.
      API PathThe path for storing the API, such as /user.
      Note The path can be up to 200 characters in length and can contain letters, digits, forward slashes (/), underscores (_), and hyphens (-). The path must start with a forward slash (/).
      ProtocolThe protocol used by the API. Valid values: HTTP and HTTPS.

      If you need to call the API by using HTTPS, you must bind an independent domain name to the API in the API Gateway console after the API is published to API Gateway. You must also upload a Secure Sockets Layer (SSL) certificate in the API Gateway console. For more information, see Enable HTTPS for an API operation.

      Request MethodThe request method. Valid values: GET and POST.
      Response Content TypeThe response format of the API. Set the value to JSON.
      Visible RangeThe range of users to whom the API is visible. Valid values:
      • Work Space: The API is visible to all members in the current workspace.
      • Private: The API is visible only to its owner, and permissions on the API cannot be granted to other members.
        Note If you set this parameter to Private, other members in the workspace cannot view the API in the API list.
      LabelSelect tags from the Label drop-down list.
      Note A tag can be up to 20 characters in length and can contain letters, digits, and underscores (_). You can configure a maximum of five tags for a workflow.
      DescriptionThe description of the API. The description can be up to 2,000 characters in length.
      Destination FolderThe folder for storing the workflow.
    3. Click Determine.
  5. Configure the workflow.
    1. On the configuration tab of the workflow, drag nodes to the DAG and connect the nodes. The following figure shows an example.
      Connect the nodes
    2. Click the API1 node. In the panel that appears, select the API that you registered from the Select API drop-down list, select set output results, and then enter {"user_id":"$.data[0].id"}.
      Set out results

      Use JSONPath expressions to configure response parameters. The syntax for obtaining the value of a parameter is ${NodeA.namea}, which is the same as that for configuring request parameters. {"user_id":"$.data[0].id"} assigns the value of the id parameter of the first element in the data array to the user_id parameter. Then, the API1 node returns {"user_id":"value"} in JSON format.

    3. Click the PYTHON1 node. In the panel that appears, select the function that you registered from the Select Function drop-down list.
    4. Click the SWITCH1 node. In the panel that appears, click Set branch conditions.
      You can enter conditional expressions based on the response parameter of the parent node. For example, you can enter expressions in the ${Node ID.Parameter}>1 or $.Parameter>1 format. Conditional expressions support the following operators: ==, !=, >=, >, <=, <, &&, !, (), +, -, *, /, and %.
      In this example, the user_id parameter is the response parameter of the API1 node and is used as the request parameter of the SWITCH1 node.
      Branch Node 1: $.user_id != 1, indicating that Branch Node 1 is run if the value of the user_id parameter is not 1. 
      Branch Node 2: $.user_id == 1, indicating that Branch Node 2 is run if the value of the user_id parameter is 1. 
    5. Click the The end node. Then, click the Response Param tab in the right-side navigation pane and configure the response parameters.
  6. Test the workflow.
    1. Click Test in the upper-right corner.
    2. In the Test APIs dialog box, click Determine.
    3. View the run logs and execution results on the Operation Log and Execution results tabs in the lower part of the configuration tab.