DataWorks supports ODPS Spark nodes. This topic uses the JAR resource type as an example to describe how to create and configure an ODPS Spark node.

Create and upload a resource

  1. Log on to the DataWorks console. In the left-side navigation pane, click Workspaces. On the Workspaces page, find the target workspace and click Data Analytics in the Actions column.
  2. On the DataStudio page, create a JAR resource.
    You can create a JAR resource in either of the following ways:
    • Move the pointer over the Create icon and choose MaxCompute > Resource > JAR.
    • Find the target workflow, click MaxCompute, right-click Resource, and choose Create > JAR.
  3. In the Create Resource dialog box that appears, enter the resource name and select the target folder.
    Note If multiple MaxCompute computing engines are bound to the current workspace, you must select one from the Engine Instance MaxCompute drop-down list.
  4. Click Upload and select the target file to upload.
  5. Click OK.

Create an ODPS Spark node

  1. Create an ODPS Spark node.
    You can create an ODPS Spark node in either of the following ways:
    • On the DataStudio page, move the pointer over the Create icon and choose MaxCompute > ODPS Spark.
    • Find the target workflow, right-click MaxCompute, and choose Create > ODPS Spark.
  2. In the Create Node dialog box that appears, enter the node name, select the target folder, and click Commit.
    Note A node name can be up to 128 characters in length.
  3. On the ODPS Spark page that appears, set the relevant parameters.

    You can set Spark Version and Language as needed. The parameters vary with the value of the Language parameter. You can set the parameters as prompted.

    The following table describes the parameters that appear after you set the Language parameter to Java/Scala.
    Parameter Description
    Spark Version The Spark version of the node. Valid values: Spark1.x and Spark2.x.
    Language The programming language of the node. Valid values: Java/Scala and Python. Select Java/Scala.
    Main JAR Resource The main JAR resource referenced by the node. Select a JAR resource that you uploaded from the drop-down list.
    Configuration Items The configuration items of the node. Click Add and specify the key and value to add a configuration item.
    Main Class The class name of the node.
    Arguments The parameter used to assign a value to a variable in the code during node scheduling. For example, enter ${bizdate} ${yesterday}. Separate multiple parameters with spaces.
    JAR Resources The JAR resource referenced by the node. Select a JAR resource that you uploaded from the drop-down list. The ODPS Spark node automatically finds the uploaded JAR resources based on the resource type.
    File Resources The file resource referenced by the node. Select a file resource that you uploaded from the drop-down list. The ODPS Spark node automatically finds the uploaded file resources based on the resource type.
    Archive Resources The archive resource referenced by the node. Select an archive resource that you uploaded from the drop-down list. The ODPS Spark node automatically finds the uploaded archive resources based on the resource type. Only compressed resources appear.

    After the configuration is completed, you can save and commit the node. For more information, see ODPS Spark node configuration tab.

  4. Configure the node properties.

    Click the Properties tab in the right-side navigation pane. On the Properties tab that appears, set the relevant parameters. For more information, see Properties.

  5. Commit the node.

    After the node properties are configured, click the Save icon in the upper-left corner. Then, commit or commit and unlock the node to the development environment.

  6. Deploy the node.

    For more information, see Deploy a node.

  7. Test the node in the production environment.