DataWorks allows you to create EMR JAR resources in the DataWorks console. You can upload a Java Archive (JAR) file that contains user-defined functions (UDFs) or MapReduce code as an EMR JAR resource. Then, you can reference the resource in compute nodes such as an EMR MR node. This topic describes how to create an EMR JAR resource by uploading a file, commit the resource, and reference the resource in compute nodes such as an EMR MR node.

Prerequisites

  • An EMR cluster is created. The inbound rules of the security group to which the cluster belongs include the following rules:
    • Action: Allow
    • Protocol type: Custom TCP
    • Port range: 8898/8898
    • Authorization object: 100.104.0.0/16
  • An EMR compute engine instance is bound to the required workspace. The EMR option is displayed only after you bind an EMR compute engine instance to the workspace on the Workspace Management page. For more information, see Configure a workspace.

Procedure

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where your workspace resides, find the workspace, and then click Data Analytics in the Actions column.
  2. In the Data Analytics pane, move the pointer over the Create icon and choose EMR > Resource > EMR JAR.
    Alternatively, you can click a workflow in the Business Flow section, right-click EMR, and then choose Create > Resource > EMR JAR.
  3. In the Create Resource dialog box, set the parameters as required.
    Create Resource
    Parameter Description
    Resource Name The name of the resource to create. The resource name must have a suffix .jar.
    Location The folder for storing the resource. The default value is the path of the current folder. The path can be modified.
    Resource Type The type of the resource. Set the value to EMR JAR.
    Engine Instance The E-MapReduce cluster where the resource resides. Select a cluster from the drop-down list.
    Storage path The storage path of the resource.
    • If you select OSS, authorize DataWorks and E-MapReduce to access to Object Storage Service (OSS) and then select a folder.
    • If you select HDFS, enter a storage path.
    File Click Upload, select a local JAR package, and then click Open.
  4. Click Create.
  5. Click the Save and Submit icons in the toolbar to save and commit the resource to the development environment.

What to do next

After you create an EMR JAR resource, you can reference the resource in the code of compute nodes such as an EMR MR node. For more information, see Create an EMR MR node.Insert Resource Path