All Products
Search
Document Center

DataWorks:Create and use EMR resources

Last Updated:Jun 21, 2026

DataWorks lets you visually create E-MapReduce (EMR) JAR and EMR FILE resources. You can upload custom functions or open-source MapReduce (MR) sample code as resources and reference them in data development tasks that run on EMR compute nodes. This topic describes how to create, upload, and commit a resource.

Prerequisites

Prerequisites vary by engine type. You must complete the required preparations in both EMR and DataWorks.

Create an EMR resource

  1. Log on to the DataWorks console. In the target region, click Data Development and O&M > Data Development in the left-side navigation pane. Select a workspace from the drop-down list and click Go to Data Development.

  2. Move the pointer over the 新建 icon and click Create Resource > EMR > EMR JAR or Create Resource > EMR > EMR File.

    Alternatively, find the workflow, right-click the workflow, and choose Create Resource > EMR > EMR JAR or Create Resource > EMR > EMR File.

  3. In the Create Resource dialog box, configure the parameters.

    Parameter

    Description

    Engine Type

    The engine type is EMR by default and cannot be changed.

    Engine Instance

    Select the engine instance from the drop-down list.

    Note

    This list displays the EMR engines bound to the workspace in Data Development.

    Resource Type

    Only EMR JAR and EMR FILE resource types are supported.

    Path

    The workflow path where the resource will be located.

    Storage path

    Select a storage path for the resource. Supported storage types include OSS and HDFS.

    • If you select OSS, you must first grant authorization and then select a directory.

      Note

      You must use an Alibaba Cloud account to grant the permissions.

    • If you select HDFS, you must manually enter the storage path.

    Note

    Task JAR packages can be stored only in the following locations:

    • The master node of the EMR cluster.

    • Object Storage Service (OSS). We recommend that you store JAR packages in OSS. For more information about how to store JAR packages in OSS, see Operations in the OSS console.

    File Source

    The source of the target file. Supported sources include Local and OSS.

    • If you select Local, click Click Upload in the Upload File field to upload a local file.

    • If you select OSS, select an OSS file from the Select file drop-down list, or click Create in OSS to create an OSS file.

    Name

    The name of the new EMR resource. If you upload a JAR resource, you must include the .jar extension.

  4. In the Create Resource dialog box, click Create.

  5. Click the 保存 and 提交 icons in the toolbar to save and commit the resource.

    Note

    When you commit the resource, you must select a scheduling resource group. If you use a serverless resource group, DataWorks sends a task to the engine to create the resource and prints the execution logs. If a problem occurs during the commit, use the logs for troubleshooting. If you do not have an available serverless resource group, you must purchase and configure one. For more information, see Use serverless resource groups.

Use a resource to register a function

DataWorks provides a visual way to register a function by using a resource. After you upload the required resource, you can use it to register a function in the UI. In Data Development, open the Register Function form and configure the parameters. For example, you can set Function Type to Other Function, select a target EMR Engine Instance such as xc_emr2, set EMR Engine Type to Hive, and set EMR Database to default. Then, enter a Function Name such as xc_ip2region, and the full class name of the UDF, such as org.alidata.emr.udf.Ip2Region. Finally, for Resource List, associate the function with the uploaded JAR file from the resource tree on the left, such as xc_ip2region-emr.jar.

Use a resource in a node

After you create an EMR JAR resource, to use the resource directly in a node, select the resource node in the Resources folder, right-click the node, and then choose Insert Resource Path. You can also right-click the resource file in the resource tree on the left and choose Insert Resource Path.

Note

After you insert the resource path, a line of code in the format @resource_reference{"resourcename"} is automatically added to the node, which references the resource.

For detailed steps, see Create an EMR MR node.

Manage resource versions

A new resource version is generated each time you submit a resource. You can view and download the resource by right-clicking its resource node and clicking View Versions. In the resource directory on the left, right-click the target resource file, such as xc_ip2region.jar, and select View Historical Versions. The Version Information dialog box appears and displays the File ID, Version Number, Submitter, Submission Time, Change Type, and Status for each version. You can click Download Code for a specific version to obtain its historical code, or select multiple versions and then click the Compare button at the bottom to compare their differences.