This topic describes how to create, reference, and download JAR and Python resources.

Prerequisites

A MaxCompute compute engine is bound to a workspace on the Workspace Management page so that the MaxCompute service is displayed in the workspace. For more information, see Configure a workspace.

Background information

If your code or function requires resource files such as .jar files, you can upload resources to your workspace and reference them.
If the existing built-in functions do not meet your requirements, DataWorks allows you to create user-defined functions (UDFs) and customize processing logic. You can upload the required JAR packages to your workspace so that you can reference them when you create UDFs.
Note
  • You can view built-in functions on the Built-In Functions tab. For more information, see Functions.
  • You can view the UDFs that you have committed or deployed in DataWorks on the MaxCompute Functions tab. For more information, see MaxCompute functions.

You can upload different types of resources including text files, Python code, and compressed packages in the .zip, .tgz, .tar.gz, .tar, and .jar formats to MaxCompute. You can read or use these resources when you run UDFs or MapReduce.

MaxCompute provides API operations for you to read and use resources. The following types of resources are supported:
  • Python: the Python code you have written. You can use Python code to register Python UDFs.
  • JAR: the compiled Java JAR packages.
  • Archive: the compressed files that can be identified by the file name extension. Supported file types include .zip, .tgz, .tar.gz, .tar, and .jar.
  • File: the files in the .zip, .so, or .jar format.
JAR resources and file resources have the following differences:
  • You can write Java code in an offline Java environment, compress the code to a JAR package, and then upload the package as a JAR resource to DataWorks.
  • You can create and edit a small-sized file resource in the DataWorks console.
  • To upload a resource file that is larger than 500 KB in size from your local device, you can select Large File (over 500 KB) when you create a file resource.
    Note Each resource file to be uploaded cannot exceed 30 MB. You can use the MaxCompute client to upload a resource file that is larger than 30 MB in size. Then, commit it to DataWorks on the MaxCompute Resources tab. For more information, see MaxCompute resources.

Create a JAR resource

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where your workspace resides, find the workspace, and then click Data Analytics in the Actions column.
  2. Move the pointer over the Create icon and choose MaxCompute > Resource > JAR.
    Alternatively, you can click the required workflow in the Business Flow section, right-click MaxCompute, and then choose Create > Resource > JAR.

    For more information about how to create a workflow, see Create a workflow.

  3. In the Create Resource dialog box, set the Resource Name and Location parameters.
    Note
    • If the selected JAR package has been uploaded from the MaxCompute client, clear Upload to MaxCompute. If you do not clear it, an error occurs during the upload process.
    • The resource name can be different from the name of the uploaded file.
    • The resource name can contain letters, digits, underscores (_), and periods (.), and is not case-sensitive. It must be 1 to 128 characters in length. A JAR resource name must end with .jar, and a Python resource name must end with .py.
  4. Click Upload and select the file to upload.
  5. Click Create.
  6. Click the Commit icon in the toolbar to commit the resource to the development environment.

Create a Python resource and register a UDF

  1. Create a Python resource.
    1. On the Data Analytics tab, move the pointer over the Create icon and choose MaxCompute > Resource > Python.
      Alternatively, you can click the required workflow in the Business Flow section, right-click MaxCompute, and then choose Create > Resource > Python.
    2. In the Create Resource dialog box, set the Resource Name and Location parameters.
      Notice The resource name can contain letters, digits, periods (.), underscores (_), and hyphens (-). It must end with .py.
    3. Click Create.
    4. Enter the code of the created Python resource in the code editor. Sample code:
      from odps.udf import annotate
      @annotate("string->bigint")
      class ipint(object):
          def evaluate(self, ip):
              try:
                  return reduce(lambda x, y: (x << 8) + y, map(int, ip.split('.')))
              except:
                  return 0
    5. Click the Commit icon in the toolbar.
  2. Register a UDF.
    1. On the Data Analytics tab, move the pointer over the Create icon and choose MaxCompute > Function.
      Alternatively, you can click the required workflow, right-click MaxCompute, and then choose Create > Function.
    2. In the Create Function dialog box, set the Function Name and Location parameters.
    3. Click Create.
    4. In the Register Function section, enter the class name of the function and the name of the Python resource that has been created, and then click the Commit icon in the toolbar. In this example, the class name is ipint.ipint.
    5. Check whether the ipint function is valid and meets your expectation. You can create an ODPS SQL node in the DataWorks console to test the ipint function by running an SQL statement.
    You can also create an ipint.py file on your local device and upload it by using the MaxCompute client. For more information, see Client.
    odps@ MaxCompute_DOC>add py D:/ipint.py;
    OK: Resource 'ipint.py' have been created.                
    odps@ MaxCompute_DOC>create function ipint as ipint.ipint using ipint.py;
    Success: Function 'ipint' have been created.           

    After the resource file is uploaded, register a UDF on the MaxCompute client. For more information, see Functions operations. You can use the UDF after it is registered.

Reference and download resources

To download a resource, double-click Resource, select the required resource, and then click Download. For more information about how to download a resource by using the MaxCompute client, see Resource operations.