To use MaxCompute resource files in your code or functions, you must create MaxCompute resources in a workspace or upload existing MaxCompute resources to the workspace before you can reference the resources. You can run MaxCompute SQL commands to upload and manage MaxCompute resources. You can also create MaxCompute resources in the DataWorks console. This topic describes how to create MaxCompute resources and use the resources in nodes in the DataWorks console. This topic also describes how to register functions based on MaxCompute resources.
Prerequisites
- A compute engine is associated with a DataWorks workspace.
After you associate a MaxCompute compute engine with a DataWorks workspace on the Workspace Management page, the MaxCompute folder is displayed in DataStudio. For more information, see Create and manage workspaces.
- A workflow is created.
DataWorks uses workflows to store resources. Therefore, you must create a workflow before you create resources. For more information, see Create a workflow.
- A node is created.
Created resources must be referenced by nodes. You must create a node based on your business requirements before you reference resources in the node. For information about how to create a node, see Create an ODPS SQL node.
Background information
- Python: the Python code that you write. You can use Python code to register Python UDFs.
- JAR: a compiled JAR package that is used to run Java programs.
- Archive: a package. You can determine the compression type of a package based on the file name extension. Packages in the following formats are supported: .zip, .tgz, .tar.gz, .tar, and .jar.
- File: a file resource. File resources in the following formats are supported: .zip, .so, and .jar. Note If you want to upload a resource file whose size is greater than 500 KB from your on-premises machine when you create a file resource, you can select Large File (> 500 KB).
- Create resources or upload existing resources
- Enable a node to use the resource
- Use resources to register functions
- Other resource management operations
Limits
- Resource size
You can directly upload a resource whose size is a maximum of 200 MB to DataWorks. For more information, see Manage MaxCompute resources.
- Resource deploymentIf you use a workspace in standard mode, you need to deploy resources to the production environment. This way, the resources can be used by projects in the production environment.Note The information about a compute engine varies based on the environment of the workspace with which the compute engine is associated. You must be clear about the information about the compute engine that is associated with the workspace in the environment. This ensures that you can query valid table and resource data in subsequent operations. For information about the information about a compute engine that is associated with a workspace in a specific environment, see Associate a MaxCompute compute engine with a workspace.
- Resource management
DataWorks allows you to view and manage resources that are uploaded by using the DataWorks console only in the DataWorks console. If you add resources to a MaxCompute compute engine by using other tools such as MaxCompute Studio, you must use the MaxCompute Resources feature in DataWorks DataStudio to manually load the resources to DataWorks. You can view and manage the resources in DataWorks after the loading is complete. For more information, see Manage MaxCompute resources.
Create resources or upload existing resources
DataWorks allows you to create resources or upload existing resources. You can select a method based on the GUIs for each type of resource.
- Go to the DataStudio page.
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- In the top navigation bar, select the region in which the workspace that you want to manage resides. Find the workspace and click DataStudio in the Actions column.
- Create a resource. You can create the desired type of resource in the desired workflow based on your business requirements. The following figure shows the entry points of creating resources and the creation procedures.Note If no workflow is available, create one. For information about how to create a workflow, see Create a workflow.
- Configure information about the resource. Configure information about the resource based on your business requirements. In this topic, a Python resource is created. Configuration items vary based on the type of the resource that you create.Note
- If you create a JAR resource and the JAR resource is never uploaded to the MaxCompute client, you must select Upload to MaxCompute. If the JAR resource has been uploaded to the MaxCompute client, clear Upload to MaxCompute. Otherwise, an error is reported when you upload the JAR resource.
- The resource name can be different from the name of the uploaded file.
- The name of a JAR resource must end with
.jar
. The name of a Python resource must end with.py
.
- Optional:Write code for the resource. If you upload an existing resource, skip this step.The following code provides an example of the code that is written in the Python resource. You can replace the code based on your business requirements.
from odps.udf import annotate @annotate("string->bigint") class ipint(object): def evaluate(self, ip): try: return reduce(lambda x, y: (x << 8) + y, map(int, ip.split('.'))) except: return 0
- Commit and deploy the resource. Click the
icon in the top toolbar to commit the resource to the development environment.
Note If nodes in the production environment need to use this resource, you also need to deploy the resource to the production environment. For more information, see Deploy nodes.
Enable a node to use the resource
@resource_reference{"Resource name"}
format is displayed. The display format of the code varies based on the type of the node that references the resource. For example, the code in the ##@resource_reference{"Resource name"}
format is displayed if a PyODPS 2 node references the resource. 
Use resources to register functions
Before you use resources to register functions, you must create a MaxCompute function by referring to Create and use a MaxCompute function. On the function configuration tab, you must enter the name of the desired resource, as shown in the following figure.
For information about MaxCompute built-in functions, see Functions.
For information about how to view the functions in a MaxCompute compute engine and the change history of the functions, and perform other operations, see MaxCompute functions.
Other resource management operations
Right-click the name of the desired resource to perform other operations on the resource.
- Delete a resource
Only the resource that is used by the compute engine that is associated with a workspace in the development environment can be deleted. If you want to delete the resource from the workspace in the production environment, you must deploy the resource deletion operation in the development environment to the production environment to make the deletion take effect in the production environment. After the operation is deployed, the resource can be deleted from the workspace in the production environment. For more information, see Deploy nodes.
- Compare resource versions and roll back a resource versionYou can click Versions to view the saved or committed resource versions, and compare the changes on the resource between different versions.Note When you compare resource versions, you must select at least two versions for comparison.
Appendix 1: View resources used by a compute engine by using commands
Command | Description |
---|---|
list resources; | Views all resources in a compute engine in the development environment. |
use Name of a compute engine in the production environment;list resources; | Views all resources in a compute engine in the production environment. |
desc resource <resource_name>; | Views the details of a specified resource. |
Appendix 2: Add compute engine resources to DataWorks for management
You can use the MaxCompute Resources feature in DataStudio to load a MaxCompute compute engine resource whose size is no more than 200 MB to DataWorks for visualized management. For more information, see Manage MaxCompute resources.