Data Studio supports uploading CDH Jar and File resources to a CDH cluster and registering them as custom functions (UDFs) for use in Flink jobs and Apache Hive SQL queries. This topic describes how to create, use, and manage CDH resources and functions through Resource Management.
Prerequisites
Before you begin, ensure that you have:
Registered a CDH cluster with DataWorks. Resource and function creation is based on Flink computing resources.
Completed development of the resource files to upload.
Access Resource Management
Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a region. Find the target workspace and choose Shortcuts > Data Studio in the Actions column.
In the left-side navigation pane, click the Resource Management button
to open the Resource Management page.Click the
button to create a resource or function. Optionally, click Create Folder first to organize your directory, then right-click the folder and select Create to choose the type.
Create and use resources
Resource types
DataWorks supports uploading the following resource types to a CDH cluster. Use them to develop Flink jobs or create custom functions.
| Resource type | Description | Local upload | OSS upload |
|---|---|---|---|
| CDH Jar | A compiled JAR package for running Java programs. File extension: .jar. | Supported | Supported |
| CDH File | Any file type. Actual usage depends on engine support. |
Limits
Size: Maximum resource size is 500 MB.
Publishing: In a workspace in standard mode, publish the resource to make it available in the production environment. Data source configurations may differ between development and production environments — confirm the correct data source before operating in each environment.
Management scope: DataWorks only supports viewing and managing resources that were uploaded through DataWorks.
Create a resource
On the Resource Management page, click the create button to open the Create Resource And Function dialog box. Configure the Type, Path, and Name.
Upload a local file as the resource source. Configure the following parameters:
Parameter Description Storage Path The HDFS path where the file is stored on the CDH cluster. Default: /user/admin/lib. If Kerberos authentication is enabled, grant the current user write permission to this directory before uploading.Data Source Select the CDH data source to upload to. Resource Group Select a Serverless resource group with connectivity to the CDH cluster. Click Save and then Publish in the toolbar. Only published resources are available for use in data development.
Use a resource
After creating a resource, reference it in a data development node:
In the node editor, click Resource Management in the left-side navigation pane.
Find the target resource, right-click it, and select Reference Resource.
After referencing, the resource appears as a code snippet in the following format:
##@resource_reference{"resource name"}For example, in a CDH Hive node:
##@resource_reference{"example"}Note: The display format varies by node type. Check the DataWorks console to confirm the format for your node type.
Alternatively, register resources as functions and use them in development nodes or SQL queries. See Create and use functions.
Create and use functions
When to use custom functions
Use a custom function (User-Defined Function, or UDF) when Apache Hive built-in functions cannot express the logic you need — for example, custom data encryption, hashing, JSON parsing, or domain-specific string transformations. For standard aggregations, filters, and string operations, use built-in Hive functions instead.
Function types
CDH functions support the following types. Select the type that matches the return behavior of your UDF:
| Function type | Description | Example use case |
|---|---|---|
| MATH | Mathematical operations on numeric values | Custom rounding, unit conversion |
| AGGREGATE | Aggregates multiple rows into a single result (equivalent to a UDAF) | Custom sum with business rules |
| STRING | String processing and transformation | Custom masking, transliteration |
| DATE | Date and time operations | Custom calendar conversions |
| ANALYTIC | Window functions that operate over a partition of rows | Custom running totals |
| OTHER | Functions that do not fit the above categories | — |
Create a function
Before creating a function, make sure:
The CDH engine is registered as a computing resource in DataWorks.
The CDH Jar or CDH File resource to be used is already uploaded.
To create a function:
On the Resource Management page, click the create button to open the Create Resource And Function dialog box. Select the function Type, configure the Path, and enter the function Name.
Click OK. Configure the function parameters:
Parameter Description Function Type Select the type that matches your UDF: MATH, AGGREGATE, STRING, DATE, ANALYTIC, or OTHER. Data Source Select the CDH data source from the dropdown. Class Name The class name of the UDF, in the format resource_name.class_name. See Determine the class name below.Resource List Select the CDH Jar or CDH File resource to use. CDH functions only support visual mode. Command Format An example showing how to call the UDF. Click Save and then Publish in the toolbar. Only published functions are available in data development.
Determine the class name
DataWorks supports using both JAR and File types of CDH resources when creating custom functions. The Class Name format depends on the resource type used:
CDH Jar resource: Use the fully qualified Java class name in the format
Java_package_name.actual_class_name. Do not include the.jarsuffix.Example: If your Maven project has:
Package:
com.aliyun.cdh.examples.udfClass:
UDAFExample
Then Class Name is:
com.aliyun.cdh.examples.udf.UDAFExampleTo find this value in IntelliJ IDEA, right-click the class and select Copy Reference.
CDH File resource: Use the file resource name as the resource name portion, in the format
file_resource_name.class_name.
Use a function
After publishing a function, reference it in a data development node or SQL query:
In a development node: In the node editor, click Resource Management in the left-side navigation pane. Find the function, right-click it, and select Reference Function. The function name is inserted into the editor automatically, for example:
example_function().In a SQL query: Call the function directly by name:
SELECT example_function(column_name) FROM table;
Manage resources and functions
After uploading resources or creating functions, manage them from the Resource Management page by clicking the target resource or function.
View version history: Click the version button on the resource or function editing page to view and compare saved versions. Select at least two versions for comparison.
Delete a resource or function: Right-click the target resource or function and select Delete. To delete a resource or function from the production environment, publish the deletion operation to the production environment. After publishing, the resource or function is removed from the production environment.