All Products
Search
Document Center

DataWorks:CDH resources and functions

Last Updated:Mar 26, 2026

Data Studio supports uploading CDH Jar and File resources to a CDH cluster and registering them as custom functions (UDFs) for use in Flink jobs and Apache Hive SQL queries. This topic describes how to create, use, and manage CDH resources and functions through Resource Management.

Prerequisites

Before you begin, ensure that you have:

Access Resource Management

  1. Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a region. Find the target workspace and choose Shortcuts > Data Studio in the Actions column.

  2. In the left-side navigation pane, click the Resource Management button image to open the Resource Management page.

  3. Click the image button to create a resource or function. Optionally, click Create Folder first to organize your directory, then right-click the folder and select Create to choose the type.

Create and use resources

Resource types

DataWorks supports uploading the following resource types to a CDH cluster. Use them to develop Flink jobs or create custom functions.

Resource typeDescriptionLocal uploadOSS upload
CDH JarA compiled JAR package for running Java programs. File extension: .jar.SupportedSupported
CDH FileAny file type. Actual usage depends on engine support.

Limits

  • Size: Maximum resource size is 500 MB.

  • Publishing: In a workspace in standard mode, publish the resource to make it available in the production environment. Data source configurations may differ between development and production environments — confirm the correct data source before operating in each environment.

  • Management scope: DataWorks only supports viewing and managing resources that were uploaded through DataWorks.

Create a resource

  1. On the Resource Management page, click the create button to open the Create Resource And Function dialog box. Configure the Type, Path, and Name.

  2. Upload a local file as the resource source. Configure the following parameters:

    ParameterDescription
    Storage PathThe HDFS path where the file is stored on the CDH cluster. Default: /user/admin/lib. If Kerberos authentication is enabled, grant the current user write permission to this directory before uploading.
    Data SourceSelect the CDH data source to upload to.
    Resource GroupSelect a Serverless resource group with connectivity to the CDH cluster.
  3. Click Save and then Publish in the toolbar. Only published resources are available for use in data development.

Use a resource

After creating a resource, reference it in a data development node:

  1. In the node editor, click Resource Management in the left-side navigation pane.

  2. Find the target resource, right-click it, and select Reference Resource.

After referencing, the resource appears as a code snippet in the following format:

##@resource_reference{"resource name"}

For example, in a CDH Hive node:

##@resource_reference{"example"}
Note: The display format varies by node type. Check the DataWorks console to confirm the format for your node type.

Alternatively, register resources as functions and use them in development nodes or SQL queries. See Create and use functions.

Create and use functions

When to use custom functions

Use a custom function (User-Defined Function, or UDF) when Apache Hive built-in functions cannot express the logic you need — for example, custom data encryption, hashing, JSON parsing, or domain-specific string transformations. For standard aggregations, filters, and string operations, use built-in Hive functions instead.

Function types

CDH functions support the following types. Select the type that matches the return behavior of your UDF:

Function typeDescriptionExample use case
MATHMathematical operations on numeric valuesCustom rounding, unit conversion
AGGREGATEAggregates multiple rows into a single result (equivalent to a UDAF)Custom sum with business rules
STRINGString processing and transformationCustom masking, transliteration
DATEDate and time operationsCustom calendar conversions
ANALYTICWindow functions that operate over a partition of rowsCustom running totals
OTHERFunctions that do not fit the above categories

Create a function

Before creating a function, make sure:

  • The CDH engine is registered as a computing resource in DataWorks.

  • The CDH Jar or CDH File resource to be used is already uploaded.

To create a function:

  1. On the Resource Management page, click the create button to open the Create Resource And Function dialog box. Select the function Type, configure the Path, and enter the function Name.

  2. Click OK. Configure the function parameters:

    ParameterDescription
    Function TypeSelect the type that matches your UDF: MATH, AGGREGATE, STRING, DATE, ANALYTIC, or OTHER.
    Data SourceSelect the CDH data source from the dropdown.
    Class NameThe class name of the UDF, in the format resource_name.class_name. See Determine the class name below.
    Resource ListSelect the CDH Jar or CDH File resource to use. CDH functions only support visual mode.
    Command FormatAn example showing how to call the UDF.
  3. Click Save and then Publish in the toolbar. Only published functions are available in data development.

Determine the class name

DataWorks supports using both JAR and File types of CDH resources when creating custom functions. The Class Name format depends on the resource type used:

  • CDH Jar resource: Use the fully qualified Java class name in the format Java_package_name.actual_class_name. Do not include the .jar suffix.

    Example: If your Maven project has:

    • Package: com.aliyun.cdh.examples.udf

    • Class: UDAFExample

    Then Class Name is: com.aliyun.cdh.examples.udf.UDAFExample

    To find this value in IntelliJ IDEA, right-click the class and select Copy Reference.

  • CDH File resource: Use the file resource name as the resource name portion, in the format file_resource_name.class_name.

Use a function

After publishing a function, reference it in a data development node or SQL query:

  • In a development node: In the node editor, click Resource Management in the left-side navigation pane. Find the function, right-click it, and select Reference Function. The function name is inserted into the editor automatically, for example: example_function().

  • In a SQL query: Call the function directly by name:

    SELECT example_function(column_name) FROM table;

Manage resources and functions

After uploading resources or creating functions, manage them from the Resource Management page by clicking the target resource or function.

  • View version history: Click the version button on the resource or function editing page to view and compare saved versions. Select at least two versions for comparison.

  • Delete a resource or function: Right-click the target resource or function and select Delete. To delete a resource or function from the production environment, publish the deletion operation to the production environment. After publishing, the resource or function is removed from the production environment.