All Products
Search
Document Center

Dataphin:Create a User-Defined Function

Last Updated:Aug 13, 2025

Offline computing functions are used to manage SQL functions in offline computing task code development. These include commonly used functions supported by default in the compute engine and user-defined functions. Default functions cannot be edited. This topic describes how to create a user-defined function.

Prerequisites

Complete the resource creation. For more information, see upload resources and references.

Background information

  • Dataphin organizes functions into directories based on their types to help you better manage functions.

  • Different compute engines support different types of functions.

    Compute Engine Type

    Supported Functions

    Offline Engine

    MaxCompute

    MAXC functions

    Hologres

    Custom functions are not supported

    Hadoop

    Hadoop functions (Hive functions), Impala functions

    TDH Inceptor

    Custom functions are not supported

    ADB for PostgreSQL

    ADB functions

    SelectDB

    Custom functions are not supported

    Doris

    Custom functions are not supported

    Real-time Engine

    Alibaba Blink

    FLINK functions

    Ververica Flink

    FLINK functions

    Open-source Flink

    FLINK functions

  • External MaxCompute projects do not support creating custom functions.

Procedure

  1. In the top menu bar of the Dataphin homepage, choose Develop > Data Development.

  2. In the top menu bar, select Project (In Dev-Prod mode, you need to select Environment).

  3. In the navigation pane on the left, choose Data Processing > Function.

  4. In the function list on the right, click the image icon and select the target function type.

  5. In the Create Function dialog box, configure the parameters.

    Parameter

    Description

    Name

    Enter a function name. The name can contain letters, numbers, and underscores (_), and must start with a letter.

    Note

    Within the same project, Impala functions and Hive functions cannot have the same name as any custom function in any directory of the project.

    Select Resource

    Select the resource file. The drop-down list provides resource names that match the current project.

    Note
    • Only JAR files are supported for function definition.

    • When selecting multiple resources, they must be of the same type.

    • If you don't have resources yet, you need to create them. For more information, see upload resources and references.

    Programming Language

    Impala supports functions defined in C++ and Java. To define an Impala function, select the corresponding programming language based on your resource type.

    Class Name

    Enter the class name. For resources in the compute type, extract the class content from the resource, such as test_udf.UDFGETSrcId.

    Type

    Select the type. The drop-down list includes Window, Statistics, Numeric, String, Time, IP Address Related Functions, URL, Encoding, Business, and Others.

    Register Function

    To define an Impala function and the programming language of the resource is C++, enter the statement to create the Impala function. The registered function must follow the syntax below. The Location statement backend is compatible with resource file substitution.

    • Create C++ scalar function

      CREATE FUNCTION [IF NOT EXISTS] [db_name.]function_name([arg_type[, arg_type...])
        RETURNS return_type
        SYMBOL='symbol_name'
    • Create C++ aggregate function

      CREATE [AGGREGATE] FUNCTION [IF NOT EXISTS] [db_name.]function_name([arg_type[, arg_type...])
        RETURNS return_type
        [INTERMEDIATE type_spec]
        [INIT_FN='function']
        UPDATE_FN='function'
        MERGE_FN='function'
        [PREPARE_FN='function']
        [CLOSEFN='function']
        [SERIALIZE_FN='function']
        [FINALIZE_FN='function']

    For more information, see User-Defined Functions (UDF).

    Syntax

    Enter the syntax. The syntax is the function reference format, such as: bigintweekday (datetime date).

    Usage Documentation

    Enter the function usage description, for example:

    select   
    get_week_date("20170810",0,2),--Query the date of Tuesday in the week of August 10.
    from  cndata.dual

    Select Directory

    The system defaults to the directory of the current function type. To modify it, the system only supports modifying subdirectories under the function type directory.

    For example, if you are creating a MAXC function, the system automatically selects the MAXC function directory. To modify the directory, the system only supports selecting subdirectories under the MAXC function directory.

  6. After completing the configuration, click Submit. In the dialog box that appears, enter Submission Notes, and then click Confirm And Submit.

    Note
    • If the resources referenced by the custom function are updated, you need to resubmit the custom function so that the custom function registered with the compute engine is updated.

    • After successful submission, related reference tasks automatically reference the new version of the object, which may cause tasks to become unavailable. Please check promptly.

    You can use Ad Hoc Query (see query and download data) to write SQL code (referencing the function in the SQL code) to verify whether the function meets the expected effect. The following is an example of an SQL query statement:

    select   
    get_week_date("20170810",0,2),--Query the date of Tuesday in the week of August 10.
    from  cndata.dual

What to do next

  • If the project mode is Dev-Prod, you need to publish the resource to the production environment. For more information, see manage publishing tasks.

  • If your development mode is Basic mode, you can use the custom function for computing task development after successful submission. For more information, see data development overview.