All Products
Search
Document Center

Hologres:User-defined functions (Remote UDFs)

Last Updated:Feb 04, 2026

Hologres V3.1 and later supports remote calls to user-defined functions (UDFs) in Function Compute (FC). You can use these functions to process complex business logic or perform advanced data operations. This topic describes how to use UDFs in Hologres.

Function overview

Remote functions allow you to extend data processing and analysis capabilities by integrating external functions into Hologres. In Hologres V3.1, remote functions support calls to the Alibaba Cloud Function Compute (FC) service. This feature lets you dynamically call FC to process complex business logic or perform advanced data operations while you query Hologres data.

Remote functions support the following three scenarios:

  • Real-time data processing: Call external functions in real time during data queries to perform data cleaning, format conversion, or complex calculations.

  • Third-party service integration: Use Function Compute to interact with other Alibaba Cloud services or third-party APIs using Hologres data.

  • Advanced analytics: Use Function Compute to implement advanced analytics algorithms, such as model inference or machine learning, and write the results directly to Hologres.

Prerequisites

  • You have a Hologres instance that is V3.1 or later. For more information, see Purchase a Hologres instance.

  • You have activated Function Compute, and your FC instance is V3.0. To do this, log on to the Function Compute console.

  • You have granted the AliyunServiceRoleForHologresRemoteUDF service-linked role.

    1. Log on to the Hologres console. In the navigation pane on the left, click Create a Service-linked Role.

    2. Select AliyunServiceRoleForHologresRemoteUDF and click Authorize Now. The following figure shows an example.

      yuyuyuuyueeeee

Note
  • The Hologres instance and the Function Compute service must be in the same region.

  • You must activate the Alibaba Cloud Function Compute service, and then develop and deploy the function that Hologres calls. For more information, see Code development.

Precautions

  • Only scalar user-defined functions (UDFs) and user-defined table-valued functions (UDTFs) are supported. User-defined aggregate functions (UDAFs) are not supported.

  • Only the following data types and their corresponding array types are supported: BOOLEAN, INTEGER, BIGINT, REAL, DOUBLE PRECISION, and TEXT.

  • Remote functions do not support constants as input parameters.

  • This feature incurs fees in Function Compute (FC). For more information, see Billing.

  • This feature does not incur extra fees in Hologres.

Manage user-defined functions

Create an extension

Before you can use this feature for the first time, you must create an extension. The command is as follows:

CREATE EXTENSION [ IF NOT EXISTS ] function_compute;

Create a function

Syntax

CREATE [ OR REPLACE ] FUNCTION <function_name>
  ( [ [ argmode ] [ argname ] <argtype> [ { DEFAULT | = } default_expr ] [, ...] ] )
    [ RETURNS [SETOF] rettype | RETURNS TABLE ( column_name column_type) ]
  LANGUAGE function_compute
  AS '<fc_endpoint>/<func_name>'
  {
    { CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT }
    | SET function_compute.qualifier TO <qualifier>
    | SET function_compute.compression TO <compression>
    | SET function_compute.max_batch_size TO <max_batch_size>
  } ...

Parameters

Parameter category

Parameter

Required

Description

Function name

function_name

Yes

The name of the function.

  • The function name must be unique within the same schema and must not conflict with existing functions that have the same parameter types. However, you can overload functions with the same name if they have different parameter types, such as rf_sum(int) and rf_sum(float).

  • To avoid conflicts, add the rf_ prefix to all remote functions.

  • If you specify a schema, such as schema1.func, the function is created in that schema.

  • When you use CREATE OR REPLACE FUNCTION to replace a function, you cannot change the function name, parameter types, or return type. To make these changes, you must delete and recreate the function.

Function parameters

argtype

Yes

The data type of the parameter.

argmode

No

The mode can be IN, OUT, or INOUT. The default is IN. To avoid ambiguity, do not use INOUT. A function can have only one OUT or INOUT parameter.

argname

No

The name of the parameter.

  • For input parameters, the name is for documentation purposes only.

  • For output parameters, the name determines the column name in the result set. If omitted, the system generates a default name.

default_expr

No

The default value. This applies only to input parameters. If a parameter has a default value, all subsequent parameters must also have default values.

FC Endpoint

LANGUAGE function_compute

Yes

A fixed value that declares the use of the Function Compute service.

fc_endpoint

Yes

The internal network endpoint of FC. For more information, see Service endpoints.

func_name

Yes

The name of the target function. You must create this function in the FC console beforehand.

Return type

rettype

No

The return value type.

If the function has an OUT or INOUT parameter, you can omit the RETURNS clause. If the clause is present, it must match the result type implied by the output parameters.

column_name

No

The name of an output column in the RETURNS TABLE syntax.

This is another way to declare a named OUT parameter. The difference is that RETURNS TABLE also implies RETURNS SETOF, which means the function returns a set.

column_type

No

The data type of an output column in the RETURNS TABLE syntax.

Null-handling policy

CALLED ON NULL INPUT

No

The default behavior. The function is called even if some of its arguments are null. The function must handle null values.

RETURNS NULL ON NULL INPUT

No

The function automatically returns null if any of its arguments are null.

STRICT

No

An alias for RETURNS NULL ON NULL INPUT.

Function versioning

qualifier

No

Specifies the function version or alias to call. The default value is LATEST.

Request compression algorithm

compression

No

Specifies the compression algorithm for the request and response when calling the function. Valid values:

  • None: Disables compression. This is the default.

  • GZIP: Enables GZIP compression.

Maximum batch size per request

max_batch_size

No

Specifies the maximum number of rows to send to FC in each batch.

The main purpose is to set an upper limit on the number of rows for batch processing for FC functions that have memory limits or other constraints. If you do not specify this parameter, Hologres automatically calculates and uses the optimal batch size. Manual intervention is usually not required.

Examples

  • Create a scalar UDF.

    CREATE OR REPLACE FUNCTION rf_add (  
      a INTEGER,  
      b INTEGER DEFAULT 1 
    )  
    RETURNS BIGINT  
    LANGUAGE function_compute  
    AS 'xxxxxxxxxxxxx.cn-shanghai-internal.fc.aliyuncs.com/add'  
    ;
  • Create a UDTF.

    -- Form 1
    CREATE OR REPLACE FUNCTION rf_unnest(TEXT [])
    RETURNS TABLE (item TEXT )
    LANGUAGE function_compute
    AS 'xxxxxxxxxxxxxx.cn-hangzhou-internal.fc.aliyuncs.com/unnest'
    STRICT
    SET function_compute.max_batch_size TO 1024;
    
    -- Form 2
    CREATE OR REPLACE FUNCTION rf_unnest(TEXT []) 
    RETURNS SETOF TEXT 
    LANGUAGE function_compute 
    AS 'xxxxxxxxxxxxxx.cn-hangzhou-internal.fc.aliyuncs.com/unnest' 
    STRICT
    SET function_compute.max_batch_size TO 1024;

View a function

You can view the remote functions that you created.

SELECT
  CASE
    WHEN p.proretset = 'f' THEN 'scalar UDF'
    WHEN p.proretset = 't' THEN 'UDTF'
  END AS function_type,
  n.nspname AS schema_name,
  p.proname AS function_name,
  pg_get_function_arguments(p.oid) AS arguments,
  pg_get_function_result(p.oid) AS return_type,
  p.proisstrict AS is_strict,
  p.proconfig AS config,
  pg_get_functiondef(p.oid) AS definition
FROM
  pg_proc p
  JOIN pg_language l ON p.prolang = l.oid
  JOIN pg_namespace n ON p.pronamespace = n.oid
WHERE
  l.lanname = 'function_compute'
  AND p.prokind != 'p'
ORDER BY
  function_type,
  schema_name,
  function_name;

Delete a function

Syntax

 DROP FUNCTION [ IF EXISTS ] <function_name> [ ( [ [ argmode ] [ argname ] <argtype> [, ...] ] ) ] [, ...]

Parameters

Parameter

Required

Description

function_name

Yes

The name of the function to delete.

argtype

Yes

The data type of the parameter.

argmode

No

The mode can be IN, OUT, or INOUT. The default is IN. To avoid ambiguity, do not use INOUT. A function can have only one OUT or INOUT parameter.

argname

No

The name of the parameter. For input parameters, the name is for documentation purposes only. For output parameters, the name determines the column name in the result set. If omitted, the system generates a default name.

Example

DROP FUNCTION rf_add(INTEGER, INTEGER);

Data interaction format

Request format from Hologres to FC

Hologres calls the FC InvokeFunction API with a POST request. The request body is a JSON object containing a data key, which holds a two-dimensional array. Each element in the array represents a row of data from the batch and contains the parameters for the function call.

Data serialization rules

  • BOOLEAN is serialized to the JSON boolean type.

  • INTEGER and BIGINT are serialized to the JSON number type.

  • REAL and DOUBLE PRECISION are serialized to the JSON number type.

  • TEXT is serialized to the JSON string type.

  • NULL is serialized to JSON null.

Example

The following code provides an example of a serialized request for a remote function with the signature rf_demo(TEXT, INTEGER, BOOLEAN).

{  
  "data": [  
    ["foo", 100, true],
    [null, null, false],
    ["bar", 200, false]
  ]
}  

Response format from FC to Hologres

After FC finishes processing a batch of data, it must return the results to Hologres in a JSON format that adheres to the following specifications:

  • Scalar UDFs

    • The top-level object must contain a results field. The value of this field is an array, and each element corresponds to the result for one row of input data.

    • Each result must be an array. The results must be in the same order as the input data. The Nth element in the results array corresponds to the Nth row of the input data.

    The following code provides an example of the return value for a batch of four rows:

    {
      "results": [
        ["Beijing"],
        ["Shanghai"],
        ["Shenzhen"],
        ["Guangzhou"]
      ]
    }
  • UDTFs: A UDTF can generate multiple output rows from a single input row. The row number (row_num) is used to associate the output with the original input row.

    • The top-level object must contain a results field. The value of this field is an array, and each element corresponds to one row of output data.

    • Each element in the results array must be an array that contains two elements:

      • row_num (first element): The 0-based index of the original input row. This number is used to associate the output with the input. The row_num values must be returned in ascending order.

      • result (second element): A row of the processed return value.

    Here is an example:

    {
      "results": [
        [0, "Beijing"],
        [1, "Shanghai"],
        [3, "Shenzhen"],
        [3, "Guangzhou"],
      ]
    }

Example of a user-defined function

This example uses the unnest function.

  1. Activate Function Compute.

    Log on to the Function Compute console. You can also claim a free resource plan with a specific quota based on the on-screen instructions. For more information, see Trial quotas.

  2. Create an FC event-triggered function.

    1. In the navigation pane on the left, click Functions. Switch to the region where your Hologres instance resides.

    2. On the Functions page, click Create Function. You are then redirected to the Create Function page.

    3. Select Event Function and configure the following parameters. You can keep the default values for the other parameters. For more information, see Create an event-triggered function.

      Parameter

      Description

      Function Name

      Enter a custom name. For example, unnest.

      Runtime

      Select Built-in Runtimes / Python / Python 3.10.

      Code Upload Method

      Select Upload Code via ZIP Package.

      Code Package

      Upload the code package that you have written and packaged.

      Save the following code to a file named unnest.py and compress it into unnest.zip.

      import json
      
      def unnest(event, context):
          evt = json.loads(event)
          data = evt.get('data', None)
          if data is None:
              raise ValueError('no "data" key in event.')
          if not isinstance(data, list):
              raise ValueError('data is not a list.')
      
          res = list()
          for i in range(len(data)):
              if len(data[i]) != 1 or not isinstance(data[i], list):
                  raise ValueError('the item in data is not a list.')
              for item in data[i][0]:
                  res.append([i, item])
      
          return json.dumps({'results': res})

      Handler

      Enter unnest.unnest.

  3. Create a Hologres remote function.

    Note

    You can complete the following steps on the HoloWeb platform. For more information, see Connect to HoloWeb.

    CREATE EXTENSION IF NOT EXISTS function_compute;
    
    CREATE OR REPLACE FUNCTION rf_unnest(INTEGER [])
      RETURNS SETOF INTEGER
    STRICT
    LANGUAGE function_compute
    AS 'xxxxxxxxxxxxxxxxx.cn-shanghai-internal.fc.aliyuncs.com/unnest';
  4. Prepare test data.

    CREATE TABLE test_array (
        numbers INTEGER[]
    );
    
    INSERT INTO test_array (numbers) VALUES
      (ARRAY[1, 3]),
      (ARRAY[2, 4]),
      ('{}'),  
      (ARRAY[]::INTEGER[]),  
      (NULL);
  5. Call the remote function.

    SELECT numbers, rf_unnest(numbers) FROM test_array;
    
     numbers | rf_unnest
    ---------+-----------
     {2,4}   |         2
     {2,4}   |         4
     {1,3}   |         1
     {1,3}   |         3
    (4 rows)