All Products
Search
Document Center

Realtime Compute for Apache Flink:Python user-defined scalar functions (UDSFs)

Last Updated:Mar 26, 2026

A user-defined scalar function (UDSF) maps zero, one, or more scalar values to a single scalar value. Each input row produces exactly one output value.

This topic describes how to create, register, and use a Python UDSF in Realtime Compute for Apache Flink.

Limits

The following constraints apply when you develop Python user-defined functions (UDFs) in Realtime Compute for Apache Flink:

ConstraintRequirement
Apache Flink version1.12 and later
Python versionPre-installed on every workspace. VVR earlier than 8.0.11: Python 3.7.9. VVR 8.0.11 and later: Python 3.9.21.
JDK versionJDK 8 and JDK 11. Third-party JAR packages must be compatible with JDK 8 or JDK 11.
Scala versionOpen-source Scala 2.11 only. Third-party JAR packages must be compatible with Scala 2.11.
Important

After upgrading to VVR 8.0.11 or later, test, deploy, and run your existing PyFlink drafts again to confirm compatibility.

Create a UDSF

The following steps use Windows as the example environment. Flink provides a sample repository that includes implementations for UDSFs, user-defined aggregate functions (UDAFs), and user-defined table-valued functions (UDTFs).
  1. Download and decompress python_demo-master to your local machine.

    This is a third-party GitHub repository. Access may be slow or intermittent.
  2. In PyCharm, choose File > Open and open the decompressed python_demo-master directory.

  3. Open udfs.py in the \python_demo-master\udx path and define your UDSF.

    from pyflink.table import DataTypes
    from pyflink.table.udf import udf
    
    @udf(result_type=DataTypes.STRING())
    def sub_string(s: str, begin: int, end: int):
        return s[begin:end]

    The sub_string example extracts characters from position begin to position end in the input string.

  4. From the \python_demo-master directory, run the following command to package the udx directory:

    zip -r python_demo.zip udx

    When python_demo.zip appears in \python_demo-master\, the package is ready.

Register a UDSF

After creating the package, register the UDSF in the Realtime Compute for Apache Flink console. For registration steps, see Manage user-defined functions (UDFs).

Use a UDSF

After registering the UDSF, use it in a Flink SQL job.

  1. Create a draft using Flink SQL. For details, see Job development overview. The following example calls ASI_UDSF (the registered name of your UDSF) to extract characters from positions 2 to 4 of the a field in the source table:

    CREATE TEMPORARY TABLE ASI_UDSF_Source (
      a VARCHAR,
      b INT,
      c INT
    ) WITH (
      'connector' = 'datagen'
    );
    
    CREATE TEMPORARY TABLE ASI_UDSF_Sink (
      a VARCHAR
    ) WITH (
      'connector' = 'blackhole'
    );
    
    INSERT INTO ASI_UDSF_Sink
    SELECT ASI_UDSF(a, 2, 4)
    FROM ASI_UDSF_Source;
  2. In the left-side navigation pane of the development console, choose O&M > Deployments. Find the deployment, then click Start in the Actions column. After the deployment starts, characters at positions 2–4 of the a field in ASI_UDSF_Source are written to ASI_UDSF_Sink.