All Products
Search
Document Center

Realtime Compute for Apache Flink:Manage UDFs

Last Updated:Mar 18, 2024

You must register a user-defined function (UDF) before you can use the UDF in an SQL deployment. This topic describes how to register, update, and delete a UDF.

Precautions

  • If you want to use a RAM user or a RAM role to perform operations, such as UDF management, in a namespace, you must make sure that the RAM user or RAM role is granted the access permissions on the namespace. For more information, see Authorize an account to perform operations in a namespace.

  • To prevent conflicts between JAR file dependencies, take note of the following items when you develop UDFs:

    • Make sure that the Flink version that you select on the SQL Editor page is the same as the Flink version that is specified in the POM dependency.

    • Specify <scope>provided</scope> for Flink-related dependencies.

    • Use the Shade plug-in to package other third-party dependencies. For more information, see Apache Maven Shade plug-in.

    Note

    For more information about how to handle Flink dependency conflicts between JAR files, see How do I troubleshoot dependency conflicts of Flink?

Register a UDF

Register a catalog UDF

  1. Go to the Register UDF Artifact dialog box.

    1. Log on to the Realtime Compute for Apache Flink console.

    2. On the Fully Managed Flink tab, find the workspace that you want to manage and click Console in the Actions column.

    3. In the left-side navigation pane, click SQL Editor.

    4. On the left side of the SQL Editor page, click the UDFs tab.

  2. Click Register UDF Artifact.

  3. In the Register UDF Artifact dialog box, upload a UDF JAR file.注册UDF

    You can use one of the following methods to upload a UDF JAR file:

    • Upload a file: Click Click to select next to Select a file to upload the UDF JAR file that you want to upload. If you want to upload a dependency file, click Click to select next to Dependencies to upload the file that your UDF JAR file depends on.

      Note
      • Your UDF JAR file is uploaded and stored in the sql-artifacts directory of the Object Storage Service (OSS) bucket that you select.

      • The dependency of a Java UDF can be packaged into the UDF JAR file or separately uploaded as a dependency file. For a Python UDF, we recommend that you separately upload the dependency as a dependency file.

    • External URL: Enter an external URL. If the size of the UDF JAR file exceeds 200 MB or you want to use a UDF JAR file that is used by another service, you can use the external URL to obtain the UDF file.

      Note
      • If the size of the UDF JAR file exceeds 200 MB, you can save the UDF JAR file in the sql-artifacts/namespaces/{namespace} directory of the OSS bucket that is associated with the fully managed Flink workspace. Then, use the HTTPS path of the file as the external URL.

      • You can use the HTTP path of another service as the external URL if the service and the fully managed Flink workspace reside in the same virtual private cloud (VPC). You can also establish a network connection between the service and the fully managed Flink workspace over the Internet and then use the public endpoint of the service. For more information, see How does the fully managed Flink service access the Internet?

  4. Click Confirm.

  5. In the Available Functions section of the Manage Functions dialog box, select one or more UDFs that you want to register and click Create Functions.

    In the console of fully managed Flink, fully managed Flink parses the UDF JAR file and checks whether classes of the UDF, user-defined aggregate function (UDAF), and user-defined table-valued function (UDTF) interfaces of fully managed Flink are used in the file. Then, fully managed Flink automatically extracts the class names and fills the class names in the Function Name column in the Manage Functions dialog box. After UDFs are registered, you can view all UDFs that you registered in the UDFs pane on the left side of the SQL Editor page. The fx characters that are highlighted in yellow are displayed on the left side of the name of each UDF that you registered.

    image.png

    Note

    By default, the latest Flink version is used for data parsing when you register a catalog UDF. If you use a catalog UDF in a deployment and select an earlier Flink engine version in the Basic section of the Configuration tab, an incompatibility issue may occur. To resolve this issue, you can implement UDF code based on the engine version that is required by your deployment and use deployment-level UDFs. For more information, see Register a deployment-level UDF.

Register a deployment-level UDF

Important
  • Only Realtime Compute for Apache Flink that uses Ververica Runtime (VVR) 8.0.3 or later supports deployment-level Python UDFs.

  • If you use a deployment-level Python UDF, you can configure parameters such as python.files or python.archives in the Parameters section of the Configuration tab to specify the related dependency file.

  • If you use a deployment-level Python UDF, you cannot perform a syntax check on a draft. You must skip the syntax check before you deploy a draft as a deployment.

To register a deployment-level UDF, perform the following steps:

  1. Upload the JAR or Python file of the UDF.

    In the left-side navigation pane, click Artifacts. On the Artifacts page, click Upload Artifact to upload the JAR or Python file of the UDF.

  2. Specify a deployment-level UDF in your deployment.

    On the Configurations tab of the SQL Editor page, select the JAR or Python file of the UDF from Additional Dependencies.

  3. Register a deployment-level UDF.

    • Java UDF

      CREATE TEMPORARY FUNCTION yourfunctionname;
    • Python UDF

      CREATE TEMPORARY FUNCTION yourfunctionname LANGUAGE Python;

Update a UDF

If you add a UDF to a UDF JAR file or change the code of a registered UDF in the file, you can perform the following operations to update the UDF JAR file:

  1. Go to the Register UDF Artifact dialog box.

    1. Log on to the Realtime Compute for Apache Flink console.

    2. On the Fully Managed Flink tab, find the workspace that you want to manage and click Console in the Actions column.

    3. In the left-side navigation pane, click SQL Editor.

    4. On the left side of the SQL Editor page, click the UDFs tab.

  2. In the UDFs pane on the left side of the SQL Editor page, move the pointer over the name of the UDF whose JAR file you want to update and click the 更新JAR icon.

  3. In the Register UDF Artifact dialog box, upload a UDF JAR file.更新JAR

    Important
    • The UDF JAR file that you upload must contain all the classes of the registered UDFs.

    • The code in the new UDF JAR file takes effect only when you restart the draft for the deployment or publish a new draft. The code in the new UDF JAR file does not take effect on jobs that are running. The jobs that are running continue to use the original UDF JAR file.

  4. Click Update.

Delete a UDF

Note

Before you delete a UDF JAR file, make sure that the UDF that is registered by using the UDF JAR file is not referenced by a deployment or an SQL file.

If you no longer need a UDF JAR file, perform the following operations to delete the UDF JAR file:

  1. Go to the Register UDF Artifact dialog box.

    1. Log on to the Realtime Compute for Apache Flink console.

    2. On the Fully Managed Flink tab, find the workspace that you want to manage and click Console in the Actions column.

    3. In the left-side navigation pane, click SQL Editor.

    4. On the left side of the SQL Editor page, click the UDFs tab.

  2. In the UDFs pane on the left side of the SQL Editor page, move the pointer over the name of the UDF JAR file that you want to delete and click the 删除JAR icon.

  3. Select Delete associated files.

    If you want to delete the UDF JAR file, you must delete all registered UDFs from the file to avoid dirty data.

  4. Click Confirm.

References

  • For more information about the classification of Java UDFs, how to pass Java UDF parameters in the console of fully managed Flink, and how to develop and use Java UDFs, see Java and UDAFs.

  • For more information about the classification and dependency of Python UDFs and how to debug, tune, develop, and use Python UDFs, see Python and UDAFs.

  • For more information about how to use a UDAF to sort and aggregate data, see Use a UDAF to sort and aggregate data.