All Products
Search
Document Center

Realtime Compute for Apache Flink:Get started with Paimon catalogs

Last Updated:Mar 09, 2026

This tutorial walks you through the core Apache Paimon operations in Realtime Compute for Apache Flink: creating a catalog, building a table, writing data, consuming changes in streaming mode, and cleaning up resources.

What you will learn:

Step

Operation

Where

1

Create a Paimon catalog backed by OSS

Scripts tab

2

Create a partitioned table with a primary key

Scripts tab

3

Write sample data through a streaming job

ETL > Drafts

4

Consume data in streaming mode

ETL > Drafts

5

Update existing rows and observe changes

ETL > Drafts

6

Cancel jobs and clean up resources

Scripts tab

Note

Steps 1 and 2 use the Scripts tab for one-time DDL operations (catalog and table management). Steps 3 through 5 use streaming drafts under ETL > Drafts for data processing jobs that go through a deploy-and-start lifecycle.

Prerequisites

Before you begin, make sure you have:

Step 1: Create an Apache Paimon catalog

A catalog connects Realtime Compute for Apache Flink to your OSS warehouse directory, where all Paimon databases and tables are stored.

  1. Log on to the Realtime Compute for Apache Flink console.

  2. Find the target workspace and click Console in the Actions column.

  3. In the left-side navigation pane, choose Development > Scripts. Create a script on the Scripts tab.

  4. Enter the following SQL in the script editor: Replace the placeholders with your actual values:

    Note

    To store the Apache Paimon table in OSS-HDFS, configure fs.oss.endpoint, fs.oss.accessKeyId, and fs.oss.accessKeySecret. Set the endpoint in the format cn-<region>.oss-dls.aliyuncs.com, for example cn-hangzhou.oss-dls.aliyuncs.com.

    Parameter

    Required

    Description

    type

    Yes

    Catalog type. Set to paimon.

    metastore

    Yes

    Metadata storage type. This example uses filesystem. For other types, see Manage Apache Paimon catalogs.

    warehouse

    Yes

    OSS path for the data warehouse directory, in the format oss://<bucket>/<object>. Find the bucket name and object path in the OSS console.

    fs.oss.endpoint

    Conditional

    OSS endpoint. Required when the OSS bucket is in a different region from the workspace, or belongs to a different Alibaba Cloud account. For endpoint values, see Regions and endpoints.

    fs.oss.accessKeyId

    Conditional

    AccessKey ID with read and write permissions on OSS. Required in the same scenarios as fs.oss.endpoint. For details, see Create an AccessKey pair.

    fs.oss.accessKeySecret

    Conditional

    AccessKey secret corresponding to the AccessKey ID above. Required in the same scenarios as fs.oss.endpoint.

       CREATE Catalog `my-catalog` WITH (
         'type' = 'paimon',
         'metastore' = 'filesystem',
         'warehouse' = '<warehouse>',
         'fs.oss.endpoint' = '<fs.oss.endpoint>',
         'fs.oss.accessKeyId' = '<fs.oss.accessKeyId>',
         'fs.oss.accessKeySecret' = '<fs.oss.accessKeySecret>'
       );
  5. Select the CREATE Catalog statement and click Run on the left side of the script editor. Expected output:

       The following statement has been executed successfully!

Step 2: Create an Apache Paimon table

  1. In the same script editor on the Scripts tab, enter the following SQL to create a database named my_db and a table named my_tbl: The table uses a composite primary key (dt, id) and is partitioned by dt. The changelog-producer is set to lookup, which enables downstream jobs to consume changes in streaming mode. For details, see Change data generation mechanism.

       CREATE DATABASE `my-catalog`.`my_db`;
    
       CREATE TABLE `my-catalog`.`my_db`.`my_tbl` (
         dt STRING,
         id BIGINT,
         content STRING,
         PRIMARY KEY (dt, id) NOT ENFORCED
       ) PARTITIONED BY (dt) WITH (
         'changelog-producer' = 'lookup'
       );
  2. Select both statements and click Run. Expected output:

       The following statement has been executed successfully!

Step 3: Write data to the Apache Paimon table

Writing data requires a streaming job. Create the job under ETL > Drafts.

  1. In the left-side navigation pane, choose Development > ETL. On the Drafts tab, click New. On the SQL Scripts tab of the New Draft dialog box, click Blank Stream Draft. For details, see Develop an SQL draft.

  2. Copy the following SQL into the SQL editor:

       -- Paimon commits data only after each checkpoint completes.
       -- A 10-second interval lets you see results quickly in this tutorial.
       -- In production, set the interval to 1 to 10 minutes based on your latency requirements.
       SET 'execution.checkpointing.interval' = '10s';
    
       INSERT INTO `my-catalog`.`my_db`.`my_tbl`
       VALUES
         ('20240108', 1, 'apple'),
         ('20240108', 2, 'banana'),
         ('20240109', 1, 'cat'),
         ('20240109', 2, 'dog');
  3. In the upper-right corner of the SQL editor, click Deploy. In the Deploy draft dialog box, configure the parameters and click Confirm.

  4. In the left-side navigation pane, choose O&M > Deployments. Find the deployment and click Start in the Actions column.

  5. In the Start Job panel, select Initial Mode and click Start.

  6. Wait for the deployment status to change to FINISHED. This confirms that the data has been written.

Step 4: Consume data in streaming mode

This step creates a streaming job that reads all rows from the Paimon table and outputs them to the Task Manager logs through the Print connector.

  1. Create a new Blank Stream Draft (under Development > ETL > Drafts) and enter the following SQL:

       CREATE TEMPORARY TABLE Print (
         dt STRING,
         id BIGINT,
         content STRING
       ) WITH (
         'connector' = 'print'
       );
    
       INSERT INTO Print SELECT * FROM `my-catalog`.`my_db`.`my_tbl`;
  2. Click Deploy, configure the parameters in the Deploy draft dialog box, and click Confirm.

  3. On the O&M > Deployments page, find the deployment and click Start in the Actions column. In the Start Job panel, select Initial Mode and click Start.

  4. View the output: Expected output:

    1. On the O&M > Deployments page, click the deployment name.

    2. On the Logs tab, under Running Task Managers, click the value in the Path, ID column.

    3. Click the Stdout tab.

       +I[20240108, 1, apple]
       +I[20240108, 2, banana]
       +I[20240109, 1, cat]
       +I[20240109, 2, dog]

Step 5: Update data in the Apache Paimon table

Paimon tables with primary keys support upserts. When you insert a row with an existing primary key, the old value is replaced.

  1. Create a new Blank Stream Draft and enter the following SQL: This replaces apple with hello for key (20240108, 1) and dog with world for key (20240109, 2).

       SET 'execution.checkpointing.interval' = '10s';
    
       INSERT INTO `my-catalog`.`my_db`.`my_tbl`
       VALUES
         ('20240108', 1, 'hello'),
         ('20240109', 2, 'world');
  2. Click Deploy, configure the parameters, and click Confirm.

  3. On the O&M > Deployments page, find the deployment and click Start. Select Initial Mode and click Start.

  4. Wait for the deployment status to change to FINISHED.

  5. Go to the Stdout tab of the streaming consumption job started in Step 4. The updated rows appear in the log: Expected output: The -U prefix indicates the old value and +U indicates the new value.

       -U[20240108, 1, apple]
       +U[20240108, 1, hello]
       -U[20240109, 2, dog]
       +U[20240109, 2, world]

(Optional) Step 6: Cancel the streaming job and clean up resources

After testing, cancel the streaming consumption job and remove the Paimon resources.

  1. On the O&M > Deployments page, find the streaming consumption deployment and click Cancel in the Actions column.

  2. Go to the Scripts tab. In the script editor, run the following SQL to delete the database and catalog: Expected output:

       -- Delete the database and all associated data files in OSS
       DROP DATABASE `my-catalog`.`my_db` CASCADE;
    
       -- Remove the catalog metadata from Realtime Compute for Apache Flink
       -- Data files in OSS are not deleted by this statement
       DROP CATALOG `my-catalog`;
       The following statement has been executed successfully!

Next steps