All Products
Search
Document Center

OpenLake:Migrate a Paimon FileSystem catalog to DLF

Last Updated:Jan 15, 2026

This topic describes how to deploy a JAR job in Realtime Compute for Apache Flink to migrate a Paimon FileSystem catalog to DLF.

Prerequisites

Procedure

Step 1: Create a JAR job

  1. Log on to the Realtime Compute for Apache Flink management console.

  2. In the list of fully managed Flink workspaces, click the name of your workspace.

  3. In the navigation pane on the left, choose Operation Center > Deployments.

  4. Click Deploy Job, select JAR Job, and configure the following parameters.

    Parameter

    Description

    Example

    Deployment Mode

    This parameter is fixed to Batch Mode.

    Batch Mode

    Deployment Name

    Enter a name for the JAR job.

    migrate_paimon

    Engine Version

    Select a real-time computing engine version.

    vvr-8.0.11-flink-1.17

    JAR URI

    Upload the paimon-flink-action JAR package.

    Upload the paimon-flink-action-1.3-SNAPSHOT-for-clone-20250909.jar package. If you have uploaded it before, select it from the drop-down list.

    Entry Point Class

    The entry point class of the program.

    Leave this blank.

    Entry Point Main Arguments

    The parameters passed to the main method.

    Leave this blank for now. The specific parameters depend on the job. For more information, see Step 2.

    Additional Dependencies

    Specify the path or filename of the dependency file to attach.

    Upload the paimon-ali-vvr-8.0-vvp-1.3-ali-SNAPSHOT-for-clone-20250909.jar package. If you have uploaded it before, select it from the drop-down list.

    Note

    For more information about deployment parameters, see Deploy a JAR job.

  5. Click Deploy to create the JAR job.

Step 2: Adjust parameters and start the job

A Flink job can migrate an entire catalog, an entire database, or a single table. Adjust the Entry Point Main Arguments parameter based on your migration goal.

  1. On the Job O&M page, find the JAR job that you created and click Details.

  2. On the Deployment Details page, click Edit in the upper-right corner and specify the Entry Point Main Arguments parameter.

    The method varies based on your source table type:

    clone
    --parallelism '<parallelism>'
    --database '<database-name>'
    --table '<table-name>'
    --catalog_conf 'metastore=filesystem'
    --catalog_conf "warehouse=<warehouse>"
    --catalog_conf 'fs.oss.endpoint=<fs.oss.endpoint>'
    --catalog_conf 'fs.oss.accessKeyId=<fs.oss.accessKeyId>'
    --catalog_conf 'fs.oss.accessKeySecret=<fs.oss.accessKeySecret>'
    --target_database '<target-database-name>'
    --target_table '<target-table-name>'
    --target_catalog_conf 'metastore=rest'
    --target_catalog_conf 'warehouse=<target-warehouse>'
    --target_catalog_conf 'uri=<dlf.next.endpoint>'
    --target_catalog_conf 'token.provider=dlf'
    --target_catalog_conf 'dlf.access-key-id=<dlf.access-key-id>'
    --target_catalog_conf 'dlf.access-key-secret=<dlf.access-key-secret>'
    --clone_from 'paimon'
    --where '<filter-spec>'

    The following table describes the configuration items.

    Configuration Item

    Description

    Required

    Remarks

    parallelism

    The concurrency of the job.

    No

    Example: 16

    database-name

    The name of the FileSystem catalog database to clone.

    No

    Example: my_database

    table-name

    The name of the FileSystem Catalog data table to clone.

    No

    Example: my_table

    warehouse

    The path of the OSS repository for the FileSystem catalog to clone.

    Yes

    The format is oss://<bucket>/<object>. In the format:

    • bucket: the name of your OSS bucket.

    • object: the path where your data is stored.

    View your bucket and object names in the OSS console.

    fs.oss.endpoint

    The endpoint of the OSS service.

    Yes

    For more information about how to obtain the endpoint, see Regions and endpoints.

    OSS example: oss-cn-hangzhou-internal.aliyuncs.com.

    OSS-HDFS example:

    cn-hangzhou.oss-dls.aliyuncs.com

    fs.oss.accessKeyId

    The AccessKey ID of the Alibaba Cloud account or RAM user that has read and write permissions on OSS.

    Yes

    Use an existing AccessKey or create a new one. For more information, see Create an AccessKey.

    Note

    To reduce the risk of an AccessKey secret leak, the AccessKey Secret is displayed only when you create it and cannot be retrieved later. Store your AccessKey secret securely.

    fs.oss.accessKeySecret

    The AccessKey secret of the Alibaba Cloud account or RAM user that has read and write permissions on OSS.

    Yes

    target-database-name

    The name of the cloned DLF database.

    No

    Example: target_database

    target-table-name

    The name of the cloned DLF data table.

    No

    Example: targety_table

    target-warehouse

    The name of the cloned DLF data catalog.

    Yes

    View the data catalog name in the DLF console. For more information, see Data catalogs.

    dlf.next.endpoint

    The endpoint of the DLF service.

    Yes

    For more information, see Endpoints.

    Example: cn-hangzhou-vpc.dlf.aliyuncs.com

    dlf.access-key-id

    The AccessKey ID required to access the DLF service.

    Yes

    Use an existing AccessKey or create a new one. For more information, see Create an AccessKey.

    Note

    To reduce the risk of an AccessKey secret leak, the AccessKey Secret is displayed only when you create it and cannot be retrieved later. Store your AccessKey secret securely.

    dlf.access-key-secret

    The AccessKey secret required to access the DLF service.

    Yes

    clone_from

    The type of the source table to clone.

    Yes

    'paimon'

    filter-spec

    The filter condition for partitions during cloning.

    No

    dt = '2024-10-01'

    Important
    • If you want to migrate an entire database, do not set the table-name and target-table-name parameters.

    • If you want to migrate an entire data catalog, do not set the database-name and target-database-name parameters.

    • When you migrate an entire data catalog or database, you can exclude specific tables by setting the --excluded_tables <excluded-tables-spec> parameter. Example: my_db.my_tbl,my_db2.my_tbl2. Do not set this parameter for single-table migrations.

  3. After you configure the parameters, click Save on the Deployment Details page.

  4. On the Job O&M page, click Start next to the JAR job. Then, start the job with the default parameters.

Step 3: Verify the result

When the job status changes to Finished, log on to the DLF console and verify that the migration was successful.

  • For a full catalog migration: Check that the catalog structure, databases, and tables in DLF are consistent with those in the FileSystem catalog.

  • For a full database migration: Check that the database and table structures in DLF are consistent with those in the FileSystem catalog.

  • For a single table migration: Check that the table structure in DLF is consistent with that in the FileSystem catalog.