This topic describes how to deploy a JAR job in Realtime Compute for Apache Flink to migrate a Paimon FileSystem catalog to DLF.
Prerequisites
A fully managed Flink workspace is created. For more information, see Activate Realtime Compute for Apache Flink.
A DLF data catalog is created. For more information, see Create a data catalog.
Procedure
Step 1: Create a JAR job
Log on to the Realtime Compute for Apache Flink management console.
In the list of fully managed Flink workspaces, click the name of your workspace.
In the navigation pane on the left, choose .
Click Deploy Job, select JAR Job, and configure the following parameters.
Parameter
Description
Example
Deployment Mode
This parameter is fixed to Batch Mode.
Batch Mode
Deployment Name
Enter a name for the JAR job.
migrate_paimon
Engine Version
Select a real-time computing engine version.
vvr-8.0.11-flink-1.17
JAR URI
Upload the paimon-flink-action JAR package.
Upload the paimon-flink-action-1.3-SNAPSHOT-for-clone-20250909.jar package. If you have uploaded it before, select it from the drop-down list.
Entry Point Class
The entry point class of the program.
Leave this blank.
Entry Point Main Arguments
The parameters passed to the main method.
Leave this blank for now. The specific parameters depend on the job. For more information, see Step 2.
Additional Dependencies
Specify the path or filename of the dependency file to attach.
Upload the paimon-ali-vvr-8.0-vvp-1.3-ali-SNAPSHOT-for-clone-20250909.jar package. If you have uploaded it before, select it from the drop-down list.
NoteFor more information about deployment parameters, see Deploy a JAR job.
Click Deploy to create the JAR job.
Step 2: Adjust parameters and start the job
A Flink job can migrate an entire catalog, an entire database, or a single table. Adjust the Entry Point Main Arguments parameter based on your migration goal.
On the Job O&M page, find the JAR job that you created and click Details.
On the Deployment Details page, click Edit in the upper-right corner and specify the
Entry Point Main Argumentsparameter.The method varies based on your source table type:
clone --parallelism '<parallelism>' --database '<database-name>' --table '<table-name>' --catalog_conf 'metastore=filesystem' --catalog_conf "warehouse=<warehouse>" --catalog_conf 'fs.oss.endpoint=<fs.oss.endpoint>' --catalog_conf 'fs.oss.accessKeyId=<fs.oss.accessKeyId>' --catalog_conf 'fs.oss.accessKeySecret=<fs.oss.accessKeySecret>' --target_database '<target-database-name>' --target_table '<target-table-name>' --target_catalog_conf 'metastore=rest' --target_catalog_conf 'warehouse=<target-warehouse>' --target_catalog_conf 'uri=<dlf.next.endpoint>' --target_catalog_conf 'token.provider=dlf' --target_catalog_conf 'dlf.access-key-id=<dlf.access-key-id>' --target_catalog_conf 'dlf.access-key-secret=<dlf.access-key-secret>' --clone_from 'paimon' --where '<filter-spec>'The following table describes the configuration items.
Configuration Item
Description
Required
Remarks
parallelism
The concurrency of the job.
No
Example: 16
database-name
The name of the FileSystem catalog database to clone.
No
Example: my_database
table-name
The name of the FileSystem Catalog data table to clone.
No
Example: my_table
warehouse
The path of the OSS repository for the FileSystem catalog to clone.
Yes
The format is
oss://<bucket>/<object>. In the format:bucket: the name of your OSS bucket.
object: the path where your data is stored.
View your bucket and object names in the OSS console.
fs.oss.endpoint
The endpoint of the OSS service.
Yes
For more information about how to obtain the endpoint, see Regions and endpoints.
OSS example: oss-cn-hangzhou-internal.aliyuncs.com.
OSS-HDFS example:
cn-hangzhou.oss-dls.aliyuncs.com
fs.oss.accessKeyId
The AccessKey ID of the Alibaba Cloud account or RAM user that has read and write permissions on OSS.
Yes
Use an existing AccessKey or create a new one. For more information, see Create an AccessKey.
NoteTo reduce the risk of an AccessKey secret leak, the AccessKey Secret is displayed only when you create it and cannot be retrieved later. Store your AccessKey secret securely.
fs.oss.accessKeySecret
The AccessKey secret of the Alibaba Cloud account or RAM user that has read and write permissions on OSS.
Yes
target-database-name
The name of the cloned DLF database.
No
Example: target_database
target-table-name
The name of the cloned DLF data table.
No
Example: targety_table
target-warehouse
The name of the cloned DLF data catalog.
Yes
View the data catalog name in the DLF console. For more information, see Data catalogs.
dlf.next.endpoint
The endpoint of the DLF service.
Yes
For more information, see Endpoints.
Example: cn-hangzhou-vpc.dlf.aliyuncs.com
dlf.access-key-id
The AccessKey ID required to access the DLF service.
Yes
Use an existing AccessKey or create a new one. For more information, see Create an AccessKey.
NoteTo reduce the risk of an AccessKey secret leak, the AccessKey Secret is displayed only when you create it and cannot be retrieved later. Store your AccessKey secret securely.
dlf.access-key-secret
The AccessKey secret required to access the DLF service.
Yes
clone_from
The type of the source table to clone.
Yes
'paimon'
filter-spec
The filter condition for partitions during cloning.
No
dt = '2024-10-01'
ImportantIf you want to migrate an entire database, do not set the
table-nameandtarget-table-nameparameters.If you want to migrate an entire data catalog, do not set the
database-nameandtarget-database-nameparameters.When you migrate an entire data catalog or database, you can exclude specific tables by setting the
--excluded_tables <excluded-tables-spec>parameter. Example: my_db.my_tbl,my_db2.my_tbl2. Do not set this parameter for single-table migrations.
After you configure the parameters, click Save on the Deployment Details page.
On the Job O&M page, click Start next to the JAR job. Then, start the job with the default parameters.
Step 3: Verify the result
When the job status changes to Finished, log on to the DLF console and verify that the migration was successful.
For a full catalog migration: Check that the catalog structure, databases, and tables in DLF are consistent with those in the FileSystem catalog.
For a full database migration: Check that the database and table structures in DLF are consistent with those in the FileSystem catalog.
For a single table migration: Check that the table structure in DLF is consistent with that in the FileSystem catalog.