JindoTable's MoveTo command migrates Hive tables or partitions to a new destination path and automatically updates the metadata afterward, keeping the table queryable without manual intervention. Use filter conditions to migrate a subset of partitions at once, or pass -fullTable to move an entire table in one operation.
Quick start
Preview which partitions match your filter, then run the migration:
# Step 1: Preview partitions to be migrated (no data is moved)
jindo table -moveTo \
-t mydb.events \
-d oss://my-bucket/archive/events \
-c "ds < '2023-01-01'" \
-e
# Step 2: Run the migration
jindo table -moveTo \
-t mydb.events \
-d oss://my-bucket/archive/events \
-c "ds < '2023-01-01'"
Prerequisites
Before you begin, ensure that you have:
-
Java Development Kit (JDK) 8 installed on your computer
-
An E-MapReduce (EMR) cluster running EMR V3.36.0 or later, or EMR V5.2.0 or later
How it works
When MoveTo runs, it:
-
Copies the underlying data to the destination path.
-
Updates the table or partition metadata to point to the new location.
-
Optionally removes the source data after a successful migration (requires
-r).
MoveTo uses a process lock stored in Hadoop Distributed File System (HDFS) to prevent concurrent runs. Only one MoveTo process can run at a time in an EMR cluster. If you start a second process while one is running, the request is rejected with a message that identifies the running process. Either wait for it to finish or stop it before starting a new one.
Migrate tables or partitions
Follow this three-step workflow to migrate data safely:
-
Preview — run with
-eto verify which partitions will be moved. -
Migrate — run the actual migration command.
-
Clean up — add
-rto remove source data only after you confirm the migration succeeded.
-r/-removeSource permanently removes source data after migration. When combined with -skipTrash, data is deleted immediately without going to the HDFS trash. Always run with -e/-explain first to verify which partitions will be moved before using these flags.
Do not start a MoveTo process on a cluster where one is already running. The new request will be rejected.
Step 1: Log on to your EMR cluster
Log on to your EMR cluster in SSH mode. For more information, see Log on to a cluster.
Step 2: Preview the migration (recommended)
Run the command with -e to print the list of matching partitions without moving any data:
jindo table -moveTo \
-t <dbName.tableName> \
-d <destination path> \
-c "<condition>" \
-e
Example: Preview all partitions in the ds column older than 2023-01-01:
jindo table -moveTo \
-t mydb.events \
-d oss://my-bucket/archive/events \
-c "ds < '2023-01-01'" \
-e
The command prints the list of matching partitions without moving any data.
Step 3: Run the migration
To view all available options for the MoveTo command, run:
jindo table -help moveTo
The full command syntax is:
jindo table -moveTo \
-t <dbName.tableName> \
-d <destination path> \
[-c "<condition>" | -fullTable] \
[-b/-before <before days>] \
[-p/-parallel <parallelism>] \
[-s/-storagePolicy <OSS storage policy>] \
[-o/-overWrite] \
[-r/-removeSource] \
[-skipTrash] \
[-e/-explain] \
[-l/-logDir <log directory>]
| Parameter | Description | Required |
|---|---|---|
-t <dbName.tableName> |
The table to migrate, in database.table format. Supports both partitioned and non-partitioned tables. |
Yes |
-d <destination path> |
The table-level destination path. For partitioned tables, the full partition path is composed as <destination path>/p1=v1/p2=v2/. |
Yes |
-c "<condition>" | -fullTable |
Use -fullTable to move the entire table. Use -c "<condition>" to filter partitions by a condition (supports standard operators such as >). Example: -c "ds > 'd'". You must specify one of these options. |
No |
-b/-before <before days> |
Migrate only tables or partitions created at least the specified number of days ago. | No |
-p/-parallel <parallelism> |
Maximum number of concurrent partition copy operations. Defaults to 1. |
No |
-s/-storagePolicy <OSS storage policy> |
Target storage class for Object Storage Service (OSS) destinations. Valid values: Standard (default), IA, Archive, ColdArchive. Not applicable for non-OSS destinations. |
No |
-o/-overWrite |
Clear the destination path before writing. For partitioned tables, only the destination path of the migrated partition is cleared. | No |
-r/-removeSource |
Remove the source path after the migration and metadata update succeed. For partitioned tables, only the source path of the migrated partition is removed. | No |
-skipTrash |
Delete source data immediately, bypassing the HDFS trash. Only valid when -r/-removeSource is specified. |
No |
-e/-explain |
Print the list of partitions to be migrated without moving any data. Use this to validate your filter conditions before running the actual migration. | No |
-l/-logDir <log directory> |
Directory for log files. Defaults to /tmp/<current user>/. |
No |
Examples
Migrate all partitions in mydb.events to an OSS Archive path, using 4 parallel threads, and remove source data after a successful migration:
jindo table -moveTo \
-t mydb.events \
-d oss://my-bucket/archive/events \
-fullTable \
-s Archive \
-p 4 \
-r
Before using the ColdArchive storage class, make sure that Cold Archive is enabled on the destination OSS bucket.
Migrate partitions older than 2023-06-01 in the ds column, created at least 90 days ago:
jindo table -moveTo \
-t mydb.logs \
-d oss://my-bucket/cold/logs \
-c "ds < '2023-06-01'" \
-b 90 \
-s IA
Configure a custom lock directory
MoveTo uses a process lock stored in HDFS to prevent concurrent runs. The default lock path is hdfs:///tmp/jindotable-lock/.
The lock path must be an HDFS path. If you do not have write permission on the default path, follow these steps to set a custom path.
Before changing the lock directory, make sure no MoveTo process is running on the cluster. Changing the lock directory while a process is active may cause the process to fail and could result in data corruption.
-
Go to the HDFS service page in the EMR console.
-
Log on to the Alibaba Cloud EMR console.
-
In the top navigation bar, select the region where your cluster resides and select a resource group.
-
Click the Cluster Management tab.
-
On the Cluster Management page, find your cluster and click Details in the Actions column.
-
In the left-side navigation pane of the Cluster Overview page, choose Cluster Service > HDFS.
-
-
Add a custom configuration item.
-
Click the Configure tab, then click hdfs-site or core-site in the Service Configuration section.
-
In the upper-right corner of the Service Configuration section, click Custom Configuration.

-
In the Add Configuration Item dialog box, add the
jindotable.moveto.tablelock.base.dirparameter and set its value to an existing HDFS path.
-
-
Save the configuration.
-
In the upper-right corner of the Service Configuration section, click Save.
-
In the Confirm Changes dialog box, fill in Description and click OK.
-