All Products
Search
Document Center

E-MapReduce:JindoTable Instructions

Last Updated:Mar 26, 2026

JindoTable collects frequent-access statistics on tables and partitions, provides tiered storage management, and optimizes table organization at the storage layer.

Prerequisites

Before you begin, make sure that you have:

  • Java Development Kit (JDK) 8 installed on your on-premises machine

  • An E-MapReduce (EMR) cluster of V3.30.0 or later

For details on creating an EMR cluster, see Create a cluster.

Commands

JindoTable provides the following commands. Specify tables in the format database.table and partitions in the format partitionCol1=1,partitionCol2=2,....

Command Description
-accessStat Query the most frequently accessed tables or partitions in a time window
-leastUseStat Query the tables or partitions that have been idle the longest
-cache Cache table or partition data to local disk
-uncache Remove cached table or partition data from local disk
-archive Lower the storage class of table or partition data to Archive or Infrequent Access
-unarchive Restore archived data to Standard or Infrequent Access storage class
-status View the storage status of a table or partition
-optimize Optimize table data organization at the storage layer
-showTable List all partitions in a partitioned table, or show storage details of a non-partitioned table
-showPartition Show storage details of a specific partition
-listTables List all tables in a database
-dumpmc Dump a MaxCompute table to an EMR cluster or Object Storage Service (OSS)

For SDK-mode archive operations and data migration, see -archiveTable and -unarchiveTable and -moveTo.

-accessStat

Query the tables or partitions accessed most frequently within a specified number of days, along with their access counts.

Syntax

jindo table -accessStat -d <days> [-n <topNums>]

Parameters

Parameter Required Default Description
-d <days> Yes Number of past days to include in the query. Must be a positive integer. If set to 1, the query covers from 00:00 to the current time on the current day.
-n <topNums> No All results Number of top results to return. Must be a positive integer.

Example

Query the 20 most-accessed tables or partitions over the last seven days:

jindo table -accessStat -d 7 -n 20

-leastUseStat

Query the tables or partitions that have been idle the longest, ranked by time since last access.

Syntax

jindo table -leastUseStat -n <num> [-i | -ignoreNever]

Parameters

Parameter Required Default Description
-n <num> Yes Number of results to return. Must be a positive integer.
-i / -ignoreNever No Include all When specified, excludes tables or partitions that have never been accessed.

Example

Query the 20 tables or partitions that have been idle the longest:

jindo table -leastUseStat -n 20

-cache

Cache the data of a table or partition from OSS or JindoFS to local disk, speeding up subsequent reads.

To remove cached data, use -uncache.

Syntax

jindo table -cache -t <dbName.tableName> [-p <partitionSpec>] [-pin]

Parameters

Parameter Required Default Description
-t <dbName.tableName> Yes The table to cache. The data must be stored in OSS or JindoFS.
-p <partitionSpec> No Entire table The partition to cache. Format: partitionCol1=val1,partitionCol2=val2,...
-pin No Not pinned When specified, prevents the cached data from being evicted when cache space runs low.

Example

Cache the March 16, 2020 partition of db1.t1 to local disk:

jindo table -cache -t db1.t1 -p date=2020-03-16

-uncache

Remove the cached data of a table or partition from local disk.

To cache data, use -cache.

Syntax

jindo table -uncache -t <dbName.tableName> [-p <partitionSpec>]

Parameters

Parameter Required Default Description
-t <dbName.tableName> Yes The table whose cached data to remove. The data must be stored in OSS or JindoFS.
-p <partitionSpec> No Entire table The partition whose cached data to remove. Format: partitionCol1=val1,partitionCol2=val2,...

Examples

Remove cached data for the entire db1.t2 table:

jindo table -uncache -t db1.t2

Remove cached data for a specific partition of db1.t1:

jindo table -uncache -t db1.t1 -p date=2020-03-16,category=1

-archive

Lower the storage class of table or partition data. The default target is Archive storage class. To use Infrequent Access (IA) instead, add -i.

To restore archived data, use -unarchive. For SDK-mode archiving that does not rely on the Jindo Namespace Service, see -archiveTable.

Syntax

jindo table -archive [-a | -i] -t <dbName.tableName> [-p <partitionSpec>]

Parameters

Parameter Required Default Description
-t <dbName.tableName> Yes The table to archive.
-a No Archive storage class When specified, explicitly archives data to Archive storage class (default behavior).
-i No Archive storage class When specified, archives data to Infrequent Access (IA) storage class instead of Archive.
-p <partitionSpec> No Entire table The partition to archive. Format: partitionCol1=val1,partitionCol2=val2,...

Example

Archive the October 12, 2020 partition of db1.t1:

jindo table -archive -t db1.t1 -p date=2020-10-12

-unarchive

Restore archived table or partition data to a higher storage class.

  • No flag: Converts archived data to Standard storage class.

  • -o: Temporarily restores an archived object. The object remains in Archive storage class after the restore window expires.

  • -i: Converts an archived object to Infrequent Access (IA) storage class.

To archive data, use -archive.

Syntax

jindo table -unarchive [-o | -i] -t <dbName.tableName> [-p <partitionSpec>]

Parameters

Parameter Required Default Description
-t <dbName.tableName> Yes The table to unarchive.
-o No Standard Temporarily restores an archived object without permanently changing its storage class.
-i No Standard Converts archived data to Infrequent Access (IA) storage class.
-p <partitionSpec> No Entire table The partition to unarchive. Format: partitionCol1=val1,partitionCol2=val2,...

Examples

Temporarily restore a specific partition of db1.t1:

jindo table -unarchive -o -t db1.t1 -p date=2020-03-16,category=1

Convert all partitions of db1.t2 from Archive to Infrequent Access:

jindo table -unarchive -i -t db1.t2

-status

View the data storage status of a table or partition.

Syntax

jindo table -status -t <dbName.tableName> [-p <partitionSpec>]

Parameters

Parameter Required Default Description
-t <dbName.tableName> Yes The table to inspect.
-p <partitionSpec> No Entire table The partition to inspect. Format: partitionCol1=val1,partitionCol2=val2,...

Examples

View the storage status of db1.t2:

jindo table -status -t db1.t2

View the storage status of the March 16, 2020 partition of db1.t1:

jindo table -status -t db1.t1 -p date=2020-03-16

-optimize

Optimize the data organization of a table at the storage layer, improving read efficiency for downstream queries.

Syntax

jindo table -optimize -t <dbName.tableName>

Parameters

Parameter Required Default Description
-t <dbName.tableName> Yes The table to optimize.

Example

Optimize the storage layout of db1.t1:

jindo table -optimize -t db1.t1

-showTable

For a partitioned table, list all partitions and their storage details. For a non-partitioned table, show storage details of the table itself.

Syntax

jindo table -showTable -t <dbName.tableName>

Parameters

Parameter Required Default Description
-t <dbName.tableName> Yes The table to inspect.

Example

List all partitions in db1.t1:

jindo table -showTable -t db1.t1

-showPartition

Show the storage details of a specific partition.

Syntax

jindo table -showPartition -t <dbName.tableName> [-p <partitionSpec>]

Parameters

Parameter Required Default Description
-t <dbName.tableName> Yes The partitioned table to inspect.
-p <partitionSpec> No The partition to inspect. Format: partitionCol1=val1,partitionCol2=val2,...

Example

Show the storage details of the October 12, 2020 partition of db1.t1:

jindo table -showPartition -t db1.t1 -p date=2020-10-12

-listTables

List all tables in a database. If no database is specified, tables in the default database are listed.

Syntax

jindo table -listTables [-db <dbName>]

Parameters

Parameter Required Default Description
-db <dbName> No Default database The database to list tables from.

Examples

List tables in the default database:

jindo table -listTables

List tables in db1:

jindo table -listTables -db db1

-dumpmc

Dump a MaxCompute table to an EMR cluster or OSS. Supported output formats are CSV and TFRECORD.

Important

Do not hardcode your AccessKey ID and AccessKey secret in commands. Use environment variables or a secure credential store instead.

Syntax

jindo table -dumpmc -i <accessId> -k <accessKey> -m <numMaps> -t <tunnelUrl> -project <projectName> -table <tableName> [-p <partitionSpec>] -f <csv|tfrecord> -o <outputPath>

Parameters

Parameter Required Default Description
-i <accessId> Yes The AccessKey ID of your Alibaba Cloud account.
-k <accessKey> Yes The AccessKey secret of your Alibaba Cloud account.
-m <numMaps> Yes The number of map tasks.
-t <tunnelUrl> Yes The Tunnel endpoint of the virtual private cloud (VPC) where the MaxCompute project resides.
-project <projectName> Yes The name of the MaxCompute project.
-table <tableName> Yes The name of the MaxCompute table.
-p <partitionSpec> No All partitions Partition filter. Example: pt=xxx. Separate multiple partitions with commas: pt=xxx,dt=xxx.
-f <csv|tfrecord> Yes Output file format. Valid values: csv, tfrecord.
-o <outputPath> Yes The output path. Use a local path (for example, /tmp/output) for an EMR cluster, or an OSS path (for example, oss://bucket/path) for OSS.

Examples

Dump a MaxCompute table in TFRECORD format to an EMR cluster:

jindo table -dumpmc -m 10 -project mctest_project -table t1 \
  -t http://dt.xxx.maxcompute.aliyun-inc.com \
  -k xxxxxxxxx -i XXXXXX \
  -o /tmp/outputtf1 -f tfrecord

Dump a MaxCompute table in CSV format to OSS:

jindo table -dumpmc -m 10 -project mctest_project -table t1 \
  -t http://dt.xxx.maxcompute.aliyun-inc.com \
  -k xxxxxxxxx -i XXXXXX \
  -o oss://bucket1/tmp/outputcsv -f csv