All Products
Search
Document Center

E-MapReduce:Collect JindoTable access frequency

Last Updated:Jun 24, 2026

Collecting access frequency for JindoTable tables or partitions helps you distinguish between cold and hot data, reduce storage costs, and improve cache efficiency.

Prerequisites

You need an Alibaba Cloud E-MapReduce cluster. For more information, see Create a cluster.

Background

JindoTable supports collecting access records for Hive tables. The collected data is stored in the namespace of the SmartData service.

SmartData 3.2.x and later support access frequency collection for the Spark, Hive, and Presto engines. This feature is enabled by default for Spark and Presto. To disable it, see Disable access frequency collection. This feature is disabled by default for Hive. To enable it, see Enable access frequency collection for Hive.

Query data

You can use a JindoTable command to query access frequency.

  • Syntax

    jindo table -accessStat <-d [days]> <-n [topNums]>

    The values for days and topNums must be positive integers. If you set days to 1, the command queries all access records from 00:00 (local time) on the current day to the current time.

  • Function

    Queries the top N most frequently accessed tables or partitions within a specified time range.

  • Example: Query the 20 most frequently accessed tables or partitions in the last seven days.

    jindo table -accessStat -d 7 -n 20

For more information about JindoTable, see the JindoTable User Guide.

Enable access frequency collection for Hive

  1. Log on to the Alibaba Cloud E-MapReduce console.

  2. In the top navigation bar, select your region and resource group.

  3. Click the Clusters tab.

  4. On the Clusters page, find your cluster and click Details in the Actions column.

  5. Modify the Hive service configuration.

    1. In the navigation pane on the left, choose Services > Hive.

    2. On the Hive service page, click the Configure tab.

    3. Search for the hive.exec.post.hooks parameter and append com.aliyun.emr.table.hive.HivePostHook to its value.

  6. Save the configuration.

    1. In the upper-right corner, click Save.

    2. In the Confirm dialog box, enter an Execution Reason and enable auto-update configuration.

    3. Click OK.

  7. Restart the service.

    1. On the Hive service page, choose Actions > Restart HiveServer2 in the upper-right corner.

    2. In the Execute Cluster Operation dialog box, enter an Execution Reason.

    3. Click OK.

    4. In the Confirm dialog box, click OK.

Disable access frequency collection

  1. Log on to the Alibaba Cloud E-MapReduce console.

  2. In the top navigation bar, select your region and resource group.

  3. Click the Clusters tab.

  4. On the Clusters page, find your cluster and click Details in the Actions column.

  5. Modify parameter values.

    • Hive service:

      1. In the navigation pane on the left, choose Services > Hive.

      2. On the Hive service page, click the Configure tab.

      3. Search for the hive.exec.post.hooks parameter and remove com.aliyun.emr.table.hive.HivePostHook from its value. To find this parameter, in the Configuration Search box, enter hive.exec.post.hooks. The parameter is in the hive-site section.

    • Spark service:

      1. In the navigation pane on the left, choose Services > Spark.

      2. On the Spark service page, click the Configure tab.

      3. Search for the spark.sql.queryExecutionListeners parameter and remove com.aliyun.emr.table.spark.SparkSQLQueryListener from its value. To find this parameter, enter spark.sql.queryExecutionListeners in the Configuration Search box. The parameter is in the spark-defaults section, and its current value is com.aliyun.emr.table.spark.SparkSQLQueryListener.

    • Presto service:

      1. In the navigation pane on the left, choose Services > Presto.

      2. On the Presto service page, click the Configure tab.

      3. Search for the event-listener.name parameter and clear its value.

  6. Save the configuration.

    1. In the upper-right corner, click Save.

    2. In the Confirm dialog box, enter an Execution Reason and enable auto-update configuration.

    3. Click OK.

  7. Restart the services.

    • Hive service:

      1. On the Hive service page, choose Actions > Restart HiveServer2 in the upper-right corner.

      2. In the Execute Cluster Operation dialog box, enter an Execution Reason.

      3. Click OK.

      4. In the Confirm dialog box, click OK.

    • Spark service:

      1. On the Spark service page, choose Actions > Restart ThriftServer in the upper-right corner.

      2. In the Execute Cluster Operation dialog box, enter an Execution Reason.

      3. Click OK.

      4. In the Confirm dialog box, click OK.

    • Presto service:

      1. On the Presto service page, choose Actions > Restart All Components in the upper-right corner.

      2. In the Execute Cluster Operation dialog box, enter an Execution Reason.

      3. Click OK.

      4. In the Confirm dialog box, click OK.