All Products
Search
Document Center

Data Lake Formation:Upgrade the EMR-HOOK component in an EMR gateway

Last Updated:Mar 25, 2026

Upgrade the EMR-HOOK component in an E-MapReduce (EMR) gateway to enable access frequency metrics on the Data Overview tab and data access frequency rules on the Lifecycle Management page in the Data Lake Formation (DLF) console.

Note

The upgrade does not affect running computing tasks. After the upgrade, newly started tasks pick up the changes automatically.

Prerequisites

Before you begin, make sure that:

Before you begin

Review the following before running any commands.

JAR file names — Hive

Hive versionEMR-5.10.1, EMR-3.44.1, and laterEarlier than EMR-3.44.1
Hive 2hive-hook-hive23.jarhive-hook-<emrhook-version>-hive23.jar
Hive 3hive-hook-hive31.jarhive-hook-<emrhook-version>-hive31.jar

JAR file names — Spark

Spark versionEMR-5.10.1, EMR-3.44.1, and laterEarlier than EMR-3.44.1
Spark 2spark-hook-spark24.jarspark-hook-<emrhook-version>-spark24.jar
Spark 3spark-hook-spark30.jarspark-hook-<emrhook-version>-spark30.jar

For versions earlier than EMR-3.44.1, replace <emrhook-version> with the EMR-HOOK minor version for your EMR release. For example, EMR-3.43.1 ships EMR-HOOK version 1.1.4, so the Hive 2 JAR file is hive-hook-1.1.4-hive23.jar.

Configuration value separators

ComponentSeparator
HiveComma (,)
SparkColon (:)

Upgrade procedure

Select the procedure for your EMR version:

EMR-5.10.1, EMR-3.44.1, and later versions

Step 1: Upgrade the JAR files

Log in to the gateway via SSH using an account with root privileges. Replace <region> with your region ID (for example, cn-hangzhou), then run:

sudo mkdir -p /opt/apps/EMRHOOK/upgrade/
sudo wget https://dlf-repo-<region>.oss-<region>-internal.aliyuncs.com/emrhook/latest/emrhook.tar.gz -P /opt/apps/EMRHOOK/upgrade
sudo tar -p -zxf /opt/apps/EMRHOOK/upgrade/emrhook.tar.gz -C /opt/apps/EMRHOOK/upgrade/
sudo cp -p /opt/apps/EMRHOOK/upgrade/emrhook/* /opt/apps/EMRHOOK/emrhook-current/

Step 2: Update Hive configuration

Important

Use the correct <hive-jar> for your Hive version. See Before you begin for the file name.

Configuration fileConfiguration itemValue to append/add
hive-site.xml (/etc/taihao-apps/hive-conf/hive-site.xml)hive.aux.jars.pathAppend ,/opt/apps/EMRHOOK/emrhook-current/<hive-jar> (separator: ,)
hive-site.xmlhive.exec.post.hooksAdd com.aliyun.emr.meta.hive.hook.LineageLoggerHook
hive-env.sh (/etc/taihao-apps/hive-conf/hive-env.sh)hive_aux_jars_pathAppend ,/opt/apps/EMRHOOK/emrhook-current/<hive-jar> (separator: ,)

Step 3: Update Spark configuration

Important

Use the correct <spark-jar> for your Spark version. See Before you begin for the file name.

Configuration fileConfiguration itemValue to append/add
spark-defaults.conf (/etc/taihao-apps/spark-conf/spark-defaults.conf)spark.driver.extraClassPathAppend :/opt/apps/EMRHOOK/emrhook-current/<spark-jar> (separator: :)
spark-defaults.confspark.executor.extraClassPathAppend :/opt/apps/EMRHOOK/emrhook-current/<spark-jar> (separator: :)
spark-defaults.confspark.sql.queryExecutionListenersAdd com.aliyun.emr.meta.spark.listener.EMRQueryLogger

Versions earlier than EMR-3.44.1

Step 1: Upgrade the JAR files

  1. Log in to the gateway via SSH using an account with root privileges. Replace <region> with your region ID (for example, cn-hangzhou), then run the following script to download and extract the latest EMR-HOOK JAR files:

    sudo mkdir -p /opt/apps/EMRHOOK/upgrade/
    sudo wget https://dlf-repo-<region>.oss-<region>-internal.aliyuncs.com/emrhook/latest/emrhook.tar.gz -P /opt/apps/EMRHOOK/upgrade
    sudo tar -p -zxf /opt/apps/EMRHOOK/upgrade/emrhook.tar.gz -C /opt/apps/EMRHOOK/upgrade/
  2. Rename the extracted JAR files to match the EMR-HOOK minor version for your EMR release. The following example uses EMR-3.43.1, which ships EMR-HOOK version 1.1.4:

    cd /opt/apps/EMRHOOK/upgrade/emrhook
    mv hive-hook-hive20.jar hive-hook-1.1.4-hive20.jar
    mv hive-hook-hive23.jar hive-hook-1.1.4-hive23.jar
    mv hive-hook-hive31.jar hive-hook-1.1.4-hive31.jar
    mv spark-hook-spark24.jar spark-hook-1.1.4-spark24.jar
    mv spark-hook-spark30.jar spark-hook-1.1.4-spark30.jar

    image

  3. Copy the renamed JAR files to the active directory:

    sudo cp -p /opt/apps/EMRHOOK/upgrade/emrhook/* /opt/apps/EMRHOOK/emrhook-current/

Step 2: Update Hive configuration

Important

Use the correct <hive-jar> for your Hive version. The file name includes the EMR-HOOK version. See Before you begin for the file name.

Configuration fileConfiguration itemValue to append/add
hive-site.xml (/etc/taihao-apps/hive-conf/hive-site.xml)hive.aux.jars.pathAppend ,/opt/apps/EMRHOOK/emrhook-current/<hive-jar> (separator: ,)
hive-site.xmlhive.exec.post.hooksAdd com.aliyun.emr.meta.hive.hook.LineageLoggerHook
hive-env.sh (/etc/taihao-apps/hive-conf/hive-env.sh)hive_aux_jars_pathAppend ,/opt/apps/EMRHOOK/emrhook-current/<hive-jar> (separator: ,)

Step 3: Update Spark configuration

Important

Use the correct <spark-jar> for your Spark version. The file name includes the EMR-HOOK version. See Before you begin for the file name.

Configuration fileConfiguration itemValue to append/add
spark-defaults.conf (/etc/taihao-apps/spark-conf/spark-defaults.conf)spark.driver.extraClassPathAppend :/opt/apps/EMRHOOK/emrhook-current/<spark-jar> (separator: :)
spark-defaults.confspark.executor.extraClassPathAppend :/opt/apps/EMRHOOK/emrhook-current/<spark-jar> (separator: :)
spark-defaults.confspark.sql.queryExecutionListenersAdd com.aliyun.emr.meta.spark.listener.EMRQueryLogger