All Products
Search
Document Center

E-MapReduce:JindoSDK upgrade and rollback process of EMR cluster (X86)

Last Updated:Mar 26, 2026

This topic covers how to upgrade JindoSDK on E-MapReduce (EMR) clusters with X86 architecture — whether you're patching a running cluster, adding nodes, or rolling back a failed upgrade.

Supported versions: EMR V3.40.0 or later, EMR V5.6.0 or later.

Which scenario applies to you?

Scenario When to use
Upgrade an existing cluster Your cluster is running and you need to apply a patch or move to a newer JindoSDK version.
Scale out or create a new cluster You're adding nodes to an existing cluster or creating a cluster from scratch.
Roll back to the default version An upgrade caused issues and you need to revert to the default JindoSDK version.

Scenario 1: Upgrade an existing cluster

Use this approach when your cluster is running and you want to upgrade JindoSDK in place.

Important

If you're upgrading from JindoSDK 4.6.8 or earlier to 4.6.9 or later, or to any 6.x version, the default temporary job path used by JindoCommitter changes. To prevent data loss, complete both actions in sequence:

Before the upgrade — add one of the following configuration items:

  • fs.jdo.committer.allow.concurrent=false in Hadoop-Common > Configuration > core-site.xml

  • spark.hadoop.fs.jdo.committer.allow.concurrent=false in Spark > Configuration > spark-defaults.conf

After the upgrade completes on all nodes (including Gateway nodes) — set the same parameter back to true.

Step 1: Prepare the software package and upgrade script

Before downloading, decide which JindoSDK version you need:

  • Direct OSS or OSS-HDFS access only: Check your local Hadoop dependency version. If it's below 2.7, additional compatibility steps may be required.

  • Semi-managed services (JindoCache, JindoAuth, JindoFSx): Contact Alibaba Cloud EMR technical support to confirm version compatibility before proceeding.

  1. Log in to the master node. See Log on to a cluster.

  2. Download the patch package and decompress it:

    su - emr-user
    cd /home/emr-user/
    wget https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/resources/emr-taihao/jindosdk-patches.tar.gz
    tar zxf jindosdk-patches.tar.gz
  3. Download the JindoSDK package into the jindosdk-patches directory. The following example uses version 6.8.2:

    cd jindosdk-patches
    wget https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/release/6.8.2/jindosdk-6.8.2-linux.tar.gz
    ls -l

    The directory should contain:

    -rwxrwxr-x 1 emr-user emr-user      2439 May 01 00:00 apply_all.sh
    -rwxrwxr-x 1 emr-user emr-user      7315 May 01 00:00 apply.sh
    -rw-rw-r-- 1 emr-user emr-user        40 May 01 00:00 hosts
    -rw-r----- 1 emr-user emr-user xxxxxxxxx May 01 00:00 jindosdk-6.8.2-linux.tar.gz
    -rwxrwxr-x 1 emr-user emr-user      1112 May 01 00:00 revert_all.sh
    -rwxrwxr-x 1 emr-user emr-user      2042 May 01 00:00 revert.sh

Step 2: Configure node information

Populate the hosts file with all cluster node hostnames. Choose either method:

Automatic (recommended):

cat /usr/local/taihao-executor-all/data/cache/.cluster_context | jq --raw-output '.nodes[].hostname.alias[]' > hosts

If this command fails to populate the file, use the manual method instead.

Manual:

vim hosts

Add one hostname per line:

master-1-1
core-1-1
core-1-2

Step 3: Run the upgrade

Run apply_all.sh with the target version:

./apply_all.sh 6.8.2

The script is complete when ### DONE appears in the output:

>> updating ...  master-1-1
>>> updating ...  core-1-1
>>> updating ...  core-1-2
### DONE

Step 4: Update the plugin directory (OSS Ranger compatibility)

Skip this step if EMR OSS Ranger authentication is not enabled.

If OSS Ranger authentication is enabled and you're upgrading from EMR-3.51.2 or earlier (or EMR-5.17.2 or earlier) to a JindoSDK version in the range [6.5.0, 6.7.2], upgrade to 6.7.3 or later instead to avoid compatibility issues. Then update the plugin directory path:

  1. Open the Configuration page for the HADOOP-COMMON service and click the core-sites.xml tab.

  2. Find and update fs.jdo.plugin.dir:

    Parameter Before After
    fs.jdo.plugin.dir /opt/apps/RANGER/jindoauth-current/plugins /opt/apps/JINDOSDK/jindosdk-current/plugins

Step 5: Handle special nodes

Gateway nodes created via EMR CLI

Gateway nodes created through the EMR console are already covered by the previous steps. If any Gateway nodes were created via EMR CLI, run the upgrade script on those nodes manually. In addition, when elastic nodes are initialized, complete the upgrade in advance through the script.

Trino, Presto, and Impala (versions below EMR-3.53.0 / EMR-5.19.0)

For clusters running EMR versions earlier than EMR-3.53.0 (V3 series) or earlier than EMR-5.19.0 (V5 series), JindoSDK is not automatically upgraded for Trino, Presto, or Impala. Manually replace the JindoSDK JAR files in the plugins path of each service with the target version, then restart those services.

Step 6: Verify the upgrade

ls -l /opt/apps/JINDOSDK/jindosdk-current/lib

After a successful upgrade from 6.2.0 to 6.8.2, the symlinks in lib point to the new version:

lrwxrwxrwx 1 emr-user emr-user 64 Apr 12 11:08 jindo-core-6.2.0.jar -> /opt/apps/JINDOSDK/jindosdk-6.8.2-linux/lib/jindo-core-6.8.2.jar
lrwxrwxrwx 1 emr-user emr-user 82 Apr 12 11:08 jindo-core-linux-el7-aarch64-6.2.0.jar -> /opt/apps/JINDOSDK/jindosdk-6.8.2-linux/lib/jindo-core-linux-el7-aarch64-6.8.2.jar
lrwxrwxrwx 1 emr-user emr-user 63 Apr 12 11:08 jindo-sdk-6.2.0.jar -> /opt/apps/JINDOSDK/jindosdk-6.8.2-linux/lib/jindo-sdk-6.8.2.jar
lrwxrwxrwx 1 emr-user emr-user 50 Apr 12 11:08 native -> /opt/apps/JINDOSDK/jindosdk-6.8.2-linux/lib/native
lrwxrwxrwx 1 emr-user emr-user 57 Apr 12 11:08 site-packages -> /opt/apps/JINDOSDK/jindosdk-6.8.2-linux/lib/site-packages

Step 7: Restart services

Restart the relevant services for the upgrade to take effect.

Service type How to restart
Hive, Presto, Impala, Flink, Ranger, Spark, Zeppelin, and similar batch-oriented services On the service page in the EMR cluster, choose More> > Restart.
Spark Streaming and Flink jobs running on YARN Wait for jobs to stop, then perform a rolling restart on YARN NodeManager.

Scenario 2: Scale out or create a new cluster

Use bootstrap actions to apply a JindoSDK upgrade automatically when new nodes join the cluster or when creating a new cluster. This ensures every node starts with the correct JindoSDK version without manual intervention.

Use -gen when scaling out an existing cluster. Use -gen-full when creating a new cluster — the full package includes all required dependencies.

Step 1: Prepare a bootstrap upgrade package

  1. Download the required files into a jindo-patch directory. The following example targets version 6.8.2:

    mkdir jindo-patch
    cd jindo-patch
    
    wget https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/resources/emr-taihao/jindosdk-patches.tar.gz
    wget https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/release/6.8.2/jindosdk-6.8.2-linux.tar.gz
    wget https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/resources/emr-taihao/bootstrap_jindosdk.sh
    
    ls -l

    The output shows:

    -rw-r----- 1 hadoop hadoop      xxxx May 01 00:00 bootstrap_jindosdk.sh
    -rw-r----- 1 hadoop hadoop xxxxxxxxx May 01 00:00 jindosdk-6.8.2-linux.tar.gz
    -rw-r----- 1 hadoop hadoop      xxxx May 01 00:00 jindosdk-patches.tar.gz
  2. Generate the upgrade package:

    • For scaling out an existing cluster:

      bash bootstrap_jindosdk.sh -gen 6.8.2
    • For creating a new cluster:

      bash bootstrap_jindosdk.sh -gen-full 6.8.2

    When the package is ready, the output shows:

    Generated patch at /home/emr-user/jindo-patch/jindosdk-bootstrap-patches.tar.gz

Step 2: Upload the package to OSS

Upload both files to Object Storage Service (OSS). Use Hadoop commands, ossutil, OSS Browser, or the OSS console.

Example using Hadoop commands:

hadoop dfs -mkdir -p oss://<bucket-name>/path/to/patch/

cd /home/hadoop/patch/
hadoop dfs -put jindosdk-bootstrap-patches.tar.gz oss://<bucket-name>/path/to/patch/
hadoop dfs -put bootstrap_jindosdk.sh oss://<bucket-name>/path/to/patch/

hadoop dfs -ls oss://<bucket-name>/path/to/patch/

Expected output:

Found 2 items
-rw-rw-rw-   1       2634 2022-05-13 14:07 oss://<bucket-name>/.../bootstrap_jindosdk.sh
-rw-rw-rw-   1  597342992 2022-05-13 13:41 oss://<bucket-name>/.../jindosdk-bootstrap-patches.tar.gz

Step 3: Add a bootstrap action

In the EMR console, add a bootstrap action with the following settings. For general guidance, see Manage bootstrap actions.

Parameter Description Example
Name A name for this bootstrap action. update_jindosdk
Script location The OSS path to the script file, in oss://*/*.sh format. oss:///path/to/patch/bootstrap_jindosdk.sh
Parameter Arguments passed to the script. -bootstrap oss:///path/to/patch/jindosdk-bootstrap-patches.tar.gz
Execution scope Select Cluster. Cluster
Execution time Select After Component Startup. After Component Startup
Failure policy Select Proceed With Execution. Proceed With Execution

Step 4: Handle special nodes

Gateway nodes created via EMR CLI

Gateway nodes created through the EMR console are already covered by the bootstrap action. If any Gateway nodes were created via EMR CLI, run the upgrade script on those nodes manually. In addition, when elastic nodes are initialized, complete the upgrade in advance through the script.

Trino, Presto, and Impala (versions below EMR-3.53.0 / EMR-5.19.0)

For clusters running EMR versions earlier than EMR-3.53.0 (V3 series) or earlier than EMR-5.19.0 (V5 series), JindoSDK is not automatically upgraded for Trino, Presto, or Impala. Manually replace the JindoSDK JAR files in the plugins path of each service with the latest version, then restart those services.

Step 5: Restart services

After the bootstrap action completes, restart the relevant services:

  • New cluster: Restart Hive, Presto, Impala, Flink, Ranger, Spark, and Zeppelin.

  • Scale-out: Restart the relevant services on the new nodes only.

Scenario 3: Roll back to the default version

If a JindoSDK upgrade causes issues on a cluster running EMR V3.40.0 or later, or EMR V5.6.0 or later, use the rollback script to revert to the default version.

Step 1: Download the rollback script

  1. Log in to the master node. See Log on to a cluster.

  2. Download the patch package and decompress it:

    su - emr-user
    cd /home/emr-user/
    wget https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/resources/emr-taihao/jindosdk-patches.tar.gz
    tar zxf jindosdk-patches.tar.gz
    cd jindosdk-patches
    ls -l

    The directory contains:

    -rwxrwxr-x 1 emr-user emr-user      2439 May 01 00:00 apply_all.sh
    -rwxrwxr-x 1 emr-user emr-user      7315 May 01 00:00 apply.sh
    -rw-rw-r-- 1 emr-user emr-user        40 May 01 00:00 hosts
    -rwxrwxr-x 1 emr-user emr-user      1112 May 01 00:00 revert_all.sh
    -rwxrwxr-x 1 emr-user emr-user      2042 May 01 00:00 revert.sh

Step 2: Configure node information

Populate the hosts file with all cluster node hostnames. Choose either method:

Automatic (recommended):

cat /usr/local/taihao-executor-all/data/cache/.cluster_context | jq --raw-output '.nodes[].hostname.alias[]' > hosts

If this command fails to populate the file, use the manual method instead.

Manual:

vim hosts

Add one hostname per line:

master-1-1
core-1-1
core-1-2

Step 3: Run the rollback

./revert_all.sh

The script is complete when ### DONE appears:

>> updating ...  master-1-1
>>> updating ...  core-1-1
>>> updating ...  core-1-2
### DONE

Step 4: Verify the rollback

ls -l /opt/apps/JINDOSDK/jindosdk-current/lib

After a successful rollback to 6.2.0, the files appear as regular files (not symlinks):

-rw-r--r-- 1 emr-user emr-user  1253740 Apr 24 17:40 jindo-core-6.2.0.jar
-rw-r--r-- 1 emr-user emr-user 13110547 Apr 24 17:40 jindo-core-linux-el7-aarch64-6.2.0.jar
-rw-r--r-- 1 emr-user emr-user  4432227 Apr 24 17:40 jindo-sdk-6.2.0.jar
drwxr-xr-x 2 emr-user emr-user     4096 Apr 24 17:40 native

Step 5: Handle special nodes

Gateway nodes created via EMR CLI

Gateway nodes created through the EMR console are already covered by the previous steps. If any Gateway nodes were created via EMR CLI, run the rollback script on those nodes manually. In addition, when elastic nodes are initialized, complete the rollback in advance through the script.

Trino, Presto, and Impala (versions below EMR-3.53.0 / EMR-5.19.0)

For clusters running EMR versions earlier than EMR-3.53.0 (V3 series) or earlier than EMR-5.19.0 (V5 series), JindoSDK for Trino, Presto, and Impala is not rolled back automatically. Manually replace the JindoSDK JAR files in the plugins path of each service with the default version, then restart those services.

Step 6: Restart services

Restart the relevant services for the rollback to take effect.

Service type How to restart
Hive, Presto, Impala, Flink, Ranger, Spark, Zeppelin, and similar batch-oriented services On the service page in the EMR cluster, choose More> > Restart.
Spark Streaming and Flink jobs running on YARN Wait for jobs to stop, then perform a rolling restart on YARN NodeManager.