All Products
Search
Document Center

E-MapReduce:JindoSDK upgrade and rollback process of EMR cluster (X86)

Last Updated:Jun 24, 2025

This topic describes the detailed steps to upgrade JindoSDK for different scenarios in EMR clusters with X86 architecture.

Prerequisites

You have created an EMR cluster with X86 architecture. For more information, see Create a cluster.

Scenario 1: Upgrade JindoSDK in an existing cluster

If you have created a cluster of EMR V3.40.0 or a later minor version, or a cluster of EMR V5.6.0 or a later minor version, and you encounter issues Known issues of JindoData versions during use, or you need to use new features of JindoSDK, you can upgrade JindoSDK by performing the following operations.

Important

If you upgrade JindoSDK from 4.6.8 or earlier to 4.6.9 or later or to a version of 6.X series, the default temporary job path used by JindoCommitter is changed. To avoid data loss during the upgrade, log on to the Cluster Services page of the cluster before the upgrade and modify one of the following configurations:

  • Add the configuration item fs.jdo.committer.allow.concurrent=false to Hadoop-Common > Configuration > core-site.xml.

  • Add the configuration item spark.hadoop.fs.jdo.committer.allow.concurrent=false to Spark > Configuration > spark-defaults.conf.

After JindoSDK is upgraded on all nodes in your cluster, including Gateway nodes, set the preceding parameter to true.

Step 1: Prepare the software package and upgrade script

Note

Determine the JindoSDK version that you want to upgrade to:

  • Direct access to OSS/OSS-HDFS:

    If you use JindoSDK only to directly access OSS or OSS-HDFS, check whether the local Hadoop dependency version is special (such as lower than 2.7). If the version is low, additional compatibility processing may be required.

  • Using semi-managed services:

    If you are using semi-managed services such as JindoCache, JindoAuth, or JindoFSx, we recommend that you contact the Alibaba Cloud EMR technical support team to confirm the compatibility of the JindoSDK version to ensure a smooth upgrade process.

  1. Log on to the master node of the EMR cluster. For more information, see Log on to a cluster.

  2. Download the patch package to the home directory of the user emr-user and decompress the package.

    su - emr-user
    cd /home/emr-user/
    wget https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/resources/emr-taihao/jindosdk-patches.tar.gz
    tar zxf jindosdk-patches.tar.gz
  3. Download the software package jindosdk-{VERSION}.tar.gz of JindoSDK to the jindosdk-patches directory that you obtained in the previous step.

    In this example, JindoSDK is upgraded to version 6.8.2.

    cd jindosdk-patches
    
    wget https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/release/6.8.2/jindosdk-6.8.2-linux.tar.gz
    
    ls -l

    Sample content of the jindosdk-patches directory:

    -rwxrwxr-x 1 emr-user emr-user      2439 May 01 00:00 apply_all.sh
    -rwxrwxr-x 1 emr-user emr-user      7315 May 01 00:00 apply.sh
    -rw-rw-r-- 1 emr-user emr-user        40 May 01 00:00 hosts
    -rw-r----- 1 emr-user emr-user xxxxxxxxx May 01 00:00 jindosdk-6.8.2-linux.tar.gz
    -rwxrwxr-x 1 emr-user emr-user      1112 May 01 00:00 revert_all.sh
    -rwxrwxr-x 1 emr-user emr-user      2042 May 01 00:00 revert.sh

Step 2: Configure node information for the upgrade

  • Manual configuration

    1. Edit the hosts file in the package.

      vim hosts
    2. Add the hostnames, such as master-1-1 and core-1-1, of all nodes in the cluster to the hosts file. Enter one hostname in each line.

      Sample file content:

      master-1-1
      core-1-1
      core-1-2
  • Automatic configuration

    You can also run the following command to obtain information about all nodes. If the hosts file fails to be obtained, you need to manually complete it.

    cat  /usr/local/taihao-executor-all/data/cache/.cluster_context | jq --raw-output '.nodes[].hostname.alias[]' > hosts

Step 3: Perform the upgrade

Run the apply_all.sh script to upgrade JindoSDK to a specific version.

./apply_all.sh $NEW_JINDOSDK_VERSION  # Run the apply_all.sh script with the specified $NEW_JINDOSDK_VERSION to upgrade to that version of JindoSDK.

For example, to upgrade JindoSDK in the cluster to version 6.8.2:

./apply_all.sh 6.8.2

When ### DONE appears in the returned information, the script execution is complete.

>> updating ...  master-1-1
>>> updating ...  core-1-1
>>> updating ...  core-1-2
### DONE

Step 4: Modify cluster configuration (for compatibility with older versions of EMR OSS Ranger authentication)

If EMR OSS Ranger authentication is enabled and you upgrade JindoSDK from a version earlier than or equal to EMR-3.51.2/EMR-5.17.2 to a version in the range of [6.5.0, 6.7.2], compatibility issues may occur. We recommend that you upgrade JindoSDK to version 6.7.3 or later and modify the cluster configuration by performing the following operations:

  1. On the Configuration page of the HADOOP-COMMON service, click the core-sites.xml tab.

  2. On the core-sites.xml page, search for and modify the following configuration item:

    Parameter

    Description

    fs.jdo.plugin.dir

    Change the plugin loading directory to the plugin path under the new version of JindoSDK, that is, change /opt/apps/RANGER/jindoauth-current/plugins to /opt/apps/JINDOSDK/jindosdk-current/plugins.

Step 5: Special node handling

  1. Upgrade Gateway nodes created through EMR CLI.

    • If the Gateway nodes are created through the EMR console, the preceding steps already cover the related content.

    • If the Gateway nodes are created through EMR CLI, because these nodes are created independently, you need to manually run the upgrade script to complete the upgrade. In addition, when elastic nodes are initialized, you should complete the upgrade in advance through the script.

  2. Replace JindoSDK for services such as Trino, Presto, and Impala.

    For versions earlier than EMR-3.53.0 and versions earlier than EMR-5.19.0 (not included), JindoSDK used by services such as Trino, Presto, and Impala will not be automatically upgraded in the preceding steps. You need to manually replace the JindoSDK JAR packages in the plugins path of these services with the target version and restart the services for the changes to take effect.

Step 6: Verify the upgrade

ls -l /opt/apps/JINDOSDK/jindosdk-current/lib

If you successfully upgrade JindoSDK from the default version 6.2.0 to version 6.8.2, the following information is returned:

lrwxrwxrwx 1 emr-user emr-user 64 Apr 12 11:08 jindo-core-6.2.0.jar -> /opt/apps/JINDOSDK/jindosdk-6.8.2-linux/lib/jindo-core-6.8.2.jar
lrwxrwxrwx 1 emr-user emr-user 82 Apr 12 11:08 jindo-core-linux-el7-aarch64-6.2.0.jar -> /opt/apps/JINDOSDK/jindosdk-6.8.2-linux/lib/jindo-core-linux-el7-aarch64-6.8.2.jar
lrwxrwxrwx 1 emr-user emr-user 63 Apr 12 11:08 jindo-sdk-6.2.0.jar -> /opt/apps/JINDOSDK/jindosdk-6.8.2-linux/lib/jindo-sdk-6.8.2.jar
lrwxrwxrwx 1 emr-user emr-user 50 Apr 12 11:08 native -> /opt/apps/JINDOSDK/jindosdk-6.8.2-linux/lib/native
lrwxrwxrwx 1 emr-user emr-user 57 Apr 12 11:08 site-packages -> /opt/apps/JINDOSDK/jindosdk-6.8.2-linux/lib/site-packages

Step 7: Restart services after the upgrade

Note

For jobs that run on YARN, such as Spark Streaming or Flink jobs, perform a rolling restart on YARN NodeManager after the jobs stop.

Restart related services, such as Hive, Presto, Impala, Flink, Ranger, Spark, and Zeppelin, for the upgrade to take effect.

For example, on the Hive service page of the EMR cluster, choose More> > Restart in the upper-right corner.

Scenario 2: Scale out an existing cluster or create a new cluster

If you want to upgrade JindoSDK when you create a cluster or scale out an existing cluster, you can add a bootstrap action in the EMR console. This ensures that JindoSDK can be upgraded to the latest version. To upgrade JindoSDK in an efficient and accurate manner, perform the following operations:

Step 1: Prepare a bootstrap upgrade package

  1. Run the following commands to download the jindosdk-patches.tar.gz and jindosdk-{VERSION}-{PLATFORM}.tar.gz packages and the bootstrap_jindosdk.sh script:

    In this example, JindoSDK is upgraded to version 6.8.2.

    mkdir jindo-patch
    
    cd jindo-patch
    
    wget https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/resources/emr-taihao/jindosdk-patches.tar.gz
    
    wget https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/release/6.8.2/jindosdk-6.8.2-linux.tar.gz
    
    wget https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/resources/emr-taihao/bootstrap_jindosdk.sh
    
    ls -l

    The following information is returned:

    -rw-r----- 1 hadoop hadoop      xxxx May 01 00:00 bootstrap_jindosdk.sh
    -rw-r----- 1 hadoop hadoop xxxxxxxxx May 01 00:00 jindosdk-6.8.2-linux.tar.gz
    -rw-r----- 1 hadoop hadoop      xxxx May 01 00:00 jindosdk-patches.tar.gz
  2. Run the following command to prepare an upgrade package:

    bash bootstrap_jindosdk.sh -gen $NEW_JINDOSDK_VERSION  # Run the bootstrap_jindosdk.sh script with the specified $NEW_JINDOSDK_VERSION to upgrade to that version of JindoSDK.
    Note
    • For scaling out an existing cluster, use the -gen option to generate a lightweight upgrade package.

    • For creating a new cluster, use the -gen-full option to generate an upgrade package with complete content.

    For example, to upgrade JindoSDK to version 6.8.2:

    bash bootstrap_jindosdk.sh -gen 6.8.2

    After you prepare the upgrade package, the following information is returned:

    Generated patch at /home/emr-user/jindo-patch/jindosdk-bootstrap-patches.tar.gz

    The preparation is complete, and the patch package jindosdk-bootstrap-patches.tar.gz is generated.

Step 2: Upload the bootstrap upgrade package

Upload the patch package and bootstrap script to Object Storage Service (OSS). You can upload the patch package and the script for EMR clusters by running Hadoop commands, by using OSSUtils or OSS Browser, or in the OSS console.

For example, upload to the OSS path oss://<bucket-name>/path/to/bootstrap_jindosdk.sh and oss://<bucket-name>/path/to/jindosdk-bootstrap-patches.tar.gz.

hadoop dfs -mkdir -p oss://<bucket-name>/path/to/patch/

cd /home/hadoop/patch/
hadoop dfs -put jindosdk-bootstrap-patches.tar.gz oss://<bucket-name>/path/to/patch/
hadoop dfs -put bootstrap_jindosdk.sh oss://<bucket-name>/path/to/patch/

hadoop dfs -ls oss://<bucket-name>/path/to/patch/

The following information is returned:

Found 2 items
-rw-rw-rw-   1       2634 2022-05-13 14:07 oss://<bucket-name>/.../bootstrap_jindosdk.sh
-rw-rw-rw-   1  597342992 2022-05-13 13:41 oss://<bucket-name>/.../jindosdk-bootstrap-patches.tar.gz

Step 3: Add a bootstrap action

Add a bootstrap action in the EMR console. For more information, see Manage bootstrap actions.

The following table describes the parameters that you can configure to add a bootstrap action.

Parameter

Description

Example

Name

The name of the bootstrap action that you want to add.

update_jindosdk

Script Location

The OSS path where the script file is located. The script path must be in the oss://**/*.sh format.

oss:///path/to/patch/bootstrap_jindosdk.sh

Parameter

The parameter of the bootstrap action script. The parameter is used to specify the value of the variable that is referenced in the script.

-bootstrap oss:///path/to/patch/jindosdk-bootstrap-patches.tar.gz

Execution Scope

Select Cluster.

Cluster

Execution Time

Select After Component Startup.

After Component Startup

Failure Policy

Select Proceed With Execution.

Proceed with execution

Step 4: Special node handling

  1. Upgrade Gateway nodes created through EMR CLI.

    • If the Gateway nodes are created through the EMR console, the preceding steps already cover the related content.

    • If the Gateway nodes are created through EMR CLI, because these nodes are created independently, you need to manually run the upgrade script to complete the upgrade. In addition, when elastic nodes are initialized, you should complete the upgrade in advance through the script.

  2. Replace JindoSDK for services such as Trino, Presto, and Impala.

    For versions earlier than EMR-3.53.0 and versions earlier than EMR-5.19.0 (not included), JindoSDK used by services such as Trino, Presto, and Impala will not be automatically upgraded in the preceding steps. You need to manually replace the JindoSDK JAR packages in the plugins path of these services with the latest version and restart the services for the changes to take effect.

Step 5: Restart services

Restart related services for the upgrade to take effect.

  • After you create a cluster, restart the related services, such as Hive, Presto, Impala, Flink, Ranger, Spark, and Zeppelin.

  • After you scale out an existing cluster, restart the related services for the new nodes, such as Hive, Presto, Impala, Flink, Ranger, Spark, and Zeppelin.

Scenario 3: Roll back JindoSDK to the default version

For clusters of EMR V3.40.0 or a later minor version, or clusters of EMR V5.6.0 or a later minor version, if you encounter issues during the upgrade of JindoSDK, you can perform the following operations to roll back JindoSDK to the default version:

Step 1: Prepare a rollback script

  1. Log on to the master node of the EMR cluster. For more information, see Log on to a cluster.

  2. Store the downloaded patch package in the HOME directory of the emr-user user and decompress the package.

    su - emr-user
    cd /home/emr-user/
    wget https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/resources/emr-taihao/jindosdk-patches.tar.gz
    tar zxf jindosdk-patches.tar.gz
    cd jindosdk-patches
    ls -l

    The following information is returned:

    -rwxrwxr-x 1 emr-user emr-user      2439 May 01 00:00 apply_all.sh
    -rwxrwxr-x 1 emr-user emr-user      7315 May 01 00:00 apply.sh
    -rw-rw-r-- 1 emr-user emr-user        40 May 01 00:00 hosts
    -rwxrwxr-x 1 emr-user emr-user      1112 May 01 00:00 revert_all.sh
    -rwxrwxr-x 1 emr-user emr-user      2042 May 01 00:00 revert.sh

Step 2: Configure node information for the rollback

  • Manual configuration

    1. Edit the hosts file in the package.

      vim hosts
    2. Add the hostnames, such as master-1-1 and core-1-1, of all nodes in the cluster to the hosts file. Enter one hostname in each line.

      Sample file content:

      master-1-1
      core-1-1
      core-1-2
  • Automatic configuration

    You can also run the following command to obtain information about all nodes. If the hosts file fails to be obtained, you need to manually complete it.

    cat  /usr/local/taihao-executor-all/data/cache/.cluster_context | jq --raw-output '.nodes[].hostname.alias[]' > hosts

Step 3: Perform a rollback

Run the following script to roll back all changes:

./revert_all.sh

When ### DONE appears in the returned information, the script execution is complete.

>> updating ...  master-1-1
>>> updating ...  core-1-1
>>> updating ...  core-1-2
### DONE

Step 4: Verify the rollback

ls -l /opt/apps/JINDOSDK/jindosdk-current/lib

If you successfully roll JindoSDK back to 6.2.0, the following information is returned:

-rw-r--r-- 1 emr-user emr-user  1253740 Apr 24 17:40 jindo-core-6.2.0.jar
-rw-r--r-- 1 emr-user emr-user 13110547 Apr 24 17:40 jindo-core-linux-el7-aarch64-6.2.0.jar
-rw-r--r-- 1 emr-user emr-user  4432227 Apr 24 17:40 jindo-sdk-6.2.0.jar
drwxr-xr-x 2 emr-user emr-user     4096 Apr 24 17:40 native

Step 5: Special node handling

  1. Upgrade Gateway nodes created through EMR CLI.

    • If the Gateway nodes are created through the EMR console, the preceding steps already cover the related content.

    • If the Gateway nodes are created through EMR CLI, because these nodes are created independently, you need to manually run the upgrade script to complete the upgrade. In addition, when elastic nodes are initialized, you should complete the upgrade in advance through the script.

  2. Replace JindoSDK for services such as Trino, Presto, and Impala.

    For versions earlier than EMR-3.53.0 and versions earlier than EMR-5.19.0 (not included), JindoSDK used by services such as Trino, Presto, and Impala will not be automatically upgraded in the preceding steps. You need to manually replace the JindoSDK JAR packages in the plugins path of these services with the default version and restart the services for the changes to take effect.

Step 6: Restart services

Note

For jobs that run on YARN, such as Spark Streaming or Flink jobs, perform a rolling restart on YARN NodeManager after the jobs stop.

Restart related services, such as Hive, Presto, Impala, Flink, Ranger, Spark, and Zeppelin, for the rollback to take effect.

For example, on the Hive service page of the EMR cluster, choose More> > Restart in the upper-right corner.