Alibaba Cloud discovered a remote code execution (RCE) vulnerability in Apache Log4j 2 and reported it to the Apache Software Foundation. This topic describes which E-MapReduce (EMR) clusters are affected and how to apply the fix.
If your cluster runs EMR V3.38.3 or later, or EMR V5.4.3 or later, the vulnerability is already fixed. No action is required.
Affected versions and services
Affected EMR versions
| EMR version | Status | Action required |
|---|---|---|
| EMR V3.38.2 and earlier | Affected | Apply the fix |
| EMR V4.X | Affected | Apply the fix |
| EMR V5.4.2 and earlier | Affected | Apply the fix |
| EMR V3.38.3 and later | Fixed | None |
| EMR V5.4.3 and later | Fixed | None |
Affected services
The following services are affected: Hive, Presto, Impala, Druid, Flink, Solr, Ranger, Storm, Oozie, Spark, and Zeppelin. Spark and Zeppelin are affected through their dependency on Hive.
The fix replaces the Apache Log4j 2 JAR file on your cluster with version 2.17.2, and sets the log4j2.formatMsgNoLookups parameter to true for Hive and Spark to disable the Java Naming and Directory Interface (JNDI) lookup feature.
The fix script does not affect running services. Run it during off-peak hours — a service restart is required after the script completes.
Fix an existing EMR cluster
-
Download patches-log4j.tar.gz.
-
Log on to the master node of your cluster and place the package in the home directory of the
emr-userorhadoopuser. -
Switch to the appropriate user and decompress the package. For DataLake, Dataflow, OLAP, DataServing, or custom clusters:
su emr-user tar zxf patches-log4j.tar.gzFor all other cluster types:
su hadoop tar zxf patches-log4j.tar.gz -
Open the
hostsfile in thepatchesdirectory and add the hostname of every node in the cluster, one per line.ImportantFor EMR V3.41 or a later minor version, or EMR V5.7.0 or a later minor version, node hostnames use a different format:
core-1-1 core-1-2 task-1-1 task-1-2cd patches vim hostsExample:
emr-header-1 emr-worker-1 emr-worker-2 -
Run the fix script.
NoteFor YARN jobs such as Spark Streaming or Flink jobs, stop the jobs and perform a rolling restart of YARN NodeManager before proceeding.
./fix.shWhen the script completes successfully, the output ends with:
### NOTICE: YOU CAN RESTORE THIS PATCH BY RUN RESTORE SCRIPT ABOVE $> sh ./restore.sh 20211213001755 ### DONETo roll back the patch, run:
./restore.sh 20211213001755 -
Restart the affected services: Hive, Hadoop Distributed File System (HDFS), Presto, Impala, Druid, Flink, Solr, Ranger, Storm, Oozie, Spark, and Zeppelin. To restart a service, go to the service page in the EMR console, then choose More > Restart in the upper-right corner.
Fix a gateway cluster
Gateway clusters do not support password-free SSH (Secure Shell) login, so the patch must be applied to each node manually.
For each node:
-
Upload the patch package to the node.
-
Follow the same steps as for a standard EMR cluster.
Two differences apply when working with gateway clusters:
-
In the
hostsfile, enter only the hostname of the current node — not all cluster nodes. -
Gateway clusters have no services, so no service restart is needed after applying the patch.
Fix when creating or scaling out a cluster
When creating a new cluster, add a bootstrap action to apply the patch automatically. When scaling out an existing cluster, the system applies the fix automatically; restart only the services on the newly added nodes afterward.
To configure the bootstrap action when creating a cluster:
-
Download patches-log4j.tar.gz and bootstrap_log4j.sh, then upload both to an Object Storage Service (OSS) path, for example
oss://<bucket-name>/path/to/. -
In the EMR console, add a bootstrap action. For more information, see Use bootstrap actions to execute scripts. Configure the bootstrap action with the following settings.
Parameter Value Name A descriptive name, for example fixlog4jvulnerabilityScript Address The OSS path of the script file: oss://<bucket-name>/path/to/bootstrap_log4j.shParameter The OSS path of the patch package: oss://<bucket-name>/path/to/patches-log4j.tar.gzExecution Scope Cluster Execution Time After Component Startup Execution Failure Policy Proceed -
After the cluster is created, restart the following services: HDFS, Hive, Presto, Impala, Druid, Flink, Solr, Ranger, Storm, Oozie, Spark, and Zeppelin.
FAQ
Does the fix script affect services that are currently running?
No. The script does not impact running services. However, you must restart the affected services after the script completes, so run it during off-peak hours.
How do I roll back the patch?
Run the restore script that was output when you applied the fix:
./restore.sh 20211213001755
Do I need to apply the fix to gateway clusters differently?
Yes. Gateway clusters do not support password-free SSH login, so you must upload the patch package and run the fix on each node individually. You do not need to restart services because gateway clusters have none.