EMR 3.x series release notes - E-MapReduce - Alibaba Cloud Documentation Center

This topic describes the release dates and update details for the EMR 3.x series. For more information about the components that are supported in each version, see Release versions.

EMR 3.55.x

Release date

Version	Date
EMR-3.55.0	October 27, 2025

Update details

Service	Change
Ranger	Jindoauth Server supports custom RAM Roles for client users to access OSS. Fixed a missing dependency in Ranger-yarn-plugin.
Paimon	Upgraded to version 1-ali-16.3.
JindoCache	Upgraded to version 6.10.1.

Release version information

DataLake cluster

Service	Version
Hadoop-Common	2.8.5
HDFS	2.8.5
OSS-HDFS	1.0.0
Hive	2.3.9
Spark2	2.4.8
Spark3	3.4.2
YARN	2.8.5
Trino	422
DeltaLake	3.0.0
Hudi	0.15.0
Iceberg	1.5.0
Flume	1.11.0
Kyuubi	1.9.2
Tez	0.10.2
OpenLDAP	2.4.46
Ranger	2.3.0
Ranger-plugin	1.0.0
Sqoop	1.4.7
DLF-Auth	2.0.2
Presto	0.283
Zookeeper	3.8.4
Knox	1.5.0
Celeborn	0.5.2
JindoCache	6.10.1
Paimon	1-ali-16.3

OLAP cluster

Service	Version
StarRocks2	2.5.22
StarRocks3	3.2.11
Doris	2.1.4
ClickHouse	23.8.2.7
Zookeeper	3.8.4

DataFlow cluster

Service	Version
Hadoop-Common	2.8.5
HDFS	2.8.5
OSS-HDFS	1.0.0
YARN	2.8.5
OpenLDAP	2.4.46
Ranger	2.3.0
Ranger-plugin	1.0.0
Zookeeper	3.8.4
Knox	1.5.0
Flink	1.17.2
Paimon	1-ali-6.2

DataServing cluster

Service	Version
Hadoop-Common	2.8.5
HDFS	2.8.5
OSS-HDFS	1.0.0
OpenLDAP	2.4.46
Ranger	2.3.0
Ranger-plugin	1.0.0
Zookeeper	3.8.4
Knox	1.5.0
HBase	1.7.1
JindoCache	6.8.2
Phoenix	4.16.1

Custom cluster

Service	Version
Hadoop-Common	2.8.5
HDFS	2.8.5
OSS-HDFS	1.0.0
Hive	2.3.9
Spark2	2.4.8
Spark3	3.4.2
YARN	2.8.5
Trino	422
DeltaLake	3.0.0
Hudi	0.15.0
Iceberg	1.5.0
Flume	1.11.0
Kyuubi	1.9.2
Tez	0.10.2
OpenLDAP	2.4.46
Ranger	2.3.0
Ranger-plugin	1.0.0
Sqoop	1.4.7
DLF-Auth	2.0.2
Presto	0.283
StarRocks2	2.5.22
StarRocks3	3.2.11
Zookeeper	3.8.4
Knox	1.5.0
Celeborn	0.5.2
Flink	1.17.2
HBase	1.7.1
JindoCache	6.10.1
Paimon	1-ali-16.3
Phoenix	4.16.1

EMR 3.54.x

Release date

Version	Date
EMR-3.54.0	July 10, 2025

Update details

Service	Change
Hive	Fixed some known bugs.
Tez	Fixed community bugs to improve performance and stability.

Release version information

DataLake cluster

Service	Version
Hadoop-Common	2.8.5
HDFS	2.8.5
OSS-HDFS	1.0.0
Hive	2.3.9
Spark2	2.4.8
Spark3	3.4.2
YARN	2.8.5
Trino	422
DeltaLake	3.0.0
Hudi	0.15.0
Iceberg	1.5.0
Flume	1.11.0
Kyuubi	1.9.2
Tez	0.10.2
OpenLDAP	2.4.46
Ranger	2.3.0
Ranger-plugin	1.0.0
Sqoop	1.4.7
DLF-Auth	2.0.2
Presto	0.283
Zookeeper	3.8.4
Knox	1.5.0
Celeborn	0.5.2
JindoCache	6.8.2
Paimon	1-ali-6.2

OLAP cluster

Service	Version
StarRocks2	2.5.22
StarRocks3	3.2.11
Doris	2.1.4
ClickHouse	23.8.2.7
Zookeeper	3.8.4

DataFlow cluster

Service	Version
Hadoop-Common	2.8.5
HDFS	2.8.5
OSS-HDFS	1.0.0
YARN	2.8.5
OpenLDAP	2.4.46
Ranger	2.3.0
Ranger-plugin	1.0.0
Zookeeper	3.8.4
Knox	1.5.0
Flink	1.17.2
Paimon	1-ali-6.2

DataServing cluster

Service	Version
Hadoop-Common	2.8.5
HDFS	2.8.5
OSS-HDFS	1.0.0
OpenLDAP	2.4.46
Ranger	2.3.0
Ranger-plugin	1.0.0
Zookeeper	3.8.4
Knox	1.5.0
HBase	1.7.1
JindoCache	6.8.2
Phoenix	4.16.1

Custom cluster

Service	Version
Hadoop-Common	2.8.5
HDFS	2.8.5
OSS-HDFS	1.0.0
Hive	2.3.9
Spark2	2.4.8
Spark3	3.4.2
YARN	2.8.5
Trino	422
DeltaLake	3.0.0
Hudi	0.15.0
Iceberg	1.5.0
Flume	1.11.0
Kyuubi	1.9.2
Tez	0.10.2
OpenLDAP	2.4.46
Ranger	2.3.0
Ranger-plugin	1.0.0
Sqoop	1.4.7
DLF-Auth	2.0.2
Presto	0.283
StarRocks2	2.5.22
StarRocks3	3.2.11
Zookeeper	3.8.4
Knox	1.5.0
Celeborn	0.5.2
Flink	1.17.2
HBase	1.7.1
JindoCache	6.8.2
Paimon	1-ali-6.2
Phoenix	4.16.1

EMR 3.53.x

Release date

Version	Date
EMR-3.53.0	April 24, 2025

Update details

Service	Change
Trino	Fixed an issue where LDAP was unavailable.
YARN	Fixed open source bugs (YARN-10213, YARN-6207, and YARN-9339).
StarRocks	Supports the creation of clusters with separated storage and compute resources.
JindoCache	Upgraded to version 6.8.2.
EMRHOOK	Enhanced stability.

Release version information

DataLake cluster

Service	Version
Hadoop-Common	2.8.5
HDFS	2.8.5
OSS-HDFS	1.0.0
Hive	2.3.9
Spark2	2.4.8
Spark3	3.4.2
YARN	2.8.5
Trino	422
DeltaLake	3.0.0
Hudi	0.15.0
Iceberg	1.5.0
Flume	1.11.0
Kyuubi	1.9.2
Tez	0.10.2
OpenLDAP	2.4.46
Ranger	2.3.0
Ranger-plugin	1.0.0
Sqoop	1.4.7
DLF-Auth	2.0.2
Presto	0.283
Zookeeper	3.8.4
Knox	1.5.0
Celeborn	0.5.2
JindoCache	6.8.2
Paimon	1-ali-6.2

OLAP cluster

Service	Version
StarRocks2	2.5.22
StarRocks3	3.2.11
Doris	2.1.4
ClickHouse	23.8.2.7
Zookeeper	3.8.4

DataFlow cluster

Service	Version
Hadoop-Common	2.8.5
HDFS	2.8.5
OSS-HDFS	1.0.0
YARN	2.8.5
OpenLDAP	2.4.46
Ranger	2.3.0
Ranger-plugin	1.0.0
Zookeeper	3.8.4
Knox	1.5.0
Flink	1.17.2
Paimon	1-ali-6.2

DataServing cluster

Service	Version
Hadoop-Common	2.8.5
HDFS	2.8.5
OSS-HDFS	1.0.0
OpenLDAP	2.4.46
Ranger	2.3.0
Ranger-plugin	1.0.0
Zookeeper	3.8.4
Knox	1.5.0
HBase	1.7.1
JindoCache	6.8.2
Phoenix	4.16.1

Custom cluster

Service	Version
Hadoop-Common	2.8.5
HDFS	2.8.5
OSS-HDFS	1.0.0
Hive	2.3.9
Spark2	2.4.8
Spark3	3.4.2
YARN	2.8.5
Trino	422
DeltaLake	3.0.0
Hudi	0.15.0
Iceberg	1.5.0
Flume	1.11.0
Kyuubi	1.9.2
Tez	0.10.2
OpenLDAP	2.4.46
Ranger	2.3.0
Ranger-plugin	1.0.0
Sqoop	1.4.7
DLF-Auth	2.0.2
Presto	0.283
StarRocks2	2.5.22
StarRocks3	3.2.11
Zookeeper	3.8.4
Knox	1.5.0
Celeborn	0.5.2
Flink	1.17.2
HBase	1.7.1
JindoCache	6.8.2
Paimon	1-ali-6.2
Phoenix	4.16.1

EMR 3.52.x

Release date

Version	Date
EMR-3.52.1	December 18, 2024
EMR-3.52.0 (New purchases are not supported)	December 4, 2024

Update details

Service	Change
Spark	Fixed a configuration issue that occurred during scale-out. Fixed an issue where SASL connections occasionally failed in Kerberos clusters.
Hive	Fixed a configuration issue that occurred during scale-out.
Trino	Resolved an issue where connections failed after LDAP was enabled.
Presto
Zookeeper	Supports adding custom configurations.
Ranger	Replaced the existing Spark 3 Ranger plugin with the version provided by the open source Kyuubi project.
Hudi	Upgraded to version 0.15.0.
Celeborn	Upgraded to version 0.5.2.
JindoCache	Upgraded to version 6.5.3.
StarRocks3	Upgraded to version 3.2.11.
Kyuubi	Upgraded to version 1.9.2.
StarRocks2	Upgraded to version 2.5.22.
Impala	The service is unavailable. You can use the recommended service as an alternative or manually install the corresponding service. You can replace Impala with Presto, Trino, ClickHouse, or StarRocks.
Kudu
Kafka
Kafka-Manager

EMR 3.51.x

Release date

Version	Date
EMR-3.51.4	December 18, 2024
EMR-3.51.3 (New purchases are not supported)	November 29, 2024
EMR-3.51.2 (New purchases are not supported)	August 29, 2024
EMR-3.51.1 (New purchases are not supported)	June 21, 2024
EMR-3.51.0 (New purchases are not supported)	April 23, 2024

Update details

EMR-3.51.4

Service	Change
JindoCache	Upgraded to version 6.5.3.
StarRocks2	Upgraded to version 2.5.22.
StarRocks3	Upgraded to version 3.2.11.

EMR-3.51.3

Service	Description
JindoSDK	JindoSDK is updated to resolve the issue that causes deadlocks.

EMR-3.51.2

Service	Description
JindoCache	JindoCache is updated to 6.5.1. The performance of reading data from and writing data to distributed hash tables is improved.
Spark	The issue that partition directories cannot be deleted is fixed. The issue related to the Hive package dependency is fixed. This ensures that the connection between Spark and the Metastore client remains uninterrupted.
Trino	The issue that some modified configurations may be unexpectedly restored to original configurations during a scale-out is fixed. Data in the OSS-HDFS service that is deployed in a high-security cluster can be queried. The issue that exceptions occur on Trino after DLF-Auth is enabled is fixed.
Presto	Data in the OSS-HDFS service that is installed in a high-security cluster can be queried.
HDFS	The issue that the memory size of NameNodes and DataNodes cannot be modified is fixed.
HBase-HDFS
YARN	Multiple timeline events can be sent by the ResourceManager at a time, which improves the processing capability. The logic issue in processing containers and resources of the ResourceManager is fixed.
ZooKeeper	The issue that the memory configuration of a node group cannot be modified is fixed. The log configuration files can be reconstructed.
Impala	The issue that client configurations are unexpectedly modified during an auto scaling activity is fixed.
Ranger	The latest version of JindoSDK is supported, which effectively reduces the CPU load.
Knox	The following issue is fixed: The URL of Knox fails to be accessed when a cluster has only one Master Extend node group.
Kafka	The following issue is fixed: The EMR cluster in which Kafka Connect is deployed fails to be started.
StarRocks	The issue that added BE nodes are not displayed after a scale-out is fixed.
Doris	Doris is updated to 2.1.4.
Paimon	Paimon is updated to 0.9-ali-7.
EMR-HOOK	The lineage information of a MaxCompute table can be parsed.

EMR-3.51.1

Service	Change
Spark	Supports deploying the Master-Extend node group.
Hive
Kyuubi
Paimon	Replaced the Flink dependency from the VVR version to the community version and added support for DLF Catalog.
Knox	Packaged using JDK 8.
Flink	Restored the DLF configurations and dependencies that were removed in EMR-3.51.0.

EMR-3.51.0

Service	Change
Spark	Upgraded Spark3 to version 3.4.2.
Celeborn	Upgraded to version 0.4.0.
Doris	Upgraded to version 2.1.0.
StarRocks	Upgraded StarRocks2 to version 2.5.18. Upgraded StarRocks3 to version 3.2.4.
DeltaLake	Upgraded to version 3.0.0.
Iceberg	Upgraded to version 1.5.0.
Zookeeper	Upgraded to version 3.8.4.
JindoCache	Upgraded to version 6.2.5.
Flink	Upgraded to version 1.17.2.

EMR 3.50.x

Release date

Version	Date
EMR-3.50.0	February 19, 2024

Update details

Service	Change
Hudi	Upgraded to version 0.14.0.
Flume	Upgraded to version 1.11.0.
Kyuubi	Upgraded to version 1.7.3.
Impala	Upgraded to version 4.3.0.
Celeborn	Upgraded to version 0.3.2.
JindoCache	Upgraded to version 6.2.0.
Paimon	Upgraded to version 0.7-ali-1.
Kafka	Upgraded to version 3.6.1. Fixed a SASL security authentication vulnerability in the Kafka Connect component.
Spark	Fixed the Commons Text vulnerability.
StarRocks	Upgraded StarRocks2 to version 2.5.13. Upgraded StarRocks3 to version 3.1.5.
Ranger	Fixed the Commons Text vulnerability. Fixed the Spring Security path matching permission bypass vulnerability. Fixed the Spring Security forward/include authentication bypass vulnerability. Fixed the Spring Framework identity authentication bypass vulnerability under a special matching pattern. Supports modifying the period for Ranger to synchronize LDAP users.

EMR 3.49.x

Release date

Version	Date
EMR-3.49.1	November 16, 2023
EMR-3.49.0 (New purchases are not supported)	October 27, 2023

Update details

Service	Change
JindoCache	Added the component. The version is 6.1.1.
JindoData	JindoData is unavailable. You can use JindoCache for data caching and DLF-Auth for authentication.
Spark	Removed `jdo`-related configurations from hive-site.xml.
HBase	Added a configuration item. You can select the HBase Thrift Server version, including v1 and v2, as needed.
StarRocks	Upgraded StarRocks2 to version 2.5.10.
Doris	Upgraded Doris to version 1.2.7.
Celeborn	Upgraded Celeborn to version 0.3.1.
Paimon	Upgraded Paimon to version 0.6-ali-2.
ClickHouse	Upgraded ClickHouse to version 23.8.2.7.

EMR 3.48.x

Release date

Version	Date
EMR-3.48.2	August 17, 2023

Update details

Service	Change
Trino	Fixed an issue where the Paimon connector could not successfully query HDFS tables. Fixed an issue where worker monitoring metrics could not be read.
Presto	Upgraded to version 0.283. Fixed an issue where worker monitoring metrics could not be read.
ClickHouse	Granted all permissions to the default user by default.
StarRocks	Renamed the previous StarRocks to StarRocks2. Added StarRocks3, version 3.1.2. By default, it is created as a storage-compute coupled version. Storage-compute separated versions are not supported.
Celeborn	Upgraded to version 0.3.0.

EMR 3.47.x

Release date

Version	Date
EMR-3.47.0	August 3, 2023

Update details

Service	Change
Hudi	Upgraded to version 0.13.1.
Paimon	Upgraded to version 0.5-ali-1.
StarRocks	Upgraded to version 2.5.8.
JindoData	Upgraded to version 4.6.11.
Trino	Upgraded to version 422. The Hudi connector supports querying Merge On Read (MOR) tables. Optimized error messages for dynamic UDF loading.

EMR 3.46.x

Release date

Version	Date
EMR-3.46.1	July 13, 2023
EMR-3.46.0 (New purchases are not supported)	June 1, 2023

Update details

EMR-3.46.1

Service	Description
Spark	By default, OSS-HDFS is used to store data of Spark History Server. OSS or OSS-HDFS is used to store data of Spark3 Native Engine.
Hive	By default, OSS-HDFS is used to store data in Hive warehouse files.
OSS-HDFS	The OSS-HDFS service is added.
YARN	By default, OSS-HDFS is used to store data.
HBase	By default, OSS-HDFS is used to store HBase data in the HFile format. OSS-HDFS is used to store write-ahead logging (WAL) logs of HBase.

EMR-3.46.0

Service	Change
Kyuubi	Upgraded to version 1.7.1.
Celeborn	Upgraded to version 0.2.2.
Paimon	Renamed Flink-Table-Store to Paimon. Upgraded to version 0.4-ali-1.
Starrocks	Upgraded to version 2.5.5.
Doris	Upgraded to version 1.2.4.
ClickHouse	Upgraded to version 22.8.17.17.
Trino	Provided a simple Event Listener by default to obtain audit logs.
Phoenix	Supports Hive on Phoenix.

EMR 3.45.x

Release date

Version	Date
EMR-3.45.1	April 3, 2023
EMR-3.45.0 (New purchases are not supported)	February 28, 2023

Update details

EMR-3.45.1

Service	Description
ClickHouse	ClickHouse is updated to 22.8.14.53.
Trino	The odps.properties connector is added. This allows you to query MaxCompute data.
JindoData	JindoData is updated to 4.6.5.
JindoSDK	JindoSDK is updated to 4.6.5.
Flink Table Store	Flink Table Store is updated to 0.3-ali-2.
YARN	The Node Labels feature is supported.

EMR-3.45.0

Service	Change
Iceberg	Upgraded to version 1.1.0.
Hudi	Upgraded to version 0.12.2. Supports the CDC feature.
Kudu	Upgraded to version 1.16.0.
Clickhouse	Upgraded to version 22.3.8.39. The ZooKeeper service must be selected when you install the ClickHouse service.
Celeborn	Renamed RSS to Celeborn. The version of Celeborn is 0.2.0.
Presto	Added the service. The kernel is community Facebook PrestoDB 0.278.3. The default HTTP port is 8889, and the HTTPS port is 7779.
DeltaLake	Upgraded to version 2.2.0.
StarRocks	Upgraded to version 2.4.3.
Doris	Upgraded to version 1.2.1.
Kafka-Manager	Upgraded to version 3.0.0.6.
Impala	The service is offline.
OpenLDAP	Upgraded to version 2.4.46.
Kyuubi	Upgraded to version 1.6.1.
Ranger	Upgraded to version 2.3.0.
HBase	Supports ThriftServer2. The default value of the hbase.block.data.cachecompressed parameter is changed to true.
Flink-Table-Store	Added the service, based on community version 0.3.
JindoData	Upgraded to version 4.6.4.

EMR 3.44.x

Release date

EMR-3.44.0 was released on December 1, 2022.

Update details

Service	Change
Iceberg	Upgraded to version 0.14.1.
Flink	Upgraded to Flink1.15-vvr-6.0.2, which corresponds to the community Flink 1.15 major version.
Kafka	Supports LDAP user logon authentication and authorization. Supports user group authorization.
Trino	EMR Presto was renamed to its official community name, Trino. Supports Ranger and DLF AUTH. Fixed an issue where connections to worker nodes failed after LDAP was enabled with a single click.
JindoSDK	Upgraded to version 4.6.2.
JindoData	Upgraded to version 4.6.2.
HBase	Supports Ranger. Fixed an issue where OSS-HDFS could not be selected as the storage mode when adding a service.
YARN	ACLs are enabled by default in high-security mode.
Starrocks	Upgraded to version 2.3.4.
Doris	Upgraded to version 1.1.5.
Hudi	The console supports configuring hudi-defaults.conf.
Ranger	Supports integration with Trino, YARN, HBase, and Kafka.
DLF-Auth	Upgraded to version 2.0.2. Supports Trino and Impala.
OpenLDAP	Integrated with the Nslcd component.
Kudu	Kudu Tserver can no longer be installed in the Task node group.
Spark	Upgraded to version 3.3.1.
Tez	Upgraded to version 0.10.2.
Kyuubi	Upgraded to version 1.6.0.

EMR 3.43.x

Release date

Version	Date
EMR-3.43.1	November 08, 2022
EMR-3.43.0 (New purchases are not supported)	October 14, 2022

Update details

EMR-3.43.1

Service	Change
Kerberos	Supports connecting to an external KDC on EMR.
Kafka	Supports adding a startup command configuration item to customize service startup parameters.
JindoData	Upgraded to version 4.6.0. Supports rewriting OSS-HDFS access paths.
Flink	Upgraded to version 1.13_vvr_4.0.15.
RSS	Upgraded to version 0.1.4.

EMR-3.43.0

Service	Change
Spark	Upgraded to version 3.3. Supports enabling Kerberos identity authentication.
Hudi	Upgraded to version 0.12.0. Supports Spark 3.3. Supports using a cloud MetaStore to host metadata and enabling the acceleration feature. For more information, see Hudi MetaStore usage guide.
Flink	Supports enabling Kerberos identity authentication. Supports automatic connection with Data Lake Formation (DLF).
Iceberg	Upgraded to version 0.14.0. Supports Spark 3.3. Supports enabling Kerberos identity authentication.
JindoData	Upgraded to version 4.5.1. Supports accessing Alibaba Cloud resources without plaintext AccessKeys.
Hadoop-Common and HDFS	Supports enabling Kerberos identity authentication. Fixed security vulnerability CVE-2022-25168.
Knox	Integrated with Ranger. The Ranger UI can be accessed from the Access Links And Ports tab.
HBase	Upgraded to version 1.7.1. Supports enabling Kerberos identity authentication. Supports group-based configuration.
RSS	Upgraded to version 0.1.2. Supports enabling Kerberos identity authentication.
Doris	Upgraded to version 1.1.2. Supports enabling Kerberos identity authentication.
StarRocks	Upgraded to version 2.2.6. Supports enabling Kerberos identity authentication.
Kafka	Upgraded to version 2.13_3.2.1. Supports enabling Kerberos identity authentication.
DeltaLake	Upgraded to version 2.1.0. Supports Spark 3.3. Supports enabling Kerberos identity authentication.
Kudu	Added the component. The version is 1.14.0.
Impala	Supports creating views in DLF. Supports enabling Kerberos identity authentication.
YARN, Imapla, Ranger, Hive, Kyuubi, Tez, Kafka, Zookeeper, DLF-Auth, Phoenix, Sqoop, Presto	Supports enabling Kerberos identity authentication.

EMR 3.42.x

Release date

EMR-3.42.0 was released on August 5, 2022.

Update details

Service	Change
Hive	Supports one-click integration with LDAP.
Presto	Upgraded to community version 389. Uses the standalone Delta Lake and Hudi connectors provided by the community. This version of the Delta Lake connector does not support Time Travel and Z-Order. This version of the Hudi connector does not support querying MOR tables. Supports one-click integration with LDAP.
DeltaLake	Integrated with DLF for automated lake table management. Supports Ranger authorization. Fixed an issue where statistics could not be collected for timestamp fields. The optimize and vacuum commands now support returning metric information.
Hudi	Upgraded to version 0.11.1.
HadoopCommon	Added a new component to resolve the issue of HDFS, YARN, and JindoSDK configurations overwriting each other.
YARN	Enhanced elastic features.
Ranger	Supports both Spark2 and Spark3. Ranger Usersync supports one-click integration with LDAP.
Kafka	CruiseControl automatically creates related topics on startup.
HBase	Added the component. The version is 1.4.9.
Phoenix	Added the component. The version is 4.14.1.
Doris	Upgraded to version 1.1.1.
StarRocks	Upgraded to version 2.2.3.
ClickHouse	Fixed a memory overflow issue when reading large files from OSS.

EMR 3.40.x

Release date

EMR-3.40.0 was released on April 21, 2022.

Update details

Service	Change
JindoData	Added the component. The version is 4.3.0.
JindoSDK	Upgraded to version 4.3.0.
Spark	Upgraded to version 3.2.1.
Hive	Fixed a bug where TEZ repeatedly committed when Speculation was enabled. Fixed a bug where UDFs could only be called after reloading the function.
Presto	Fixed a bug where the Presto service could not be started after it was added when the Hadoop cluster was initialized.
DeltaLake	Fixed a compatibility issue with Streaming SQL.
Hudi	Upgraded to version 0.10.1.
Iceberg	Upgraded to version 0.13.1.
YARN	Added a feature to restrict ApplicationMasters (AMs) to run only on CORE group nodes. Fixed an issue where the mareduce.map.java.opts configuration was missing taihaodoctor.
Zookeeper	Optimized JVM parameter configurations.
Flink	Adapted to JindoSDK 4.3.0.
Impala
Flume
Druid
Sqoop	Upgraded the PostgreSQL version.
Zeppelin	Resolved a startup failure issue with the JDBC Interpreter.
Ranger	The Ranger 1.2.0 Spark Plugin supports Hudi.
Oozie	Upgraded Log4j to version 2.17.2.
HBase	Fixed an issue where RegionServer could not be started in HBase 1.4.9.
DLF-Auth	Upgraded to version 2.0.0.

EMR 3.39.x

Release date

Version	Date
EMR-3.39.2	March 25, 2022
EMR-3.39.1 (New purchases are not supported)	February 15, 2022

Update details

EMR-3.39.2

Note

Only OLAP clusters and DataFlow clusters in the new EMR console support this version.

Service	Change
Flink	Improved the application performance management (APM) dashboard and added new monitoring metrics, such as sourceIdleTime. Supports CloudMonitor alerts.
Kafka	Supports SSL and SASL configurations. Modified the default values of some parameters.
Clickhouse	Modified the default values of some parameters.

EMR-3.39.1

Service	Change
SmartData	The component is offline.
BIGBOOT	The component is offline.
RSS	Upgraded the ESS service to RSS. For more information, see RSS. Enhanced the features and stability of the service.
JindoSDK	Upgraded the architecture to JindoData. EMR integrates JindoSDK 4.0 for the first time and supports services such as OSS and OSS-HDFS..
Spark	Optimized Hive on Spark. Adapted to JindoSDK.
Tez	Adapted to JindoSDK.
Hive	Adapted to JindoSDK.
Presto	Supports dynamic UDF loading. Delta Lake tables support Time Travel queries with the `for ... as of` syntax. Added a standalone Delta Lake Catalog, provided default Delta connector configurations, and supported ZOrder Dataskip optimization based on the standalone Catalog. Fixed an issue where the Hudi connector could not query Hudi MOR tables. The Hive connector does not support querying Hudi MOR tables. Adapted to JindoSDK.
Delta Lake	Metadata management Used the built-in Spark Catalog instead of the Hive CLI API to synchronize metadata and partition information. Automatically reports table statistics (dataProfiling) to the MetaStore. SQL Supports Time Travel syntax. Supports DropPartition SQL syntax. Supports ADD COLUMN operations at specified positions (FIRST and AFTER). Enhanced table management capabilities Supports and enables dynamic adjustment of filesize based on table size by default. Supports and enables automatic Vacuum by default. Supports concurrent Vacuum. Optimized the logic for automatic compaction, which is disabled by default. Added Zorder syntax and accelerated the Zorder process.
Hudi	Upgraded to version 0.10.0.
HDFS	Adapted to JindoSDK.
YARN	Adapted to JindoSDK.
Flume	Adapted to JindoSDK.
Flink	By default, the Flink lib directory is uploaded to the HDFS cluster, so that you can use it with the yarn.provided.lib.dirs parameter. Adapted to JindoSDK.
Impala	Adapted to JindoSDK.
Ranger	Fixed a startup failure issue with Spark History Server. Adapted to JindoSDK.
HBase	Fixed an issue with default parameters. Fixed a GC log date format issue. Fixed a restart issue when RS used an IP address.
Druid	Adapted to JindoSDK.
Clickhouse	Optimized the handling logic when the ClickHouse component is stopped.
Iceberg	Upgraded to version 0.13.0. Hid default configuration items to improve user experience.
DLF-Auth	Fixed a startup failure issue with Spark History Server.
StarRocks	Added the service to the new console. Version 2.0.1 is published.

EMR 3.38.x

Release date

Version	Date
EMR-3.38.3	December 2021
EMR-3.38.2 (New purchases are not supported)	December 2021
EMR-3.38.1 (New purchases are not supported)	November 2021
EMR-3.38.0 (New purchases are not supported)	October 2021

Update details

EMR-3.38.3

Fixed the Log4j security vulnerability in all related components. For more information, see Vulnerability announcement | Apache Log4j2 remote code execution vulnerability.

Service	Change
Presto	Fixed an error that occurred when Presto queried Hudi tables in a high availability cluster. Fixed the Log4j vulnerability in the Elasticsearch connector.
DLF Metastore	Changed the default setting for Metastore logs from enabled to disabled. Fixed an error caused by an excessively long URI in Metastore gettablestats.
Delta Lake	Fixed an issue with synchronizing schema changes to the Metastore.
Flink	Upgraded VVR to version 4.0.11. This version supports the following features: Released the commercial Flink CDC feature: Supports Schema Evolution. Supports Flink SQL semantics for full database synchronization. Supports using Gemini Statebackend to store state on OSS. Provided an enterprise edition of the Hudi Connector with built-in DLF for metadata management.
Sqoop	Fixed an issue where precision was lost for the Decimal type when importing HCatalog tables with Sqoop.

EMR-3.38.2

Service	Change
SmartData	Upgraded SmartData to version 3.8.0. For more information, see Introduction to SmartData 3.8.x. Supports authentication and authorization management for OSS based on Kerberos and Ranger.

EMR-3.38.1

Service	Change
SmartData	Upgraded SmartData to version 3.7.3. For more information, see Introduction to SmartData 3.7.x.
Spark	Removed the invalid Log4j MetricsAppender configuration. Fixed a NullPointerException issue during SparkContext startup.
Presto	Fixed an issue in high availability Hadoop clusters where Presto required host configuration to query Hive tables. Fixed a startup failure issue with Presto under default configurations when memory is low. Fixed an issue where modifications to the worker-jvm configuration did not take effect. Supports Ranger.
Impala	Fixed a `no such method error` that occurred when querying DLF metadata tables.
Ranger	Supports Presto. Fixed a permission issue with Ranger Spark when inserting data into ORC and PARQUET tables. Fixed an issue where Ranger Hive role permissions did not take effect after Kerberos was enabled.
DLF-Auth	Upgraded DLF-Auth to version 1.0.1. Supports DLF permissions to control Presto permissions. Fixed an issue with RAM user caching.

EMR-3.38.0

Service	Change
SmartData	Upgraded SmartData to version 3.7.2. For more information, see Introduction to SmartData 3.7.x.
Spark	Upgraded Spark to version 2.4.8. Supports both Spark 2.4.8 and Spark 3.1.2. Note Spark3 does not support Delta or Remote Shuffle Service. For the Spark 3.x series, SparkSQL performance for Distinct calculations is optimized. The optimization is triggered when an aggregate operator contains multiple `count(distinct case ... when ...)` expressions. Fixed an array-index out of bounds issue in Adaptive Query Execution (AQE) when statistics were missing. Fixed an error that occurred with AQE and Cache in specific scenarios.
Hive	Upgraded Hive to version 2.3.9.
Presto	Released as a standalone Presto cluster. Upgraded Presto to community version 358. Important This version does not support Ranger. Supports connectors such as Hudi and MySQL by default, and updated the default configurations. Presto clusters support elastic scaling. Supports data lake analytics.
DeltaLake	Unified delta-connectors for Hive 2 and Hive 3. Fixed an error that occurred when querying multi-level partitioned tables with delta-connectors.
Hudi	Upgraded Hudi to version 0.9.0. Fixed a compatibility issue with sql.extension between DeltaLake and Hudi.
HDFS	The default parameter for NameNode reserved capacity now increases automatically. This ensures that NameNode enters safe mode promptly when disk space is low.
Flink	Upgraded Flink to version 1.13-vvr-4.0.10, which corresponds to community Flink 1.13.1. Added commercial Flink Connectors, such as the Hologres connector. Added a corresponding Metric Reporter and integrated it with the APM dashboard for monitoring. For the Kafka Connector, added a Kafka Catalog based on SchemaRegistry. This lets you directly read from and write to existing Kafka topics without using DDL.
Storm	The component is offline.
Zeppelin	Upgraded Zeppelin to community version 0.10.0.
Ranger	When Presto is community version 358, this version of Ranger does not support Presto access control.
Hue	Fixed an issue where the YARN Job Browser could not properly display or terminate jobs in some cases. The YARN Job Browser is enabled in the default configurations. The Presto protocol is supported in the default configurations.
Druid	Fixed a node restart failure caused by residual PID files after a server power loss.
ClickHouse	Updated the default configurations. Supports cluster scale-out. Supports the MetaChecker feature. Supports reading data using the OSS table engine and OSS table function. Supports custom ZooKeeper addresses at the table level.
Iceberg	Added the component. The version is 0.12.0-1.0.1.
Knox	Fixed an issue where the first access to a Spark task failed.
DLF-Auth	Added the component. Supports DLF permissions to control Hive and Spark permissions. The version is 1.0.0.
ESS	Upgraded ESS to version 1.2.0.

EMR 3.37.x

Release date

Version	Date
EMR-3.37.1	September 2021
EMR-3.37.0 (New purchases are not supported)	August 2021

Update details

EMR-3.37.1

Service	Change
SmartData	Upgraded SmartData to version 3.7.1.
Hue	Fixed an issue where Impala could not be used in high-security clusters.
Kudu	Supports Kerberos.

EMR-3.37.0

Service	Changes
SmartData	Upgraded SmartData to version 3.7.0.
Spark	Fixed a compatibility issue with Delta Lake.
Delta Lake	Upgraded Delta-Connectors to support creating and querying tables using StorageHandler syntax. Fixed an issue that occurred when using INSERT OVERWRITE on partitioned tables. Fixed an issue where Optimize wrote virtual fields to files in G-SCD scenarios.
YARN	Added appId, CPU, and memory resource usage information to the node Containers REST API. Fixed an issue where ApplicationMaster (AM) logs could not be viewed on nodes released by Auto Scaling. Added support for cleaning up released nodes after they are decommissioned by Auto Scaling. Improved the graceful decommission logic for Auto Scaling. Nodes are now marked as offline only after the NodeManager (NM) process ends.
ZooKeeper	Upgraded to community version 3.6.3.
Flink	Added the SmartData component. Fixed an issue that prevented password-free access to OSS when submitting jobs to a DataFlow-Flink cluster through Secure Shell (SSH).
Impala	Fixed an issue that caused an infinite loop when listing directories after an OSS partition directory was directly deleted.
Hue	Fixed a display issue in the user interface when Hue is used with Oozie.
Kudu	Upgraded to community version 1.14.0.
ClickHouse	Updated the default configurations.

EMR-3.36.x

Release date

EMR-3.36.1 was released on July 16, 2021.

Updates

Service	Changes
SmartData	Upgraded SmartData to version 3.6.1. For more information, see Introduction to SmartData 3.6.x.
Hive	Upgraded Hive to version 2.3.8. Fixed an issue where an incorrect result was returned when you execute the `show create table` command using Data Lake Formation (DLF) metadata. Optimized the default parameters of Hive to improve job performance. Changed the names of configuration items on the hive-env tab of the Hive service Configuration page in the E-MapReduce console to uppercase for ease of use. The error message that is reported because of the incompatibility between the file system and Hive metastore when you write data to a Hive table is optimized.
HDFS	Added support for the Zstandard (ZSTD) compression format.
Flink	Upgraded Flink to version 1.12-vvr-3.0.2. Note Flink is removed from Hadoop clusters.
Hudi	Upgraded Hudi to version 0.8.0. Added support for integration with Spark SQL.
Spark	Optimized the names of configuration items on the spark-defaults tab of the Spark service Configuration page in the E-MapReduce console. Optimized the performance of log output. Added support for the ZSTD compression format.
Impala	Fixed an issue that caused a core dump error when you use Hadoop Distributed File System (HDFS).
Tez	Optimized the default parameters of Tez to improve job performance.
Knox	Added support for the Kudu component. Added support for the Impala component. Added support for the Hbase component.
Phoenix	Fixed an issue where a "Java Database Connectivity (JDBC) Driver not found" error was reported when you use Hive or Spark SQL to access Phoenix tables.
ClickHouse	Enabled application performance management (APM) monitoring and alerting.

EMR-3.35.x

Release date

EMR-3.35.0 was released on April 21, 2021.

Updates

Service	Change
SmartData	Upgraded to version 3.5.0. For version details, see Introduction to SmartData 3.5.x.
Spark	Fixed an issue where Adaptive Execution did not take effect in some scenarios. Fixed an issue where the behavior of statistical aggregate functions was inconsistent with that of Hive. Fixed an issue where data of the char type was read incorrectly from Hive ORC tables.
HDFS	Adds support for the SM4 national encryption algorithm.
Hue	Upgraded Hue to version 4.9.0.
Alluxio	Upgraded Alluxio to version 2.5.0.
Druid	Upgraded Druid to version 0.20.1. Enhanced security.
Livy	Upgraded Livy to version 0.7.1.

EMR 3.34.x

Release date

EMR-3.34.0 was released on March 15, 2021.

Changes

Service	Changes
SmartData	Upgraded to version 3.4.0. For more information, see Introduction to SmartData 3.4.x.
Spark	Optimized some default configurations. Performance optimization: Added support for Window TopK pushdown. Enhanced compatibility for reading and writing CSV or JSON tables in Hive. The ANALYZE statement now supports omitting all table column names. Added support for enabling or disabling the Lightweight Directory Access Protocol (LDAP) feature with a single click. Improved the usability of the Spark Beeline tool.
Hive	Optimized some default configurations. Performance optimization: Enhanced the cost-based optimizer (CBO). Added support for enabling or disabling the LDAP feature with a single click. Upgraded Calcite to version 1.12.0. Added the hive.security.authorization.sqlstd.confwhitelist.append parameter.
Presto	Added support for enabling or disabling the LDAP feature with a single click.
YARN	Fixed an important security threat related to unauthorized access to the Hadoop web UI. The threat occurred when accessing the YARN web UI through a Secure Shell (SSH) tunnel, which required user.name=name to be explicitly specified in the URL.
Zookeeper	Upgraded to version 3.6.2.
Flink	Updated the config.sh file during initialization to fix an issue with HADOOP_CLASSPATH.
Impala	Upgraded Impala to version 3.4.0. Upgraded Shiro to version 1.7.0. Added support for Data Lake Formation (DLF) metadata. Added support for querying data in Delta format. Added support for enabling or disabling the LDAP feature with a single click.
Tez	Optimized the default configurations.
HAS	Fixed an issue where the admin.keytab file could not be re-initialized after an error occurred during the HAS installation flow.
Ranger	The issue caused by filter pushdown in Spark is fixed. The issue that prevents Presto from being enabled after you disable Presto in Ranger is fixed. LDAP authentication can be enabled or disabled with a click.
Knox	Fixed an issue with the Knox link for Druid 0.20.0.
Hue	Added support for enabling or disabling the LDAP feature with a single click.
Hudi	Added support for the SQL on Hudi feature. Fixed an accuracy issue that occurred when querying partial data. Added support for partition pruning when you query Copy On Write tables in Hudi using Spark. Added support for a bucketing index mechanism to improve write performance.
Delta Lake	Fixed an issue where metadata could not be synchronized to Hive Metastore from an existing Delta table. Fixed an issue where the MERGE command could not parse the `*` character. Fixed an error that occurred during the creation of table metadata when transforming data from Parquet format to a Delta table. Fixed an issue where the OPTIMIZE command failed when there were no files to compact. The MERGE syntax now supports using a subquery as the source. Introduced a caching mechanism to improve query efficiency when you use Presto to query Delta tables. Added support for querying Delta tables using Impala.
Superset	The issue that prevents the admin user from logging on to the web UI is fixed. Datasets are compatible with Druid clusters. Spark SQL datasets are no longer supported.
Sqoop	Added support for importing files in Parquet format to Object Storage Service (OSS).
Alluxio	Upgraded to version 2.4.1.
Phoenix	Hive on Phoenix now supports backing field settings.
Pig	Removed.

EMR-3.33.x

Release date

EMR-3.33.0 was released on January 15, 2021.

Updates

Service	Changes
SmartData	Upgraded to version 3.2.0. For more information, see Introduction to SmartData 3.2.x.
Spark	Upgraded to version 2.4.7. Upgraded jQuery to version 3.5.1. Added compatibility with Hive to automatically update table and partition sizes. Added support for outputting Spark metadata and job running information to DataWorks.
Hive	Upgraded to version 2.3.7. HCatalog now supports Data Lake Formation. Added support for outputting Hive metadata and job running information to DataWorks.
Metastore	Added the Hive Statistics feature. HCatalog now supports Data Lake Formation. Optimized the method for obtaining STS tokens.
HDFS	Upgraded jQuery to version 3.5.1.
YARN	Upgraded jQuery to version 3.5.1. Adjusted the Fair Scheduler configuration. Optimized Timeline Server.
Zeppelin	Upgraded to version 0.9.0.
Ranger	Added audit log configuration for Hive. Added audit configuration for Log4j.
OpenLDAP	Added an audit feature. Enabled the SSL port (10636) by default. Added support for one-click startup of Presto.
Knox	Fixed a Spring vulnerability. Fixed an issue with viewing the Executors page in the Spark UI. Fixed an issue with the Oozie job status page.
Hue	Added support for Presto.
Druid	Upgraded to version 0.20.0.
EMRHook	Added a new software service. hive-hook: Supports outputting Hive metadata and job running information to DataWorks. spark-hook: Supports outputting Spark metadata and job running information to DataWorks.

EMR-3.32.x

Release date

EMR-3.32.0 was released on November 23, 2020.

Updates

Service	Changes
SmartData	Upgraded to version 3.1.0. For more information, see Introduction to SmartData 3.1.x.
Alluxio	Supports Alluxio 2.4.0. Default parameter settings scale with cluster node size. Uses HDFS in the EMR cluster as the default UnderFS. This feature is ready to use out of the box. Enhanced the Alluxio OSS UnderFS to support new features such as OSS multi-versioning. Compatible with engines such as Hadoop, Hive, Spark, and Presto.
HUDI	Supports HUDI 0.6.0.
Spark	JindoTable supports enabling or disabling the data collection feature.
Hive	Fixed a connection pool leak issue in HiveServer. JindoTable supports enabling or disabling the data collection feature. Optimized the performance of `ADD COLUMN`. Fixed an issue where incorrect data was read from HUDI tables. Default parameter settings scale with cluster node size.
HDFS	Supports a larger number of snapshots.
YARN	Default parameter settings scale with cluster node size.
Tez	Default parameter settings scale with cluster node size.
Sqoop	Fixed an issue with importing files in Avro format.

EMR 3.30.x

Release date

EMR-3.30.0 was released on October 26, 2020.

Updates

Service	Updates
SmartData	Upgraded to 3.0.0. For more information, see Introduction to SmartData 3.0.x.
Spark	Added support for Alibaba Cloud Data Lake Formation (DLF) metadata. Upgraded the HAS dependency to 2.0.1. Fixed an issue with backticks in Streaming SQL. Removed the Delta JAR package. Delta is now deployed separately. Modified the log path to write all logs to HDFS.
Hive	Added support for Alibaba Cloud DLF metadata. Resolved an issue where a DUMMY file was written when reading an empty directory in a Delta table. Upgraded the HAS dependency to 2.0.1.
Presto	Added support for Alibaba Cloud DLF metadata. Resolved an issue that limited the reading of Delta tables. Fixed an issue where the JVM configuration was missing in high-security mode. Upgraded the HAS dependency to 2.0.1.
HDFS	Added support for hot-swappable disk mode. Upgraded the HAS dependency to 2.0.1.
YARN	Fixed an issue with YARN RMZKStateStore. Added support for SNAPPY files output by SLS. Modified the directory configuration for MapReduce Local mode to resolve a directory permission check issue. Added support for hot-swappable disk mode. Set the log path to write all logs to HDFS. Upgraded the HAS dependency to 2.0.1.
Zookeeper	Added support for attaching the service port to an internal IP address at startup. Upgraded the HAS dependency to 2.0.1.
Flink-Vvp	Upgraded to version 1.11-2.2.2. Added support for SQL and Autopilot features. Note Only Dataflow clusters support Flink-Vvp. Hadoop clusters do not support Flink-Vvp at this time.
Flink	Added support for writing to OSS in cache mode. This feature, combined with Flink Checkpoints and a resumable Source, achieves EXACTLY_ONCE semantics. Synchronized with Flink community version 1.11.1 features. SQL now supports multiple outputs (MULTI INSERT). Upgraded the HAS dependency to 2.0.1.
Impala	Added support for custom configurations of catalogd.flgs, impalad.flgs, and statestored.flgs. Upgraded Shiro to version 1.6.0. Upgraded the HAS dependency to 2.0.1.
Tez	Optimized the default memory parameters for the Application Master (AM). Upgraded the HAS dependency to 2.0.1.
HAS	Upgraded the HAS dependency to 2.0.1.
Storm
Zeppelin
Ranger
OpenLDAP
Oozie
Knox
Kafka
HUE
HBase
Druid

EMR-3.29.x

Release date

EMR-3.29.0 was released on July 29, 2020.

Updates

Service	Changes
Bigboot	Upgraded to version 2.7.301. Jindo DistCp now supports writing data to OSS with the Archive or Infrequent Access storage class. Enhanced the FUSE feature to support multiple namespaces. Improved the metadata caching feature in Cache mode.
Spark	Upgraded Spark to 2.4.5.2.0. Added support for third-party metastores. Added the datalake metastore-client.
Hive	Upgraded Hive to 2.3.5.6.0. Added support for third-party metastores. Added the datalake metastore-client.
Presto	Upgraded to version 338.
Ranger	Upgraded the software package to 1.2.0-1.5.0. Added support for Presto 338. Added descriptions to configuration files.
Hadoop Distributed File System (HDFS)	Enabled adaptive configuration for the reserved space size of datanodes.
Knox	Impala, later versions of Flink, and PAI are supported.
Druid	Upgraded to version 0.18.1.
SmartData	Upgraded to version 2.7.301.

EMR 3.28.x

Release date

EMR-3.28.0 was released on June 12, 2020.

New features

Service

Changes

Bigboot

Releases the first version of JindoTable, which provides hotspot statistics for tables and partitions.
Adds support for complete storage policies in Block mode and tiered storage policies, such as Infrequent Access and Archive.
Adds the Jindo DistCp data migration tool.
Improves and fixes Jindo Fuse.
Improves the integration of the JFS scheme with the Hive engine and Jindo JobCommitter in Cache mode.
Adds a feature to set a read ratio in Block mode for reading data directly from OSS. This reduces the overhead of reading from the local cache.
Decouples JindoFS software modules into Bigboot (control layer), Smartdata (distributed service), and the JindoFS SDK. Each module can be independently upgraded and maintained.

Updates

Service	Changes
Flink	Upgrades open source Flink to Ververica Platform Enterprise Edition. The platform is heavily customized based on open source Flink 1.10 and provides value-added features, such as the self-developed Gemini storage engine.
Bigboot	Upgrades to version 2.7.0.
Delta	Upgrades to version 0.6.0. Decouples the Delta code from the Spark code.
Spark	Upgrades to version 2.4.5. Supports streaming-sql scripts from DataFactory. Supports Delta 0.6.0.
Hive	Supports Delta 0.6.0.
Ranger	Supports custom deployments of Hadoop Distributed File System (HDFS), Hive, and Spark. Supports the configuration of ranger-admin-site and ranger-ugsync-site in the console.
HDFS	Now prints DataNode exception information when an HDFS write fails due to no available DataNodes (HDFS-9023).
Hue	Supports installing the Hue component on Gateway clusters. Supports deploying multiple Hue instances on a single node.
DataFactory	Supports Delta 0.6.0.
Druid	Upgrades to version 0.18.0.
Knox	Upgrades to version 1.1.0-1.0.7. Supports the HBase UI.

EMR-3.27.x

Release dates

Version	Date
EMR-3.27.0	April 29, 2020
EMR-3.27.1 (New purchases are not supported)	May 8, 2020
EMR-3.27.2 (New purchases are not supported)	May 20, 2020

New features

Feature

Change

Custom component deployment

Added support for custom deployment of components on master nodes. The following components are supported:

Hadoop
Spark
Hive
Zookeeper
Presto

Graceful shutdown for Auto Scaling

When graceful shutdown is enabled, nodes are not released immediately. They are released after tasks are completed within a specified time period.

Updates

Service	Change
Spark	CUBE now supports date type partition fields. Increased the stack depth of Spark-Submit.
Delta	Enhanced Data Definition Language (DDL) syntax, including commands such as CREATE, SHOW, and DESCRIBE. Delta now supports the Optimize syntax with ZOrder.
Knox	Adapted for the Druid User Interface (UI). Multi-master deployment is supported.
Hive	hcatalog tables now support the magic committer. Removed some outdated default configurations.
Bigboot	Upgraded to version 2.6.3. Multi-master deployment is supported.
SmartData	Upgraded to version 2.6.3. Multi-master deployment is supported.
Ranger	Ranger now supports the Solr component. Ranger now supports PrestoSQL version 311.
Tez	Tez now supports setting scratchdir on OSS.
Presto	Upgraded to version 331.
Druid	Upgraded to version 0.17.1.
Superset	Upgraded to version 0.35.2.
Sqoop	The MySQL Java Database Connectivity (JDBC) JAR package is upgraded to version 5.1.48. The MySQL direct export mode supports setting a custom encoding using `--mysql-charset`.

EMR-3.26.x

Release dates

Version	Date
EMR-3.26.3 (New purchases are not supported)	April 16, 2020

Updates

Service	Changes
Bigboot	Upgraded to version 2.6.3. Added support for OTS metadata and Namespace HA.
SmartData
Hive	HCatalog tables now support the direct committer.
YARN	Changed the default committer to JindoOssCommitter.
HDFS	Upgraded JindoFS-related configurations.
Spark	Changed the default committer to JindoOssCommitter.

EMR-3.25.x

Release date

EMR-3.25.0 was released on January 13, 2020.

New features

Ranger service: Added support for Ranger Presto operations.

Updates

Service	Changes
Ranger	Initialized the RangerAdmin database for high-availability (HA) clusters. Fixed a security issue in the RangerUserSync startup script.
Spark	Added support for configuring Delta-related parameters, such as `spark.sql.extensions`, in the console. Added support for Hive to read Delta tables without setting the input format. Added support for the ALTER TABLE SET TBLPROPERTIES and UNSET TBLPROPERTIES statements.
Delta
Hive	Fixed an issue where MapReduce (MR) task execution failed in automatic local mode.
Presto	Upgraded to version 310. Upgraded the joda-time version to 2.10.5.
Tez	Upgraded to version 0.9.2. Fixed an issue where the application progress was not displayed correctly in the Tez user interface (UI). Fixed an issue where the application history could not be viewed in the Tez UI.
Impala	Fixed an issue where Impala could not access LZO tables.
HDFS	Removed mongo-hadoop related JAR packages.
Zookeeper	Upgraded to version 3.5.6.
YARN	Adapted for the Tez UI. The yarn-site tab now supports adding the configuration item yarn.resourcemanager.system-metrics-publisher.enabled=true.
Bigboot	Upgraded to version 2.2.3. Added support for rename operations in OSS Cache mode.
SmartData
Knox	Upgraded dependency package versions.
Oozie	Upgraded dependency package versions.

EMR-3.24.x

Release date

EMR-3.24.0 was released on November 18, 2019.

New features

Service	Changes
Delta	Supports SQL syntax, including ALTER, CONVERT, CREATE, CTAS, DELETE, DESC, INSERT, MERGE, OPTIMIZE, UPDATE, and VACUUM. Built-in and optimized the OPTIMIZE command. Supports the Hive connector. Supports other existing open-source features.
Grafana	Added as a new component for standalone Flink clusters. Version: 6.4.2.
Prometheus	Added as a new component for standalone Flink clusters. Version: 2.13.0.
AlertManager	Added as a new component for standalone Flink clusters. Version: 0.19.0.
TensorFlow on spark	Supports running TensorFlow on Spark. This deeply integrates Spark with the deep learning framework. The integration includes optimized task scheduling and data exchange. It provides a complete workflow, from data pre-processing to deep learning training. Supports streaming tasks.

Updates

Service	Changes
SmartData	Optimized JindoFS usage modes. The usage of Block mode is unchanged. Cache mode now supports its original usage and is also compatible with the original OSS file system usage. It supports data and metadata caching. These features can be enabled or disabled separately through configuration and are disabled by default. Optimized read and write performance for Block mode and Cache mode. Optimized disk cleanup. This provides more accurate statistics and more timely cleanup for hot data cached on local disks. It strictly ensures that disk usage does not exceed the quota. Improved support for Gateway clusters. Block mode and Cache mode can now be used on a Gateway. Supports a deployment mode where one storage cluster is separated from multiple compute clusters.
Spark	Added support for Delta-related parameters. Added support for Ranger Spark plugin configuration. Upgraded JindoCube to version 0.3.0.
Hive	Added logic for the SQL compatibility check feature. Released a combination of Hive 2.3.5 and Hadoop 2.8.5. When restarting the component, the content of hiveserver2-site.xml is no longer synchronized to hive-site.xml under spark-conf. Supports using the MSCK command to add incremental folders. Fixed a bug that occurred when Hive reused a Tez container. Supports using the MSCK command to optimize column-based folders.
Bigboot	Upgraded to 2.2.1. Fixed issues with native code support on some machine models.
Ranger	Refactored the deployment method for the Spark plugin. Fixed a bug where header2 in an HA cluster did not obtain the keytab.
Kudu	Fixed the startup logic.
Zookeeper	Added configuration for four-letter words. This is enabled by default.
HDFS	Added compatibility with JindoFS.
YARN	Changed the default value of the yarn.scheduler.capacity.node-locality-delay configuration to -1. Added compatibility with JindoFS.
Has	Integrated with OpenLDAP as the backend.
OpenLDAP	Added compatibility with Has.
Presto	Upgraded to version 0.228.
Kafka	Removed D1 bad disks.
Druid	Upgraded to 0.16.0.
Flume	Upgraded to 1.9.0.
Flink	Upgraded to 1.9.1. Supports standalone Flink clusters (released to a whitelist).

EMR-3.23.x

Release date

EMR-3.23.0 was released on September 18, 2019.

Updates

Service	Changes
Druid	Upgraded to 0.15.1. Added the router component. Upgraded fastjson.
Spark	Updated Spark Thrift Server to fix a class loader issue. Refactored Spark transaction code to improve stability. Fixed an issue with reading and writing files in ORC format after the built-in Hive was upgraded to version 2.3. Added support for the MERGE INTO syntax. Added support for the SCAN and STREAM syntax. The Structured Streaming Kafka sink now supports exactly-once semantics (EOS). Updated Delta Lake to 0.4.0.
Hive	Removed the old version of the Hive hook. Added an optimization to handle data skew for multiple COUNT(DISTINCT) fields. Fixed an issue where data was lost when joining tables with different bucket versions.
Flink	Upgraded to 1.8.2.
Bigboot	Updated the small file tool. Updated the OSS JAR package to fix a non-daemon thread issue.
Kafka	Added support for the Deployment Set awareness feature. Removed the fastjson dependency.
HDFS	Optimized the deployment logic for the SmartData OSS JAR package. Updated the SmartData OSS JAR package.
Flume	Upgraded fastjson.
TensorFlow on Spark	Added this service.
HAS	Upgraded fastjson.
Livy	Upgraded fastjson.

EMR-3.22.x

Release date

EMR-3.22.0 was released on July 28, 2019.

New features

Service	Change
Kudu	Added Kudu as a new component. Kudu fills a gap in the Hadoop ecosystem. It provides fast data inserts and random access similar to HBase, and lets you modify data. It also provides large-scale data analytics and query capabilities similar to Hadoop Distributed File System (HDFS) or Parquet. Provides C++ and Java APIs for custom development. Integrates with Impala, Spark, and Hive Metastore. This version of Kudu is based on Apache Kudu 1.10.0.
OpenLDAP	Added OpenLDAP as a new component to replace ApacheDS. ApacheDS is now offline. Supports high availability (HA).

Updates

Component	Details
JindoFileSystem	Multiple storage modes Block mode: Data is stored as blocks in the backend OSS. The local Namespace service maintains metadata. Block mode provides better metadata and data performance. Block mode supports different storage policies, including WARM (local replicas, OSS replicas), COLD (OSS replicas only), HOT (multiple local replicas, OSS replicas), TEMP (local replicas only), and ALL_HDD (multiple local replicas). The default policy is WARM. You can set different storage policies for folders based on your application scenario. Cache mode: This mode is compatible with existing OSS storage methods. In Cache mode, files are stored as objects in OSS. Data and metadata for each file are cached locally based on access frequency. This improves data and metadata access performance. Cache mode provides different metadata synchronization policies to meet the needs of different scenarios. External client support The client software development kit (SDK) lets you access the EMR JindoFS file system from outside an EMR cluster. You can use the client to access the Namespace in Block mode. However, external clients cannot use the data cache built by EMR JindoFS within the EMR cluster. This results in lower performance compared to using it within the EMR cluster. Cache mode retains the original OSS storage semantics. It uses JindoFS to accelerate data caching within the EMR cluster. Therefore, you can directly access data from outside the EMR cluster using an OSS client, such as the OSS SDK or EMR OssFileSystem. Ecosystem component support JindoFS now supports many compute engines on EMR, such as Spark, Flink, Hive, MapReduce, Impala, and Presto. For scenarios that separate computing and storage, you can also store job logs in JindoFS, such as YARN Container logs and Spark Event logs. JindoFS can be used as the HFile backend storage for HBase to expand its storage capacity.
OssFileSystem	Added logic to OssFileSystem to automatically detect bad disks. This fixes an issue where cache writes failed during OSS writes due to bad disks. Completed the related configurations for OssFileSystem.
Bigboot	Upgraded to version 2.0.0. Includes several major updates, such as support for multiple Namespaces, storing local data blocks as large files, multi-mode storage, and external clients. Fixed an issue where the Bigboot monitor status was incorrect during a machine restart. Added a service spec for the Kudu component. Added correctness checks for all service specs.
Hadoop	HDFS Adapted for HDFS Federation. You can now create HDFS Federation clusters using custom configurations and APIs. This avoids the need for a second format operation when creating a Federation cluster. Optimized the bad disk detection logic. For local disk scenarios, you can trigger bad disk detection when a DataNode block report is triggered by dfsadmin. YARN Fixed an issue where the MapReduce JobHistory job list did not update when MapReduce job Container logs were stored in JindoFS or OSS.
Spark	Relational Cache Added support for Relational Cache. Relational Cache uses pre-computation to accelerate user queries. You can create a Relational Cache to pre-compute data. When a user query is executed, the Spark Optimizer automatically finds a suitable cache, rewrites the SQL execution plan, and continues the computation based on the cached data. This improves query speed. This feature is suitable for scenarios such as reports, dashboards, data synchronization, and multidimensional analysis. Use Data Definition Language (DDL) to perform operations such as CACHE, UNCACHE, ALTER, and SHOW. Cached data supports all Spark data sources and data formats. Supports automatic cache data updates and updates using the REFRESH command. Supports incremental updates based on partitions. Supports execution plan optimization based on Relational Cache. Streaming SQL Standardized the parameter configuration for Stream Query Writer. Optimized the schema compatibility check for Kafka data tables. If a Kafka data table schema does not exist, it is automatically created in SchemaRegistry. Optimized the log information for when a Kafka schema is incompatible. Fixed an issue where column names had to be explicitly specified when writing query results to a Kafka table. Removed the restriction that streaming SQL queries only support Kafka and Loghub data sources. Delta Added Delta. You can use Spark to create a Delta data source to support scenarios such as streaming data writes, transactional reads and writes, data validation, and data history. For more information, see Delta details. Supports using the DataFrame API to read data from or write data to Delta. Supports using the Structured Streaming API to read from or write to Delta as a source or sink. Supports using the Delta API to perform operations such as update, delete, merge, vacuum, and optimize. Supports using SQL to perform operations such as creating Delta-based tables, importing data to Delta, and reading from Delta tables. Others Added a constraint feature that supports primary keys and foreign keys. Resolved JAR file conflicts, such as for servlets.
Flink	Rollback of Log4j logs
Kafka	Log rollback for Log4j. Upgraded fastjson.
Zeppelin	Upgraded the dependent commons-lang3 package to version 3.7. This fixes an issue where PySpark could not write to OSS. For more information, see Spark 2.4 incompatibility with commons-lang3 in Zeppelin.
Ranger	Added support for SHOW GRANTS.
Analytics-Zoo	Fixed a NumPy installation error.
Impala	Now compatible with Apache Kudu 1.10.0.
Presto	Upgraded to version 0.221.
ZooKeeper	Upgraded to version 3.5.5.

Versions earlier than EMR-3.22.x

EMR-3.1.1

Upgraded the operating system (OS) to CentOS 7.2.
Upgraded Spark to version 2.1.1.
Upgraded emr-core to version 1.2.6.
Fixed a bug related to AccessKey-free operations for OSS.

EMR-3.0.2

Upgraded emr-core to version 1.2.5.
Extended AccessKey-free support for OSS to more regions.
Adjusted the replacement policy for role-based AccessKeys.
Fixed some bugs in Hive and Hadoop.

EMR-3.0.1

Added support for interactive mode and unified table management. You can now store Hive metadata in an external database. This allows multiple clusters to share the same metadata.
Upgraded emr-core to version 1.2.4, which optimizes the read and write performance of Object Storage Service (OSS).
Upgraded Spark to version 2.0.2.

Note

This version is fully compatible with EMR-3.0.0.

EMR-3.0.0

Initial release.