This topic covers the release dates and component updates for E-MapReduce (EMR) 5.x. For the components supported in each version, see Distributions.
EMR-5.21.x
Release date
| Version | Date |
|---|---|
| EMR-5.21.0 | October 27, 2025 |
Updates
| Service | Change |
|---|---|
| Hive | Adds a profile mechanism that automatically detects the file format of data lake storage (such as ORC) and applies the optimized buffer and pre-read parameters of JindoSDK. Introduces an ORC stripe prefetch mechanism that enables parallel computing and I/O operations when you process medium to large ORC files — it asynchronously prefetches subsequent stripes while processing the current stripe to improve throughput. Supports ORC vectorized read: when reading index data from ORC files or performing predicate pushdown, many scattered and non-consecutive file ranges are generated; vectorized read sends batch requests to significantly improve throughput. Integrates the JindoSDK batch metadata API to process metadata requests (such as getFileStatus) in batches, improving metadata request throughput. |
| Spark | Adds a profile mechanism that automatically detects the file format of data lake storage (such as ORC) and applies the optimized buffer and pre-read parameters of JindoSDK. Introduces an ORC stripe prefetch mechanism that enables parallel computing and I/O operations when you process medium to large ORC files — it asynchronously prefetches subsequent stripes while processing the current stripe to improve throughput. Supports parallel pre-open for small files: automatically detects small file query scenarios and pre-opens a batch of files in parallel, greatly reducing I/O latency caused by frequent open operations. Supports ORC vectorized read to significantly improve throughput through batch requests. |
| Tez | Supports parallel pre-open for small files, automatically detecting small file query scenarios and pre-opening files in parallel to reduce I/O latency. |
| Ranger | JindoAuth Server supports custom RAM roles for client users to access Object Storage Service (OSS). Fixes a missing dependency issue in the Ranger-yarn-plugin. |
| Paimon | Upgraded to version 1-ali-16.3. |
| JindoCache | Upgraded to version 6.10.1. |
| Delta Lake | Added the component. Version: 3.2.1. |
Release version information
DataLake cluster
| Service | Version |
|---|---|
| Hadoop-Common | 3.2.1 |
| HDFS | 3.2.1 |
| OSS-HDFS | 1.0.0 |
| Hive | 3.1.3 |
| Spark2 | 2.4.8 |
| Spark3 | 3.5.3 |
| Tez | 0.10.2 |
| Trino | 422 |
| Delta Lake | 3.2.1 |
| Hudi | 0.15.0 |
| Iceberg | 1.5.0 |
| Flume | 1.11.0 |
| Kyuubi | 1.9.2 |
| YARN | 3.2.1 |
| OpenLDAP | 2.4.46 |
| Ranger | 2.3.0 |
| Ranger-plugin | 1.0.0 |
| DLF-Auth | 2.0.2 |
| Presto | 0.283 |
| ZooKeeper | 3.8.4 |
| Sqoop | 1.4.7 |
| Knox | 1.5.0 |
| Celeborn | 0.5.2 |
| JindoCache | 6.10.1 |
| Paimon | 1-ali-16.3 |
OLAP clusters
| Service | Version |
|---|---|
| StarRocks2 | 2.5.22 |
| StarRocks3 | 3.2.11 |
| Doris | 2.1.4 |
| ClickHouse | 23.3.13.6 |
| ZooKeeper | 3.8.4 |
DataFlow cluster
| Service | Version |
|---|---|
| Hadoop-Common | 3.2.1 |
| HDFS | 3.2.1 |
| OSS-HDFS | 1.0.0 |
| YARN | 3.2.1 |
| OpenLDAP | 2.4.46 |
| ZooKeeper | 3.8.4 |
| Knox | 1.5.0 |
| Flink | 1.17.2 |
| Paimon | 1-ali-6.2 |
DataServing cluster
| Service | Version |
|---|---|
| Hadoop-Common | 3.2.1 |
| HDFS | 3.2.1 |
| OSS-HDFS | 1.0.0 |
| OpenLDAP | 2.4.46 |
| Ranger | 2.3.0 |
| Ranger-plugin | 1.0.0 |
| ZooKeeper | 3.8.4 |
| Knox | 1.5.0 |
| HBase | 2.6.3 |
| JindoCache | 6.8.2 |
| Phoenix | 5.2.1 |
Custom cluster
| Service | Version |
|---|---|
| Hadoop-Common | 3.2.1 |
| HDFS | 3.2.1 |
| OSS-HDFS | 1.0.0 |
| Hive | 3.1.3 |
| Spark2 | 2.4.8 |
| Spark3 | 3.5.3 |
| Tez | 0.10.2 |
| Trino | 422 |
| Delta Lake | 3.2.1 |
| Hudi | 0.15.0 |
| Iceberg | 1.5.0 |
| Flume | 1.11.0 |
| Kyuubi | 1.9.2 |
| YARN | 3.2.1 |
| OpenLDAP | 2.4.46 |
| Ranger | 2.3.0 |
| Ranger-plugin | 1.0.0 |
| DLF-Auth | 2.0.2 |
| Presto | 0.283 |
| StarRocks2 | 2.5.22 |
| StarRocks3 | 3.2.11 |
| ZooKeeper | 3.8.4 |
| Sqoop | 1.4.7 |
| Knox | 1.5.0 |
| Celeborn | 0.5.2 |
| Flink | 1.17.2 |
| HBase | 2.6.3 |
| JindoCache | 6.10.1 |
| Paimon | 1-ali-16.3 |
| Phoenix | 5.2.1 |
EMR-5.20.x
Release date
| Version | Date |
|---|---|
| EMR-5.20.0 | July 10, 2025 |
Updates
| Service | Change |
|---|---|
| Hive | Optimizes the performance of adding fields to partitioned tables. |
| YARN | Optimizes global scheduling performance to prevent certain application behaviors from degrading cluster scheduling performance. |
Release version information
DataLake cluster
| Service | Version |
|---|---|
| Hadoop-Common | 3.2.1 |
| HDFS | 3.2.1 |
| OSS-HDFS | 1.0.0 |
| Hive | 3.1.3 |
| Spark2 | 2.4.8 |
| Spark3 | 3.5.3 |
| Tez | 0.10.2 |
| Trino | 422 |
| Hudi | 0.15.0 |
| Iceberg | 1.5.0 |
| Flume | 1.11.0 |
| Kyuubi | 1.9.2 |
| YARN | 3.2.1 |
| OpenLDAP | 2.4.46 |
| Ranger | 2.3.0 |
| Ranger-plugin | 1.0.0 |
| DLF-Auth | 2.0.2 |
| Presto | 0.283 |
| ZooKeeper | 3.8.4 |
| Sqoop | 1.4.7 |
| Knox | 1.5.0 |
| Celeborn | 0.5.2 |
| JindoCache | 6.8.2 |
| Paimon | 1-ali-6.2 |
OLAP Clusters
| Service | Version |
|---|---|
| StarRocks2 | 2.5.22 |
| StarRocks3 | 3.2.11 |
| Doris | 2.1.4 |
| ClickHouse | 23.3.13.6 |
| ZooKeeper | 3.8.4 |
DataFlow cluster
| Service | Version |
|---|---|
| Hadoop-Common | 3.2.1 |
| HDFS | 3.2.1 |
| OSS-HDFS | 1.0.0 |
| YARN | 3.2.1 |
| OpenLDAP | 2.4.46 |
| ZooKeeper | 3.8.4 |
| Knox | 1.5.0 |
| Flink | 1.17.2 |
| Paimon | 1-ali-6.2 |
DataServing cluster
| Service | Version |
|---|---|
| Hadoop-Common | 3.2.1 |
| HDFS | 3.2.1 |
| OSS-HDFS | 1.0.0 |
| OpenLDAP | 2.4.46 |
| Ranger | 2.3.0 |
| Ranger-plugin | 1.0.0 |
| ZooKeeper | 3.8.4 |
| Knox | 1.5.0 |
| HBase | 2.6.3 |
| JindoCache | 6.8.2 |
| Phoenix | 5.2.1 |
Custom cluster
| Service | Version |
|---|---|
| Hadoop-Common | 3.2.1 |
| HDFS | 3.2.1 |
| OSS-HDFS | 1.0.0 |
| Hive | 3.1.3 |
| Spark2 | 2.4.8 |
| Spark3 | 3.5.3 |
| Tez | 0.10.2 |
| Trino | 422 |
| Hudi | 0.15.0 |
| Iceberg | 1.5.0 |
| Flume | 1.11.0 |
| Kyuubi | 1.9.2 |
| YARN | 3.2.1 |
| OpenLDAP | 2.4.46 |
| Ranger | 2.3.0 |
| Ranger-plugin | 1.0.0 |
| DLF-Auth | 2.0.2 |
| Presto | 0.283 |
| StarRocks2 | 2.5.22 |
| StarRocks3 | 3.2.11 |
| ZooKeeper | 3.8.4 |
| Sqoop | 1.4.7 |
| Knox | 1.5.0 |
| Celeborn | 0.5.2 |
| Flink | 1.17.2 |
| HBase | 2.6.3 |
| JindoCache | 6.8.2 |
| Paimon | 1-ali-6.2 |
| Phoenix | 5.2.1 |
EMR-5.19.x
Release date
| Version | Date |
|---|---|
| EMR-5.19.0 | April 24, 2025 |
Updates
| Service | Change |
|---|---|
| Trino | Fixes an issue where LDAP is unavailable. |
| YARN | Improves resource allocation efficiency through global scheduling optimization. Adds metric monitoring for HTTP services. Fixes open source bug YARN-10213. |
| HBase | Upgraded to version 2.6.3. Changes the default runtime environment to Java 11. Changes the default garbage collector to G1. |
| Phoenix | Upgraded to version 5.2.1. |
| JindoCache | Upgraded to version 6.8.2. |
| StarRocks | Supports creating clusters with decoupled storage and compute. |
| EMRHOOK | Adds support for Spark 3.5. Supports data lineage tracking for Paimon tables. Enhances stability. |
Release version information
DataLake cluster
| Service | Version |
|---|---|
| Hadoop-Common | 3.2.1 |
| HDFS | 3.2.1 |
| OSS-HDFS | 1.0.0 |
| Hive | 3.1.3 |
| Spark2 | 2.4.8 |
| Spark3 | 3.5.3 |
| Tez | 0.10.2 |
| Trino | 422 |
| Hudi | 0.15.0 |
| Iceberg | 1.5.0 |
| Flume | 1.11.0 |
| Kyuubi | 1.9.2 |
| YARN | 3.2.1 |
| OpenLDAP | 2.4.46 |
| Ranger | 2.3.0 |
| Ranger-plugin | 1.0.0 |
| DLF-Auth | 2.0.2 |
| Presto | 0.283 |
| ZooKeeper | 3.8.4 |
| Sqoop | 1.4.7 |
| Knox | 1.5.0 |
| Celeborn | 0.5.2 |
| JindoCache | 6.8.2 |
| Paimon | 1-ali-6.2 |
OLAP clusters
| Service | Version |
|---|---|
| StarRocks2 | 2.5.22 |
| StarRocks3 | 3.2.11 |
| Doris | 2.1.4 |
| ClickHouse | 23.3.13.6 |
| ZooKeeper | 3.8.4 |
DataFlow cluster
| Service | Version |
|---|---|
| Hadoop-Common | 3.2.1 |
| HDFS | 3.2.1 |
| OSS-HDFS | 1.0.0 |
| YARN | 3.2.1 |
| OpenLDAP | 2.4.46 |
| ZooKeeper | 3.8.4 |
| Knox | 1.5.0 |
| Flink | 1.17.2 |
| Paimon | 1-ali-6.2 |
DataServing cluster
| Service | Version |
|---|---|
| Hadoop-Common | 3.2.1 |
| HDFS | 3.2.1 |
| OSS-HDFS | 1.0.0 |
| OpenLDAP | 2.4.46 |
| Ranger | 2.3.0 |
| Ranger-plugin | 1.0.0 |
| ZooKeeper | 3.8.4 |
| Knox | 1.5.0 |
| HBase | 2.6.3 |
| JindoCache | 6.8.2 |
| Phoenix | 5.2.1 |
Custom cluster
| Service | Version |
|---|---|
| Hadoop-Common | 3.2.1 |
| HDFS | 3.2.1 |
| OSS-HDFS | 1.0.0 |
| Hive | 3.1.3 |
| Spark2 | 2.4.8 |
| Spark3 | 3.5.3 |
| Tez | 0.10.2 |
| Trino | 422 |
| Hudi | 0.15.0 |
| Iceberg | 1.5.0 |
| Flume | 1.11.0 |
| Kyuubi | 1.9.2 |
| YARN | 3.2.1 |
| OpenLDAP | 2.4.46 |
| Ranger | 2.3.0 |
| Ranger-plugin | 1.0.0 |
| DLF-Auth | 2.0.2 |
| Presto | 0.283 |
| StarRocks2 | 2.5.22 |
| StarRocks3 | 3.2.11 |
| ZooKeeper | 3.8.4 |
| Sqoop | 1.4.7 |
| Knox | 1.5.0 |
| Celeborn | 0.5.2 |
| Flink | 1.17.2 |
| HBase | 2.6.3 |
| JindoCache | 6.8.2 |
| Paimon | 1-ali-6.2 |
| Phoenix | 5.2.1 |
EMR-5.18.x
Release dates
| Version | Date |
|---|---|
| EMR-5.18.1 | December 18, 2024 |
| EMR-5.18.0 (New purchases not supported) | December 4, 2024 |
Updates
| Service | Change |
|---|---|
| Spark3 | Upgraded to version 3.5.3. Fixes a configuration issue that occurs during Spark scale-out. |
| Trino | Fixes an issue where connections fail after LDAP is enabled. |
| ZooKeeper | Supports adding custom configurations. |
| Ranger | Replaces the existing Spark 3 Ranger plugin with the version provided by the open source Kyuubi project. |
| Hudi | Upgraded to version 0.15.0. |
| Celeborn | Upgraded to version 0.5.2. |
| Paimon | Upgraded to version 1.0-ali-1. |
| JindoCache | Upgraded to version 6.5.3. |
| StarRocks3 | Upgraded to version 3.2.11. |
| StarRocks2 | Upgraded to version 2.5.22. |
| Impala | The service is offline. Use Presto, Trino, ClickHouse, or StarRocks as an alternative, or install Impala yourself. |
| Kudu | The service is offline. |
| Kafka | The service is offline. |
| Kafka-Manager | The service is offline. |
EMR-5.17.x
Release dates
| Version | Date |
|---|---|
| EMR-5.17.4 | December 18, 2024 |
| EMR-5.17.3 (New purchases not supported) | November 29, 2024 |
| EMR-5.17.2 (New purchases not supported) | August 29, 2024 |
| EMR-5.17.1 (New purchases not supported) | June 21, 2024 |
| EMR-5.17.0 (New purchases not supported) | April 23, 2024 |
Updates
EMR-5.17.4
| Service | Change |
|---|---|
| JindoCache | Upgraded to version 6.5.3. |
| StarRocks2 | Upgraded to version 2.5.22. |
| StarRocks3 | Upgraded to version 3.2.11. |
EMR-5.17.3
| Service | Change |
|---|---|
| JindoSDK | Upgraded to fix a deadlock issue. |
EMR-5.17.2
| Service | Change |
|---|---|
| JindoCache | Upgraded to version 6.5.1. Improves the read and write performance of Distributed Hash Table (DHT). |
| Spark | Fixes an issue where partition folders cannot be deleted. Fixes a Hive package dependency issue to ensure that client operations do not interrupt the connection to metaStoreClient. |
| Trino | Fixes an issue where some modified configurations might be unexpectedly restored during scale-out. Supports querying data on high-security OSS-HDFS. Fixes a service abnormality that occurs after DLF-Auth is enabled. |
| Presto | Supports querying data on high-security OSS-HDFS. |
| HDFS | Fixes an issue where the memory of NameNode and DataNode cannot be modified. |
| YARN | ResourceManager supports sending timeline events in batches to improve processing throughput. Fixes a logic issue in container and resource processing in ResourceManager. |
| ZooKeeper | Fixes an issue where the memory configuration of a node group cannot be modified. Supports refactoring the log configuration file. |
| Impala | Fixes an issue where customer configurations are modified during elastic scaling. |
| Ranger | Supports the new JindoSDK kernel to reduce CPU utilization. |
| Knox | Fixes an issue where component URL access fails when there is only one Master-Extend node. |
| Kafka | Fixes a startup issue with Kafka Connect clusters. |
| StarRocks | Fixes an issue where new BE nodes are not visible after a scale-out. |
| Doris | Upgraded to version 2.1.4. |
| Paimon | Upgraded to version 0.9-ali-7. |
| EMRHOOK | Supports parsing data lineage for MaxCompute tables. |
EMR-5.17.1
| Service | Change |
|---|---|
| Spark | Supports deploying Master-Extend node groups. |
| Paimon | Replaces the Flink dependency from the VVR version with the community version. Supports Data Lake Formation (DLF) Catalog. |
| Knox | Uses JDK 8 for packaging. |
| Flink | Restores the DLF configurations and dependencies that were removed in EMR-5.17.0. |
EMR-5.17.0
| Service | Change |
|---|---|
| Spark | Spark3 upgraded to version 3.4.2. |
| Celeborn | Upgraded to version 0.4.0. |
| Doris | Upgraded to version 2.1.0. |
| StarRocks | StarRocks2 upgraded to version 2.5.18. StarRocks3 upgraded to version 3.2.4. |
| Delta Lake | Upgraded to version 3.0.0. |
| Iceberg | Upgraded to version 1.5.0. |
| ZooKeeper | Upgraded to version 3.8.4. |
| JindoCache | Upgraded to version 6.2.5. |
| Flink | Upgraded to version 1.17.2. |
EMR-5.16.x
Release date
| Version | Date |
|---|---|
| EMR-5.16.0 | February 19, 2024 |
Updates
| Service | Change |
|---|---|
| Hudi | Upgraded to version 0.14.0. |
| Flume | Upgraded to version 1.11.0. |
| Kyuubi | Upgraded to version 1.7.3. |
| Impala | Upgraded to version 4.3.0. |
| Celeborn | Upgraded to version 0.3.2. |
| JindoCache | Upgraded to version 6.2.0. |
| Paimon | Upgraded to version 0.7-ali-1. |
| Kafka | Upgraded to version 3.6.1. |
| StarRocks | StarRocks2 upgraded to version 2.5.13. StarRocks3 upgraded to version 3.1.5. |
| Spark | Fixes the Commons Text vulnerability. |
| Ranger | Fixes vulnerabilities in the Commons Text library. Fixes the path matching permission bypass vulnerability in the Spring Security framework. Fixes the forward/include authentication bypass vulnerability in the Spring Security framework. Fixes the identity authentication bypass vulnerability in a special matching mode in Spring Framework. The interval at which Ranger retrieves user information from the Lightweight Directory Access Protocol (LDAP) server and updates it can now be configured. |
EMR-5.15.x
Release dates
| Version | Date |
|---|---|
| EMR-5.15.1 | November 16, 2023 |
| EMR-5.15.0 (New purchases not supported) | October 27, 2023 |
Updates
| Service | Change |
|---|---|
| JindoCache | Added the service. Version: 6.1.1. |
| JindoData | JindoData cannot be selected. Use JindoCache for caching and DLF-Auth for authentication. |
| Spark | Removes jdo-related configurations from hive-site.xml. |
| HBase | Adds a configuration item to select the HBase Thrift Server version (v1 or v2). |
| StarRocks | StarRocks2 upgraded to version 2.5.10. |
| Doris | Upgraded to version 1.2.7. |
| Celeborn | Upgraded to version 0.3.1. |
| Paimon | Upgraded to version 0.6-ali-2. |
| ClickHouse | Upgraded to version 23.3.13.6. |
EMR-5.14.x
Release date
| Version | Date |
|---|---|
| EMR-5.14.2 | August 17, 2023 |
Updates
| Service | Change |
|---|---|
| Trino | Fixes an issue where the Paimon connector fails to query Hadoop Distributed File System (HDFS) tables. Fixes an issue where worker monitoring metrics cannot be read. |
| Presto | Upgraded to version 0.283. Fixes an issue where worker monitoring metrics cannot be read. |
| ClickHouse | Grants all permissions to the default user by default. |
| StarRocks | Renames the previous StarRocks version to StarRocks2. Adds StarRocks3 at version 3.1.2 — clusters are created in coupled storage and compute mode by default; decoupled storage and compute mode is not supported. |
| Celeborn | Upgraded to version 0.3.0. |
EMR-5.13.x
Release date
| Version | Date |
|---|---|
| EMR-5.13.0 | August 3, 2023 |
Updates
| Service | Change |
|---|---|
| Hudi | Upgraded to version 0.13.1. |
| Paimon | Upgraded to version 0.5-ali-1. |
| StarRocks | Upgraded to version 2.5.8. |
| JindoData | Upgraded to version 4.6.11. |
| Trino | Upgraded to version 422. The Hudi connector supports querying Merge On Read (MOR) tables. Improves the error message for dynamic UDF loading. |
EMR-5.12.x
Release dates
| Version | Date |
|---|---|
| EMR-5.12.1 | July 13, 2023 |
| EMR-5.12.0 (New purchases not supported) | June 1, 2023 |
Updates
EMR-5.12.1
| Service | Change |
|---|---|
| Spark | Spark History Server supports OSS-HDFS for storage by default. The Spark 3 native engine supports OSS and OSS-HDFS for storage. |
| Hive | Hive warehouse supports OSS-HDFS for storage by default. |
| OSS-HDFS | Added the service. |
| YARN | Supports OSS-HDFS for storage by default. |
| HBase | HBase HFile data supports OSS-HDFS for storage by default. HBase WAL logs support OSS-HDFS for storage. |
EMR-5.12.0
| Service | Change |
|---|---|
| Kyuubi | Upgraded to version 1.7.1. |
| Celeborn | Upgraded to version 0.2.2. |
| Paimon | Flink-Table-Store renamed to Paimon. Upgraded to version 0.4-ali-1. |
| StarRocks | Upgraded to version 2.5.5. |
| Doris | Upgraded to version 1.2.4. |
| ClickHouse | Upgraded to version 23.3.2.37. |
| Trino | Provides a simple event listener by default to obtain audit logs. |
| Phoenix | Supports Hive on Phoenix. |
EMR-5.11.x
Release dates
| Version | Date |
|---|---|
| EMR-5.11.1 | April 3, 2023 |
| EMR-5.11.0 (New purchases not supported) | February 28, 2023 |
Updates
EMR-5.11.1
| Service | Change |
|---|---|
| ClickHouse | Upgraded to version 22.8.14.53. |
| Trino | Adds the odps.properties connector to support queries on MaxCompute. |
| JindoData | Upgraded to version 4.6.5. |
| JindoSDK | Upgraded to version 4.6.5. |
| Flink-Table-Store | Upgraded to version 0.3-ali-2. |
| YARN | Supports Node Labels management. |
EMR-5.11.0
| Service | Change |
|---|---|
| Iceberg | Upgraded to version 1.1.0. |
| Hudi | Upgraded to version 0.12.2. Supports CDC. |
| Delta Lake | Upgraded to version 2.2.0. Supports recording Vacuum operations in the transaction log. |
| Kudu | Upgraded to version 1.16.0. |
| ClickHouse | ZooKeeper must be selected when installing the ClickHouse service. |
| Celeborn | RSS renamed to Celeborn. Version: 0.2.0. |
| Presto | Added the service. Kernel: community Facebook PrestoDB 0.278.3. Default HTTP port: 8889. Default HTTPS port: 7779. |
| StarRocks | Upgraded to version 2.5.1. |
| Doris | Upgraded to version 1.2.1. |
| Kafka-Manager | Upgraded to version 3.0.0.6. |
| Impala | Upgraded to version 4.2.0. |
| OpenLDAP | Upgraded to version 2.4.46. |
| HBase | Supports JDK 11. Supports ThriftServer2. Changes the default value of hbase.block.data.cachecompressed to true. |
| Flink-Table-Store | Added the service. Version based on community version 0.3. |
| JindoData | Upgraded to version 4.6.4. |
EMR-5.10.x
Release date
| Version | Date |
|---|---|
| EMR-5.10.0 | December 1, 2022 |
Updates
| Service | Change |
|---|---|
| Iceberg | Upgraded to version 0.14.1. |
| Flink | Upgraded to Flink 1.15-vvr-6.0.2, corresponding to the community Flink 1.15 major version. |
| Kafka | Supports LDAP user logon authentication and authorization. Supports user group authorization. |
| Trino | EMR Presto renamed to Trino (official community name). Supports Ranger and DLF AUTH. Fixes an issue where connections to worker nodes fail after one-click LDAP enablement. |
| JindoSDK | Upgraded to version 4.6.2. |
| JindoData | Upgraded to version 4.6.2. |
| HBase | Supports Ranger. Fixes an issue where OSS-HDFS cannot be selected as the storage mode when adding the service. |
| YARN | ACLs are enabled by default in high-security mode. |
| StarRocks | Upgraded to version 2.4.1. |
| Doris | Upgraded to version 1.1.5. |
| Hudi | The console supports configuring hudi-defaults.conf. |
| Ranger | Upgraded to version 2.3.0. Supports integration with Trino, YARN, HBase, and Kafka. |
| DLF-Auth | Upgraded to version 2.0.2. Supports Trino and Impala. |
| OpenLDAP | Integrates with the Nslcd component. |
| Kudu | Kudu Tserver can no longer be installed in Task node groups. |
| Spark | Upgraded to version 3.3.1. |
| Tez | Upgraded to version 0.10.2. |
| Kyuubi | Upgraded to version 1.6.0. |
EMR-5.9.x
Release dates
| Version | Date |
|---|---|
| EMR-5.9.1 | November 8, 2022 |
| EMR-5.9.0 (New purchases not supported) | October 14, 2022 |
Updates
EMR-5.9.1
| Service | Change |
|---|---|
| Kerberos | Supports connecting to an external KDC on EMR. |
| Kafka | Adds a configuration item for startup commands, letting you customize startup parameters for the service. |
| JindoData | Upgraded to version 4.6.0. Supports rewriting OSS-HDFS access paths. |
| Flink | Upgraded to version 1.13_vvr_4.0.15. |
| RSS | Upgraded to version 0.1.4. |
EMR-5.9.0
| Service | Change |
|---|---|
| Spark | Upgraded to version 3.3. Supports Kerberos authentication. |
| Hudi | Upgraded to version 0.12.0. Supports Spark 3.3. Supports using a cloud-based MetaStore to host metadata and enabling the acceleration feature. For more information, see Instructions on how to use Hudi MetaStore. |
| Flink | Supports Kerberos authentication. Supports automatic connection with Data Lake Formation (DLF). |
| Iceberg | Upgraded to version 0.14.0. Supports Spark 3.3. Supports Kerberos authentication. |
| JindoData | Upgraded to version 4.5.1. Supports AccessKey-free access to Alibaba Cloud resources. |
| Hadoop-Common and HDFS | Supports Kerberos authentication. Fixes security vulnerability CVE-2022-25168. |
| Knox | Integrates with Ranger. Access the Ranger UI from the Access Links And Ports tab. |
| HBase | Upgraded to version 2.4.9. Supports Kerberos authentication. Supports group configuration. |
| RSS | Upgraded to version 0.1.2. Supports Kerberos authentication. |
| Doris | Upgraded to version 1.1.2. Supports Kerberos authentication. |
| StarRocks | Upgraded to version 2.3.2. Supports Kerberos authentication. |
| Kafka | Upgraded to version 2.13_3.2.1. Supports Kerberos authentication. |
| Delta Lake | Supports upgrading to version 2.1.0. Supports Spark 3.3. Supports Kerberos authentication. |
| Impala | Supports creating views in DLF. Supports Kerberos authentication. |
| Kudu | Added the component. Version: 1.14.0. |
| YARN, Ranger, Hive, Kyuubi, Tez, ZooKeeper, DLF-Auth, Phoenix, Sqoop, and Presto | Support Kerberos authentication. |
EMR-5.8.x
Release date
| Version | Date |
|---|---|
| EMR-5.8.0 | August 5, 2022 |
Updates
| Service | Change |
|---|---|
| Spark | Supports one-click integration with LDAP. |
| Hive | Supports one-click integration with LDAP. |
| Presto | Upgraded to community version 389, using the standalone Delta Lake and Hudi connectors provided by the community. Note: the Delta Lake connector does not support Time Travel or Z-Order; the Hudi connector does not support querying MOR tables. Supports one-click integration with LDAP. |
| Delta Lake | Integrates with DLF for automated lake table management. Fixes an issue where partition information cannot be automatically synchronized in CTAS scenarios. The optimize and vacuum commands support returning metric information. |
| Hudi | Upgraded to version 0.11.1. |
| Hadoop-Common | Added the component. Resolves the issue where HDFS, YARN, and JindoSDK configurations overwrite each other. |
| YARN | Enhances the elastic scaling feature. |
| Ranger | Supports both Spark 2 and Spark 3. Ranger Usersync supports one-click integration with LDAP. |
| Kafka | Added the component. Version: 2.12-2.4.1. |
| HBase | Added the component. Version: 2.3.4. |
| Phoenix | Added the component. Version: 5.1.2. |
| Doris | Upgraded to version 1.1.1. |
| StarRocks | Upgraded to version 2.3.0. The primary key model supports the complete DELETE WHERE syntax and persistence of the primary key index to reduce memory usage. |
| ClickHouse | Upgraded to version 22.3.8.39. Fixes an out-of-memory issue when reading large files from OSS. |
EMR-5.6.x
Release date
| Version | Date |
|---|---|
| EMR-5.6.0 | April 21, 2022 |
Updates
| Service | Change |
|---|---|
| JindoData | Added the component. Version: 4.3.0. |
| JindoSDK | Upgraded to version 4.3.0. |
| Spark | Upgraded to version 3.2.1. |
| Hive | Fixes a bug where commits are repeated after Speculation is enabled in Tez. |
| Presto | Fixes a bug where the Presto service fails to start after being added to a Hadoop cluster that has already been initialized. |
| Delta Lake | DML supports subqueries. |
| Hudi | Upgraded to version 0.10.1. |
| Iceberg | Upgraded to version 0.13.1. |
| YARN | Adds a feature configuration to restrict ApplicationMasters (AMs) to run only on CORE group nodes. |
| HBase | Fixes a bug in the HBase 2.3.4 kernel. |
| ZooKeeper | Optimizes JVM parameter configurations. |
| Impala | Adapts to JindoSDK 4.3.0. |
| Sqoop | Upgrades the PostgreSQL version. |
| Zeppelin | Fixes an issue where the JDBC Interpreter fails to start. |
| Ranger | The Ranger 1.2.0 Spark plugin supports Delta Lake and Hudi. |
| Flume | Adapts to JindoSDK 4.3.0. |
| Oozie | Upgrades Log4j to version 2.17.2. |
| DLF-Auth | Upgraded to version 2.0.0. |
EMR-5.5.x
Release dates
| Version | Date |
|---|---|
| EMR-5.5.1 | March 25, 2022 |
| EMR-5.5.0 (New purchases not supported) | February 15, 2022 |
Updates
EMR-5.5.1
Only OLAP clusters in the new console support this version.
| Service | Change |
|---|---|
| ClickHouse | Modifies the default values of some parameters. |
| StarRocks | Upgraded to version 2.1.1. |
EMR-5.5.0
| Service | Change |
|---|---|
| SmartData | The component is offline. |
| RSS | Upgrades the ESS service to RSS. Enhances features and stability. |
| JindoSDK | Upgrades the architecture to JindoData. EMR integrates with JindoSDK 4.0 for the first time, supporting services such as OSS and OSS-HDFS. |
| Spark | The COUNT DISTINCT function supports IF statements and optimizes the use of CASE WHEN (set spark.sql.optimizer.rewriteConditionalDistinctAggregates to true). Shuffle Hash Join supports fallback to Sort Merge Join (set spark.sql.join.preferSortMergeJoin to false and spark.sql.join.enableShuffledHashJoinFallback to true). Supports automatic merging of small files for non-dynamic partitions (set spark.sql.adaptive.merge.output.small.files.enabled to true). Automatically adjusts concurrency for GroupingSet and Distinct scenarios (set spark.sql.execution.optimizeExpand to true). Optimizes Hive on Spark. Supports Time Travel syntax. Adapts to JindoSDK. |
| Tez | Adapts to JindoSDK. |
| Hive | Optimizes the batch deletion of Hive Jindo. Optimizes the HiveServer2 out-of-memory issue. Optimizes Hive on Spark. Adapts to JindoSDK. |
| Presto | Upgraded to community version 358. Adds MySQL, Iceberg, Hudi, Phoenix, Kudu, and Delta connectors by default and updates the default configurations. Supports data lake analytics. Supports dynamic UDF loading. Adapts to JindoSDK. |
| Delta Lake | Upgraded to version 1.1.0, compatible with Spark 3.2.0. All commercial features migrated to version 1.1.0. Optimizes synchronization of metadata modifications to the metastore. Automatically reports table statistics (dataProfiling) to the metastore. Supports Time Travel syntax. Supports DropPartition SQL syntax. Supports dynamic partition overwrites using SQL. Supports ADD COLUMN operations at specified positions (FIRST and AFTER). Supports and enables dynamic adjustment of file sizes based on table sizes by default. Supports and enables automatic Vacuum by default (concurrent Vacuum also supported). Optimizes the logic for automatic compaction (disabled by default). Adds Z-order syntax and accelerates the Z-order process. |
| Hudi | Upgraded to version 0.10.0. Supports Spark 3.2.0. Supports JindoFS Block mode. |
| HDFS | Adapts to JindoSDK. |
| YARN | Adapts to RSS memory configurations. Adapts to JindoSDK. |
| Flume | Adapts to JindoSDK. |
| Impala | Adapts to JindoSDK. |
| Ranger | Supports Spark 3.2.0. Supports Presto 358. |
| HBase | Fixes issues with default parameters. Fixes an issue with the GC log date format. |
| ClickHouse | Adds HDFS and OSS disk types to support hot and cold data separation (see Use HDFS for hot and cold data separation and Use OSS for hot and cold data separation). In Replicated\*MergeTree scenarios, Zero Copy is supported for OSS, HDFS, and S3 disk types. Optimizes the processing logic when the ClickHouse component is stopped. |
| Iceberg | Upgraded to version 0.13.0. Supports Presto 358. |
| DLF-Auth | Supports Spark 3.2.0. Supports Presto 358. |
EMR-5.4.x
Release dates
| Version | Date |
|---|---|
| EMR-5.4.3 | December 2021 |
| EMR-5.4.2 (New purchases not supported) | December 2021 |
| EMR-5.4.1 (New purchases not supported) | November 2021 |
| EMR-5.4.0 (New purchases not supported) | October 2021 |
Updates
EMR-5.4.3
This release fixes the Log4j security vulnerabilities in all related components. For more information, see Vulnerability Announcement — Apache Log4j2 Arbitrary Code Execution Vulnerability.
| Service | Change |
|---|---|
| Presto | Fixes the Log4j vulnerability in the Elasticsearch connector. |
| DLF Metastore | Changes the default setting for Metastore logs from enabled to disabled. Fixes an error caused by an excessively long URI for Metastore gettablestats. |
| Delta Lake | Fixes an issue with synchronizing schema changes to the Metastore. |
| Sqoop | Fixes an issue where precision is lost for the Decimal type when Sqoop imports HCatalog tables. |
EMR-5.4.2
| Service | Change |
|---|---|
| SmartData | Updated to version 3.8.0. For more information, see SmartData 3.8.X overview. Authentication and authorization based on Kerberos and Ranger can be used to manage permissions on data in OSS. |
EMR-5.4.1
| Service | Change |
|---|---|
| SmartData | Upgraded to version 3.7.3. For more information, see Introduction to SmartData 3.7.x. |
| Oozie | Fixes an issue where the Jetty server for Oozie fails to start due to a JAR package conflict in an HA environment. |
| Impala | Fixes a no such method error that occurs when querying DLF metadata tables. |
| DLF-Auth | Upgraded to version 1.0.1. |
EMR-5.4.0
| Service | Change |
|---|---|
| SmartData | Upgraded to version 3.7.2. For more information, see Introduction to SmartData 3.7.x. |
| Spark | Upgraded to version 3.1.2. Optimizes Distinct computing performance for Spark SQL when an aggregation operator contains multiple count(distinct case ... when ...) methods. Fixes the array-index out of bounds error when some required statistics for Adaptive Query Execution (AQE) are missing. Fixes errors related to AQE and data caching in specific scenarios. |
| Hive | The batch metadata optimization feature is supported for Hive on JindoFS (Block mode). Disabled by default. |
| Presto | Delta tables support StorageHandler queries. |
| Delta Lake | Upgraded to version 1.0.0. Unifies delta-connectors for Hive 2 and Hive 3. Fixes an error when delta-connectors query multi-level partitioned tables. Supports SQL syntax for DataSkipping, Optimize, and Zorder. Supports synchronizing metadata to the MetaStore. |
| Hudi | Updated to version 0.9.0. Fixes the compatibility issue of sql.extension between Delta Lake and Hudi. Supports Spark 3.1.2. |
| HDFS | The default parameter for NameNode reserved capacity is automatically increased to ensure NameNode promptly enters safe mode when disk space is insufficient. |
| Storm | The component is offline. |
| Zeppelin | Upgraded to community version 0.10.0. |
| Hue | Fixes an issue where the YARN Job Browser fails to display or stop jobs in some cases. Enables the YARN Job Browser in default configurations. Supports the Presto protocol in default configurations. |
| Druid | Fixes an issue where a node fails to restart after an unexpected server shutdown because a PID file is not deleted. |
| ClickHouse | Updates the default configurations. Supports cluster scale-out. Supports the MetaChecker feature. Supports reading data using the OSS table engine and OSS table function. |
| Iceberg | Upgraded to version 0.12.0-1.0.1. Fixes an error with Hive Runtime dependencies. |
| Knox | Fixes an issue where the first access to the Spark UI fails. |
| DLF-Auth | Added the service at version 1.0.0. Supports configuring Hive or Spark permissions to access DLF. |
EMR-5.3.x
Release dates
| Version | Date |
|---|---|
| EMR-5.3.1 | September 2021 |
| EMR-5.3.0 (New purchases not supported) | August 2021 |
Updates
EMR-5.3.1
| Service | Change |
|---|---|
| SmartData | Upgraded to version 3.7.1. |
| Hue | Fixed an issue where Impala could not be used in high-security clusters. |
| Kudu | Added support for Kerberos. |
| HBase | Fixed an issue where restarting HBase in high-security clusters took too long. Fixed an issue where the integration of Spark 3.1.1 with HBase failed. Optimized the graceful stop process. |
EMR-5.3.0
| Service | Change |
|---|---|
| SmartData | Upgraded to version 3.7.0. |
| Spark | Fixed a compatibility issue with Delta Lake. |
| Hive | Hive on JindoFS (Block mode) supports the batch metadata optimization feature. Disabled by default. |
| Delta Lake | Added support for the DeltaLake partition feature. Fixed a compatibility issue between the desc detail command and Spark 3.1.1. |
| YARN | Added appId, CPU, and memory resource usage information to the node Containers REST API. Fixed an issue where ApplicationMaster (AM) logs on released Auto Scaling nodes could not be viewed. Fixed an issue where historical data in the State Store caused the cluster to become unavailable. Added support for cleaning up released nodes after they are decommissioned by Auto Scaling. Improved the graceful decommission logic for Auto Scaling — a node is marked as offline only after the NodeManager (NM) process ends. |
| ZooKeeper | Upgraded to community version 3.6.3. |
| Flink | Added the SmartData component. Fixed an issue where password-free access to OSS was not possible when submitting jobs to a DataFlow-Flink cluster via SSH. |
| Impala | Fixed an issue where deleting an OSS partition directory directly caused a directory listing loop. |
| Hue | Fixed a UI display issue that occurred when Hue was integrated with Oozie. |
| Kudu | Upgraded to community version 1.14.0. |
| ClickHouse | The component version is 21.3.13.9. |
| Iceberg | Added the Iceberg component. Version: 0.12.0. |
EMR-5.2.x
Release date
| Version | Date |
|---|---|
| EMR-5.2.1 | July 16, 2021 |
Updates
| Service | Change |
|---|---|
| SmartData | Upgraded to version 3.6.1. For more information, see Introduction to SmartData 3.6.x. |
| Hive | Fixed an issue where the show create table command returned incorrect results when DLF metadata was used. Optimized default Hive parameters to improve job performance. Changed the names of configuration items on the hive-env tab to uppercase for ease of use. Fixed a memory leak in HiveServer2 caused by a User-Defined Function (UDF). Improved the error message displayed when writing data to a Hive table if the file system is inconsistent with the MetaStore. |
| HDFS | Added support for the Zstandard (ZSTD) compression format. |
| Delta Lake | Upgraded to version 0.8.0. Added support for Spark 3. |
| Flink | Upgraded to version 1.12-vvr-3.0.2. |
| Hudi | Upgraded to version 0.8.0. Added support for integration with Spark SQL. |
| Spark |
Important
Spark 3.1.1 in EMR-5.2.1 is not compatible with Kudu 1.11.1. Supports the Delta Lake and Hudi data lake formats. Supports Remote Shuffle Service. Supports Livy. Optimized the names of configuration items on the spark-defaults tab. Optimized Cost-Based Optimization (CBO), Dynamic Partition Pruning (DPP), and Z-Order — performance is improved by 50% compared with the open source Spark 3 version. Added support for data sources such as Alibaba Cloud Log Service, DataHub, and Message Queue for Apache RocketMQ (ONS). |
| Tez | Optimized default Tez parameters to improve job performance. |
| Ranger |
|
| Knox | Added support for the Kudu component. Added support for the HBase component. |
| Kafka | Added support for the Cruise Control component for Kafka cluster balancing. Introduced a hot-swapping feature for Kafka disks — replace faulty disks without stopping or starting a broker. Modified the default values of some parameters. |
| Phoenix | Fixed an issue where a "JDBC Driver not found" error was reported when Hive and Spark SQL were used to access Phoenix tables. |
| ESS (EMR Remote Shuffle Service) | Added support for Spark 3. |