All Products
Search
Document Center

E-MapReduce:Release notes for EMR V5.4.X

Last Updated:Apr 27, 2023

This topic describes the release notes for E-MapReduce (EMR) V5.4.X, including the release dates, updates, and release version information.

Release dates

Version

Date

EMR V5.4.3

December 2021

EMR V5.4.2

December 2021

EMR V5.4.1

November 2021

EMR V5.4.0

October 2021

Updates

EMR V5.4.3

The Log4j security vulnerability of all related components is fixed. For more information, see Vulnerability announcement | RCE vulnerability in Apache Log4j 2.

Service

Description

Presto

The Log4j security vulnerability of the Elasticsearch connector is fixed.

DLF Metastore

  • By default, the logging feature for metastores is disabled. In earlier versions, the feature is enabled by default.

  • The error caused by an excessively long getTableStats URI for a metastore is fixed.

Delta Lake

The issue that schema changes fail to be synchronized to a metastore is fixed.

Sqoop

The issue that precision loss for the DECIMAL data type occurs when you use Sqoop to import data to HCatalog tables is fixed.

EMR V5.4.2

Service

Description

SmartData

  • SmartData is updated to 3.8.0. For more information, see SmartData 3.8.X overview.

  • Authentication and authorization based on Kerberos and Ranger can be used to manage permissions on data in OSS.

EMR V5.4.1

Service

Description

SmartData

SmartData is updated to 3.7.3. For more information, see SmartData 3.7.X overview.

Oozie

The issue that Jetty Server of Oozie fails to start due to JAR package conflicts in high availability (HA) scenarios is fixed.

Impala

The issue that the no such method error message appears when you query data in DLF metadata tables is fixed.

DLF-Auth

DLF-Auth is updated to 1.0.1.

EMR V5.4.0

Service

Description

SmartData

SmartData is updated to 3.7.2. For more information, see SmartData is updated to 3.7.2. For more information, see SmartData 3.7.X overview.

Spark

  • Spark is updated to 3.1.2.

  • In Spark 3.x, the Distinct computing performance is optimized for Spark SQL. The optimization feature is triggered if an aggregation operator contains multiple count(distinct case ... when ...) methods.

  • The array-index out of bounds error that is returned when some required statistics for Adaptive Query Execution (AQE) are missing is fixed.

  • Errors related to AQE and data caching in specific scenarios are fixed.

Hive

In JindoFS in block storage mode, the metadata of multiple Hive tables can be optimized at the same time. By default, this feature is disabled.

Presto

Storage handlers can be used to query data in Delta tables.

Delta Lake

  • Delta Lake is updated to 1.0.0.

  • The same Delta Lake connectors are used in Hive 2 and Hive 3.

  • The error that is returned when you use Delta Lake connectors to query data from multi-level partitioned tables is fixed.

  • The SQL syntax that supports various features, such as Data Skipping, Optimize, and Z-ordering, can be used.

  • Metadata can be synchronized to a metastore.

Hudi

  • Hudi is updated to 0.9.0.

  • The issue about the compatibility of sql.extension between Delta Lake and Hudi is fixed.

Note

Spark 3.1.2 is supported.

HDFS

By default, the reserved space of NameNode adaptively increases. This way, NameNode enters the Safe mode in a timely manner when the disk space is insufficient.

Storm

The service is no longer used.

Zeppelin

Zeppelin is updated to 0.10.0.

Hue

  • The issue that YARN Job Browser sometimes cannot present or terminate jobs is fixed.

  • YARN Job Browser is accessible by default.

  • The Presto protocol is supported by default.

Druid

The following issue is fixed: After a server is unexpectedly shut down, the related node fails to restart because a PID file is not deleted.

ClickHouse

  • Some default configurations are updated.

  • Clusters can be scaled out.

  • The MetaChecker feature is supported.

  • Object Storage Service (OSS) table engines and OSS table functions can be used to read data.

Iceberg

  • Iceberg is updated to 0.12.0-1.0.1.

  • Errors related to the Hive runtime dependency are fixed.

Knox

The issue that the first access to the Spark UI fails is fixed.

DLF-Auth

The service is added.

The permissions of using Hive or Spark to access DLF can be configured. The service version is 1.0.0.

Release version information

Hadoop clusters

Service

Version

HDFS

3.2.1

YARN

3.2.1

Hive

3.1.2

Spark

3.1.2

Knox

1.1.0

Tez

0.9.2

Ganglia

3.7.2

Sqoop

1.4.7

SmartData

EMR V5.4.0: 3.7.2

EMR V5.4.1: 3.7.3

EMR V5.4.2: 3.8.0

Bigboot

Iceberg

0.12.0

DLF-Auth

1.0.0

Hudi

0.9.0

Delta Lake

1.0.0

OpenLDAP

2.4.44

Hue

4.9.0

HBase

2.3.4

ZooKeeper

3.6.3

Presto

338

Impala

3.4.0

Zeppelin

0.10.0

Flume

1.9.0

Livy

0.7.1

Superset

0.36.0

Ranger

2.1.0

ESS

1.2.0

Alluxio

2.5.0

Kudu

1.14.0

Oozie

5.2.1

Kafka clusters

Service

Version

ZooKeeper

3.6.3

Ganglia

3.7.2

Kafka

2.4.1

Kafka Manager

1.3.3.16

OpenLDAP

2.4.44

Knox

1.1.0

Ranger

2.1.0

ClickHouse clusters

Service

Version

ZooKeeper

3.6.3

Ganglia

3.7.2

ClickHouse

21.3.13.9