All Products
Search
Document Center

E-MapReduce:Release notes for EMR V3.39.X

Last Updated:Apr 27, 2023

This topic describes the release notes for E-MapReduce (EMR) V3.39.X, including the release dates, updates, and release version information.

Release dates

Version

Date

EMR V3.39.2

March 25, 2022

EMR V3.39.1

February 15, 2022

Updates

EMR V3.39.2

Note

Only OLAP and Dataflow clusters support this version. You can create an OLAP or Dataflow cluster in the new EMR console.

Service

Description

Flink

  • The application performance management (APM) dashboard is optimized. Metrics such as sourceIdleTime are added.

  • Alerting by CloudMonitor is supported.

Kafka

  • SSL and Simple Authentication and Security Layer (SASL) configurations are supported.

  • The default values of some parameters are changed.

ClickHouse

The default values of some parameters are changed.

EMR V3.39.1

Service

Description

SmartData

The service is no longer used.

Bigboot

RSS

  • EMR Remote Shuffle Service (ESS) is upgraded to Remote Shuffle Service (RSS). For more information, see RSS.

  • The service functionality and stability are enhanced.

JindoSDK

  • The architecture of SmartData is upgraded to JindoData.

  • EMR is integrated with JindoSDK for JindoData 4.0.0 for the first time. JindoData connects to Alibaba Cloud Object Storage Service (OSS) and the Alibaba Cloud OSS-HDFS service.

Spark

  • Hive on Spark is optimized.

  • Spark is adapted to JindoSDK.

Tez

Tez is adapted to JindoSDK.

Hive

Hive is adapted to JindoSDK.

Presto

  • User-defined functions (UDFs) can be dynamically loaded.

  • The for ... as of syntax can be used in the time travel feature to query data in a Delta Lake table.

  • An independent Delta Lake catalog is added. Presto provides default configurations for a Delta connector and supports the Optimize, Z-ordering, and Data Skipping features based on the independent Delta Lake catalog.

  • The issue that data in Merge on Read tables of Hudi cannot be queried by using a Hudi connector is fixed. You cannot use a Hive connector to query Merge on Read tables of Hudi.

  • Presto is adapted to JindoSDK.

Delta Lake

  • Metadata management

    • The built-in catalog of Spark, instead of an API operation that is called by using the Hive CLI, is used to synchronize metadata and partition information.

    • The statistics on table data are automatically reported to metastores.

  • SQL

    • The syntax of the time travel feature is supported.

    • The DROP PARTITION SQL syntax is supported.

    • The ADD COLUMN statement can be used to add columns to specified locations (FIRST and AFTER).

  • Enhanced table management capabilities

    • The file size can be dynamically adjusted based on the table size. By default, this feature is enabled.

    • The auto-vacuum feature is supported and enabled by default. Concurrent vacuum operations are supported.

    • The logic of automatic compaction is optimized. By default, the automatic compaction feature is disabled.

    • The Z-ordering syntax is added. Z-ordering-based data processing is accelerated.

Hudi

Hudi is updated to 0.10.0.

HDFS

HDFS is adapted to JindoSDK.

YARN

YARN is adapted to JindoSDK.

Flume

Flume is adapted to JindoSDK.

Flink

  • By default, the lib directory of Flink is uploaded to your HDFS cluster. This way, you can configure the yarn.provided.lib.dirs parameter to use the directory.

  • Flink is adapted to JindoSDK.

Impala

Impala is adapted to JindoSDK.

Ranger

  • The issue that Spark History Server fails to start is fixed.

  • Ranger is adapted to JindoSDK.

HBase

  • The issue about default parameter settings is fixed.

  • The issue about the date format of garbage collection (GC) logs is fixed.

  • The restart issue that occurs if an IP address is configured for RegionServer is fixed.

Druid

Druid is adapted to JindoSDK.

ClickHouse

The logic of processing data when the ClickHouse component stops working is optimized.

Iceberg

  • Iceberg is updated to 0.13.0.

  • Default configuration items are hidden to improve user experience.

DLF-Auth

The issue that Spark History Server fails to start is fixed.

StarRocks

The service is added in the new EMR console.

The 2.0.1 version of this service is released.

Release version information

Note

To view the information about an OLAP cluster, you must log on to the new EMR console.

Hadoop clusters

Service

Version

HDFS

2.8.5

YARN

2.8.5

Hive

2.3.9

Spark

2.4.8

Knox

1.1.0

Tez

0.9.2

Ganglia

3.7.2

Sqoop

1.4.7

Iceberg

0.13.0

DLF-Auth

1.0.4

Hudi

0.10.0

Delta Lake

0.6.1

OpenLDAP

2.4.44

Hue

4.9.0

JindoSDK

4.0.0

Spark

3.2.0

HBase

1.4.9

ZooKeeper

3.6.3

Presto

358

Impala

3.4.0

Zeppelin

0.10.2

Flume

1.9.0

Livy

0.7.1

Superset

0.36.0

Ranger

1.2.0

Phoenix

4.14.1

RSS

1.0.0

Alluxio

2.5.0

Kudu

1.14.0

Oozie

5.2.1

Druid clusters

Service

Version

HDFS

2.8.5

Druid

0.20.1

ZooKeeper

3.6.3

Knox

1.1.0

Ganglia

3.7.2

OpenLDAP

2.4.44

JindoSDK

4.0.0

YARN

2.8.5

Superset

0.36.0

Dataflow clusters

Service

Version

HDFS

2.8.5

YARN

2.8.5

ZooKeeper

3.6.3

Knox

1.1.0

Flink

1.13-vvr-4.0.11

OpenLDAP

2.4.44

JindoSDK

4.0.0

Ganglia

3.7.2

Kafka

1.1.1

Kafka Manager

1.3.3.16

ClickHouse clusters

Service

Version

ZooKeeper

3.6.3

Ganglia

3.7.2

ClickHouse

20.8.12.2

Presto clusters

Service

Version

Knox

1.1.0

Presto

358

Ganglia

3.7.2

Iceberg

0.13.0

Hudi

0.10.0

Delta Lake

0.6.1

OpenLDAP

2.4.44

Hue

4.9.0

JindoSDK

4.0.0

Alluxio

2.5.0

OLAP clusters

Service

Version

ClickHouse

20.8.12.2.2.17

StarRocks

2.0.1

ZooKeeper

3.6.3