All Products
Search
Document Center

E-MapReduce:Release notes for EMR V4.8.X

Last Updated:Apr 27, 2023

This topic describes the release notes for E-MapReduce (EMR) V4.8.X, including the release date, updates, and release version information.

Release date

March 15, 2021 for EMR V4.8.0

Updates

Service

Description

SmartData

SmartData is updated to 3.4.0.

For more information, see SmartData 3.4.X.

Spark

  • Some default configurations are optimized.

  • Performance is optimized. Window-based top-k queries can be pushed down.

  • The capability of reading data from and writing data to Hive tables in the CSV or JSON format is enhanced.

  • All the column names of a table can be omitted in the ANALYZE statement.

  • LDAP authentication can be enabled or disabled with a click.

  • Spark Beeline is easier to use.

Hive

  • Some default configurations are optimized.

  • Performance is optimized. The cost-based optimization (CBO) feature is enhanced.

  • LDAP authentication can be enabled or disabled with a click.

YARN

The risk caused by unauthorized access from a Hadoop cluster to the YARN web UI is fixed. If you access the YARN web UI by using SSH Tunnel, you no longer need to explicitly specify user.name in the URL.

Tez

Some default configurations are optimized.

Ranger

  • The issue caused by filter pushdown in Spark is fixed.

  • The issue that prevents Presto from being enabled after you disable Presto in Ranger is fixed.

  • LDAP authentication can be enabled or disabled with a click.

Hue

LDAP authentication can be enabled or disabled with a click.

Impala

  • Impala is updated to 3.4.0.

  • Shiro is updated to 1.7.0.

  • Metadata stored in Alibaba Cloud Data Lake Formation (DLF) is supported.

  • Data in the Delta format can be queried.

  • LDAP authentication can be enabled or disabled with a click.

  • The exception that occurs when you use the INSERT OVERWRITE statement to overwrite data stored in OSS is fixed.

Hudi

  • SQL statements can be executed to query data in Hudi tables.

  • The issue that causes the query results on some data to be inaccurate is fixed.

  • Partition pruning is supported if you query data in a Copy on Write table of Hudi by using Spark.

  • The bucket-based index mechanism is supported to improve write performance.

Delta Lake

  • The issue that prevents metadata from being synchronized to a Hive metastore based on existing Delta tables is fixed.

  • The issue that prevents the MERGE statement from parsing asterisks (*) in data is fixed.

  • The issue that causes an error to be reported when you convert data in the Parquet format into a Delta table and create table metadata is fixed.

  • The issue that causes the OPTIMIZE command to fail if no files need to be compacted is fixed.

  • A subquery can be used as the source in the MERGE statement.

  • Data can be cached if you use Presto to query data in a Delta table. This improves query efficiency.

  • Impala can be used to query data in Delta tables.

ESS

  • Exceptions in the shuffle read stage, such as ClosedChannelException, IndexOutOfBoundsException, and excessive off-heap memory usage, are fixed.

  • The issue that causes NullPointerException (NPE) to be reported after metric monitoring is enabled is fixed.

HAS

The issue that prevents the admin.keytab file from being initiated again after a HAS installation error is reported is fixed.

Presto

LDAP authentication can be enabled or disabled with a click.

HBase

  • HBase is updated to 2.2.6.

  • Access control based on Ranger is no longer supported.

Sqoop

Files in the Parquet format can be imported to OSS.

Superset

  • The issue that prevents the admin user from logging on to the web UI is fixed.

  • Datasets are compatible with Druid clusters.

  • Spark SQL datasets are no longer supported.

Knox

  • Access to Presto by using Knox is supported.

  • The issue that causes the Druid web UI to be inaccessible is fixed.

  • The limit that you can access the Ranger web UI based on HTTP only by using Knox in high security mode is removed.

Release version information

Hadoop clusters

Service

Version

HDFS

3.2.1

YARN

3.2.1

Hive

3.1.2

Spark

2.4.7

Knox

1.1.0

Tez

0.9.2

Ganglia

3.7.2

Sqoop

1.4.7

SmartData

3.4.0

Bigboot

3.4.0

Hudi

0.6.0

OpenLDAP

2.4.44

Hue

4.4.0

HBase

2.3.4

ZooKeeper

3.5.6

Presto

338

Impala

3.4.0

Zeppelin

0.9.0

Flume

1.9.0

Livy

0.6.0

Superset

0.36.0

Ranger

2.1.0

Flink

1.10-vvr-1.0.2

Storm

1.2.2

Alluxio

2.4.1

ESS

1.0.0

Kudu

1.11.1

Oozie

5.1.0

Shuffle Service clusters

Service

Version

ESS

1.0.0

Kafka clusters

Service

Version

ZooKeeper

3.5.6

Ganglia

3.7.2

Kafka

2.4.1

Kafka-Manager

1.3.3.16

OpenLDAP

2.4.44

Knox

1.1.0

Ranger

2.1.0