This topic describes the release notes of E-MapReduce (EMR) V3.24.X, including the release date, new features, and updates.

Release date

November 18, 2019 for EMR V3.24.0

New features

Component Description
Delta
  • The following SQL statements are supported: ALTER, CONVERT, CREATE, CTAS, DELETE, DESC, INSERT, MERGE, OPTIMIZE, UPDATE, and VACUUM.
  • The syntax of OPTIMIZE is optimized and built in Delta.
  • The Hive connector is supported.
  • The features of open source Delta are supported.
Grafana Grafana 6.4.2 is supported in independent Flink clusters.
Prometheus Prometheus 2.13.0 is supported in independent Flink clusters.
Alertmanager Alertmanager 0.19.0 is supported in independent Flink clusters.
Tensorflow on Spark
  • The TensorFlow framework can be deployed on Spark. TensorFlow and Spark are integrated to optimize task scheduling and data exchange, and support the entire deep learning process from the pre-processing of data to deep learning training.
  • Streaming tasks are supported.

Updates

Component Description
SmartData
  • The cache mode of JindoFS is optimized. The usage of OssFileSystem is integrated into the cache mode to cache data and metadata. Data caching and metadata caching are controlled by using separate switches and are disabled by default.
  • The read/write performance of the block storage mode and cache mode is optimized.
  • Hot data cached in local disks is measured more accurately and cleared in a timely manner to ensure that disk usage does not exceed the upper limit.
  • Both the block storage mode and cache mode are supported in gateway clusters.
  • The separated deployment of a single JindoFS cluster from multiple computing clusters is supported.
Spark
  • Parameters related to Delta are supported.
  • The Spark plug-in can be configured in Ranger.
  • JindoCube is updated to 0.3.0.
Hive
  • SQL statement compatibility can be checked.
  • Hive 2.3.5 and Hadoop 2.8.5 are released as a combination.
  • When Hive is restarted, the content in hiveserver2-site.xml is not synchronized to hive-site.xml in the spark-conf folder.
  • The MSCK command can be used to add incremental directories.
  • The bug triggered by the reuse of a Tez container in Hive is fixed.
  • The MSCK command can be used to optimize column directories.
Bigboot Bigboot is updated to 2.2.1. Native code can be executed on all server models.
Ranger
  • The deployment mode of the Spark plug-in is modified.
  • The bug that the emr-header-2 node of a high-availability cluster cannot obtain keytab files is fixed.
Kudu The startup logic is repaired.
ZooKeeper Four-letter-word commands can be configured. By default, all four-letter-word commands are enabled.
HDFS HDFS is compatible with JindoFS.
YARN
  • The default value of yarn.scheduler.capacity.node-locality-delay is changed to -1.
  • YARN is compatible with JindoFS.
Has Has is interconnected with OpenLDAP.
OpenLDAP OpenLDAP is interconnected with Has.
Presto Presto is updated to 0.228.
Kafka Bad disks (d1 instance family) are automatically removed.
Druid Druid is updated to 0.16.0.
Flume Flume is updated to 1.9.0.
Flink
  • Flink is updated to 1.9.1.
  • Independent Flink clusters are supported. An IP address whitelist must be configured for an independent Flink cluster.