This topic describes the release notes for Data Lake Analytics (DLA) and provides links to the relevant references.

June 2021

CategoryFeatureDescriptionReferences
Cluster managementMonitoring and alertingSpark virtual clusters support monitoring and alerting. View the metrics of the serverless Spark engine
Data lake managementData reading from secondary databasesThe DLA lakehouse solution allows you to read data from secondary databases of ApsaraDB RDS and PolarDB for MySQL. N/A
Performance improvementThe DLA lakehouse solution allows you to read data from database tables in parallel during full synchronization. This way, the performance of DLA is improved by 2.5 times. N/A
Time series issue fixingThe DLA lakehouse solution fixes the time series issue that occurs when DLA uses Data Transmission Service (DTS) to perform parallel write operations per second. N/A
DLA SparkTable data reading across accountsSpark SQL allows you to use different accounts to read data from tables in the DLA metadata system. N/A
OSS optimizationBy default, OSS optimization is enabled. This ensures that the performance of OSS is not affected when you perform a deep copy in OSS. N/A
Configuration of the maximum number of job failures on Spark executorsDLA allows you to configure the maximum number of job failures on Spark executors. By default, the maximum number of job failures on Spark executors is twice the number of Spark executors. N/A
Job retriesSpark jobs support automatic retries to fix the stability issue that is caused by the unstable performance of the platform framework. Configure a Spark job
Monitoring and alertingSpark jobs support monitoring and alerting. View the metrics of the serverless Spark engine
DLA PrestoConfiguration of a path not required for table creationWhen you create a table in DLA, you do not need to configure the Location parameter to specify a path. N/A
Partition projection for better table performanceIf you enable partition projection for a table, DLA provides better performance for the table to list OSS directories. N/A
Fixing of metadata system issuesThe cause can be identified if an error message is returned when you create a table in the DLA metadata system. N/A
Fixing of issues on tables for which partition projection is enabledThe following issue is fixed: If partition projection is enabled for a table, no data is found in the table after the INSERT OVERWRITE statement is executed for the table. N/A
Operator pushdownData computations of operators, such as filter, aggregation, and limit operators, can be pushed down to Tablestore. Use computing pushdown for Tablestore
Parameter controlDLA allows you to control the task_writer_count and task_concurrency parameters. N/A
Improved read modeThe data read method of AnalyticDB for MySQL 3.0 is changed to streaming. This resolves the issue of high memory usage caused by non-streaming read mode. N/A

1.0.0

CategoryFeatureDescription
Data analysisAnalysis of data in OSS filesData in a single OSS file can be analyzed. In addition, association analysis can be performed for files across different OSS buckets.
Writing of analysis results to OSSThe analysis results can be written back to OSS.
Analysis of data in TablestoreData in Tablestore can be analyzed.
Analysis of data in ApsaraDB RDSData in ApsaraDB RDS can be analyzed.
Analysis of data from multiple data sourcesData from multiple data sources, such as OSS, Tablestore, and ApsaraDB RDS, can be analyzed.

1.1.0

CategoryFeatureDescription
Core featuresPolarDB data sourceAlibaba Cloud PolarDB data sources are supported.
Redis ConnectorThe ApsaraDB for Redis connector is supported.
Data reading from ApsaraDB for MongoDBData can be read from ApsaraDB for MongoDB.
Logical viewLogical views are supported.
MySQL 8.0 protocolMySQL 8.0 protocol is supported.
OSS data sourcesThe DDL table creation wizard supports OSS data sources.
Public datasetsPublic datasets are supported.
Other featuresJSON_EXTRACT functionThe JSON_EXTRACT function is supported to process data from ApsaraDB for MongoDB.
IP address resolution functionThe IP address resolution function is a new function. This function can translate IP addresses into location information, such as countries, provinces, and cities.
PreparedStatementPreparedStatement is supported.
OSS API callsThe number of calls to the OSS API is reduced.
Limits on the number of partitionsThe number of partitions to which data can be written at a time is limited.
Table and field formatsTable and field names can start with a digit.
ALTER PARTITIONThe ALTER PARTITION command is supported.
LogstashLogstash is supported.

1.2.0

CategoryFeatureDescription
Ease of useConsole reconstruction and optimizationThe following features in the new version of the DLA console are optimized: overview, account management, and endpoint management.
Pop-up window for version releaseA pop-up window is displayed for version updates each time a new version is released.
Optimized process of account managementThe account management process is optimized. This helps you manage accounts and passwords and add comments for DLA sub-accounts.
New page for SQL interactionA new page is displayed for SQL interaction. This helps you explore data lakes and accelerates the SQL interaction.
Schema wizardThe schema creation wizard and table creation wizard are developed and optimized. This significantly improves the efficiency of data lake formation and data exploration and discovery.
GUI-based database and table operationsGUI-based operations are supported for you to delete tables or databases.
Optimized data writing to partitionsThe INSERT OVERWRITE SELECT statement can be used to perform extract, transform, load (ETL) operations and write data to the destination partition. This simplifies data cleansing and processing during ETL operations.
Deep integrationIntegration with data analysis and writingDLA allows you to analyze data from various data sources and write data to these data sources. The data sources include OSS, Tablestore, AnalyticDB for PostgreSQL, AnalyticDB for MySQL, ApsaraDB RDS for MySQL, ApsaraDB RDS for PostgreSQL, ApsaraDB RDS for SQL Server, self-managed MySQL, PostgreSQL, and SQL Server databases, ApsaraDB for Redis, self-managed Redis databases, ApsaraDB for MongoDB, self-managed MongoDB databases, and PolarDB. OSS data includes more than seven types of structured and semi-structured data and data files in multiple compression formats.
Integration with DataWorksDLA is integrated with DataWorks. This helps you customize data processing procedures in a visualized manner and create big data workflows in the cloud.
Integration with Function ComputeDLA is integrated with Function Compute. This helps you create cloud-native serverless workflows based on the serverless Spark and Presto engines.
Integration with Message Service (MNS) and Message QueueDLA is integrated with MNS and Message Queue. This significantly improves the data processing efficiency for DLA and facilitates business integration.