All Products
Search
Document Center

Data Lake Analytics - Deprecated:What is DLA?

Last Updated:Feb 19, 2024
Important

Data Lake Analytics (DLA) is discontinued. AnalyticDB for MySQL supports the features of DLA and provides additional features and enhanced performance. For more information about how to use AnalyticDB for MySQL, see What is AnalyticDB for MySQL?

DLA is a next-generation big data solution that separates data computing from data storage. DLA can archive messages and database data and build data warehouses in real time. The supported databases include relational databases, PolarDB databases, and NoSQL databases. In addition, DLA provides the serverless Spark and Presto engines to meet the requirements for online interactive search, stream processing, batch processing, and machine learning. As a robust alternative to traditional Hadoop solutions, DLA facilitates a seamless transition to cloud-based analytics.

Data sources supported by DLA

For more information about the data sources that are supported by DLA, see Compatibility matrix for data sources and SQL statements.

Data source

Serverless Presto engine

Serverless Spark engine

Object Storage Service (OSS)

Supported

Supported

ApsaraDB RDS

Supported

Supported

PolarDB

Supported

Supported

ApsaraDB for HBase

To be supported

Supported

ApsaraDB for MongoDB

Supported

To be supported

Tablestore

Supported

Supported

AnalyticDB for MySQL V2.0

Supported

Supported

AnalyticDB for MySQL V3.0

Supported

Supported

AnalyticDB for PostgreSQL

Supported

Supported

MaxCompute

Supported

Supported

Elasticsearch

Supported

Supported

ApsaraDB for Cassandra

Supported

Supported

Kudu

Supported

Supported

Self-managed Druid database hosted on an Elastic Compute Service (ECS) instance

Supported

Supported

Features

DLA provides an end-to-end cloud-native data lake analytics and computing solution for data that is stored in OSS. DLA has the following benefits that help troubleshoot various issues:

  • End-to-end data lake solution: This solution enables efficient data ingestion, extract, transform, load (ETL), machine learning, and interactive analytics. DLA provides Data Lake Formation (DLF) and serverless Presto and Spark engines.

  • Secure data processing: All tables in databases and the stored data of DLA have separate security solutions. This prevents data misuse.

  • Cost-effective data processing: The serverless cloud-native data processing solution of DLA is cost-effective.

  • Smooth evolution: DLA ensures a smooth evolution from a Hadoop system to a data lake solution.

Support for serverless Presto and Spark engines

The serverless Presto engine of DLA is built based on Apache Presto. All the computing jobs are implemented by the memory. The serverless Presto engine delivers a high-performance and interactive analysis experience, and returns analysis results in seconds. The serverless Spark engine is compatible with all the API operations provided by Apache Spark.

We recommend that you use the serverless Spark engine of DLA in the following scenarios:

  • You must customize code or SQL statements cannot meet your business requirements.

  • A large amount of data needs to be cleansed. For example, one terabyte to one petabyte of data stored in OSS must be cleansed per day.

  • A wide range of algorithms must be supported. The serverless Spark engine of DLA supports all Spark algorithms.

  • Streaming must be supported.