Data Lake Analytics (DLA) is a serverless, high-performance interactive query service designed to draw business insights from multiple data types yet requiring zero infrastructure setup and maintenance costs. Regardless of whether you choose to explore a single dataset or analyze data across multiple sources simultaneously, DLA delivers fast, interactive responses without you having to perform Extract, Transform, and Load (ETL) on any of your data. DLA is fully compatible with SQL syntax and works with popular Business Intelligence natively tools so you can easily visualize your data, derive insights, and accelerate decision-making all at minimal costs.
Massively parallel processing (MPP) solutions, such as Hadoop or open source Greenplum have inflexible solution costs because of its integrated storage and computing framework. In contrast, the MPP based DLA data analysis solutions provides flexible storage and computing costs. This allows small and medium-sized enterprises migrating to the cloud to select economic storage and flexible analysis features to suit their business needs. Specifically, key features of DLA include the ability to:
• Quickly query mass key values and other data stored in Table Store.
• Store task flow data in relational databases including MySQL, SQL Server, and PostgreSQL.
• Enable relational databases to perform complex query.
• Use union query to combine multiple data sources from different relational databases, such as MySQL, SQL Server, and PostgreSQL.
• Quickly analyze logs and archived data stored in OSS.
DLA is designed for customers who prefer to store all their data in a data lake first, and then analyze it later when it becomes necessary to derive certain business insights. DLA allows customers to query multiple data sources from a simple interface in a timely and cost-efficient manner.
Customer can use DLA to solve multiple business problems. For example, businesses with massive amounts of user activity logs can divide their data into “hot logs” (data with urgent analysis needs) and cold logs (data without urgent analysis needs). By separating their data, business can save on storage costs but also have the option to immediately leverage DLA to perform ad hoc analysis and gain business insights, such as customer trends, at any given time. Another business problem that DLA helps solve is the issue of multi-source data. With DLA, users can perform joint analysis on multiple data sources such as Object Storage Service (OSS), Table Store, Relational Database Service (RDS), and MongoDB.
Businesses produce all types of data, such as logs, CSV, JSON, etc. This data is then stored in different locations, such as in OSS or Table Store, to form a data lake. DLA can unleash analytics capabilities from a data lake without prior preparation or infrastructure setup, and ultimately produce consolidated insights to help businesses improve.
Compared to conventional data analysis solutions, such as Hadoop or open source Greenplum, DLA is considerably more efficient and cost-effective. Traditional solutions require the purchase of node instances, and solves computing and storage bottlenecks through linear expansion of server resources. You pay for all idle resources instead of what you actually use.In contrast, DLA’s serverless architecture and auto-scaling capacity based on Alibaba Cloud Elastic Computing Service (ECS) enables you to pay only for resources consumed which significantly reduces data analysis cost. In addition to having no setup/maintenance time and cost restraints, DLA eliminates the traditional cumbersome extract, transform, and load (ETL) process required to query data from multiple sources, and is therefore an extremely fast tool returning results within seconds. All this allows small and medium-sized enterprises migrating to the cloud to access economic storage choices and flexible analysis options best suited to their business needs.
For more information on how to get started with DLA, see quick start guide.
For more information on DLA pricing, see pricing overview.