Analyses in this article are based on Now Tech: Cloud Data Warehouse, Q1 2018 (Published by Noel Yuhanna, March 13, 2018). The views and opinions expressed herein are those of the author.
On March 13, 2018, Forrester issued the Now Tech: Cloud Data Warehouse Q1 2018 report. In this report, Forrester comprehensively assessed Cloud Data Warehouses (CDWs) in aspects such as main features, regional performance, market segmentation, and customers.
Alibaba Cloud, AWS, Google and Microsoft are selected as the four global first-tier CDW service providers. Alibaba Cloud DataWorks and MaxCompute are the only products from a Chinese company recognized in the report.
In this report, Forrester highlighted four core CDW features:
Before analyzing DataWorks, we will first take a quick look at its role in the Alibaba Cloud CDW service system and its product architecture.
Among a variety of Alibaba Cloud products, DataWorks and MaxCompute make up the core of CDW service capabilities. As a storage computing engine, MaxCompute is responsible for supporting the IaaS layer and provides users with numerous and reliable big data table storage and SQL execution capability. However, MaxCompute alone cannot meet data processing requirements. Data development, data integration and other CDW services are also required to empower customers with big data. To this end, DataWorks provides a relatively complete solution.
Specifically, DataWorks includes 8 major modules:
This Forrester report gives lengthy explanation of the necessity of multiple deployment modes, and includes the comparison among CDWs from several service providers. DataWorks is one of the first-tier products that provide multiple deployment modes.
Serving as the core of the Alibaba Group's data middleware system, DataWorks has been used to support business operations in enterprises like Alibaba Group, Ant Financial, and Cainiao since 2009. If you've used data services provided by Taobao, Tmall, Ant Financial, and other companies, you may have indirectly used the computing service provided by DataWorks.
DataWorks also supports private cloud. As an important empowering means of big data, DataWorks is utilized in Alibaba Cloud's private cloud solutions including Apsara Enterprise. Since 2015, DataWorks has been providing support for important enterprise and government projects including the Alibaba Cloud ET City Brain and "Easy municipal service access".
With flexible deployment modes, DataWorks can meet a wide variety of customers' needs. For small enterprises, public cloud solutions can be used flexibly to provide services and support; for medium and large enterprises, private cloud or hybrid cloud solutions can fully meet customers' needs.
It is obvious that efficient data integration methods can significantly facilitate the migration of enterprise data to cloud. During the initial migration stage, enterprises need to quickly and securely migrate their data assets to cloud; during the stage of continuous business operations, enterprises need to input various kinds of data into CDWs and then output processed data from CDWs to individual business units.
The Data Integration feature of DataWorks can be used to read/write multiple data sources, including relational databases, NoSQL databases, big data databases and text storage (FTP), uniformly check data resources in data sources, and synchronize and integrate heterogeneous data sources in complex network environments. As to scheduling a specific import task, DataWorks supports batch synchronization, full synchronization and incremental synchronization of offline data. Users can specify a custom synchronization time by minute, day, hour, week, or month.
In addition, the Data Integration feature of DataWorks provides data stream control to manage data stream behavior in dirty data, data velocity and number of concurrent threads, leading to all-round user cost reduction and lean management.
DataWorks provides powerful data development IDEs and supports visual editing of SQL code, integration tasks and business flow DAG graphs. Multi-user online cooperation and task script version management can meet practical needs of enterprise-level data development. In addition to the offline task processing feature, DataWorks provides the lightweight "Analytics Workbench" tool to fully utilize the computing capacity of MaxCompute and meet users' instant data analysis needs.
It is reported that updates have recently been made to the drag-and-drop business flow editing feature in DataWorks to further improve user experience and provide a better data development IDE.
Sensitive data protection requires even better compliance with the industry standards and data privacy laws and regulations. Security is the top priority of DataWorks. DataWorks provides data security modules and implements all-round data security using the following security protection means:
DataWorks has received a third-level information security certificate issued by the Ministry of Public Security.
With "Internet Plus" further applied in different industries, there is an increasing need for enterprises to manage, process and employ their data assets. Internet companies can quickly use their big data processing capability to meet other enterprises' needs. That also explains why these four cloud service providers, instead of long-established data warehouse companies like Oracle and IBM, are listed in the Forrester report as first-tier CDW providers.
Thanks to years of data leveraging in Alibaba Cloud, DataWorks can fully meet enterprise-level requirements in deployment modes, data integration, analysis means, and data security.
It is said that DataWorks will continue to provide more advanced data management ideas, including real-time data integration and data asset analysis. DataWorks combines cloud computing with data warehouse management methodology to implement persistent innovations and create "platforms most suitable for big data warehouse development". That is another reason why DataWorks is listed in this Forrester's CDW report.
To learn more about the Big Data capabilities of Alibaba Cloud, read the Forrester report on MaxCompute.
Alibaba Cloud MaxCompute - January 7, 2019
Alibaba Clouder - July 5, 2019
Alibaba Clouder - August 28, 2019
ApsaraDB - November 28, 2019
Alibaba Cloud MaxCompute - July 4, 2019
Alibaba Clouder - April 10, 2018
Secure and easy solutions for moving you workloads to the cloudLearn More
SDDP automatically discovers sensitive data in a large amount of user-authorized data, and detects, records, and analyzes sensitive data consumption activities.Learn More
Realtime Compute offers a highly integrated platform for real-time data processing, which optimizes the computing of Apache Flink.Learn More
TSDB is a stable, reliable, and cost-effective online high-performance time series database service.Learn More
More Posts by Alibaba Clouder