Course at a Glance
-
Course StatusLive
-
Course CodeDEA-C01
-
TypeClassroom Training
-
Course Duration2 Days
-
Hands-on LabsYes
-
Available LanguageEnglish
Introduction
A Data Engineer is responsible for designing, building, and maintaining scalable data pipelines on Alibaba Cloud. This role involves developing ETL pipelines to ingest, transform, and process structured and unstructured data using tools such as MaxCompute, DataWorks, and Hologres. The Data Engineer supports data warehousing, real-time analytics, and machine learning workflows while adhering to enterprise standards for security, governance, and cost optimization.
Learning Outcomes
- Big Data Fundamentals
Explain core big data concepts including the 4Vs, distributed storage (e.g., Apsara File System), and differences between data warehouses and data lakes. - Data Pipeline Development
Develop batch pipelines using MaxCompute and DataWorks, and real-time streaming pipelines with Realtime Compute for Apache Flink and Hologres using SQL. - Distributed Storage & Processing
Process large-scale batch data with MaxCompute and perform unified real-time analytics using Hologres (storage-compute separation architecture). - Data Services & Development Tools
Deliver data via APIs using Alibaba Cloud API Gateway and leverage support tools like Data Map and Intelligent Monitoring for data lineage, quality, and observability.
Course Outline
- Big Data & Use Cases
- Big Data Architecture
- Big Data on Alibaba Cloud
- Data Sources and Collection Techniques
- Data Collection on Alibaba Cloud
- Demo
- Introduction to Distributed Data Storage Systems
- Distributed Data Storage on Alibaba Cloud
- Introduction to Data Modeling
- Building a Data Warehouse for Batch Processing
- Best Practices in Batch Processing on Alibaba Cloud
- Demo
- Introduction to Real-time Computing
- Apache Flink and Realtime Compute for Apache Flink
- Building a Data Warehouse for Real-time Processing
- Best Practices in Real-time Processing on Alibaba Cloud
- Demo
- Data Services
- Data Development Support Tools
- Demo
Module 1: Overview of Big Data
Module 2: Data Collections on Alibaba Cloud
Module 3: Distributed Data Storage Systems
Module 4: Batch Processing on Alibaba Cloud
Module 5: Real-time Processing on Alibaba Cloud
Module 6: Data Services & Data Development Support Tools