• Training
  • Alibaba Cloud Certified Associate: Data Engineer
Alibaba Cloud Certified Associate: Data Engineer
  • Training
  • Alibaba Cloud Certified Associate: Data Engineer
Alibaba Cloud Certified Associate: Data Engineer
Course at a Glance
  • Course Status
    Live
  • Course Code
    DEA-C01
  • Type
    Classroom Training
  • Course Duration
    2 Days
  • Hands-on Labs
    Yes
  • Available Language
    English
Introduction
A Data Engineer is responsible for designing, building, and maintaining scalable data pipelines on Alibaba Cloud. This role involves developing ETL pipelines to ingest, transform, and process structured and unstructured data using tools such as MaxCompute, DataWorks, and Hologres. The Data Engineer supports data warehousing, real-time analytics, and machine learning workflows while adhering to enterprise standards for security, governance, and cost optimization.
Learning Outcomes
  • Big Data Fundamentals
    Explain core big data concepts including the 4Vs, distributed storage (e.g., Apsara File System), and differences between data warehouses and data lakes.
  • Data Pipeline Development
    Develop batch pipelines using MaxCompute and DataWorks, and real-time streaming pipelines with Realtime Compute for Apache Flink and Hologres using SQL.
  • Distributed Storage & Processing
    Process large-scale batch data with MaxCompute and perform unified real-time analytics using Hologres (storage-compute separation architecture).
  • Data Services & Development Tools
    Deliver data via APIs using Alibaba Cloud API Gateway and leverage support tools like Data Map and Intelligent Monitoring for data lineage, quality, and observability.
Course Outline
    Module 1: Overview of Big Data
  • Big Data & Use Cases
  • Big Data Architecture
  • Big Data on Alibaba Cloud
  • Module 2: Data Collections on Alibaba Cloud
  • Data Sources and Collection Techniques
  • Data Collection on Alibaba Cloud
  • Demo
  • Module 3: Distributed Data Storage Systems
  • Introduction to Distributed Data Storage Systems
  • Distributed Data Storage on Alibaba Cloud
  • Module 4: Batch Processing on Alibaba Cloud
  • Introduction to Data Modeling
  • Building a Data Warehouse for Batch Processing
  • Best Practices in Batch Processing on Alibaba Cloud
  • Demo
  • Module 5: Real-time Processing on Alibaba Cloud
  • Introduction to Real-time Computing
  • Apache Flink and Realtime Compute for Apache Flink
  • Building a Data Warehouse for Real-time Processing
  • Best Practices in Real-time Processing on Alibaba Cloud
  • Demo
  • Module 6: Data Services & Data Development Support Tools
  • Data Services
  • Data Development Support Tools
  • Demo