Alibaba Cloud: The Fundamentals of Big Data

Leverage your skills to the maximum level with Alibaba Cloud Digital Talent program.

Course Overview

Digital Talent Scholarship program is initiated by Indonesia Ministry of Communications and Informatics, and it is sponsored by Alibaba Cloud Academy. We provide our Membership and lab coupon to Indonesian participants only FOR FREE so that you will get FREE access to all Apsara Clouder courses, certifications and labs below!

To deepen your understanding of cloud computing knowledge in each Stage, we have scheduled one flip virtual class and one hands-on class each week at
(1) 7PM-9PM JKT on September 20th and 21st,
(2) 7PM-9PM JKT on September 26th and 27th,
(3) 7PM-9PM JKT on October 4th and 5th,
(4) 7PM-9PM JKT on October 11th and 12th.

To ensure the course quality, you must finish all the self-training materials in each Stage before the classes.

How To Start Your Big Data Journey?

Step 1

Search "Digital Talent Scholarship Big Data Program - Membership & LabEx" in your email and redeem the Membership and LabEx codes following the instructions.

Step 2

Claim the two courses below by clicking "Learn for Free (Membership)".
1. ACA Big Data Exam Preparation Course
2. Big Data Analysis Specialty

Step 3

Start your learning below by clicking "Start to Learn".
If you encountered any problems in first two steps, please contact our support at intl_training@list.alibaba-inc.com.

Stage I: Big Data Fundamentals

Learning Module

Course Name

Objective

Course Video

Quiz

Week 1 Onboarding

Onboarding Meeting

In this meeting, you will better understand how DTS program works. You will also get to know how to utilize this self-training learning path and how to gain Alibaba Cloud Associate (ACA) and Professional (ACP) Big Data Training and Certification for free.

Play Back

Week 1 Big Data Basics

Big Data Basics

Learn the basics of Big Data, including the major differences between traditional databases and common Big Data tools. See how Big Data systems are used to collect, store, and process data, and learn the differences between batch and streaming data processing tools.

Start to learn

Week 1 Big Data Basics

Big Data Basic Concepts

Learn the basics of data analysis on Alibaba Cloud. Familiarize yourself with the steps in the data analysis process, including data accumulation and processing, data analysis and data presentation, as well as visualization, reporing and value extraction.

Start to learn

Week 1 Big Data Basics

Alibaba Cloud Big Data Products Overview

In this video, we introduce the core components of Alibaba Cloud’s Big Data ecosystem. Learn how tools such as DataWorks, MaxCompute, QuickBI, and DataV fit together, and see how each tool can be used to manage and process large datasets more effectively.

Start to learn

Week 1 Big Data Basics

Scenario in Which Products Would be Used

This this video, we examine some real data processing scenarios, and look at how Ailbaba Cloud services can be tied together to address a large number of different storage and analysis requirements.

Start to learn

Week 1 DataWorks

DataWorks Overview

This short video introduces the topics which will be covered in the following several video lectures, such as Data Acquisition, Processing, and Quality Monitoring with DataWorks.

Start to learn

Week 1 DataWorks

DataWorks Introduction

In this video lecture, you'll learn what DataWorks is, and how it fits into the Alibaba Cloud Big Data ecosystem. Develop a fuller understanding of the capabilities of DataWorks with analyses of common use cases.

Start to learn

Week 1 DataWorks

Demo of DataWorks - Data Acquisition

Learn how to create a new DataWorks Workspace, and import log data into MaxCompute using DataWorks Data Integration.

Start to learn

Week 1 DataWorks

Demo of DataWorks - Data Processing

Learn how DataWorks can be used to create "Business Flows" (Workflows) for automatic, scheduled data processing. Create a multi-stage data processing workflow and run it from the DataWorks console. See how User Defined Functions (UDFs) can be used to add custom features to MaxCompute SQL.

Start to learn

Week 1 DataWorks

Demo of DataWorks - Data Quality Monitoring

See how DataWorks Data Quality monitoring can be used to ensure consistent data quality each time data import and processing tasks are run. See how Data Quality metrics are selected and applied to MaxCompute tables.

Start to learn

Week 1 DataWorks

DataWorks Introduction Summary and Review

In this video, we quickly summarize the DataWorks concepts covered in the previous few sections.

Start to Learn

Week 1 DataWorks

More Details Of DataWorks

Learn more about DataWorks by visiting the DataWorks documentation. Here, you can learn about the specific features offered by different DataWorks editions, as well as watch useful video walkthroughs, access the DataWorks FAQ page, and see specific step-by-step guides to using core features like Operation Center or DataService Studio.

Start to learn Take Exam

Week 1 Big Data Blog

Setting up a MySQL data source

Learn how to create, configure, and load sample e-commerce data into an RDS MySQL database. This database is used in the next section as a data source for a DataWorks project.

Start to learn

Week 1 Big Data Blog

Importing and Processing MySQL data using DataWorks

Learn how to import data from MySQL into MaxCompute using DataWorks, how to process that data by creating a "Business Flow" (Workflow), and learn how to export your processed data from MaxCompute back into your MySQL database.

Start to learn

Flip Virtual Class

7PM-9PM JKT on September 20th

Learn how DataWorks and MaxCompute can be used to import and process structured data from a traditional relational database.

Play Back

Hands-On Lab Class

7PM-9PM JKT on September 21st

In this hands-on lab, learn how to synchronize data from a MySQL database to MaxCompute using DataWorks Data Integration.

Play Back

Stage II: Data Warehousing and Data Processing

Learning Module

Course Name

Objectives

Course Video

Quiz

Week 2 Python Structured Data Processing

Introduction to Python Pandas and business scenarios

Learn how to install the Anaconda Python distribution and open a new Jupyter notebook, which can be used for processing data with Pandas.

Start to learn

Week 2 Python Structured Data Processing

Loading data from different data types

In this video lecture, you'll learn how to import CSV or JSON data into Pandas. The video also demonstrates how to access all or part of the data once it is loaded into Pandas.

Start to learn

Week 2 Python Structured Data Processing

Problems of the raw data

Learn how to identify common issues with raw datasets such as missing (null) values.

Start to learn

Week 2 Python Structured Data Processing

Data scrubbing

Learn how Pandas can be used to clean your raw data by replacing NaNs and NULL values, and by standardizing data types.

Start to learn

Week 2 Python Structured Data Processing

Data analysis and virtualization

Learn how to filter, sort, and group data in Pandas, as well as generate plots which allow you to gain insights by visualizing data instribution, trends, and correlations.

Start to learn

Week 2 Python Structured Data Processing

Deal with more than one data set

In this video lecture, you'll learn how to work with multiple datasets in Pandas at the same time, as well as how to establish connections between datasets (merge and join) and plot data relationships.

Start to learn

Week 2 SQL for Beginners

SQL For Beginners Course Objectives

This video briefly introduces the SQL content discussed in the following sections.

Start to learn

Week 2 SQL for Beginners

Select Statement Basic

Learn how to query datasets with the SQL SELECT statement. See how SELECT can be used to filter, group, and sort data. This section also discusses more advanced usage, such as using the DISTINCT keyword on tables which contain NULL values.

Start to learn Take Exam

Week 2 SQL for Beginners

SELECT statement with WHERE

Learn advanced filtering techniques with SQL SELECT using the WHERE keyword. Learn how WHERE can be used with logical keywords (AND, OR, XOR, NOT) as well as arithmetic operators and comparison operators such as IN, IS, NULL, and LIKE.

Start to learn

Week 2 SQL for Beginners

SELECT with ORDER BY and Tips

Learn advanced filtering techniques with SQL SELECT using the ORDER BY keyword. See how ORDER BY can be used to sort results in ascending or descending order.

Start to learn

Week 2 SQL for Beginners

Table Join

Learn how SQL JOIN statements work. See how inner, outer, left, and right joins are used to combine multiple tables in SQL.

Start to learn

Week 2 SQL for Beginners

Troubleshooting

In this section, learn some of the basic skills needed to read and understand SQL errors and warnings.

Start to learn

Week 2 SQL for Beginners

SQL Models for Syntax Checking

In this video lecture, you'll learn how MySQL databases allow you to change SQL modes, allowing you to take advantage of different SQL features for different use-cases.

Start to learn

Week 2 MaxCompute Basic

MaxCompute Course Content Briefing

In this video, we provide a brief outline of the following sections, which focus on Alibaba Cloud's data warehousing tool, MaxCompute.

Start to learn

Week 2 MaxCompute Basic

Introduction of MaxCompute

In this video lecture, you'll learn about MaxCompute, Alibaba Cloud's distributed data storage and processing tool. Learn how MaxCompute works, what it can do, and how it is used at Alibaba Group to store and process petabytes of data.

Start to learn

Week 2 MaxCompute Basic

MaxCompute Architecture

In this video, we provide a brief outline of the following sections, which focus on Alibaba Cloud's data warehousing tool, MaxCompute.

Start to learn

Week 2 MaxCompute Basic

Basic Concepts of MaxCompute

In this video, we explore key concepts in MaxCompute, including Projects, Tables, Partitions, and Resources.

Start to learn

Week 2 MaxCompute Basic

How to Use MaxCompute

In this section, we review the interfaces that allow you to interact with MaxCompute, including the MaxCompute CLI, the MaxCompute Studio IDE plugin, and DataWorks.

Start to learn

Week 2 MaxCompute Basic

Demo: Quick Start Guide of MaxCompute

Learn how to import data from OSS into MaxCompute using DataWorks, and see how the data can be manipulated and modified using the DataWorks console.

Start to learn

Week 2 MaxCompute Security

MaxCompute Users And Roles

Learn how MaxCompute users and roles can be used to control data access privilages within MaxCompute projects.

Start to learn Take Exam

Flip Virtual Class

7PM-9PM JKT on September 26th

Learn how MaxCompute SQL can be used to create and manipulate tables and perform basic operations like table joins.

Play Back

Hands-On Lab Class

7PM-9PM JKT on September 27th

In this blog post, we explore how an data-based API can be quickly and easily published from within DataWorks, without the need to write any code. See how DataWorks DataService Studio enables code-free creation of HTTP APIs.

Play Back

Stage III: Advanced Data Processing Tools and Techniques

Learning Module

Course Name

Objective

Course Video

Quiz

Week 3 MaxCompute SQL Development

MaxCompute SQL Overview

Gain a basic understanding of MaxCompute's SQL dialect, and how it differs from common SQL dialects such as MySQL's SQL.

Start to learn

Week 3 MaxCompute SQL Development

Data Definition Language (DDL)

Develop an understanding of the basic MaxCompute SQL DDL operations (creating, deleting, and modifying tables).

Start to learn

Week 3 MaxCompute SQL Development

Data Manipulation Language (DML)

Develop an understanding MaxCompute's SQL DML operations (selecting and inserting records).

Start to learn

Week 3 MaxCompute SQL Development

Built-In Function (Part 1)

In this video, we review some of MaxCompute SQL's built-in functions, including mathematical functions like MAX, ABS, and RAND.

Start to learn

Week 3 MaxCompute SQL Development

Built-In Function (Part 2)

In this video, we review some of MaxCompute SQL's built-in functions, including time manipulation functions such as GETDATE, DATEPART, and WEEKDAY.

Start to learn

Week 3 MaxCompute SQL Development

Built-In Function (Part 3)

In this video, you will learn how SQL functions are used in general, along with common use-cases for common SQL functions in most SQL dialects.

Start to learn

Week 3 MaxCompute SQL Development

MaxCompute SQL Development Summary And Review

In this video lecture, we review the MaxCompute concepts discussed in the previous sections, with a focus on MaxCompute SQL development.

Start to learn Take Exam

Week 3 MaxCompute User Define Function

Introduction Of UDF

Learn the basics of MaxCompute's User Defined Functions (UDFs), which allow you to add your own new SQL functionality to MaxCompute's SQL language. Learn the differences between the major UDF types: UDF, UDAF, and UDTF.

Start to learn

Week 3 MaxCompute User Define Function

The Implement Logical Of UDF

Learn how to create your own MaxCompute UDF functions in Java.

Start to learn

Week 3 MaxCompute User Define Function

UDF Development Process

In this video lecture, we explain the UDF development process in detail, including the steps needed to create and compile a Java UDF in a local IDE.

Start to learn

Week 3 MaxCompute User Define Function

UDF Summary And Review

In this section, we briefly review User Defined Functions, as covered in the previous three sections.

Start to learn Take Exam

Flip Virtual Class

7PM-9PM JKT on October 4th

Learn how MaxCompute SQL can be extended with user-written UDF functions for advanced data processing.

Play Back

Hands-On Lab Class

7PM-9PM JKT on October 5th

In this hands-on lab, learn how to develop and deploy your own UDF in MaxCompute using DataWorks, then see how you can develop a new MaxCompute SQL task which uses your UDF function to process data in a MaxCompute table.

Play Back

Stage IV: Visualization, Machine Learning, and AI

Learning Module

Course Name

Objective

Course Video

Quiz

Week 4 QuickBI

QuickBI Course Content Briefing

This video outlines the topics covered in the following several video lectures, including basic concepts in Alibaba Cloud QuickBI, proper chart selection, and construction of QuickBI dashboards.

Start to learn

Week 4 QuickBI

Background Introduction Of BI

Learn how to choose the right type of charts to display your data, and how different types of charts are best suited to different roles (comparison, relationship, distribution, composition).

Start to learn

Week 4 QuickBI

Features Highlights Of QuickBI

This section introduces QuickBI and explains how QuickBI fits into the Alibaba Cloud Big Data ecosystem.

Start to learn

Week 4 QuickBI

Commonly Used Charts Introduction Part One

In this video lecture, we give an in-depth explanation of different chart types, and examine scenarios in which each chart type is a good (or bad) choice.

Start to learn

Week 4 QuickBI

Demo Of Commonly Used Charts Part One

See a hands-on demo of the QuickBI console, and learn how to construct a basic data dashboard.

Start to learn

Week 4 QuickBI

Commonly Used Charts Introduction Part Two

Learn about additional useful chart types including the funnel chart, tree chart, tree map, and conversion chart.

Start to learn

Week 4 QuickBI

Demo Of Commonly Used Charts Part Two

See a hands-on demo of more advanced charts in QuickBI, such as the conversion chart.

Start to learn

Week 4 QuickBI

Commonly Used Charts Introduction Part Three

In this section, we examine some additional dashboard elements in QuickBI such as the card, word cloud, and map (geo-chart) elements.

Start to learn

Week 4 QuickBI

Demo Of Commonly Used Charts Part Three

See a hands-on demo of some of QuickBI's additional dashboard elements such as the card and word cloud.

Start to learn

Week 4 QuickBI

Summary Of All Charts In Categories

In this video lecture, we summarize all the chart types discussed previously, tying each chart back to its best use-cases.

Start to learn

Week 4 QuickBI

End To End Demo Introduction

Here we introduce the scenario for the end-to-end QuickBI dashboard construction demo carried out in the next video.

Start to learn Take Exam

Week 4 Machine Learning Platform for AI

Introduction Of PAI

Gain a basic understanding of Platform for AI (PAI), Alibaba Cloud's Machine Learning platform. See how a simple PAI environment can be quickly and easily created using the Alibaba Cloud web console.

Start to learn

Week 4 Machine Learning Platform for AI

Quick Start And Architecture Overview

See a demonstration of PAI's low-code Machine Learning tool, PAI Studio. Learn how this interface can be used to prepare data, train and test machine learning models, and evaluate model accuracy.

Start to learn

Week 4 Machine Learning Platform for AI

Demo: Best Practice Of PAI

Learn how PAI's low-code Studio tool can be used to train a machine learning model to detect financial fraud. This end-to-end demo explains how fraud rings are identified and shows the steps needed to train a working machine learning model on the provided sample data.

Start to learn

Week 4 Machine Learning Platform for AI

User Cases

In this section, we take a look at some of PAI's current users, and discuss the ways in which they are utilizing PAI to improve a variety of different business processes.

Start to learn Take Exam

Flip Virtual Class

7PM-9PM JKT on Octoer 11th

See how PAI-DSW can be used to interactively process data and train machine learning models using common open source frameworks like scikit-learn.

Play Back

Hands-On Lab Class

7PM-9PM JKT on October 13th

In this hands-on lab, learn how to predict heart disease using Alibaba Cloud's PAI studio low-code Machine Learning tool. Learn to clean a dataset, split it into test and training sets, and train a basic machine learning mode, all without writing any code.

Play Back

Stage V: Alibaba Cloud Associate Big Data Certification Exam

Learning Module

Exam Name

Certification Introduction

Sign Up

Week 5 ACA Big Data Exam

Alibaba Cloud's Big Data Certification Associate is designed for engineers who can use Alibaba Cloud Big Data products. It covers basic distributed system theory and Alibaba Cloud's core products like MaxCompute, DataWorks, E-MapReduce and ecosystem tools.

COMING SOON

FAQ

1. Why do I have to create a personal Alibaba Cloud account?

Because RAM account or team account will not work for ANY Alibaba Cloud courses and certification exams.

2. How to create a personal Alibaba Cloud account?

If you do not have your own account, please create it at https://www.alibabacloud.com before the exam registration. If you have any issues in the process, please check this document: https://www.alibabacloud.com/help/doc-detail/50482.htm.

3. When will the ACA Big Data Certification Exam be hosted? Will it be provided to everyone for free?

The ACA Big Data Certification Exam that will be hosted in the last week of October. Please stay tuned for more schedule details near the last training session. The exam will be free for everyone who passed the pretest.

4. How will I be qualified for the free ACP Big Data Training and Certification Exam?

You would be qualified for ACP if you:
1. Get over 75 in pretest;
2. Complete at least 60% of course materials in learning path;
3. Pass all the self-test quizzes in the learning path above;
4. Pass ACA Big Data exam.

5. When will the ACP Big Data Training and Certification Exam be hosted?

The dates and times for ACP Big Data Training and Certification Exam are still to be determined, and you will be informed near the end of ACA Big Data Training.

6. If I don't pass an exam, can I retake it for free?

There will be only one opportunity for ACA and ACP Big Data Certification Exam respectively.