Migration to MaxCompute

MaxCompute provides a comprehensive set of data migration solutions and a variety of classical distributed computing models. MaxCompute not only enables you to store and compute a large amount of data, but also helps effectively reduce the costs of your enterprise.

Why Migrate to MaxCompute?

MaxCompute is available five minutes after being activated. MaxCompute is deeply integrated with major streaming services on the cloud, making it easy to access streaming data from various data sources.

  • Combined Capabilities of Data Lakes and Data Warehouses

    ▪ MaxCompute features high flexibility and a comprehensive ecosystem like a data lake and provides enterprise-level services like a data warehouse.

    ▪ MaxCompute centralizes storage and unifies metadata using a smart data warehouse.

    ▪ MaxCompute provides a unified experience of data development, management, and governance.

  • Reduced TCO

    ▪ You are charged only for the jobs and storage resources that you use.

    ▪ The total cost of ownership (TCO) of MaxCompute is 90% lower than Hadoop Hive.

    ▪ The TCO of MaxCompute is at least 30% lower than other cloud data warehouses.

    ▪ MaxCompute does not require platform O&M, which minimizes O&M investments.

  • Modernized Data Warehouse

    ▪ Traditional data warehouses cannot meet the requirements of enterprises on infrastructure resources.

    ▪ MaxCompute provides ultimate flexibility and improved performance based on the cloud computing architecture.

    ▪ MaxCompute provides enterprise-level capabilities, such as global data development and data governance.

  • Convenient Migration Process

    ▪ The MaxCompute Team provides comprehensive evaluation solutions for data and application migration.

    ▪ The MaxCompute Team provides easy-to-use tools and solutions for data, workflow, and application migration.

    ▪ A migration expert team is available.

Success Stories

A customer of Alibaba Cloud from the financial industry said, "Alibaba Cloud helped us migrate petabytes of data and tens of thousands of tables to the cloud in two weeks using the MaxCompute Migration Assistant and DataWorks Migration Assistant. Alibaba Cloud also helped us reconstruct thousands of core jobs in five business days. After the migration, the expected completion time of tasks advanced by 3 hours, and the performance was improved by 30%."

A customer of Alibaba Cloud from the gaming industry said, "Our self-managed clusters provide extremely restricted outbound access. Alibaba Cloud used MaxCompute Migration Assistant together with Data Transport to replace our network-based transmission. Alibaba Cloud helped us complete the seemingly impossible task of data migration in three business days, making us highly recognized by our business team."

A customer of Alibaba Cloud that manages a maternal and infant community platform said, "We had more than 1 PB of data that needed to be migrated. Among the data, some tables had a size of more than 80 TB. In such extreme scenarios, the performance of MaxCompute Migration Assistant was three times higher than other tools, which dispelled our initial concerns. Alibaba Cloud helped us efficiently, stably, and accurately migrate data to the cloud, allowing us to conduct our business on the cloud."

Overall Process

1

Information Collection

Basic Information
Migration Cycle and Workload Expectations

2

Solution Evaluation

Architecture Mapping
Data and Application Migration Evaluation

3

Cost Estimation

Specifications and Cloud Resource Quantity

4

Migration

Data Migration and Synchronization Verification
Application Migration: Tasks and Scheduling
Permission Migration: Permission Mapping

5

Verification and Cutover

Verification of Destination Warehouse Availability
Automatic Routine Optimization
Smooth Cutover and Operations

Evaluation and Architecture

General Migration Architecture

Migrate Data From Hadoop

Status Quo Survey
Solution support engineers issued a Hadoop questionnaire on the status quo and asked customers to complete it. The questionnaire covers the following content:

1. Cluster Scale: Information about storage resources, computing resources, and YARN

2. Network Environment: Outbound bandwidth over the internal network of the IDC and leased lines for connecting the IDC to Alibaba Cloud

3. Common components, machine specifications, and the architecture of existing data

4. Tables and jobs in the Hadoop cluster

5. Post-Migration Expectations: Cycle and costs

  • Solution Evaluation

    1. Mapping between the existing architecture and the service architecture of Alibaba Cloud

    2. Mapping between the source data flow diagram and the service architecture of Alibaba Cloud

    3. Data verification solution

    4. Migration evaluation solutions for other jobs, such as UDF, MapReduce, external table, and Spark jobs

    5. Migration process and plan evaluation

  • Cost Estimation

    Services, service description, specifications, and quantity

General Migration Architecture

Migrate Data From Other Clouds

Status Quo Survey
1. Information about storage and computing resources

2. Common components, machine specifications, and the architecture of existing data

3. Tables and jobs

4. Post-migration expectations

  • Solution Evaluation

    1. Mapping between the existing architecture and the service architecture of Alibaba Cloud

    2. Mapping between the source data flow diagram and the service architecture of Alibaba Cloud

    3. Data verification solution

    4. Migration process and plan evaluation

  • Cost Estimation

    Services, service description, specifications, and quantity

General Migration Architecture

Migrate Data From Traditional Data Warehouses

1. Information about storage and computing resources

2. Common components, machine specifications, and the architecture of existing data

3. Tables, jobs, and stored procedures

4. Post-migration expectations

  • Solution Evaluation

    1. Mapping between the existing architecture and the service architecture of Alibaba Cloud

    2. Mapping between the source data flow diagram and the service architecture of Alibaba Cloud

    3. Data verification solution

    4. Migration process and plan evaluation

  • Cost Estimation

    Services, service description, specifications, and quantity

Tools

MaxCompute Migration Assistant

MaxCompute Migration Assistant

MaxCompute Migration Assistant provides solutions to migrate data from different data sources to MaxCompute. It is most commonly used to migrate data from Hive to MaxCompute.

Download Link
DataWorks Data Integration

DataWorks Data Integration

Data Integration is a stable, efficient, and scalable data synchronization platform. It is dedicated to providing high-speed and stable data movement and synchronization between abundant heterogeneous data sources in complex network environments.

Learn More
DataX

DataX

DataX is a widely used data synchronization tool within Alibaba Group. It implements efficient data synchronization between heterogeneous data sources, such as MySQL, Oracle, PostgreSQL, and HDFS.

Learn More
DataWorks Migration Assistant

DataWorks Migration Assistant

Migration Assistant is a powerful tool for job migration in DataWorks. It supports open source scheduling engines, such as Oozie and Azkaban, to quickly migrate jobs to the cloud. It also provides detailed migration reports.

Learn More

Videos