MaxCompute provides a variety of data upload and download tools. This topic describes the typical scenarios in which you can use the tools to migrate data to MaxCompute.

Migrate Hadoop data

You can use MaxCompute Migration Assist (MMA), Sqoop, or DataWorks to migrate Hadoop data.

  • If you use DataWorks, DataX is required.
  • If you use Sqoop, a MapReduce job is run on the original Hadoop cluster to transmit data to MaxCompute in a distributed manner. For more information, see Apache Sqoop.

Synchronize data from a database

To synchronize data from a database to MaxCompute, you must select a tool based on the database type and synchronization policy.

  • Use DataWorks to migrate data offline. DataWorks supports a variety of database types, such as MySQL, SQL Server, and PostgreSQL. For more information, see Create a batch sync node. You can also perform instance-related operations based on Create a sync node.
  • Use the OGG plug-in to synchronize data from an Oracle database in real time.
  • Use Data Transmission Service (DTS) to synchronize data from an ApsaraDB RDS database in real time.

Collect logs

You can use tools such as Flume, Fluentd, and Logstash to collect logs.