MaxCompute provides a variety of tools that you can use to upload and download data. This topic describes the typical scenarios in which you can use the tools to migrate data to MaxCompute.

Migrate Hadoop data

You can migrate Hadoop data by using MaxCompute Migration Assist (MMA), Sqoop, or DataWorks.

  • If you use DataWorks, DataX is required.
  • If you use Sqoop, a MapReduce (MR) job is run on the original Hadoop cluster to transmit data to MaxCompute in a distributed manner. For more information, see Apache Sqoop.

Synchronize data from a database

To synchronize data from a database to MaxCompute, you must select a tool based on the database type and synchronization policy.

  • Use DataWorks to migrate data offline. DataWorks supports a variety of database types, such as MySQL, SQL Server, and PostgreSQL. For more information, see Create a batch sync node. You can also perform operations on instances. For more information, see Create a sync node.
  • Use the OGG plug-in to synchronize data from an Oracle database in real time.
  • Use Data Integration in DataWorks to synchronize data from an ApsaraDB RDS database in real time. For more information, see Configure data sources for data synchronization from MySQL.

Collect logs

To collect logs, you can use tools such as Flume, Fluentd, or Logstash.