Data upload and data download tools of the MaxCompute platform can be used for a wide range of cloud data migration scenarios. This document introduces some typical scenarios.
Hadoop data migration
For Hadoop data migration, use either Sqoop or DataWorks.
- Sqoop runs an MR job on the original Hadoop cluster for the distributed data transmission to MaxCompute and is highly efficient. For more information, see the Sqoop tool introduction.
DataWorks can be combined with DataX for Hadoop data migration.
To synchronize the data of a database to MaxCompute, select an appropriate tool based on the database type and synchronization rule.
- For offline batch data synchronization, use DataWorks. It supports a wide range of database types, including MySQL, SQL Server, and PostgreSQL. For more information, see Data synchronization introduction. For instance operation instructions, see Create a synchronization task.
- For real-time Oracle data synchronization, use OGG plug-in tools.
- For real-time RDS data synchronization, use DTS.
For log collection, use Flume, Fluentd, and Logstash tools.