All Products
Search
Document Center

Object Storage Service:Overview

Last Updated:Aug 01, 2023

This topic describes how to migrate data to Object Storage Service (OSS) or OSS-HDFS.

Migrate data to OSS

You can migrate data from local devices, third-party storage devices, or a source OSS bucket to a destination OSS bucket. The following table describes the methods that you can use to migrate data to OSS.

Migration method

Description

References

Data Online Migration

Migrate data from third-party storage devices to OSS or between OSS buckets across accounts, across regions or in the same region. You do not need to set up an environment for migration tasks. You can submit a migration job online and monitor the migration process.

ossimport

Migrate historical data from various sources to OSS in batches, including local storage devices, Qiniu Cloud Object Storage (KODO), Baidu Object Storage (BOS), Amazon Simple Storage Service (Amazon S3), Azure Blob, UPYUN Storage Service (USS), Tencent Cloud Object Service (COS), Kingsoft Standard Storage Service (KS3), HTTP, and OSS. The sources can be expanded based on your business requirements.

Use ossimport to migrate data

ossutil

Migrate large amounts of historical data from various sources to OSS in batches.

ossutil

Mirroring-based back-to-origin

Seamlessly migrate data from origins to OSS. You can migrate your business from origins or other cloud services to OSS without service interruption. After ossimport migrates historical data to OSS and the business runs in OSS, if requested data is not in OSS, mirroring-based back-to-origin is triggered to retrieve data from the origins and download data to OSS.

For example, you can use mirroring-based back-to-origin rules to migrate your business from a self-managed origin or from another cloud service to OSS without service interruption. You can use mirroring-based back-to-origin rules during migration to obtain data that is not migrated to OSS. This ensures business continuity.

Overview

Cross-region replication (CRR)

Replicate objects from one OSS bucket to another OSS bucket in a different region.

Note
  • You can specify a prefix that is contained in the names of the objects that you want to replicate. After you specify the prefix, only the objects whose names contain the prefix are replicated to the destination bucket.

  • Cold Archive or Deep Cold Archive objects in the source bucket cannot be replicated to the destination bucket.

Overview

Data Transport

Migrate terabytes to petabytes of data from a local data center to OSS.

What is Data Transport?

OSS API or OSS SDK

Use OSS API or OSS SDK to programmatically migrate data to OSS. This migration method is especially suitable for developers.

OSS external tables (gpossext)

Use the OSS external table (gpossext) feature of AnalyticDB for PostgreSQL to import data from or export data to OSS tables.

Jindo DistCp

Copy files within or between large-scale clusters. Jindo DistCp uses MapReduce to distribute files, handle errors, and restore data. The lists of files and directories are used as the input of the MapReduce tasks. Each task copies specific files and directories in the input list.

Migrate data from HDFS to OSS

Migrate data to OSS-HDFS

OSS-HDFS (JindoFS) is a cloud-native data lake storage service. OSS-HDFS provides centralized metadata management capabilities and is fully compatible with Hadoop Distributed File System (HDFS) API. OSS-HDFS also supports Portable Operating System Interface (POSIX). You can use OSS-HDFS to manage data in data lake-based computing scenarios in the big data and AI fields. You can migrate data to OSS-HDFS or between buckets for which OSS-HDFS is enabled. The following table describes the methods that you can use to migrate data to OSS-HDFS.

Migration method

Description

References

Jindo DistCp

Copy files within or between large-scale clusters. Jindo DistCp uses MapReduce to distribute files, handle errors, and restore data. The lists of files and directories are used as the input of the MapReduce tasks. Each task copies specific files and directories in the input list.

JindoDistJob

Migrate full or incremental metadata of files from a semi-hosted JindoFS cluster to OSS-HDFS without the need to copy data blocks.

Migrate data from a semi-hosted JindoFS cluster to OSS-HDFS

MoveTo command of JindoTable

Automatically update metadata after the command copies the underlying data. This way, data in a table or partitions can be fully migrated to the destination path. If you want to migrate a large number of partitions at the same time, you can specify filter conditions for the MoveTo command. JindoTable also provides protective measures to ensure data integrity and security when the MoveTo command is used to migrate data.

Use the JindoTable MoveTo command to migrate Hive tables and partitions to OSS-HDFS