Document Center

All Products

Document Center

Cloud Backup:Add a data source

Last Updated:Mar 04, 2024

Cloud Backup provides the data synchronization feature for unstructured file systems. You can synchronize data sources on the source, such as Network Attached Storage (NAS) file systems, Hadoop Distributed File System (HDFS) file systems, S3-Compatible Storage buckets, Object Storage Service (OSS) buckets, Cloud Parallel File Storage (CPFS) file systems, and OSS-Compatible Storage buckets, to the data sources on the destination (including Alibaba Cloud). Before you synchronize data for the first time, you must add the source and destination data sources. This topic describes how to add a data source in the Cloud Backup console.

Prerequisites

Cloud Backup is activated. You are not charged for activating Cloud Backup. The data synchronization feature is in public preview and is provided free of charge.
Cloud Backup is authorized, and a Cloud Backup client is installed. For more information, see Before you begin.

Procedure

Log on to the Cloud Backup console.
In the left-side navigation pane, choose Synchronization > Data Synchronization.
In the top navigation bar, select a region.
On the Data Source List tab, click Create Data Source.

In the Create Data Source panel, configure the parameters and click OK.

Source Type: Network Attached Storage (NAS)

Configure the parameters described in the following table.

Parameter	Description
Source Type	The type of the data source. Select Network Attached Storage (NAS).
Data Source Name	The name of the data source.
NAS Network Address	The network address of the NAS file system whose data is to be synchronized.
NAS Share Path	The directory relative to the / root directory. For example, if you enter /myshare, Cloud Backup synchronizes data from the /myshare directory. For more information about how to query the shared directories of a NAS file system, see How do I query the shared directories of a NAS file system? The directory can contain letters, digits, and the following special characters: `,-_=/.:\`. Note The root directory of an Apsara File Storage NAS file system after the file system is mounted differs depending on the protocol type. In most cases, the root directory of a Network File System (NFS) file system is `/` after the file system is mounted, whereas the root directory of a Server Message Block (SMB) file system is `/myshare` after the file system is mounted. Note that the mounted directory may be different in actual scenarios. Therefore, perform operations based on your actual mounting situation.
Protocol Type	The protocol type of the NAS file system. Valid values: NFS SMB GlusterFS Important If you want to synchronize data from an Apsara File Storage NAS file system, configure the vers parameter in the Advanced Settings section. The NFS, SMB, or GlusterFS client must be installed on the Cloud Backup client. You can run the following commands to install the NFS, SMB, or GlusterFS client: `NFS -CentOS: sudo yum install nfs-utils -Ubuntu: sudo apt-get install nfs-common SMB -Centos: sudo yum install cifs-utils -Ubuntu: sudo apt-get install cifs-utils -openSUSE: sudo zypper install cifs-utils GlusterFS -CentOS: sudo yum install glusterfs-client -Ubuntu: sudo apt-get install glusterfs-client -Reference: https://docs.gluster.org/en/latest/Install-Guide/Overview/`

Optional. Click Advanced Settings and then click +Set Mount Parameters.

The following table describes the mount parameters that you can configure.

Parameter	Description
vers	The protocol version of the file system. vers=3: uses NFSv3 to mount the file system. vers=4: uses NFSv4 to mount the file system. vers=4.0: uses NFSv4.0 to mount the file system.
nolock	Specifies whether to enable file locking.
proto	The protocol that you want to use to mount the file system.
rsize	The size of each data block that a client can read from the file system. Recommended value: 1048576. Unit: bytes.
wsize	The size of each data block that a client can write to the file system. Recommended value: 1048576. Unit: bytes.
hard	Specifies that applications no longer access the file system when the file system is unavailable and access the file system again when the file system becomes available. We recommend that you enable this parameter.
timeo	The period of time for which the NFS client waits before the client retries to send a request. Unit: deciseconds (tenths of a second). Recommended value: 600 (60 seconds).
retrans	The number of retries after the NFS client fails to send a request. Recommended value: 2.

Source Type: Hadoop Distributed File System (HDFS)

Configure the parameters described in the following table.

Parameter	Description
Source Type	The type of the data source. Select Hadoop Distributed File System (HDFS).
Data Source Name	The name of the HDFS data source. You can specify a name based on your business requirements. Example: back-end-hdfs.
NameNode Network Address	The network address of an HDFS primary server. NameNode serves as a primary server that is used to manage the namespaces of HDFS file systems and control access from clients to files in the file systems. For example, if the network address is `47.100.XX.XX` and the port number is `9000`, the data source address is `47.100.XX.XX:9000`.
NameNode Port	The port number of the HDFS primary server. Example: `9000`.
Secondary NameNode Network Address	The network address of a secondary HDFS node. A secondary HDFS node helps the primary server perform management tasks.
Secondary NameNode Port	The port number of the secondary HDFS node.
HDFS Username	The name of the HDFS user. Note Make sure that the specified HDFS user has sufficient permissions. Otherwise, you may be unable to read files when you synchronize data. We recommend that you set this parameter to hadoop or hdfs.

Source Type: Alibaba Cloud Object Storage Service (OSS)

Configure the parameters described in the following table.

Parameter	Description
Source Type	The type of the data source. Select Alibaba Cloud Object Storage Service (OSS).
Data Source Name	The name of the OSS data source.
Use HTTPS	Specifies whether to transmit data over HTTPS. HTTPS provides higher transmission security than HTTP.
OSS Bucket	The name of the OSS bucket whose data you want to synchronize. Select a name from the drop-down list. Cloud Backup automatically obtains all OSS buckets in the region within your account.
OSS Endpoint	The endpoint of the OSS bucket. Select an endpoint from the drop-down list. For more information about the endpoints of OSS in each region, see Regions and endpoints. If you synchronize data over the Internet, select the public endpoint of the bucket. For example, select `oss-cn-hangzhou.aliyuncs.com` for the China (Hangzhou) region. If you use a virtual private cloud (VPC) to synchronize data, select the internal endpoint of the bucket. For example, select `oss-cn-hangzhou-internal.aliyuncs.com` for the China (Hangzhou) region.

Source Type: Cloud Parallel File Storage (CPFS)

Configure the parameters described in the following table.

Parameter	Description
Data Source Name	The name of the CPFS data source. The name helps you quickly identify the data source. You can specify a name based on your business requirements. Example: cpfs.
CPFS Mount Path	The mount path of the CPFS file system. Example: `/cpfs/00d0****1b-000001`. If you have not added a Portable Operating System Interface for UNIX (POSIX) mount target or installed the CPFS-POSIX client for your CPFS file system, add a mount target and install the client. You can run the following commands on the CPFS cluster management node to query the CPFS cluster status and the mount path. Query the CPFS cluster status Command `mmgetstate -a` Sample output `Node number Node name GPFS state --------------------------------------------------------------- 1 cpfs-00d0**1b-000001-qr-001 active 2 cpfs-00d0**1b-000001-qr-002 active 3 cpfs-00d0**1b-000001-qr-003 active 4 iZbp**haqrZ active` Query the CPFS mount path Command `df -h` Sample output `Filesystem Size Used Avail Use% Mounted on devtmpfs 3.8G 0 3.8G 0% /dev tmpfs 3.8G 16K 3.8G 1% /dev/shm tmpfs 3.8G 528K 3.8G 1% /run tmpfs 3.8G 0 3.8G 0% /sys/fs/cgroup /dev/vda1 40G 7.3G 33G 19% / tmpfs 763M 0 763M 0% /run/user/0 00d0**1b-000001 3.6T 564M 3.6T 1% /cpfs/00d0**1b-000001` In the preceding output, `/cpfs/00d0****1b-000001` is the CPFS mount path.

Source Type: OSS Compatible Storage

Configure the parameters described in the following table.

Parameter	Description
Source Type	The type of the data source. Select OSS Compatible Storage.
Data Source Name	The name of the data source for OSS-Compatible Storage. You can specify a name based on your business requirements. Example: oss-bucket.
Use HTTPS	Specifies whether to transmit data over HTTPS. HTTPS provides higher transmission security than HTTP.
OSS Bucket	The name of the OSS bucket for OSS-Compatible Storage. The name is provided by a third-party storage service provider.
OSS Endpoint	The VPC endpoint provided by the third-party storage service provider. Obtain the VPC endpoint from the administrator for OSS-Compatible Storage.
AccessKey ID	The AccessKey ID and AccessKey secret provided by the third-party storage service provider to access the VPC. Obtain the AccessKey pair from the administrator for OSS-Compatible Storage. The full permission to read data from OSS-Compatible Storage must be granted on the key.
AccessKey Secret

Source Type: S3 Compatible Storage

Configure the parameters described in the following table.

Parameter	Description
Source Type	The type of the data source. Select S3 Compatible Storage.
Data Source Name	The name of the data source for S3 Compatible Storage. You can specify a name based on your business requirements. Example: awss3.
Use HTTPS	Specifies whether to transmit data over HTTPS. HTTPS provides higher transmission security than HTTP.
S3 Bucket	The name of the S3-Compatible Storage bucket.
S3 Endpoint	The endpoint connected to the buckets that can be used to perform operations on S3 objects. Example: `s3.us-east-1.amazonaws.com`. Obtain the endpoint from the administrator for S3-Compatible Storage.
Access Key	The security credential for accessing S3-Compatible Storage as an Identity and Access Management (IAM) user. Obtain the AccessKey pair from the administrator for S3-Compatible Storage. The full permission to read data from S3-Compatible Storage must be granted on the key.
Secret Key

After you add a data source, the data source is displayed on the Data Source List tab.

Related operations

After a data source is added, you can click More in the Actions column and then select the required operation. The following table describes the available operations.

Operation

Description

Edit Data Source

Modifies the parameters of the data source.

Unregister Data Source

If you no longer need to synchronize data, you can unregister the data source. After a data source is unregistered, data is no longer synchronized.

On the Sync Plan tab, delete all the synchronization plans for the data source.
On the Data Source List tab, find the data source that you want to unregister and choose More > Unregister Data Source in the Actions column.
In the View Client Groups panel, delete the Cloud Backup client for data synchronization.
On the server where the Cloud Backup client for data synchronization is installed, uninstall the client. For more information, see How do I uninstall a Cloud Backup client?

What to do next

Create a synchronization plan