Archive data sources - Cloud Backup - Alibaba Cloud Documentation Center

When you archive data for the first time, you must add a data source. Supported data sources include on-premises NAS, HDFS, and S3-compatible storage. This topic describes how to use Cloud Backup to add a data source.

Prerequisites

Cloud Backup is authorized, and a Cloud Backup client is installed. For more information, see Install a client.

Procedure

Log on to the Cloud Backup console.
In the left-side navigation pane, click Archive.
In the top navigation bar, select a region.
On the Analyze and Archive tab, click Create Data Source.

In the Create Data Source panel, configure the parameters and click Next.

Source Type: Local NAS

Configure the parameters described in the following table.

Parameter	Description
Source Type	The type of data source. Select Local NAS.
Data Source Name	Enter a name for the data source.
NAS Type	The NAS type. Select Other NAS.
NAS Network Address	The network address of the NAS file system.
NAS Share Path	The NAS shared directory. The name can contain Chinese characters, uppercase and lowercase letters, digits, and the following special characters: `,-_=/.:\`. If you set the NAS Type parameter to Other NAS, the directory is relative to the / root directory. For example, if you enter /myshare, Cloud Backup archives data from the /myshare directory.
Protocol Type	The protocol type of the NAS file system. Valid values: NFS: Network File System (NFS) SMB: Server Message Block (SMB) Important If you mount an Alibaba Cloud File Storage NAS, you must set the vers parameter in Advanced Settings.

Optional. Click Advanced Settings and then click +Set Mount Parameters.

The following table describes the mount parameters that you can configure.

Parameter	Description
vers	The protocol version of the file system. vers=3: uses NFSv3 to mount the file system. vers=4: uses NFSv4 to mount the file system. vers=4.0: uses NFSv4.0 to mount the file system. Note Each protocol version of File Storage NAS has differences in terms of features, security, and namespace. For more information, see Differences between NFSv3 and NFSv4.0.
nolock	Specifies whether to enable file locking.
proto	The protocol that you want to use to mount the file system.
rsize	The size of each data block that a client can read from the file system. Recommended value: 1048576. Unit: bytes.
wsize	The size of each data block that a client can write to the file system. Recommended value: 1048576. Unit: bytes.
hard	Specifies that applications no longer access the file system when the file system is unavailable and access the file system again when the file system becomes available. We recommend that you enable this parameter.
timeo	The period of time for which the NFS client waits before the client retries to send a request. Unit: deciseconds (tenths of a second). Recommended value: 600 (60 seconds).
retrans	The number of retries after the NFS client fails to send a request. Recommended value: 2.
noresvport	Uses a new TCP port upon network reconnection to prevent connection interruptions when the network recovers from a failure. We recommend that you enable this parameter.

Source Type: HDFS

Configure the parameters described in the following table.

Parameter	Description
Source Type	The type of data source. Select HDFS.
Data Source Name	The name of the HDFS data source. You can specify a name based on your business requirements. Example: back-end-hdfs.
NameNode Network Address	The network address of an HDFS primary server. NameNode serves as a primary server that is used to manage the namespaces of HDFS file systems and control access from clients to files in the file systems.
NameNode Port	The port number of the HDFS primary server.
Secondary NameNode Network Address	The network address of a secondary HDFS node. A secondary HDFS node helps the primary server perform management tasks.
Secondary NameNode Port	The port number of the secondary HDFS node.
HDFS Username	The name of the HDFS user. Note Make sure that the specified HDFS user has sufficient permissions. Otherwise, you may be unable to read files when you archive data, write files when you retrieve data, or restore the information of user groups. We recommend that you set this parameter to hadoop or hdfs.

Source Type: S3

Configure the parameters described in the following table.

Parameter	Description
Source Type	The type of the data source. Select S3.
Data Source Name	The name of the S3 data source. You can specify a name based on your business requirements. Example: awss3.
Use HTTPS	Specifies whether to encrypt transmitted data by using HTTPS. Valid values: No Yes
S3 Bucket	The name of the S3-Compatible Storage bucket.
S3 Endpoint	The endpoint of the bucket that can be used to perform operations on S3 objects. Examples: s3.us-east-1.amazonaws.com and 11.238.XXX.XXX:9000.
Access Key	The key ID that is used to access the S3-Compatible Storage bucket.
Secret Key	The access key that is used to access the S3-Compatible Storage bucket.

Associate a client group, and then click Next.

You can add multiple clients to a client group to concurrently run an archive job. You can also select an existing client group. In this example, set the Client Group From parameter to Create Backup Client Group. Then, enter a name for Client Group Name and select the clients that you want to add to the client group.

Configure a data analysis plan, and then click OK.

In the Configure Analysis Plan step, configure the parameters. The following table describes the parameters.

Parameter	Description
Enable Data Source Analysis	Specifies whether to enable the data analysis feature. If you turn on Enable Data Source Analysis, Cloud Backup analyzes data after the data source is added. You can use Cloud Backup to scan, analyze, or search for a data source only if the data analysis feature is enabled for the data source. Note If you turn off Enable Data Source Analysis, Cloud Backup directly archives data from the data source.
Meta Index Start Time	The time when Cloud Backup starts to perform an index operation on metadata.
Meta Index Interval	The interval at which Cloud Backup performs index operations on metadata. Unit: days or weeks.

After you add a data source, the data source is displayed on the Analyze and Archive tab.

Related operations

To perform the following operations on a data source, click <hd> More </hd> in the Actions column:

Operation	Description
Configure Analysis Plan	Configures a data analysis plan for the data source.
Run Meta Indexing	Creates an index for the data source. This way, you can efficiently analyze and search for data.
View Data Source	Views the details of the data source. The details include the type, NAS network address, NAS share path, and backup client group.
Edit Data Source	Modifies the parameters of the data source.
Unregister Data Source	Unregisters a data source. If you no longer require an archive plan, you can perform this operation.
Edit Backup Client Group	Changes the name of a client or a client group.

What to do next

Analyze a data source