Before moving data between a Cloud Parallel File Storage (CPFS) fileset and an Object Storage Service (OSS) bucket, you must create a dataflow. This topic describes how to create and manage dataflows in the NAS console.
Prerequisites
A CPFS fileset is created and is in the Normal state. For more information, see Create a fileset.
A tag is added to the source OSS bucket. The tag key is cpfs-dataflow and the tag value is true. Do not delete or modify this tag while using the dataflow. Otherwise, the CPFS dataflow cannot access the data in the bucket. For more information, see Set tags for an OSS bucket.
If multiple dataflows from one or more CPFS file systems use the same OSS bucket as the source, you must enable versioning for the bucket. This prevents data conflicts when multiple dataflows export data to the same OSS bucket. For more information, see Introduction to versioning.
To configure automatic metadata updates, ensure that EventBridge is activated. For more information, see Activate EventBridge and grant permissions.
NoteThe automatic metadata update feature is currently supported only in the China (Hangzhou), China (Chengdu), China (Shanghai), China (Shenzhen), China (Zhangjiakou), and China (Beijing) regions.
Usage notes
Billing
After you create a dataflow, you are charged for the dataflow bandwidth. For more information, see Billing items.
When you configure automatic updates, CPFS uses EventBridge to collect modification events for source objects stored in OSS. This incurs event fees. For more information, see EventBridge billing.
Permissions
When you create a dataflow, CPFS assumes the AliyunServiceRoleForNasOssDataflow and AliyunServiceRoleForNasEventNotification service-linked roles. For more information, see CPFS service-linked roles.
Create a dataflow
Log on to the NAS console.
In the left-side navigation pane, choose File System > File System List.
In the top navigation bar, select a region.
On the File System List page, click the name of the file system.
On the file system details page, click Dataflow.
On the Dataflow tab, click Create Dataflow.
In the Create Dataflow dialog box, configure the following parameters.
Parameter
Description
Fileset ID /Name
Select a fileset.
ImportantCreating a dataflow clears all data in the fileset. Make sure that the fileset contains no data.
OSS Bucket Path
Select the source OSS bucket to associate with the CPFS fileset.
OSS Bucket SSL
Select whether to enable transport encryption (HTTPS).
Automatic Metadata Update
Specifies whether to enable the automatic metadata update feature.
Metadata Refresh Interval (minutes)
If you enable automatic metadata updates, you must specify the metadata refresh interval in minutes.
OSS Prefix
If you enable automatic metadata updates, you must specify a directory for automatic updates.
Limits:
The prefix must be 2 to 1,024 characters in length.
It must start and end with a forward slash (/).
It must be an existing prefix in the OSS bucket.
EventBridge Status
The status of the EventBridge service.
Automatic updates depend on the EventBridge service. Make sure that the EventBridge service is activated.
Bandwidth (MB/s)
The maximum transmission bandwidth for the dataflow, in MB/s.
Valid values:
600
1200
1500
The transmission bandwidth of the dataflow cannot exceed the I/O bandwidth of the file system.
SLR Authorization
You must grant CPFS the service-linked roles to access resources of OSS and EventBridge. For more information, see CPFS service-linked roles.
Click OK.
Related operations
You can view, modify, stop, enable, or delete dataflows in the console.
Operation | Description | Steps |
View dataflows | View existing dataflows and create dataflow tasks. | On the Dataflow tab, you can view the configurations of a dataflow. |
Modify a dataflow | You can only modify the metadata refresh interval, add an OSS prefix for automatic updates, and change the bandwidth and description. |
|
Stop a dataflow | After you stop a dataflow, billing for the dataflow stops in the next billing cycle. Important After a dataflow is stopped, you cannot import or export data, and running tasks are canceled. |
|
Enable a dataflow | You can enable a dataflow that is in the Stopped state. |
|
Delete a dataflow | After you delete a dataflow, all tasks in the dataflow are cleared, and data can no longer be synchronized. |
|