Alibaba Cloud Object Storage Service (OSS) is a secure and reliable storage service that allows you to store large volumes of data.
For more information about OSS, see What is OSS?
For more information about OSS SDK for Java, see Aliyun OSS Java SDK.
- Go to the Data Source page.
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- After you select the region where the required workspace resides, find the workspace and click Data Integration in the Actions column.
- In the left-side navigation pane of the Data Integration page, choose to go to the Data Source page.
- On the Data Source page, click Add data source in the upper-right corner.
- In the Add data source dialog box, click OSS in the Semi-structuredstorage section.
- In the Add OSS data source dialog box, configure the parameters.
Parameter Description Data Source Name The name of the data source. The name can contain letters, digits, and underscores (_) and must start with a letter. Data source description The description of the data source. The description can be a maximum of 80 characters in length. Environment The environment in which the data source is used. Valid values: Development and Production.Note This parameter is displayed only when the workspace is in standard mode. Endpoint The endpoint of OSS. Specify this parameter in the format similar to
http://oss.aliyuncs.com. The endpoint of OSS varies based on the region.Note If you add a bucket name before the endpoint of OSS, the data source can pass the connectivity test but data synchronization will fail. Example of adding a bucket name before the endpoint of OSS:
Bucket The name of the OSS bucket. A bucket is a storage space that serves as a container for storing objects.
You can create one or more buckets and add one or more objects to each bucket.
During data synchronization, DataWorks can search for objects only in the bucket that is specified by this parameter.
AccessKey ID The AccessKey ID of the account that you can use to connect to the OSS bucket. You can view the AccessKey ID on the Security Management page. AceessKey Secret The AccessKey secret of the account that you can use to connect to the OSS bucket.Notice If data in OSS is stored as CSV files, the data must comply with the standard CSV format. For example, if the data in a column of a CSV file is enclosed in a pair of single quotation marks ('), you must replace this pair of single quotation marks with a pair of double quotation marks ("). Otherwise, the data in the CSV file may be incorrectly parsed.
- Set Resource Group connectivity to Data Integration.
- Find the desired resource group in the resource group list in the lower part of the
dialog box and click Test connectivity in the Actions column. A synchronization node can use only one type of resource group. To ensure that your synchronization nodes can be normally run, you must test the connectivity of all the resource groups for Data Integration on which your synchronization nodes will be run. If you want to test the connectivity of multiple resource groups for Data Integration at a time, select the resource groups and click Batch test connectivity. For more information, see Select a network connectivity solution.Note
- By default, the resource group list displays only exclusive resource groups for Data Integration. To ensure the stability and performance of data synchronization, we recommend that you use exclusive resource groups for Data Integration.
- If you want to test the network connectivity between the shared resource group or a custom resource group and the data source, click Advanced below the resource group list. In the Warning message, click Confirm. Then, all available shared and custom resource groups appear in the resource group list.
- After the data source passes the connectivity test, click Complete.