Alibaba Cloud Object Storage Service (OSS) is a secure and reliable service that allows you to store large amounts of objects.

Note
  • Workspaces in standard mode support the Connection isolation feature. You can add connections for the development and production environments separately and isolate the connections to protect your data security.
  • For more information about OSS, see OSS product introduction.
  • For more information about OSS Java SDK, see OSS Java SDK.

Procedure

  1. Log on to the DataWorks console as a workspace administrator, find the target workspace, and then click Data Integration in the Actions column.
  2. In the left-side navigation pane, click Connection to go to the Workspace Manage > Data Source page.
  3. On the Data Source page that appears, click Add Connection in the upper-right corner.Add a connection
  4. In the Add Connection dialog box that appears, click OSS in the Semi-structured storage section.
  5. In the Add OSS Connection dialog box that appears, set the parameters.OSS
    Parameter Description
    Connection Name The name of the connection. The name can contain letters, digits, and underscores (_) and must start with a letter.
    Description The description of the connection. The description cannot exceed 80 characters in length.
    Applicable Environment The environment in which the connection is used. Valid values: Development and Production.
    Note This parameter is available only when the workspace is in standard mode.
    Endpoint The OSS endpoint, in the format of http://oss.aliyuncs.com. The OSS endpoint varies with the region.
    Note If you add the bucket name before the domain name, for example,http://xxx.oss.aliyuncs.com, the connection can pass the connectivity test but data synchronization will fail.
    Bucket The name of the OSS bucket. A bucket is a storage space that serves as a container for storing objects.

    You can create one or more buckets and add one or more objects to each bucket.

    DataWorks can only search for objects in the bucket specified here during data synchronization.

    AccessKey ID and AccessKey Secret The AccessKey ID and AccessKey secret used as logon credentials.
  6. Click Test Connection.
  7. After the connectivity test is passed, click Complete.
Note When data in OSS is stored as Comma-Separated Values (CSV) files, they must comply with the standard CSV format. For example, if the data in a column of a CSV file contains a double quotation mark (''), you must replace the double quotation mark with a pair of double quotation marks (""). Otherwise, the data in the CVS file may be parsed incorrectly.

Note on connectivity testing

  • If the data store is a user-created one deployed on Elastic Compute Service (ECS) instances that reside on a classic network, we recommend that you use a custom resource group to run sync nodes that use the connection. The default resource group does not guarantee that it can connect to the data store over the network.
  • If the data store is deployed in a Virtual Private Cloud (VPC), the connectivity test is not supported. You can click Complete without testing the connectivity.

What to do next

Now you have learned how to configure an OSS connection. You can proceed with the next tutorial. In the next tutorial, you will learn how to configure OSS Reader and Writer. For more information, see Configure OSS Reader and Configure OSS Writer.