A Hadoop Distributed File System (HDFS) connection allows you to read data from and write data to HDFS by using HDFS Reader and Writer. You can configure sync nodes for HDFS by using the code editor.

Workspaces in standard mode support the connection isolation feature. You can add connections for the development and production environments separately and isolate the connections to protect your data security.


  1. Log on to the DataWorks console as a workspace administrator, find the target workspace, and then click Data Integration in the Actions column.
  2. In the left-side navigation pane, click Connection to go to the Workspace Manage > Data Source page.
  3. On the Data Source page that appears, click Add Connection in the upper-right corner.
  4. In the Add Connection dialog box that appears, click HDFS in the Semi-structured storage section.
  5. In the Add HDFS Connection dialog box that appears, set the parameters.
    Parameter Description
    Connection Name The name of the connection. The name can contain letters, digits, and underscores (_) and must start with a letter.
    Description The description of the connection. The description cannot exceed 80 characters in length.
    Applicable Environment The environment in which the connection is used. Valid values: Development and Production.
    Note This parameter is available only when the workspace is in standard mode.
    DefaultFS The address of the NameNode of HDFS, in the format of hdfs://ServerIP:Port.
  6. Click Test Connection.
  7. After the connectivity test is passed, click Complete.

    The connectivity test checks whether the entered information is correct.

Note on connectivity testing

  • If the data store is a user-created one deployed on Elastic Compute Service (ECS) instances that reside on a classic network, we recommend that you use a custom resource group to run sync nodes that use the connection. The default resource group does not guarantee that it can connect to the data store over the network.
  • If the data store is deployed in a Virtual Private Cloud (VPC), the connectivity test is not supported. You can click Complete without testing the connectivity.

What to do next

Now you have learned how to configure an HDFS connection. You can proceed with the next tutorial. In the next tutorial, you will learn how to configure HDFS Reader and Writer. For more information, see Configure HDFS Reader and Configure HDFS Writer.