edit-icon download-icon

HDFS data source config

Last Updated: Apr 03, 2018

HDFS, as a distributed file system, allows both reading and writing data, and supports configuring synchronization in script mode.

Procedure

  1. Log on to the DataWorks console as a developer, and click Enter Project, as shown in the following figure.

    EnterProject

  2. Click Data Integration from the upper menu and navigate to the Data Sources page.

    DataSources

  3. Click New Source.

  4. In the dialog box, select HDFS as the data source type.

  5. Complete the configuration items for the data source.

    20

    • Name: A combination of letters, numbers, and underscores (). It must begin with a letter or an underscore (),the length must not exceed 60 characters.

    • Description: A brief description of the data source, which must not exceed 80 characters.

    • defaultFS: The node address of nameNode in the format of hdfs://ServerIP:Port.

  6. Click Test Connectivity.

  7. When the connectivity test is passed, click Complete.

Connectivity test description

  • In a classic network, connectivity test is provided to verify whether the JDBC URL and Username/Password entered are correct.

  • Currently, the connectivity test for data sources is unavailable in VPC environments.

What to do next

Now you have learned how to configure the HDFS data source. The article explains how to configure the HDFS Writer plug‑in later. For more information, see Configure HDFS Writer.

Thank you! We've received your feedback.