All Products
Search
Document Center

Object Storage Service:Access OSS-HDFS by using RootPolicy

Last Updated:Jan 04, 2024

OSS-HDFS supports RootPolicy. You can use RootPolicy to configure a custom prefix for OSS-HDFS. This way, jobs can run on OSS-HDFS without modifying the original access prefix hdfs://.

Prerequisites

Procedure

  1. Configure environment variables.

    1. Connect to an ECS instance. For more information, see Connect to an ECS instance.

    2. Go to the bin directory of the installed JindoSDK JAR package.

      cd jindosdk-x.x.x/bin/
      Note

      x.x.x indicates the version number of the JindoSDK JAR package.

    3. Grant read and write permissions to the jindo-util file in the bin directory.

      chmod 700 jindo-util
    4. Rename the jindo-util file to jindo.

      mv jindo-util jindo
    5. Create a configuration file named jindosdk.cfg, and then add the following parameters to the configuration file.

      [common] Retain the following default configurations. 
      logger.dir = /tmp/jindo-util/
      logger.sync = false
      logger.consolelogger = false
      logger.level = 0
      logger.verbose = 0
      logger.cleaner.enable = true
      hadoopConf.enable = false
      
      [jindosdk] Specify the following parameters. 
      <!-- In this example, the China (Hangzhou) region is used. Specify your actual region.  -->
      fs.oss.endpoint = cn-hangzhou.oss-dls.aliyuncs.com
      <! -- Configure the AccessKey ID and AccessKey secret that is used to access OSS-HDFS.  -->
      fs.oss.accessKeyId = LTAI5tJCTj5SxJepqxQ2****   
      fs.oss.accessKeySecret = i0uLwyd0mHxXetZo7b4j4CXP16****                                        
    6. Configure environment variables.

      export JINDOSDK_CONF_DIR=<JINDOSDK_CONF_DIR>

      Set <JINDOSDK_CONF_DIR> to the absolute path of the jindosdk.cfg configuration file.

  2. Configure RootPolicy.

    Run the following SetRootPolicy command to specify a registered address that contains a custom prefix for a bucket:

    jindo admin -setRootPolicy oss://<bucket_name>.<dls_endpoint>/ hdfs://<your_ns_name>/

    The following table describes the parameters in the SetRootPolicy command.

    Parameter

    Description

    bucket_name

    The name of the bucket for which OSS-HDFS is enabled.

    dls_endpoint

    The endpoint of the region in which the bucket for which OSS-HDFS is enabled. Example: cn-hangzhou.oss-dls.aliyuncs.com.

    If you do not want to repeatedly add the <dls_endpoint> parameter to the SetRootPolicy command each time you run RootPolicy, you can use one of the following methods to add configuration items to the core-site.xml file of Hadoop:

    • Method 1:

      <configuration>
          <property>
              <name>fs.oss.endpoint</name>
              <value><dls_endpoint></value>
          </property>
      </configuration>
    • Method 2:

      <configuration> 
       <property>
              <name>fs.oss.bucket.<bucket_name>.endpoint</name>
              <value><dls_endpoint></value>
          </property>
      </configuration>

    your_ns_name

    The custom nsname that is used to access OSS-HDFS. A non-empty string is supported, such as test. The current version supports only the root directory.

  3. Configure Access Policy discovery address and Scheme implementation class.

    You must configure the following parameters in the core-site.xml file of Hadoop:

    <configuration>
        <property>
            <name>fs.accessPolicies.discovery</name>
            <value>oss://<bucket_name>.<dls_endpoint>/</value>
        </property>
        <property>
            <name>fs.AbstractFileSystem.hdfs.impl</name>
            <value>com.aliyun.jindodata.hdfs.HDFS</value>
        </property>
        <property>
            <name>fs.hdfs.impl</name>
            <value>com.aliyun.jindodata.hdfs.JindoHdfsFileSystem</value>
        </property>
    </configuration>

    If you want to configure Access Policy discovery addresses and Scheme implementation classes for multiple buckets, separate the buckets with commas (,).

  4. Run the following command to check whether RootPolicy is successfully configured:

    hadoop fs -ls hdfs://<your_ns_name>/

    If the following results are returned, RootPolicy is successfully configured:

    drwxr-x--x   - hdfs  hadoop          0 2023-01-05 12:27 hdfs://<your_ns_name>/apps
    drwxrwxrwx   - spark hadoop          0 2023-01-05 12:27 hdfs://<your_ns_name>/spark-history
    drwxrwxrwx   - hdfs  hadoop          0 2023-01-05 12:27 hdfs://<your_ns_name>/tmp
    drwxrwxrwx   - hdfs  hadoop          0 2023-01-05 12:27 hdfs://<your_ns_name>/user
  5. Use a custom prefix to access OSS-HDFS.

    After you restart services such as Hive and Spark, you can access OSS-HDFS by using a custom prefix.

  6. Optional. Use RootPolicy for other purposes.

    • List all registered addresses that contain a custom prefix specified for a bucket

      Run the following listAccessPolicies command to list all registered addresses that contain a custom prefix specified for a bucket:

      jindo admin -listAccessPolicies oss://<bucket_name>.<dls_endpoint>/
    • Delete all registered addresses that contain a custom prefix specified for a bucket:

      Run the following unsetRootPolicy command to delete all registered addresses that contain a custom prefix specified for a bucket:

      jindo admin -unsetRootPolicy oss://<bucket_name>.<dls_endpoint>/ hdfs://<your_ns_name>/

For more information about the RootPolicy commands, see Jindo CLI user guide.