OSS-HDFS supports RootPolicy. You can use RootPolicy to configure a custom prefix for OSS-HDFS. This way, jobs can run on OSS-HDFS without modifying the original access prefix hdfs://
.
Prerequisites
JindoSDK 4.6.0 or later is used. To download the package, visit GitHub.
Procedure
Configure RootPolicy.
Run the following SetRootPolicy command to specify an address that contains a custom prefix for a bucket:
jindo admin -setRootPolicy oss://<bucket_name>.<dls_endpoint>/ hdfs://<your_ns_name>/
The following items describe the parameters of the SetRootPolicy command:
<bucket_name>: Specify the name of the bucket for which OSS-HDFS is enabled.
<dls_endpoint>: Specify the endpoint of the region in which OSS-HDFS is enabled. Example:
cn-hangzhou.oss-dls.aliyuncs.com
.If you do not want to repeatedly add the <dls_endpoint> parameter to the SetRootPolicy command each time you run a RootPolicy command, you can use one of the following methods to add configuration items to the
core-site.xml
file of Hadoop:Method 1:
<configuration> <property> <name>fs.oss.endpoint</name> <value><dls_endpoint></value> </property> </configuration>
Method 2:
<configuration> <property> <name>fs.oss.bucket.<bucket_name>.endpoint</name> <value><dls_endpoint></value> </property> </configuration>
After the preceding configuration is complete, the preceding command can be simplified into the following format:
jindo admin -setRootPolicy oss://<bucket_name>/ hdfs://<your_ns_name>/
<your_ns_name>: Specify the custom nsname that is used to access OSS-HDFS. A non-empty string is supported, such as
test
. The current version supports only the root directory.
Configure Access Policy discovery address and Scheme implementation class.
Run the following command to check whether RootPolicy is successfully configured:
hadoop fs -ls hdfs://<your_ns_name>/
If the following results are returned, RootPolicy is successfully configured:
drwxr-x--x - hdfs hadoop 0 2023-01-05 12:27 hdfs://<your_ns_name>/apps drwxrwxrwx - spark hadoop 0 2023-01-05 12:27 hdfs://<your_ns_name>/spark-history drwxrwxrwx - hdfs hadoop 0 2023-01-05 12:27 hdfs://<your_ns_name>/tmp drwxrwxrwx - hdfs hadoop 0 2023-01-05 12:27 hdfs://<your_ns_name>/user
Use a custom prefix to access OSS-HDFS.
After you restart services such as Hive and Spark, you can access OSS-HDFS by using a custom prefix.
(Optional) Use RootPolicy for other purposes.
List all addresses that contain a custom prefix specified for a bucket
Run the following listAccessPolicies command to list all registered prefix addresses of a specific bucket:
jindo admin -listAccessPolicies oss://<bucket_name>.<dls_endpoint>/
Delete all registered prefix addresses of a specific bucket
Run the following unsetRootPolicy command to delete all registered prefix addresses of a specified bucket:
jindo admin -unsetRootPolicy oss://<bucket_name>.<dls_endpoint>/ hdfs://<your_ns_name>/
You must configure the following parameters in the core-site.xml file of Hadoop:
<configuration>
<property>
<name>fs.accessPolicies.discovery</name>
<value>oss://<bucket_name>.<dls_endpoint>/</value>
</property>
<property>
<name>fs.AbstractFileSystem.hdfs.impl</name>
<value>com.aliyun.jindodata.hdfs.HDFS</value>
</property>
<property>
<name>fs.hdfs.impl</name>
<value>com.aliyun.jindodata.hdfs.JindoHdfsFileSystem</value>
</property>
</configuration>
If you want to configure Access Policy discovery addresses and Scheme implementation classes for multiple buckets, separate the buckets with commas (,
).