All Products
Search
Document Center

Object Storage Service:Use OSS-HDFS as the underlying storage for HBase

Last Updated:Aug 14, 2025

HBase is a real-time database in the Hadoop ecosystem that provides high write performance. The OSS-HDFS service from Alibaba Cloud offers a bucket type that is fully compatible with Hadoop Distributed File System (HDFS) interfaces. JindoSDK enables HBase to use the OSS-HDFS service as its underlying storage and supports storing Write-Ahead Logging (WAL) files to separate storage from compute. Compared to local HDFS storage, the OSS-HDFS service provides greater flexibility and reduces O&M costs.

Prerequisites

Procedure

  1. Connect to an ECS instance. For more information, see Connect to an instance.

  2. Configure JindoSDK.

    1. Download the latest version of the JindoSDK JAR package. For the download link, see GitHub.

    2. Decompress the downloaded package.

      The following example shows how to decompress jindosdk-x.x.x-linux.tar.gz. If you use a different version of JindoSDK, replace the package name with the actual one.

      tar -zxvf jindosdk-x.x.x-linux.tar.gz -C /usr/lib
      Note

      In this example, x.x.x represents the version number of the JindoSDK JAR package.

    3. Configure JINDOSDK_HOME.

      export JINDOSDK_HOME=/usr/lib/jindosdk-x.x.x-linux
      export PATH=$JINDOSDK_HOME/bin:$PATH
    4. Configure HADOOP_CLASSPATH.

      export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:${JINDOSDK_HOME}/lib/*
      Important

      Deploy the installation folder and environment variables to all required nodes.

    5. Install the JindoSDK JAR package to the Hadoop classpath.

      cp jindosdk-x.x.x-linux/lib/jindo-core-x.x.x.jar <HADOOP_HOME>/share/hadoop/hdfs/lib/
      cp jindosdk-x.x.x-linux/lib/jindo-sdk-x.x.x.jar <HADOOP_HOME>/share/hadoop/hdfs/lib/
  3. Configure the OSS-HDFS service implementation class and AccessKey.

    1. Configure the OSS-HDFS service implementation class in the HBase core-site.xml file.

      <configuration>
          <property>
              <name>fs.AbstractFileSystem.oss.impl</name>
              <value>com.aliyun.jindodata.oss.JindoOSS</value>
          </property>
      
          <property>
              <name>fs.oss.impl</name>
              <value>com.aliyun.jindodata.oss.JindoOssFileSystem</value>
          </property>
      </configuration>
    2. Add the AccessKey ID and AccessKey secret for your OSS-HDFS bucket to the HBase core-site.xml file.

      <configuration>
          <property>
              <name>fs.oss.accessKeyId</name>
              <value>LTAI********</value>
          </property>
      
          <property>
              <name>fs.oss.accessKeySecret</name>
              <value>KZo1********</value>
          </property>
      </configuration>
  4. Configure the OSS-HDFS service endpoint.

    To access an OSS bucket using the OSS-HDFS service, you must configure an endpoint. The recommended path format is oss://{yourBucketName}.{yourBucketEndpoint}/{path}. For example: oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampleobject.txt. After the configuration is complete, JindoSDK uses the endpoint in the access path to access the OSS-HDFS service interface.

    You can also configure the OSS-HDFS service endpoint in other ways. These configuration methods have an order of precedence. For more information, see Appendix 1: Other ways to configure an endpoint.

  5. Specify the storage path for HBase.

    To specify the storage path for HBase and its WAL files, change the value of the hbase.rootdir parameter in the hbase-site configuration file to an OSS path. The format is oss://bucket.endpoint/hbase-root-dir.

    Important

    Before you release the cluster, disable the table to ensure that all WAL files are flushed to HFiles.