This topic describes how to deploy JindoSDK in an environment other than E-MapReduce (EMR).

Prerequisites

An Elastic Compute Service (ECS) instance is connected. For more information, see Connect to an instance.

Deploy JindoSDK

  1. Run the following command to download the JAR package of JindoSDK 4.6.2:
    wget https://jindodata-binary.oss-cn-shanghai.aliyuncs.com/release/4.6.2/jindosdk-4.6.2.tar.gz
  2. Run the following command to decompress the JindoSDK JAR package:
    tar zxvf jindosdk-4.6.2.tar.gz
  3. Configure environment variables. For example, run the following commands to decompress the installation package to the /usr/lib/jindosdk-4.6.2 directory:
    export JINDOSDK_HOME=/usr/lib/jindosdk-4.6.2
    export JINDOSDK_CONF_DIR=${JINDOSDK_HOME}/conf
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${JINDOSDK_HOME}/lib/native
    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:${JINDOSDK_HOME}/lib/*
Important You must deploy the installation package and environment variables on all required nodes.

Deploy JindoSDK by using the configuration file of Hadoop

Perform the following steps to configure the Object Storage Service (OSS) or OSS-HDFS implementation class and the AccessKey pair.

  1. Run the following command to open the core-site.xml configuration file of Hadoop:
    vim /usr/local/hadoop/etc/hadoop/core-site.xml
  2. Configure the OSS or OSS-HDFS implementation class in the core-site.xml configuration file of Hadoop.
    <configuration>
        <property>
            <name>fs.AbstractFileSystem.oss.impl</name>
            <value>com.aliyun.jindodata.oss.JindoOSS</value>
        </property>
    
        <property>
            <name>fs.oss.impl</name>
            <value>com.aliyun.jindodata.oss.JindoOssFileSystem</value>
        </property>
    </configuration>
  3. In the core-site.xml configuration file of Hadoop, specify the AccessKey ID and AccessKey secret that you want to use to access the desired bucket of OSS or OSS-HDFS.
    <configuration>
        <property>
            <name>fs.oss.accessKeyId</name>
            <value>xxx</value>
        </property>
    
        <property>
            <name>fs.oss.accessKeySecret</name>
            <value>xxx</value>
        </property>
    </configuration>
  4. Configure the endpoint of OSS or OSS-HDFS.

    To access the desired bucket of OSS or OSS-HDFS, you must configure the endpoint of OSS or OSS-HDFS. We recommend that you specify an access path in the oss://<Bucket>.<Endpoint>/<Object> format. Example: oss://examplebucket.cn-hangzhou.oss-dls.aliyuncs.com/exampleobject.txt. After you specify the access path, JindoSDK accesses OSS or OSS-HDFS from the endpoint that is specified in the path. You can also specify a default endpoint in the following simplified format: oss://<Bucket>/<Object>. Example: oss://examplebucket/exampleobject.txt.

    <configuration>
        <property>
            <name>fs.oss.endpoint</name>
            <value>xxx</value>
        </property>
    </configuration>

Deploy JindoSDK by using a configuration file other than the configuration file of Hadoop

When you use non-Hadoop components, such as JindoFuse or Jindo CLI, JindoSDK accesses the directory in which the environment variable JINDOSDK_CONF_DIR is located to read configuration files.

Configuration file

Use the configuration file in the .ini format. In this example, the name of the configuration file after compilation is jindosdk.cfg. The following code shows the configuration items in the configuration file:

[common]
logger.dir = /tmp/jindosdk-log

[jindosdk]
# The endpoint of the created OSS bucket. For example, if the OSS bucket is created in the China (Hangzhou) region, the endpoint is oss-cn-hangzhou.aliyuncs.com. 
# The endpoint of the created OSS-HDFS bucket. For example, if the OSS-HDFS bucket is created in the China (Hangzhou) region, the endpoint is cn-hangzhou.oss-dls.aliyuncs.com. 
fs.oss.endpoint = <your_endpoint>
# The AccessKey ID and AccessKey secret that you want to use to access OSS. An Alibaba Cloud account has the permissions to call all API operations. If the AccessKey pair of your Alibaba Cloud account is leaked, your data may be exposed to high security risks. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the Resource Access Management (RAM) console. 
fs.oss.accessKeyId = <your_key_id>
fs.oss.accessKeySecret = <your_key_secret> 

Access OSS or OSS-HDFS without a password

Before you access OSS or OSS-HDFS without a password, make sure that you use an ECS instance to which the required RAM role is assigned. For more information, see Attach an instance RAM role to an ECS instance.

Sample code:

[common]
logger.dir = /tmp/jindosdk-log

[jindosdk]
# The endpoint of the created OSS bucket. For example, if the OSS bucket is created in the China (Hangzhou) region, the endpoint is oss-cn-hangzhou.aliyuncs.com. 
# The endpoint of the created OSS-HDFS bucket. For example, if the OSS-HDFS bucket is created in the China (Hangzhou) region, the endpoint is cn-hangzhou.oss-dls.aliyuncs.com. 
fs.oss.endpoint = <your_endpoint>
fs.oss.provider.endpoint = ECS_ROLE
fs.oss.provider.format = JSON