All Products
Search
Document Center

E-MapReduce:JindoFS external client

Last Updated:Mar 26, 2026

When you need to access JindoFS data from outside an E-MapReduce (EMR) cluster—for example, from a local development machine or an independent Spark or Hadoop application server—the JindoFS external client lets you read and write JindoFS data without joining the cluster. Because the external client is compatible with Hadoop Distributed File System (HDFS), any application that can talk to HDFS can use it immediately.

Important

The external client works only when JindoFS runs in block storage mode. If JindoFS is in cache mode, use the standard Object Storage Service (OSS) client instead—cache mode is fully compatible with OSS semantics.

Note

The external client connects to JindoFS through Namespace Service. Because it cannot access locally cached data in the EMR cluster, read and write performance is lower than accessing JindoFS from inside the cluster.

Prerequisites

Before you begin, ensure that you have:

  • JindoFS configured in block storage mode (see Use the block storage mode)

  • Network connectivity from your external host to the EMR cluster (required to reach Namespace Service)

Set up the external client

Step 1: Get the Bigboot package

In the EMR cluster, locate the Bigboot package at /usr/lib/bigboot-current and copy it to your external host.

Note

Bigboot is built on native code and may be incompatible with some operating systems. Verify compatibility with your OS before proceeding.

Step 2: Configure the environment

On your external host, set the BIGBOOT_HOME environment variable to the Bigboot installation directory, then add the ext and lib subdirectories to the classpath of your big data processing component (such as Hadoop or Spark).

Step 3: Copy the configuration file

Copy bigboot.cfg.external from /usr/lib/bigboot-current/conf/ on the EMR cluster to the conf/ directory under BIGBOOT_HOME on your external host.

Step 4: Configure Namespace Service

Open bigboot.cfg.external and set the following parameters to point to the Namespace Service in your EMR cluster. These values are already set in the Bigboot configuration file of the EMR cluster—check the cluster configuration to get them.

Parameter Example value Description
client.namespace.rpc.port 8101 Port that Namespace Service listens on
client.namespace.rpc.address {RPC_Address} Hostname or IP address of Namespace Service

Step 5: Configure OSS data access

In the same bigboot.cfg.external file, add the following parameters for each namespace you want to access. Replace {YourNamespace} with the actual namespace name (this example uses test).

Parameter Example value Description
client.namespaces.{YourNamespace}.oss.access.bucket {YourOssBucket} Name of the OSS bucket backing this namespace
client.namespaces.{YourNamespace}.oss.access.endpoint {YourOssEndpoint} OSS endpoint for the bucket's region
client.namespaces.{YourNamespace}.oss.access.key {YourOssAccessKeyID} AccessKey ID with read/write access to the bucket
client.namespaces.{YourNamespace}.oss.access.secret {YourOssAccessKeySecret} AccessKey secret for the AccessKey ID

A complete configuration for the test namespace looks like this:

client.namespace.rpc.port = 8101
client.namespace.rpc.address = {RPC_Address}
client.namespaces.test.oss.access.bucket = {YourOssBucket}
client.namespaces.test.oss.access.endpoint = {YourOssEndpoint}
client.namespaces.test.oss.access.key = {YourOssAccessKeyID}
client.namespaces.test.oss.access.secret = {YourOssAccessKeySecret}

Verify the configuration

Run the following commands to confirm that the external client is set up correctly.

Check that the namespace is accessible:

hdfs dfs -ls jfs://test/

A successful response lists the contents of the namespace root:

Found 0 items

Found 0 items confirms the connection succeeded even when the namespace has no files yet. If you see a connection error, verify that client.namespace.rpc.address and client.namespace.rpc.port match the values in the EMR cluster configuration.

Test upload and download:

hdfs dfs -put /etc/hosts jfs://test/
hdfs dfs -get jfs://test/hosts

A successful upload produces no output. A successful download saves the file to your current directory. If either command fails, verify that the OSS access parameters (bucket, endpoint, AccessKey ID, and AccessKey secret) are correct.

What's next