When your Hive external tables are stored in JindoFileSystem (JindoFS), Impala and Presto on E-MapReduce can query them without any change to your query logic. To move a table from Hadoop Distributed File System (HDFS) to JindoFS, change only the LOCATION value in your CREATE TABLE statement from an hdfs:// path to a jfs:// path.
Prerequisites
Before you begin, ensure that you have:
-
An E-MapReduce cluster running V3.22.0 or later
-
A JindoFS namespace already configured in your cluster
JindoFS configuration
The examples in this topic use a namespace named emr-jfs with the following configuration:
| Parameter | Value |
|---|---|
jfs.namespaces |
emr-jfs |
jfs.namespaces.emr-jfs.oss.uri |
oss://oss-bucket/oss-dir |
jfs.namespaces.emr-jfs.mode |
block |
Store table data in JindoFS
The only change required to move a Hive external table from HDFS to JindoFS is the LOCATION value. Replace the hdfs:// URI with a jfs:// URI that references your JindoFS namespace.
HDFS location:
Create external table lineitem (L_ORDERKEY INT, L_PARTKEY INT, L_SUPPKEY INT, L_LINENUMBER INT, L_QUANTITY DOUBLE, L_EXTENDEDPRICE DOUBLE, L_DISCOUNT DOUBLE, L_TAX DOUBLE, L_RETURNFLAG STRING, L_LINESTATUS STRING, L_SHIPDATE STRING, L_COMMITDATE STRING, L_RECEIPTDATE STRING, L_SHIPINSTRUCT STRING, L_SHIPMODE STRING, L_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE LOCATION 'hdfs:///tpch_impala/lineitem';
JindoFS location:
Create external table lineitem (L_ORDERKEY INT, L_PARTKEY INT, L_SUPPKEY INT, L_LINENUMBER INT, L_QUANTITY DOUBLE, L_EXTENDEDPRICE DOUBLE, L_DISCOUNT DOUBLE, L_TAX DOUBLE, L_RETURNFLAG STRING, L_LINESTATUS STRING, L_SHIPDATE STRING, L_COMMITDATE STRING, L_RECEIPTDATE STRING, L_SHIPINSTRUCT STRING, L_SHIPMODE STRING, L_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE LOCATION 'jfs://emr-jfs/tpch_impala/lineitem';