All Products
Search
Document Center

E-MapReduce:Use Hive to access Phoenix data in EMR

Last Updated:Feb 26, 2024

In Alibaba Cloud E-MapReduce (EMR), you can use a Hive external table to access and process data that is stored in Phoenix. This topic describes how to use Hive in your EMR cluster to access EMR Phoenix data.

Prerequisites

  • A custom cluster that contains the Hive, HBase, ZooKeeper, and Phoenix services is created. For more information, see Create a cluster.

    Note EMR V4.X versions and EMR V5.X versions do not support Phoenix. Therefore, this topic applies only to EMR V3.X versions.
  • You have logged on to the cluster. For more information, see Log on to a cluster.

Limits

This topic applies to only EMR clusters of V3.X.

Procedure

If you want to use Hive to access the existing Phoenix table phoenix_hive_create_internal, you can create an external table in Hive and establish a mapping between the Hive external table and the Phoenix table to access data in the Phoenix table.

  1. Run the following command to open the Hive CLI:

    hive
  2. Run the following command to create an external table named ext_table in Hive and establish a mapping between the Hive external table and the Phoenix table:
    create external table ext_table(
      s1 string,
      i1 int,
      f1 float,
      d1 double
    )
    stored by 'org.apache.phoenix.hive.PhoenixStorageHandler'
    tblproperties(
      "phoenix.table.name" = "phoenix_hive_create_internal",
      "phoenix.rowkeys" = "s1, i1",
      "phoenix.column.mapping" = "s1:s1, i1:i1, f1:f1, d1:d1"
    );
  3. Execute the following statement to query data of the Phoenix table in Hive:

    select * from ext_table;

    If the data of the Phoenix table can be queried, you succeeded in using Hive to access Phoenix data.

References