In Alibaba Cloud E-MapReduce (EMR), you can use a Hive external table to access and process data that is stored in Phoenix. This topic describes how to use Hive in your EMR cluster to access EMR Phoenix data.
Prerequisites
A custom cluster that contains the Hive, HBase, ZooKeeper, and Phoenix services is created. For more information, see Create a cluster.
Note EMR V4.X versions and EMR V5.X versions do not support Phoenix. Therefore, this topic applies only to EMR V3.X versions.You have logged on to the cluster. For more information, see Log on to a cluster.
Limits
This topic applies to only EMR clusters of V3.X.
Procedure
If you want to use Hive to access the existing Phoenix table phoenix_hive_create_internal, you can create an external table in Hive and establish a mapping between the Hive external table and the Phoenix table to access data in the Phoenix table.
Run the following command to open the Hive CLI:
hive
- Run the following command to create an external table named ext_table in Hive and establish a mapping between the Hive external table and the Phoenix table:
create external table ext_table( s1 string, i1 int, f1 float, d1 double ) stored by 'org.apache.phoenix.hive.PhoenixStorageHandler' tblproperties( "phoenix.table.name" = "phoenix_hive_create_internal", "phoenix.rowkeys" = "s1, i1", "phoenix.column.mapping" = "s1:s1, i1:i1, f1:f1, d1:d1" );
Execute the following statement to query data of the Phoenix table in Hive:
select * from ext_table;
If the data of the Phoenix table can be queried, you succeeded in using Hive to access Phoenix data.
References
For more information about Phoenix, see Phoenix.
For more information about how to connect Phoenix to Hive, see Phoenix Storage Handler for Apache Hive.