After you integrate Impala with Kudu, you can use Impala to access data tables in Kudu. This topic describes how to integrate Impala with Kudu.
Prerequisites
An E-MapReduce (EMR) cluster is created, and Impala and Kudu are selected from the optional services when you create the cluster. For more information, see Create a cluster.
Procedure
Use the EMR console
On the Configure tab of the Impala service page, add configuration items. For more information, see Manage configuration items.
On the Configure tab of the Impala service page, click impalad.flgs.
On the impalad.flgs tab, click Add Configuration Item to add a configuration item whose name is kudu_master_hosts and value is master-1-1:7051.
Notekudu_master_hosts specifies the name and port number of the master node in the Kudu cluster that is connected to Impala. If a Kudu cluster contains multiple master nodes, separate the names and port numbers of the master nodes with commas (,). Example: master-1-1:7051,master-1-2:7051,master-1-3:7051.
Click the catalogd.flgs tab. On the catalogd.flgs tab, click Add Configuration Item to add a configuration item whose name is kudu_master_hosts and value is master-1-1:7051.
Optional. Log on to the cluster to check whether Impala is integrated with Kudu.
Connect to Impala. For more information, see Use the Impala shell tool.
Run the following command to create a table:
create table my_first_table ( id bigint, name string, primary key(id) ) partition by hash partitions 16 stored as kudu tblproperties( 'kudu.num_tablet_replicas' = '1');
If the output contains
Table has been created.
, the table is created. This indicates that Impala is integrated with Kudu.
Use a CLI
Connect to Impala. For more information, see Use the Impala shell tool.
Run the following command to create a table.
kudu.master_addresses
in the code specifies a Kudu cluster. Example:create table my_first_table ( id bigint, name string, primary key(id) ) partition by hash partitions 16 stored as kudu tblproperties( 'kudu.master_addresses' = 'master-1-1:7051', 'kudu.num_tablet_replicas' = '1');
NoteParameters in the sample code:
my_first_table
: The name of the table. You can specify a custom name.kudu.master_addresses
: specifies the master node. If your cluster contains multiple master nodes, separate the names and port numbers of the master nodes with commas (,). Example:master-1-1:7051,master-1-2:7051,master-1-3:7051
. If your cluster is a Hadoop cluster, change master-1-1 toemr-header-1
.
If the output contains
Table has been created.
, the table is created. This indicates that Impala is integrated with Kudu.Optional. Run the following command to insert data into the table:
insert into my_first_table values(1,"ss");
Optional. Run the following command to query data in the table:
select * from my_first_table;
The following output is returned:
+----+------+ | id | name | +----+------+ | 1 | ss | +----+------+
NoteIf you want to drop the table, run the
drop table my_first_table;
command.