After you integrate Impala with Kudu, you can use Impala to access data tables in Kudu. This topic describes how to integrate Impala with Kudu.

Prerequisites

An E-MapReduce (EMR) cluster is created, and Impala and Kudu are selected from the optional services when you create the cluster. For more information, see Create a cluster.

Integrate Impala with Kudu

You can integrate Impala with Kudu by using one of the following methods:
  • Method 1: Use a command-line interface (CLI)
    1. Connect to Impala. For more information, see Use the Impala shell tool.
    2. Run the following command to create a table.
      kudu.master_addresses in the code specifies a Kudu cluster. Example:
      create table my_first_table
          (
            id bigint,
            name string,
            primary key(id)
          )
          partition by hash partitions 16
          stored as kudu
          tblproperties(
            'kudu.master_addresses' = 'emr-header-1:7051,emr-header-2:7051,emr-header-3:7051',
            'kudu.num_tablet_replicas' = '1');
      Note In this example, a table named my_first_table is created. You can use a different table name.
      If the following information is returned, the table is created. create_table
    3. Optional:Run the following command to insert data into the table:
      insert into my_first_table values(1,"ss");
    4. Optional:Run the following command to query data in the table:
      select * from my_first_table;
      The following information is returned. select
      Note If you want to drop the table, run the drop table my_first_table; command.
  • Method 2: Use the EMR console
    1. Add a configuration item in the EMR console.
      1. Go to the Impala service page and click the Configure tab. In the Service Configuration section, click the impalad.flgs tab.
      2. Click Custom Configuration in the upper-right corner.
      3. In the Add Configuration Item dialog box, add the kudu_master_hosts parameter and set the value to emr-header-1:7051,emr-header-2:7051,emr-header-3:7051 to specify a Kudu cluster. add_impala
      4. Click OK.
      5. Repeat Step i to Step iv to add the kudu_master_hosts parameter to the catalogd.flgs tab and set the value to emr-header-1:7051,emr-header-2:7051,emr-header-3:7051.
    2. Save the configurations.
      1. Click Save in the upper-right corner.
      2. In the Confirm Changes dialog box, specify Description and click OK.
    3. Restart the Impala service.
      1. In the upper-right corner of the Impala service page, choose Actions > Restart All Components.
      2. In the Cluster Activities dialog box, specify Description and click OK.
      3. In the Confirm message, click OK.
    4. Optional:Log on to the cluster to check whether Impala is integrated with Kudu.
      1. Connect to Impala. For more information, see Use the Impala shell tool.
      2. Run the following command to create a table.
        Example:
        create table my_first_table
            (
              id bigint,
              name string,
              primary key(id)
            )
            partition by hash partitions 16
            stored as kudu
            tblproperties(
              'kudu.num_tablet_replicas' = '1');
        If the following information is returned, the table is created. This indicates that Impala is integrated with Kudu. create_table2