This topic describes how to use DLA CU Edition to read data from and write data to a self-managed Kudu database by using standard SQL statements.

Prerequisites

  • DLA CU Edition is activated. For more information, see Quick start for the DLA Presto-compatible SQL CU edition. Only DLA CU Edition can access a Kudu database.
  • The network of the virtual cluster (VC) must be in the same VPC as the Kudu cluster. For more information about the network of the VC, see Data source network.
  • DLA CU Edition cannot access the Kudu clusters that have enabled Kerberos authentication. To access these clusters, submit a ticket or contact DLA technical support through DingTalk.

Preparations

Before you read data from and write data to a Kudu database, you must create a test table in the Kudu database. For more information about the design of tables in a Kudu database, visit the following link: https://kudu.apache.org/docs/schema_design.html

Note DLA does not allow you to create tables in the Kudu database by using SQL statements. It only allows you to associate an existing table in the Kudu database with that in DLA by using SQL statements. Specifically, you must create a table in the Kudu database, and then use the table creation statement of DLA to associate the created table with a table in DLA.
The following code snippet demonstrates how to create a table in Java:
        String KUDU_MASTERS = "master-1:7051,master-2:7051,master-3:7051";
        String tableName = "users";
    
    KuduClient client = new KuduClient.KuduClientBuilder(KUDU_MASTERS).build();

    // Set up a simple schema.
    List<ColumnSchema> columns = new ArrayList<>(3);
    columns.add(new ColumnSchema.ColumnSchemaBuilder("user_id", Type.INT32)
            .key(true)
            .build());
    columns.add(new ColumnSchema.ColumnSchemaBuilder("first_name", Type.STRING).nullable(true)
            .build());
    columns.add(new ColumnSchema.ColumnSchemaBuilder("last_name", Type.STRING).nullable(true)
            .build());
    Schema schema = new Schema(columns);

    // Set up the partition schema, which distributes rows to different tablets by hash.
    // Kudu also supports partitioning by key range. Hash and range partitioning can be combined.
    // For more information, see http://kudu.apache.org/docs/schema_design.html.
    CreateTableOptions cto = new CreateTableOptions();
    List<String> hashKeys = new ArrayList<>(1);
    hashKeys.add("user_id");
    int numBuckets = 2;
    cto.addHashPartitions(hashKeys, numBuckets);
    cto.setNumReplicas(1);

    // Create the table.
    client.createTable(tableName, schema, cto);
    System.out.println("Created table " + tableName);

Procedure

  1. Connect to DLA.
  2. Create a Kudu database.
    CREATE DATABASE `kudu_test`
    WITH DBPROPERTIES (
        catalog = 'kudu',
        location = 'master-1:7051,master-2:7051,master-3:7051'
    );
    Parameters:
    • catalog: the schema you created. The value kudu indicates that a Kudu schema is created.
    • location: the addresses of the Kudu master node, which are separated by commas (,).
  3. Create a table.
    CREATE EXTERNAL TABLE users (
    user_id int primary key,
      first_name varchar,
      last_name varchar);
    Notice The table name and sequence and data type of fields must be the same as those of the table in the Kudu database.
  4. Access data.

    Add hint: /*+cluster=your-vc-name*/ to the SQL statements that are used to access tables in the Kudu database. This step is required because only CUs of DLA CU Edition can access the Kudu database. After this hint is added, the SQL statements can be properly executed on DLA CU Edition.

    Example:
    mysql> /*+ cluster=vc-test */ insert into kudu_it_db_vc.users values(1, 'Donald', 'Duck');
    +------+
    | rows |
    +------+
    |    1 |
    +------+
    1 row in set (0.46 sec)
    
    mysql> /*+ cluster=vc-test */ select user_id,first_name,last_name from kudu_it_db_vc.users where user_id = 1;
    +---------+------------+-----------+
    | user_id | first_name | last_name |
    +---------+------------+-----------+
    |       1 | Donald     | Duck      |
    +---------+------------+-----------+
    1 row in set (0.43 sec)

For more information about SQL statements, see SQL Reference.