All Products
Search
Document Center

E-MapReduce:Getting started

Last Updated:May 29, 2023

This topic describes how to use StarRocks clusters in Alibaba Cloud E-MapReduce (EMR) to create and query tables and manage internal and external data.

Prerequisites

A StarRocks cluster is created. For more information, see Create a StarRocks cluster.

Create and query a table

  1. Log on to the StarRocks cluster by using Secure Shell (SSH). For more information, see Log on to a cluster.

  2. Run the following command to access the StarRocks cluster:

    mysql -h127.0.0.1  -P 9030 -uroot
  3. Run the following command to create a database if no database exists:

    CREATE DATABASE IF NOT EXISTS load_test;
    USE load_test;
  4. Run the following command to create a table:

     CREATE TABLE insert_wiki_edit
    (
        event_time DATETIME,
        channel VARCHAR(32) DEFAULT '',
        user VARCHAR(128) DEFAULT '',
        is_anonymous TINYINT DEFAULT '0',
        is_minor TINYINT DEFAULT '0',
        is_new TINYINT DEFAULT '0',
        is_robot TINYINT DEFAULT '0',
        is_unpatrolled TINYINT DEFAULT '0',
        delta INT SUM DEFAULT '0',
        added INT SUM DEFAULT '0',
        deleted INT SUM DEFAULT '0'
    )
    AGGREGATE KEY(event_time, channel, user, is_anonymous, is_minor, is_new, is_robot, is_unpatrolled)
    PARTITION BY RANGE(event_time)
    (
        PARTITION p06 VALUES LESS THAN ('2015-09-12 06:00:00'),
        PARTITION p12 VALUES LESS THAN ('2015-09-12 12:00:00'),
        PARTITION p18 VALUES LESS THAN ('2015-09-12 18:00:00'),
        PARTITION p24 VALUES LESS THAN ('2015-09-13 00:00:00')
    )
    DISTRIBUTED BY HASH(user) BUCKETS 10
    PROPERTIES("replication_num" = "1");
  5. Run the following command to import test data:

    INSERT INTO insert_wiki_edit VALUES("2015-09-12 00:00:00","#en.wikipedia","GELongstreet",0,0,0,0,0,36,36,0),("2015-09-12 00:00:00","#ca.wikipedia","PereBot",0,1,0,1,0,17,17,0);
  6. Run the following command to query the test data:

    select * from insert_wiki_edit;

    The following information is returned:

    +---------------------+---------------+--------------+--------------+----------+--------+----------+----------------+-------+-------+---------+
    | event_time          | channel       | user         | is_anonymous | is_minor | is_new | is_robot | is_unpatrolled | delta | added | deleted |
    +---------------------+---------------+--------------+--------------+----------+--------+----------+----------------+-------+-------+---------+
    | 2015-09-12 00:00:00 | #en.wikipedia | GELongstreet |            0 |        0 |      0 |        0 |              0 |    36 |    36 |       0 |
    | 2015-09-12 00:00:00 | #ca.wikipedia | PereBot      |            0 |        1 |      0 |        1 |              0 |    17 |    17 |       0 |
    +---------------------+---------------+--------------+--------------+----------+--------+----------+----------------+-------+-------+---------+
    2 rows in set (0.16 sec)

Use catalogs to manage internal and external data

StarRocks cluster in EMR V5.8.0 and later allows you to use catalogs to manage internal and external data. StarRocks clusters V2.3 and later provides the following two types of catalogs:

  • Internal catalogs: The catalogs are used to manage internal databases and tables in StarRocks clusters. Databases and tables that are created by executing the CREATE DATABASE and CREATE TABLE statements are stored in internal catalogs. Each StarRocks cluster has a default internal catalog named default_catalog. You cannot change the name of the default internal catalog or create additional internal catalogs in a StarRocks cluster.

  • External catalogs: The catalogs are used to manage data from external data sources. When you create an external catalog, you must specify the information about accessing an external data source. After the external catalog is created, you can query data from the external data source without creating an external table.

For more information, see Overview of catalogs.