All Products
Search
Document Center

ApsaraDB for ClickHouse:Data dictionaries

Last Updated:Oct 16, 2024

In data warehousing development and data analysis, when you need to store constant data in business fact tables and business data in business dimension tables, you can use data dictionaries instead of dimension tables to store data. This way, you do not need to execute a JOIN statement. This makes queries more efficient. This topic describes how to create, modify, and use an external dictionary of ApsaraDB for ClickHouse.

Prerequisites

The data source and the destination cluster use the same virtual private cloud (VPC) and reside in the same region. The IP address of the data source is added to the whitelist of the destination cluster, and the IP address of the destination cluster is added to the whitelist of the data source. Otherwise, resolve the network issue first. For more information, see the What do I do if a connection fails to be established between the destination cluster and the data source? section of the FAQ topic.

Note

You can execute the SELECT * FROM system.clusters; statement to query the IP address of an ApsaraDB for ClickHouse cluster.

For more information about how to configure a whitelist for an ApsaraDB for ClickHouse cluster, see Configure a whitelist.

Note

You cannot manage the data dictionaries of ApsaraDB for ClickHouse clusters of V21.8 or later in the console. You can use SQL statements to view and create a data dictionary. For more information, see CREATE DICTIONARY.

Create and modify an external dictionary

  1. Log on to the ApsaraDB for ClickHouse console.

  2. In the top navigation bar, select the region where the cluster that you want to manage is deployed.

  3. On the Clusters page, click the Clusters of Community-compatible Edition tab, find the cluster that you want to manage, and then click the ID of the cluster.

  4. In the left-side navigation pane, click Dictionary Management to go to the Dictionary Management page.

  5. In the upper-right corner, click Add Dictionary Configuration.

  6. In the Add Dictionary Configuration dialog box that appears, edit the template and specify the dictionary content.

    The following table describes only the key parameters for dictionary configuration. For more information about the parameters, see Dictionaries.

    Parameter

    Description

    <name>

    The name of the external dictionary that you want to create. The name must be globally unique.

    <source>

    The data source of the dictionary. ApsaraDB for ClickHouse supports the following data sources:

    • MySQL

    • ClickHouse

    <lifetime>

    The interval at which data is updated in the dictionary. Unit: seconds.

    <layout>

    The format of data stored in memory. ApsaraDB for ClickHouse supports the following data formats for an external dictionary:

    • Numeric key

      • flat

      • hashed

      • range_hashed

      • cache

    • Composite key

      • complex_key_hashed

      • complex_key_cache

      • ip_trie

    <structure>

    The data structure of the dictionary.

    Note

    You can add only one node at a time.

  7. Click OK. The data dictionary is created.

    You can click a link in the Actions column of the created dictionary to view, modify, or delete the dictionary.

Execute a DDL statement to create an external dictionary table

If the version of your ApsaraDB for ClickHouse cluster is later than V20.8, you can execute a DDL statement to create an external dictionary table. You can view the version number of an ApsaraDB for ClickHouse cluster in the ApsaraDB for ClickHouse console. Click the cluster ID to go to the Cluster Information page. In the Cluster Properties section, you can view the version number.

CREATE DICTIONARY [IF NOT EXISTS] [db.]dictionary_name [ON CLUSTER cluster]
(
    key1 type1  [DEFAULT|EXPRESSION expr1] [HIERARCHICAL|INJECTIVE|IS_OBJECT_ID],
    key2 type2  [DEFAULT|EXPRESSION expr2] [HIERARCHICAL|INJECTIVE|IS_OBJECT_ID],
    attr1 type2 [DEFAULT|EXPRESSION expr3],
    attr2 type2 [DEFAULT|EXPRESSION expr4]
)
PRIMARY KEY key1, key2
SOURCE(SOURCE_NAME([param1 value1 ... paramN valueN]))
LAYOUT(LAYOUT_NAME([param_name param_value]))
LIFETIME([MIN val1] MAX val2)

Use an external dictionary

Query an external dictionary table

SELECT
    name,
    type,
    key,
    attribute.names,
    attribute.types,
    bytes_allocated,
    element_count,
    source
FROM system.dictionaries

Query data in an external dictionary

Use the dictGet function to query data in a dictionary. For more information about the dictGet function, see Functions for Working with Dictionaries.

dictGet(<dict_name>, <attr_name>, <id_expr>)
dictGetOrDefault(<dict_name>, <attr_name>, <id_expr>, <default_value_expr>)