This topic describes how to use Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition).

Prerequisites

Limits

Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) has the following limits:
  • Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) does not allow you to create internal tables. You can create only foreign tables to accelerate queries on data in MaxCompute tables.
  • Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) instances share resources. The specifications of an instance cannot be upgraded or downgraded.
  • Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) can scan a maximum of 1,024 partitions from a partitioned table during a single query.
  • Data types such as MAP, LIST, and STRUCT are not supported. For more information about data type mappings between MaxCompute and Hologres, see Data types.
  • Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) can scan a maximum of 200 GB data from a table during a single query.
  • The size of a query statement that you commit cannot exceed 100 KB.
  • By default, the query timeout period is 30 minutes. You can change the query timeout period by modifying the statement_timeout parameter.
  • The total number of the JOIN and GROUP BY keywords in a single query cannot be greater than 20.
  • You cannot use the COPY statement to import or export data.
  • You can create only the hive_compatible extension. You cannot create other extensions. For more information, see GET_JSON_OBJECT.
  • You cannot execute the ANALYZE statement on tables.
  • Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) does not support DML statements and supports only the following DDL statements:
    • DATABASE
      • CREATE DATABASE
      • ALTER DATABASE
      • DROP DATABASE
    • SCHEMA
      • CREATE SCHEMA
      • ALTER SCHEMA
      • DROP SCHEMA
    • FOREIGN TABLE
      • CREATE FOREIGN TABLE
      • DROP FOREIGN TABLE
      • IMPORT FOREIGN SCHEMA

Use Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition)

  1. Purchase a Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) instance.
    1. Log on to the Alibaba Cloud official website by using your Alibaba Cloud account.
    2. Go to the Hologres product page.
    3. Click Buy Now to go to the buy page.
    4. Select Shared Cluster (BI-based MaxCompute Acceleration Edition) as the service type, select a region, and then enter an instance name. Then, click Buy Now.
  2. Create a database in the Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) instance.
    After the instance is purchased, go to the Hologres console to view the status of the instance.
    After the instance is created, the system automatically creates a default database named postgres. The default database is used to monitor and manage the instance that you have created. If you need to create other databases to process business data, perform the following steps:
    1. On the Instances page, click the name of the instance to go to the instance details page.
      You can also click Manage in the Actions column of the instance to go to the instance details page.
    2. In the left-side navigation pane of the instance details page, click Database Management.
    3. On the Database Authorization page, click Create Database in the upper-right corner.
    4. In the Create Database dialog box, enter a database name in the Database Name field and specify whether to enable the simple permission model (SPM) by setting the SPM parameter based on your business requirements. Create a database

      Hologres allows you to use the standard PostgreSQL authorization model and SPM to manage user permissions.

      The standard PostgreSQL authorization model is exactly the same as that in PostgreSQL. This model is also called the expert mode for short. For more information, see Standard PostgreSQL authorization model.

      Backed by its understanding of customers' business and its practical experience, Alibaba Cloud introduced the SPM to Hologres to simplify the management of user permissions. For more information, see Overview.

      If you want simple permission management, we recommend that you select SPM when you create a database.

    5. Click OK.
      You can view the created database on the Database Authorization page.
  3. Connect a development tool to the Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) instance.
    Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) is compatible with PostgreSQL and provides the Java Database Connectivity (JDBC) and Open Database Connectivity (ODBC) drivers. After you create a database, you can connect a development tool to the Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) instance to analyze data. You can select an appropriate development tool based on your business requirements. For more information, see Overview.
    In this example, HoloWeb is used. HoloWeb is a tool that is developed by the Hologres team. To connect HoloWeb to the instance, perform the following steps:
    1. In the Hologres console, click Go to HoloWeb in the left-side navigation pane to go to the HoloWeb console.
    2. Choose Metadata Management > Add Instance. Add Instance
    3. In the Add Instance dialog box, set the parameters. Add Instance dialog box
      Parameter Description
      Name The custom name of the instance.
      Description The description of the instance.
      Network type Select the network type and region of the instance. Valid values of the network type:
      • public network
      • VPC
      Instance name Select the instance that you have created in the selected region based on your business requirements.

      After you select the instance, the Host and Port parameters are automatically set.

      Host The endpoint of the Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) instance.

      You can go to the instance details page in the Hologres console and view the endpoint of the instance.

      Port The port number of the Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) instance.

      You can obtain the port number on the instance details page in the Hologres console.

      Username The system automatically displays the AccessKey ID of your Alibaba Cloud account.

      You can also obtain the AccessKey ID from the Security Management page.

      Password The system automatically displays the AccessKey secret of your Alibaba Cloud account.

      You can also obtain the AccessKey secret from the Security Management page.

      Test Connectivity Check whether the data connection is successful.
      • If the connection is successful, the Test passed message appears.
      • If the connection fails, the Test failed message appears.
    4. Click OK.
  4. Accelerate queries on data in MaxCompute tables.
    After the instance is connected to HoloWeb, you can create foreign tables to accelerate queries on data in MaxCompute tables.
    In this example, a foreign table is created in HoloWeb to query table data of the MaxCompute project public_data in the public dataset. To achieve this, perform the following steps:
    1. Create a foreign table.
      1. Choose Metadata Management > MaxCompute Acceleration > Create Foreign Table to create a foreign table in a visualized manner.
      2. On the New foreign table tab, set the parameters. TableThe following table describes the parameters.
        Parameter Description
        Table name The name of the Hologres foreign table.
        Description The description of the Hologres foreign table.
        Schema The name of the schema.

        You can select the default schema public or a newly created schema.

        Type The service type of the source table.

        The default value is MaxCompute, which cannot be changed.

        Server The foreign server that is used to query data in a MaxCompute table. You can use the odps_server server that is created at the underlying layer of Hologres. For more information, see postgres_fdw.
        Table The name of the source MaxCompute table to be mapped, including the name of the MaxCompute project.

        Format: project.table_name.

        Note
        • You cannot query data in a source table that is stored in a region different from that of the selected Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) instance.
        • After you enter the name of a source table, the fields in the table are displayed. You can synchronize all the fields or part of the fields in the source table to the Hologres foreign table as needed.
      3. After you enter the name of the MaxCompute table to be queried, the fields in the table are displayed. Select fields to be synchronized based on your business requirements and click Submit.
      You can also create an SQL query window and create multiple foreign tables at a time by executing SQL statements in the SQL query window. For example, you can execute the following statement. For more information about how to create an SQL query window, see Manage an SQL query task.
      IMPORT FOREIGN SCHEMA public_data LIMIT to(
        customer,
        customer_address,
        customer_demographics,
        inventory,item,
        date_dim,
        warehouse) 
        FROM server odps_server INTO PUBLIC options(if_table_exist 'update');
    2. Preview the foreign table.
      Double-click the created foreign table. On the table configuration tab, click the Data preview tab to view data in the MaxCompute table that is mapped to the foreign table. Data preview tab
      You can also create an SQL query window and create multiple foreign tables at a time by executing SQL statements in the SQL query window. For more information about how to create an SQL query window, see Manage an SQL query task. For example, you can execute the following statement:
      IMPORT FOREIGN SCHEMA public_data LIMIT to(
        customer,
        customer_address,
        customer_demographics,
        inventory,item,
        date_dim,
        warehouse) 
        FROM server odps_server INTO PUBLIC options(if_table_exist 'update');
    3. Accelerate queries on data in source tables.
      To accelerate queries on data in source tables, use the following sample SQL statements:
      # SQL 1: Query the number of customers with the preferred flag specified, and sort the query results in descending order of the number of customers. 
      SELECT c_preferred_cust_flag,
             count(*) AS cnt
      FROM customer
      WHERE c_preferred_cust_flag IS NOT NULL
      GROUP BY c_preferred_cust_flag
      ORDER BY cnt DESC LIMIT 10;
      
      # SQL 2: Query the number of customers who were born in each year, and display the years in which more than 1,000 customers were born in descending order of the number of customers. 
      SELECT c_birth_year,
             count(*) AS cnt
      FROM customer
      WHERE c_birth_year IS NOT NULL
      GROUP BY c_birth_year HAVING count(*) > 1000
      ORDER BY cnt DESC LIMIT 10;
      
      # SQL 3: Query the number of customers in each city, and display the cities where more than 10 customers reside in descending order of the number of customers. 
      SELECT ca_city,
             count(*) AS cnt
      FROM customer ,
           customer_address
      WHERE c_current_addr_sk = ca_address_sk
        AND ca_city IS NOT NULL
      GROUP BY ca_city HAVING count(*) > 10
      ORDER BY cnt DESC LIMIT 10;
      
      # SQL 4: Query the number of customers who were born between 1980 and 1989 in each city, and display the cities where more than 10 customers born between 1980 and 1989 reside in descending order of the number of customers. 
      SELECT ca_city,
             count(*) AS cnt
      FROM customer ,
           customer_address
      WHERE c_current_addr_sk = ca_address_sk
        AND c_birth_year >= 1980
        AND c_birth_year < 1990
        AND c_preferred_cust_flag = 'Y'
        AND ca_city IS NOT NULL
      GROUP BY ca_city HAVING count(*) > 10
      ORDER BY cnt DESC LIMIT 10;
  5. Connect a business intelligence (BI) tool to the instance to perform visualized data analysis.
    Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) is compatible with PostgreSQL and can be directly connected to BI tools. After you create a foreign table to accelerate queries on data in a MaxCompute table, you can select an appropriate BI tool based on your business requirements and perform visualized data analysis. For more information about the BI tools that you can connect to Hologres, see Overview.
    In this example, Quick BI is used to connect to Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition). To achieve this, perform the following steps:
    1. Log on to the Quick BI console.
    2. Create a data source.
      1. In the Quick BI console, click Workstation in the top navigation bar.
      2. On the page that appears, click Data Sources in the left-side navigation pane.
      3. On the Data Sources page, click Create Data Source in the upper-right corner.
      4. Choose Cloud Databases > PostgreSQL.
      5. In the AddPostgreSQLData Sources dialog box, set the parameters. Add a data sourceThe following table describes the parameters.
        Parameter Description
        Name The custom name of the data source.
        Database Address The public endpoint of the Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) instance.
        Port Number The public port number of the Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) instance.
        Database The name of the database to be connected in the Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) instance.
        Schema Default value: public.
        Username The AccessKey ID of your Alibaba Cloud account.

        You can obtain the AccessKey ID from the Security Management page.

        Password The AccessKey secret of your Alibaba Cloud account.

        You can obtain the AccessKey secret from the Security Management page.

        VPC Data Source Clear this check box.
        SSL Clear this check box.
        After you set the parameters, you can click Test Connection to test the connection to the data source. If the test is successful, the connection between Hologres Shared Cluster (BI-based MaxCompute Acceleration Edition) and Quick BI is established.
      6. Click OK.
    3. Display data analysis results in a visualized manner.
      After the data source is connected, click the created connection in the My Data Sources section of the Data Sources page. Then, all tables in the data source are displayed. To analyze data in a visualized manner by executing SQL statements for ad hoc queries, perform the following steps:
      1. On the Data Sources page, click Ad Hoc Query in the upper-right corner to go to the Ad Hoc Query page.
      2. Enter an SQL statement to query data based on your business requirements and click Run.
      3. After you obtain the data that you want to query, click Create Dataset.
      4. In the Save Custom SQL dialog box, set the Name, Save To, and SQL parameters.
      5. Click OK.

      After you create a dataset, you can import data to create reports on the Dataset page and analyze the data in a visualized manner. For more information, see Use ad hoc queries for data modeling.

Data type mappings

For more information about data type mappings between Hologres and MaxCompute, see Data types.