Background information

  • The following table compares Hologres with MaxCompute.
    Item MaxCompute Hologres
    Scenario Data extract, transform, load (ETL), developed for the data warehouse detail (DWD) and data warehouse service (DWS) layers. Interactive query and online data services, developed for the AnalyticDB service.
    Usage Asynchronous MaxCompute jobs. Synchronous queries.
    Cluster Large shared software as a service (SaaS)-based clusters. Exclusive platform as a service (PaaS)-based clusters.
    Compute engine Uses the Job Execution model to convert MaxCompute jobs to stages and apply for computing resources for each stage on demand. File-based persistence is implemented during the job execution. Uses the Massively Parallel Processing (MPP) model to manage memory resources in a fine-grained manner, use resident memory resources for the execution engine, and schedule fine-grained SQL operators in user mode. Computing results are not stored on disks.
    Scheduling mode Supports the scheduling of processes. Resources are applied for and allocated on demand. Supports lightweight threads. Resources are reserved upon startup.
    Scalability Almost unrestricted. Avoids data shuffle across nodes in complex queries.
    Storage mode Column-oriented data storage. Row-oriented data storage, column-oriented data storage, or hybrid row-column data storage.
    Storage cost Provides data storage based on Apsara Distributed File System, which is cost-effective. Provides data storage based on Apsara Distributed File System and cache-based acceleration based on SSDs, which brings relatively high costs.
    Supported interface Interfaces compatible with MaxCompute SQL, which is equivalent to Hive SQL. Interfaces compatible with PostgreSQL.
  • Methods of query acceleration
    • Create a foreign table to accelerate queries

      Foreign tables do not store data. Instead, data is stored in MaxCompute source tables. Indexes cannot be created for foreign tables, so data queries in the foreign tables consume a large amount of CPU resources. Therefore, we recommend that you use this query acceleration method to query a small amount of data with low queries per second (QPS).

    • Import data to a Hologres internal table to accelerate queries

      Data in Hologres internal tables is stored in Hologres. To maximize the advantages of Hologres, we recommend that you use this query acceleration method if the data to be queried has been updated, the queries are complex, or the QPS is high.

Common errors and troubleshooting methods

  • The specified partitions count in MaxCompute table: exceeds the limitation of 512, please add stricter partition filter or set axf_MaxCompute_partition_limit or Build desc failed: Exceeds the partition limitation of 512, current match xxx partitions error is reported.
    • Cause:

      The number of partitions queried in Hologres exceeds the upper limit of 512.

    • Solution:
      • Specify filter conditions to filter out unnecessary partitions before the query. This ensures that the number of partitions in a query does not exceed the upper limit.
      • If you import data to a Hologres internal table to accelerate data queries, no upper limit is set on the number of partitions. For more information, see Import data from MaxCompute to Hologres by executing SQL statements.
      • You can execute the following statements to set the upper limit. By default, the upper limit is 512. You can set it to an integer up to 1024. To ensure the query performance, we recommend that you do not set the upper limit to an integer that is too large.
        -- Hologres V1.1 and later
        set hg_foreign_table_max_partition_limit = 128;
        -- Hologres V0.10
        set hg_experimental_foreign_table_max_partition_limit = xxx ;
        Note If you set multiple levels of partitions for MaxCompute source tables, Hologres counts the number of queried partitions based on the most fine-grained level.
  • The Build desc failed: Exceeds the scan limitation of 200 GB, current scan xxx GB error is reported.
    • Cause:

      By default, Hologres allows a maximum of 200 GB underlying data to be scanned in a query. The upper limit is the limit on the amount of data scanned in MaxCompute partitions, but not on the amount of data stored in MaxCompute partitions. The error is reported if the amount of scanned data exceeds the upper limit.

    • Solution:
      • Specify filter conditions to filter out unnecessary partitions before the query. This ensures that the amount of underlying data to be scanned in a query does not exceed the upper limit.
      • Import data from the MaxCompute table to a Hologres internal table before the query. For more information, see Import data from MaxCompute to Hologres by executing SQL statements.
      • Not recommended. Execute the set hg_experimental_foreign_table_max_scan_size = xxx; statement to set an upper limit on the amount of underlying data to be scanned. Replace xxx with an integer that you want to enter, such as 400. Unit: GB. However, if the integer you enter is too large, the query performance may be deteriorated or an out of memory (OOM) exception may occur.
  • The query next from foreign table executor failed, Not implemented error is reported.
    • Cause: You have run the tunnel.createStreamUploadSession command to write data to MaxCompute tables by using a streaming tunnel. However, Hologres does not support data writes to MaxCompute tables by using a streaming tunnel.
    • Solution:
      • Method 1: Recreate a MaxCompute table and run the tunnel.createUploadSession command to write data.
      • Method 2: Run the following command to merge MaxCompute tables and run the tunnel.createUploadSession command to write data:
        ALTER TABLE tablename [PARTITION] MERGE SMALLFILES;
  • The Build desc failed: failed to check permission: Currently not supported table type "view" error is reported.

    Cause: Hologres does not support queries on data in MaxCompute views.

  • The Build desc failed: failed to get foregin table split:MaxCompute-0010000: System internal error - get input pangu dir meta fai error is reported.
    • Cause:

      The parameters that you set to read MaxCompute data are not updated.

    • Solution:

      Wait a few minutes and try again. If the error still occurs, contact the technical support staff.

  • The ERROR: Query:[xxx] Build desc failed: failed to get foregin table split:ERPC_ERROR_CONNECTION_CLOSED error is reported.
    • Cause:

      Due to a substantial number of small files in MaxCompute, the amount of metadata of the request exceeds the upper limit of 1 GB governed by the Remote Procedure Call (RPC) protocol.

    • Solution:
      • Execute the following statements to merge small files:
        set MaxCompute.merge.task.mode=sql;
        set MaxCompute.merge.lock.expire.time=0;
        ALTER TABLE <tablename> [PARTITION] MERGE SMALLFILES;
      • This feature is optimized in Hologres V0.10.21 and later. Update the Hologres instance. For more information, see Instance updates.
      • If the amount of data to be queried is small, contact the technical support staff of MaxCompute. They can help you write the required data to Hologres.
  • The ERROR: status { code: SERVER_INTERNAL_ERROR message: "hos_exception: IO error: Failed to execute pangu open normal file , err: PanguParameterInvalidException" } error is reported.
    • Cause:

      In Hologres V1.1, the execution engines of foreign tables cannot read MaxCompute data that is encrypted and stored in Apsara Distributed File System.

    • Solution:

      Execute the set hg_experimental_enable_access_MaxCompute_orc_via_holo = off; statement or update the Hologres instance to the latest version.

  • The failed to import foregin schema:Failed to get MaxCompute table:Not enable schema evolution error is reported.
    • Cause:

      The metadata of the MaxCompute table is modified.

    • Solution:
      • After the schema of the MaxCompute source table is updated, such as creating or deleting a column, you must execute the IMPORT FOREIGN SCHEMA statement to recreate the foreign table.
      • In some cases, the error persists after you execute the IMPORT FOREIGN SCHEMA statement. This may happen because the MaxCompute table is in the schema evolution state after its schema is modified. Tables in this state cannot be read by Hologres. To fix this error, recreate the MaxCompute table and then recreate the corresponding foreign table.
  • The Open ORC file failed for schema mismatch. Reader schema: error is reported.
    • Cause:

      In cases where the MaxCompute table is in the Optimized Row Columnar (ORC) format, if the storage mode of data of the DECIMAL type in the table is modified, Hologres cannot read this type of data. This usually happens when fields of the DECIMAL type are added to the table or the configurations of the table are changed in canary release mode.

    • Solution:
      • Execute the set MaxCompute.storage.orc.enable.binary.decimal=false statement to re-import data from the MaxCompute table.
      • Convert the data of the DECIMAL type in the MaxCompute table to the data of the DOUBLE type, and then update the data.
  • The failed to import foregin schema:Failed to get MaxCompute table:Not enable acid table error is reported.
    • Cause:

      The MaxCompute table is a transactional table.

    • Solution:

      Hologres does not allow you to read data from a MaxCompute transactional table. Change the table to a regular MaxCompute table.

  • The Request denied, may caused by server busy error is reported.
    • Cause:

      The resources allocated for the foreign table are fully occupied and the CPU utilization severely exceeds the expected value.

    • Solution:
      • Optimize SQL statements to reduce the amount of resources to use. For more information, see Optimize the performance of querying MaxCompute tables in Hologres.
      • Reduce the concurrency of data queries.
        1. Execute the show hg_foreign_table_executor_max_dop; statement to view the current concurrency configurations.
        2. Execute the following statement to reduce the concurrency by 50%.
          -- Syntax
          set hg_foreign_table_executor_max_dop = <Concurrency>; 
          -- Sample statement 
          set hg_foreign_table_executor_max_dop = 18;
          Concurrency: the number of queries performed by an execution node. Valid values: 0 to 1024. Default value: 256. If the concurrency that you set is too high, it may lead to an OOM exception. As a result, the import or query of data fails, or even the Hologres instance restarts, which makes services unavailable. If the concurrency that you set is too low, the performance of querying foreign tables or importing MaxCompute data is poor.
      • Import data from the MaxCompute table to a Hologres internal table. You can create indexes in the Hologres internal table, which improves data query performance. For more information, see Import data from MaxCompute to Hologres by executing SQL statements.
  • The Query executor exceeded total memory limitation xxxxx: yyyy bytes used error is reported.
    • Cause:

      The upper limit on the memory size is exceeded because the amount of imported data is too large or the logic of data import is too complex. A Hologres instance can contain multiple nodes. For each node, the upper limit on the memory size is 64 GB. The memory resources of each node are evenly allocated for data computing, data caching, and metadata storage. In this example, the memory resources for data computing are exhausted.

    • Solution:
      1. View the execution plan.

        You can execute the explain analyze sql; statement to view the number of rows involved in the execution plan. In some cases, the INSERT statement contains a SELECT statement, but the ANALYZE statement has not been executed for some tables. In other cases, the ANALYZE statement has been executed, but the data is updated, which leads to data inaccuracy. Consequently, in all these cases, the join order decision of the query optimizer is invalid, and excessive memory overheads are caused.

        To prevent this, you can execute the analyze tablename; statement on all involved internal and foreign tables to update the statistical metadata of the tables. This helps the query optimizer generate better execution plans.

      2. Limit the number of data entries imported at a time.
        If the MaxCompute table has too many columns, the amount of data in each row is large, and so is the amount of data imported at a time. This causes excessive memory overheads. To prevent this, you can add a statement for setting the hg_experimental_query_batch_size parameter in front of the INSERT statement.
        set hg_experimental_query_batch_size=1024; -- Default value: 8192.
        insert into holo_table select * from mc_table;
      3. Reduce the concurrency of data import.
        You can also reduce memory overheads during data import by reducing the data import concurrency. The concurrency is specified by the hg_foreign_table_executor_max_dop parameter. The default value of this parameter equals the number of CPU cores of the Hologres instance. You can set a smaller value to reduce memory overheads during data import.
        set hg_foreign_table_executor_max_dop = 8;
        insert into holo_table select * from mc_table;
      4. Check whether the foreign table has too much duplicate data.

        If you execute the insert on conflict statement to import data and the error persists after you try the preceding methods, check whether the foreign table has too much duplicate data. If so, data import performance is affected. We recommend that you remove the duplicate data from the MaxCompute table before you import data from the table. For more information, see Merge multiple rows of data into one row.

      5. Update the Hologres instance to dynamically adjust the memory size.

        Hologres V1.1.24 and later can dynamically adjust the memory size allocated to a task. It detects the overall memory usage in real time. When Hologres detects idle memory resources, it allocates the resources to the data import task. To use this feature, update the Hologres instance. For more information, see Instance updates.

      6. Scale up Hologres instances.

        If the error persists after you try the preceding methods, scale up the Hologres instance. For more information, see the Upgrade section of the Instances topic.

  • The Timestamp overflow detected while converting timestampfrom orc VectorBatch to arrow error is reported.
    • Cause:

      You have written data of the TIMESTAMP type to MaxCompute tables by using a streaming tunnel. After the data is written, the unit of these data is converted to nanoseconds. However, Hologres does not support data of the TIMESTAMP type in nanoseconds.

    • Solution:

      Convert data of the TIMESTAMP type in the MaxCompute table to data of the DATETIME type.

  • The query next from foreign table executor failed, userinfao fail error is reported.
    • Cause:

      Hologres V1.1 and earlier do not allow you to read encrypted data from MaxCompute tables.

    • Solution:

      Update the Hologres instance to a version later than V1.1. For more information, see Instance updates.

  • The You have NO privilege 'MaxCompute:Select' on xxx error is reported.
    • Cause:

      The current user is not granted the SELECT permission on the data in specific MaxCompute tables.

    • Solution:

      Contact the MaxCompute administrator to grant the current user the SELECT permission on the data in the MaxCompute tables. For more information, see Permissions.

  • The The sensitive label of column 'xxx' is 2, but your effective label is 0 error is reported.
    • Cause:

      The current user is granted the permissions to query only part of fields in MaxCompute tables.

    • Solution:
      • Check whether the current account is granted the permissions to query the specified field. If not, apply for the permissions to query the field in MaxCompute tables, or query only fields that are permitted.
      • If the current account is granted the permissions to query the specified field, the error may be brought by bugs in the Hologres instance of earlier versions. We recommend that you add the following statements in front of the query statements to fix the bugs:
        set hg_experimental_enable_MaxCompute_executor=on;  
        set hg_experimental_enable_query_master=on;
        For more information about errors associated with MaxCompute permissions, see Permissions on MaxCompute.

What can I do if the query of data by using a foreign table is slow?

We recommend that you optimize SQL statements. For more information, see Optimize the performance of querying MaxCompute tables in Hologres.