All Products
Search
Document Center

Hologres:Release notes

Last Updated:Oct 22, 2025

This topic describes the release information for Hologres features.

2025

Hologres V4.0 (September 2025)

Core feature enhancements

Description

References

Behavior changes

Enhanced AI and retrieval capabilities

  • (Beta) Work with unstructured data: Use LLM-powered AI functions to search and analyze unstructured data, including text and images. All models are fully hosted in Hologres AI nodes to ensure data security, performance, scalability, and compliance. AI functions enable the following use cases:

    • With embedding models and Object Tables, you can perform vector and full-text search on unstructured data.

    • Derive insights from text and images.

    • Filter and classify text using natural language, and translate and localize multilingual content.

    • Perform sentiment analysis and dimension-based analysis to enhance customer service.

    • Parse documents to support data analysis and retrieval-augmented generation (RAG) processes.

  • (Beta) Enhance vector search with HGraph: This technology improves performance by over 10 times. It supports hybrid search for scalar and vector data, ideal for applications such as image and video search, behavior-based recommendations, and security and fraud detection. It supports hybrid in-memory and on-disk indexes, cutting memory usage by 80%, with a mere 5% trade-off in QPS on the standard VectorDBBench. HGraph enables a cost-effective solution for use cases requiring efficient retrieval on massive vector data, such as autonomous driving.

  • (Beta) Full-text search support: Introduces inverted indexes and built-in tokenizers to enable full-text search. Use cases:

    • Keyword, phrase, and natural language search.

    • Supports the BM25 scoring algorithm for text similarity search.

    • Combines full-text and vector search.

    • Combines full-text and scalar data.

  • (Beta) Global secondary index: Supports efficient point queries on non-primary key columns. Ideal for feature stores and e-commerce platforms.

  • High-performance query channel: Hologres V4.0+ instances now default to using Common Table to query MaxCompute external tables. This new method provides a significant performance boost and supports MaxCompute Delta Table and Append 2.0 Table. See Access MaxCompute using the Common Table link.

  • Improved data freshness: Starting from Hologres V4.0, when you directly access Hologres from MaxCompute, the default data freshness is 1 minute.

  • Serverless compaction: The hg_serverless_computing_run_compaction_before_commit_bulk_load parameter is enabled by default to run compaction synchronously during serverless bulk loads, reducing its impact on instance resources. See Use Serverless Computing for concurrent compaction.

  • Flexible DLF table format: When you use CREATE EXTERNAL TABLE to create a table in Data Lake Formation (DLF), the default table format (ORC previously) is now dynamically determined by the Paimon SDK. See CREATE EXTERNAL TABLE.

  • PQE memory protection: Implements single-node memory protection for executing SQL statements using PQE. This mechanism throws OOM errors when queries may affect instance stability.

  • Adaptive execution for Dynamic Tables: In Hologres V4.0+, Dynamic Tables in full-refresh mode now use adaptive execution by default. This reduces peak resource consumption for large workloads and improves workload stability. It estimates resource consumption more accurately to save costs. Adaptive execution optimizes suboptimal query plans mid-execution, preventing failures caused by inaccurate statistics.

Engine enhancements

  • Supports the TopN runtime filter to accelerate data queries in TopN scenarios.

  • (Beta) Supports Time Travel for internal tables. This feature lets you query historical data at any point within a defined time period.

  • (Beta) Supports History-Based Optimization (HBO). Hologres collects execution details from slow queries, automatically analyzes the query plan for tuning opportunities, and intelligently adjusts the query plan.

N/A

Dynamic Table

(Beta) Supports writing processed data back to Paimon full or incremental modes. Dynamic Tables now support near real-time data processing for warehouse-to-warehouse, lake-to-warehouse, warehouse-to-lake, and lake-to-lake use cases. When combined with serverless instances, they enable ultra-low-cost data processing on data lakes.

Syntax

Supports the QUALIFY clause to filter the results of window functions.

QUALIFY (Beta)

Function and ecosystem

Enhanced ClickHouse compatibility: Supports multiple time trunc functions, such as toDayOfMonthtoDayOfYear, and toHour. These functions can improve performance by 50% compared to the extract(field from timestamp) function.

Date and time functions

Serverless and elasticity

  • (Beta) Virtual warehouse instancesdeliver ultimate write isolation. Previously, batch writes relied on the leader virtual warehouse. Now you can use any virtual warehouse for this task type, without loading a table group.

  • (Beta) Virtual warehouse instances now feature seamless upgrades (Beta), allowing them to be updated with no impact on running SQL queries and providing automatic client reconnection.

Data lake analytics

(Beta) MaxCompute data mirroring: Mirrors data from MaxCompute to Hologres with zero ETL, greatly improving query efficiency. The performance is similar to querying Hologres internal tables.

N/A

Hologres V3.2 (July 2025)

Core feature enhancements

Core feature description

References

Behavior changes

Engine enhancements

  • Supports adaptive CTE expression reuse or inlining.

  • Optimizes BETWEEN expression calculations.

CTE materialization strategy

  • Starting from Hologres V3.2, adaptive optimization for CTEs is supported. The CTE materialization strategy is determined by two GUC parameters: optimizer_cte_inlining and hg_cte_strategy. These parameters are independent of each other and can be set separately, regardless of the setting order. The behaviors are as follows:

    • When optimizer_cte_inlining = off, CTEs are reused, regardless of the hg_cte_strategy setting.

    • When optimizer_cte_inlining = on, CTEs are no longer forcibly inlined. The CTE policy is determined by hg_cte_strategy:

      • If hg_cte_strategy = auto (default), the query optimizer (QO) automatically selects whether to inline or reuse the CTE based on factors such as its complexity.

      • When hg_cte_strategy = inlining, CTEs are forcibly inlined.

      • When hg_cte_strategy = reuse, CTEs are forcibly reused.

  • Tag and Branch become reserved words.

    Databases, tables, and columns do not support the use of reserved words. To use these words as column names in SQL query expressions, you must enclose them in double quotation marks ("").

Dynamic Table

  • The incremental refresh mode now supports functions such as ARRAY_AGG and STRING_AGG.

  • DataWorks Data Map now supports lineage analysis on Dynamic Tables.

Enhanced serving capabilities

The Fixed Plan now supports simple expressions, which enables high-QPS point queries and writes in more scenarios.

Accelerate the execution of SQL statements by using fixed plans

Function and ecosystem extensions

Supports Lambda expressions and higher-order array functions that use them.

LAMBDA expressions and related functions

Serverless capabilities

  • Serverless Computing now supports reading data from and writing data to encrypted tables, including internal tables and MaxCompute external/foreign tables.

  • Enhanced Query Queue capabilities: Automatically routes SQL requests for certain tables to Serverless Computing.

Data lake analytics capabilities

  • Supports Paimon lake table mirroring to accelerate data lake queries.

  • Integrates with DLF 2.5 for metadata management, which supports access to Apache Paimon catalogs in DLF using DLF REST APIs.

  • Supports Apache Paimon catalogs mirroring, which replicates data lake data to mirrored internal tables through zero ETL and greatly improves query efficiency.

  • Provides a Time Travel query feature for Paimon, which lets you read historical data by specifying a timestamp or tag.

  • Supports reading data from a specific or fallback branch of Paimon tables.

  • Supports disabling full table scans on partitioned tables to prevent resource overconsumption.

  • Enhanced performance: The TPC-H benchmark test shows that querying Paimon tables on a 1 TB dataset achieves a 2× faster execution speed.

Enhanced ecosystem capabilities

Supports trimming and compressing binary logs, which reduces I/O usage during binary log consumption.

N/A

Hologres serverless instances launched (July 2025)

Core feature enhancements

Core feature description

References

Launch of Hologres serverless instances (Beta)

Hologres serverless instances (beta) are available for invitational preview free of charge. Hologres serverless instances are a new instance type developed by Hologres based on the cloud-native serverless architecture. You can enjoy flexible, scalable, and easy-to-use Hologres computing and storage services without the need to purchase exclusive computing resources or bear idle holding costs.

You can use your Alibaba Cloud account to fill out the form and request a trial.

Serverless instance (beta)

Hologres V3.1 (April 2025)

Core feature enhancements

Core feature description

References

Behavior changes

Dynamic Table

  • Supports dynamic partitioning of logical partitioned tables, which significantly simplifies the usage of partitioned tables.

  • Added the auto-refresh mode. In this mode, you only need to specify data freshness, and the engine automatically optimizes the refresh strategy. This enhances the flexibility of data refresh.

  • Incremental refresh supports joins on two data streams, which improves the flexibility of real-time data processing.

  • Incremental refresh supports the RoaringBitmap function, which are used for incremental calculations in complex scenarios such as unique visitor (UV) and page view (PV).

  • Full refresh mode supports Adaptive Execution (Beta), which enables the engine to self-adapt internally. While maintaining low latency, it significantly improves execution stability by reducing the probability of OOM errors and enhances usability with features such as dynamic estimation of computing resources and execution plan adjustments.

  • Query Queue has completed the Beta phase and is now generally available. For more information, see Query queue.

  • The TRUNCATE statement has been changed from a DDL to a DML statement, which reduces the load on FE nodes from TRUNCATE operations. For more information, see TRUNCATE.

  • When you use the COPY command to import data into a table that has a primary key, the hg_experimental_copy_enable_on_conflict GUC parameter is enabled by default, which lets you set a data update policy. For more information, see COPY.

  • When you use the hg_insert_overwrite feature, the number of columns and the data types specified in the standard SELECT statement must strictly match the columns of the target_table (a Hologres internal table). Otherwise, an error is reported, such as error: table "hg_alias" has x columns available but x columns specified or error: column xx is of type xxx but expression is of type xxx. For more information, see INSERT OVERWRITE.

  • The hg_table_storage_status system table is updated. Starting from V3.1, it no longer calculates the data volume of an in-memory table (Mem Table) and only calculates the actual storage of the table. For more information about the principles of memory tables, see INSERT.

Serverless capabilities

  • Optimized Serverless Computing to support complex DML scenarios such as INSERT OVERWRITE, resharding, and CREATE TABLE AS, along with stored procedures, Rebuild, and encrypted tables.

  • Achieves lossless scaling of virtual warehouses (Beta), which ensures business continuity during virtual warehouse scaling.

  • Supports automatic throttling (Beta). This feature dynamically limits the concurrency of query queues based on workload, which significantly improves cluster stability.

  • Added the adaptive routing capability. Large queries are automatically executed using serverless resources.

  • Supports configuring an upper limit for the daily usage of serverless computing resources.

  • Optimized the cache reuse capability of Serverless Computing in high-concurrency scenarios, which enhances query performance.

  • Supports using RAM roles for scaling virtual warehouses.

Performance optimization and enhanced query capabilities

  • Restructured the query engine, introduced QEv2, and added support for computation on light-weight encoding. The TPC-H 1 TB benchmark test demonstrates a 33% performance improvement.

  • Supports engine adaptive optimization and automatic pushdown of aggregation plans based on a cost model. This reduces the data involved in JOIN operations and significantly lowers latency and compuataion overhead.

  • The engine automatically deduces the NOT NULL attribute of JOIN fields and pushes down NOT NULL conditions to filter out NULL values in advance. It also automatically eliminates constant fields in GROUP BY clauses within aggregation operations.

  • The query cache feature is added to accelerate specific query results through caching.

  • Enhanced the diagnostic capabilities of the hg_stats_missing view for statistics information by adding new fields, such as autovacuum_enabled (whether AUTO ANALYZE is enabled) and reason (cause of missing statistics information), making it easier to diagnose and correct statistics information.

  • AUTO ANALYZE optimization: Enhanced statistics collection by automatically retrieving table row counts for missing statistics, improving query plan quality. Statistics now offer better persistence and interference resistance, reducing unnecessary clears from schema changes (e.g., RENAME, cold storage transition). This lowers system load and enhances execution plan quality.

Data management and write optimization

  • Storage and indexing optimization

    • Supports logical partitioned tables (Beta), which enhances the flexibility of partitioned table usage, while simplifying metadata and data management.

    • Supports stored generated columns (Beta), which simplifies data processing and accelerates queries through pre-computation.

    • Introduced the Rebuild tool (Beta), supporting lightweight indexes (distribution key, clustering key, and segment key) and other table structure modifications.

  • Enhanced write capabilities

    • Primary key tables support partial column updates in COPY operations, which reduces the need for FIXED COPY scenarios. If a fixed frontend (FE) node is used, it does not consume the original FE connection count.

    • Native support for the INSERT OVERWRITE syntax provides more flexibility for performing INSERT OVERWRITE operations on regular tables and logical partitioned tables.

Function and ecosystem extensions

  • New built-in functions

    • Added a property association funnel function and a dimension grouping funnel function.

    • Extended functions compatible with Spark and Presto to improve cross-engine development efficiency.

    • Roaring bitmap functions partially support 64-bit, which extends the scope of user persona analysis scenarios.

  • Remote function support

    • Supports calling remote user-defined functions (UDFs) using Function Compute (FC), enabling flexible expansion of extract, transform, and load (ETL) capabilities.

Enterprise-level feature upgrade

  • Enhanced enterprise-grade permission management. You can specify security tokens in the connection options of the PostgreSQL protocol, and RAM role logon using JDBC or PSQL is supported.

  • Supports a table recycle bin, which enables the recovery of accidentally deleted tables and their data from the recycle bin.

  • Optimized data masking to mask computation results and non-TEXT field types. This significantly enhances the protection of sensitive data and prevents brute-force attacks on sensitive information.

Data lake analytics capabilities

  • Support for external data sources

    • Enhanced external database integration, which enables seamless access from mainstream BI tools such as Quick BI, Tableau, and Superset.

    • Supports specifying metadata refresh intervals for external databases.

    • Supports ANALYZE and AUTO ANALYZE for external databases.

    • Supports using INSERT INTO to write data to Paimon primary key tables, which facilitates flexible lakehouse data flow.

    • Supports using INSERT INTO to write data to Iceberg tables, which ensures compatibility with more open data lake formats.

  • MaxCompute transparent acceleration

    • Upgraded remote querying of MaxCompute data to 2.0 (Beta). The underlying mechanism is rebuilt using MaxCompute C++ Native SDK, which further enhances the performance and experience of Hologres accessing MaxCompute data sources.

    • Supports direct reads of MaxCompute Delta Tables (Beta).

    • Supports direct reads of dynamically masked MaxCompute data, offering an integrated, desensitized data masking experience (Beta).

    • Supports direct reads of MaxCompute tables that have undergone schema changes. The supported operations include adding columns, deleting columns, modifying column types, and adjusting the column order.

    • Supports one-click mapping of MaxCompute projects, schemas, and tables to Hologres with DataWorks Data Development (new version).

    • Supports one-click import of MaxCompute table data into Hologres with DataWorks Data Development (new version).

2024

Hologres V3.0 (September 2024)

Core feature enhancements

Core feature description

References

Behavior changes

Engine enhancements

  • Introduces Dynamic Table, which supports full and incremental refresh modes. Data is automatically streamed and refreshed to meet demanding requirements for real-time data warehouse layering and unified batch-stream processing, accommodating various data analysis timeliness needs.

  • The Serverless Computing feature is optimized to support the SELECT and COPY operations. The cloud-native resource usage solution is provided for ad hoc large-scale queries.

  • Virtual warehouses support scheduled scaling (Beta). This feature provides elastic computing resources as scheduled to meet varying resource requirements across different time periods. This helps prevent mutual interference and increase resource utilization.

  • Query queues are supported.

    • Create query queues based on business requirements and configure the concurrency and length for query queues. This helps improve instance stability.

    • The capability of governing large-scale queries is improved. You can configure a queuing timeout period for large-scale queries to reduce negative impacts on instances. Serverless computing resources can re-run large-scale queries.

    • The Serverless Computing feature is supported to execute all queries in a query queue.

  • The performance of data writes, data updates, and point queries based on fixed plans is improved by approximately 10% compared with Hologres V2.2.

  • The INSERT OVERWRITE statement can be executed on partitioned parent tables.

  • The stored procedure feature (Beta) is supported to define common SQL statements and simplify business complexity.

  • The schema evolution capability is enhanced to allow you to modify the data types of columns.

  • The COPY capability is enhanced. A full-row update policy can be configured to update data records, instead of reporting an error, in the event of a primary key conflict when importing data into tables with primary keys.

  • The query engines are enhanced to support cross joins. This helps improve the performance of queries based on non-equivalent joins. Partial aggregation is supported. If a GROUP BY operation is performed based on multiple fields, you can use partial aggregation to limit the memory usage and reduce the probability of OOM errors.

  • The storage engine is enhanced to improve the update performance of column-oriented tables when columns in which data is unordered are configured as segment keys.

  • Function capabilities are enhanced in the following aspects:

    • The TRY_CAST function is enhanced to support data conversions to the DATE, TIMESTAMP, and TIMESTAMPTZ types.

    • The ARRAY_AGG and STRING_AGG functions that contain the DISTINCT and ORDER BY clauses are supported by Hologres Query Engine (HQE). This helps improve query performance.

  • The public preview of the SQL Hint feature is complete, and the feature can be used in production environments. By default, this feature is enabled.

  • In metadata warehouses, two records are generated for a COPY operation instead of one record. For more information, see COPY.

  • In Hologres V3.0.10 and later, the maximum number of compute units (CUs) for each virtual warehouse in a virtual warehouse instance increases from 512 to 1024.

O&M and stability improvements

  • The SQL Audit feature is provided by Hologres based on Simple Log Service. This feature is used to monitor, record, and analyze database operations to ensure data security and compliance with relevant policies.

  • The scale-out capability of virtual warehouses is enhanced. During scale-out, data reads and writes are not interrupted.

  • Statistics about DML and DQL statements that consume less than 100 ms are aggregated in the Query Log system table to improve SQL statement observation and analysis capabilities.

Data lakehouse

  • The external database feature is added to support catalog-level metadata mapping for DLF and MaxCompute tables. This helps improve the metadata and data management capabilities of data lakes.

  • The Hive Metastore Service (HMS) can be integrated with Hologres to support metadata mapping. This feature helps accelerate data queries in EMR clusters.

  • The INSERT INTO statement can be executed to write data to Apache Paimon append-only tables.

  • Data can be read from Iceberg-based data lakes. This helps further expand the data lake ecosystem.

  • Security capabilities are enhanced. By default, the service-linked role is used to access DLF2.0. You can also use a RAM role to access DLF2.0.

  • Table capabilities are enhanced.

    • Delta Lake readers are reconstructed to significantly improve the read performance.

    • Paimon deletion vectors can be optimized to improve the query performance when a large amount of data is deleted but the compaction is not performed promptly.

  • Access to MaxCompute Delta tables from Hologres is supported in Hologres V3.0.22 and later.

Serverless Computing feature is commercially available (July 2024)

Core feature enhancements

Core feature description

References

Behavior changes

The Beta phase of the Hologres Serverless Computing feature is complete

The Beta phase of the Hologres Serverless Computing feature is complete. The feature is available for production use and is backed by an SLA. It was officially commercialized at 00:00 on July 1, 2024 (UTC+8).

The Beta phase of the Hologres Serverless Computing feature is complete and the feature is available for production use.

Hologres V2.2 (April 2024)

Core feature enhancements

Core feature description

References

Behavior changes

Engine enhancements

  • The underlying capabilities of the engine are continuously optimized, and the overall performance is improved by about 15% compared with the previous version. Hologres Query Engine (HQE) and Query Optimizer (QO) are continuously optimized.

    • The capabilities of HQE are optimized to improve performance from the following aspects:

      • Capabilities of runtime filters are enhanced to support shuffle joins. This helps increase the query efficiency by about 30% in scenarios in which runtime filters are used.

      • The remote procedure call (RPC) mechanism of HQE is optimized. Data is merged among workers and then distributed to other workers. This significantly reduces network overheads and improves query performance by 8% in scenarios in which data is shuffled.

    • The performance of QO is optimized to increase the efficiency of processing SQL statements in the plan stage by 40% from the following aspects:

      • The memory allocation mechanism and join algorithm are optimized to improve query performance in multi-join scenarios.

      • The DATE_PART function is optimized to improve the efficiency of querying time-related fields, such as the year field.

      • The comparison between fields of the DATE and TIMESTAMP types is optimized to improve the efficiency of querying time-related fields.

      • Computing of complex functions with filter clauses is optimized. After the optimization, the order of filter operations is adjusted to reduce the amount of data to be processed and improve query efficiency.

  • The Serverless Computing feature is provided. This feature lets you run specified data import or ETL tasks in a shared serverless resource pool. This prevents resource contention and mutual interference among tasks in an instance and improves instance stability. This feature is supported in specific regions.

  • The dynamic partitioning feature is optimized to allow you to customize the time for creating or deleting partitions and the time for cold data migration. This makes the dynamic partitioning feature easier to use.

  • The SQL Hint syntax is supported. Hints can be used to change the execution mode of SQL statements. This lets you optimize the execution of SQL statements in a fine-grained manner.

  • The hg_stat_activity view is optimized to provide more accurate metrics about CPU and memory resources. You can also query this view to obtain the progress of data imports from MaxCompute to Hologres. This improves the observability of active queries.

  • Path analysis functions are added to allow you to analyze the traffic and time consumption of each event in a path. This helps you analyze product operation strategies and optimize product design ideas.

  • Function capabilities are enhanced in the following aspects:

    • The try_cast function is supported. For abnormal data, this function returns NULL rather than an error message. This reduces costs for processing abnormal data.

    • The date and time functions dateadd, datediff, and last_day are supported.

    • Multiple general-purpose aggregate functions can be run on HQE to improve query performance.

  • In Hologres V2.2 and later, the value of Engine Type for processes that use fixed plans is changed from SDK to FixedQE in slow query logs. This ensures name consistency in slow query logs and metrics.

  • In Hologres V2.2 and later, the number of connections to an FE node is increased from 128 to 256. The total number of connections is doubled. For more information, see Instance Management.

  • INSERT OVERWRITE and BSI functions are now generally available.

  • In Hologres V2.2 and later, the SELECT hg_dump_script() statement returns table creation properties with the WITH syntax instead of the CALL syntax. This change improves the convenience and readability of table creation. For more information, see View a table schema.

O&M and stability improvements

  • SQL fingerprints can be collected and recorded in slow query logs. You can perform clustering analysis on SQL fingerprints to improve problem locating and exception monitoring capabilities.

  • Metrics related to QE, FixedQE, and binary logs are exposed to improve the observability and O&M capabilities of your business.

  • The Query Insight feature of HoloWeb is supported to allow you to obtain query execution information, table metadata information, and lock troubleshooting information with a few clicks. This improves the troubleshooting efficiency.

  • Cross-AZ disaster recovery is supported to improve the disaster recovery capabilities of instances. This feature is supported in specific regions.

  • Engine error codes and error messages are optimized to improve the efficiency of analyzing slow query logs.

    • The logic for calculating the duration of DDL statements is optimized to improve the collection accuracy of DDL execution durations.

    • Results of the EXPLAIN ANALYZE statement are recorded in slow query logs for you to view the runtime data of each operator.

  • The underlying mechanism of version upgrades is optimized. Physical restoration is used to significantly shorten the upgrade duration when a large amount of metadata exists and reduce the negative impact of upgrades on your business.

  • Table locks for FE nodes are upgraded to short locks to resolve issues such as DDL statement execution failures and metadata inconsistency of FE nodes. This improves the stability and consistency of metadata on FE nodes.

  • This upgrade to OpenAPI capabilities adds new APIs for data lake acceleration, and resource groups to enhance instance operations management capabilities.

Ecosystem extension

  • The Auto Load feature for foreign tables supports the MaxCompute three-layer model, and you can use the hg_experimental_auto_load_foreign_schema_mapping parameter to specify schema mappings. This feature also supports Schema Evolution for MaxCompute foreign tables, such as adding columns, deleting columns, modifying column names, and changing the column order.

  • The Auto Load feature is optimized to support automatic creation of foreign tables based on Data Lake Formation (DLF) metadata. This helps accelerate queries from tables in Object Storage Service (OSS).

  • The data lake architecture is upgraded. Foreign tables in the ORC and Parquet formats support multi-level caching using built-in high-speed disks and memory, and predicate pushdown filtering. This greatly improves the read performance.

  • The service-linked role of Hologres can be used to access MaxCompute foreign tables. This helps better configure permissions on Alibaba Cloud services and prevent risks caused by misoperations. You can create the service-linked role and grant permissions to the role in the Hologres console with a few clicks.

  • Data in OSS buckets and tables in specified schemas of MaxCompute three-layer models can be accessed in the HoloWeb console.

2023

Hologres V2.1 (October 2023)

Core feature enhancements

Core feature description

References

Behavior changes

Engine enhancements

  • The performance of executing one or more COUNT DISTINCT functions is automatically optimized at runtime, which significantly improves query efficiency.

  • The Row Group Filter mechanism is added to the query optimizer. For column-oriented tables, rows in a column form a row group, and the maximum and minimum values in each row group are recorded. When you query data in a column, the system filters data in each row group without reading data from the column-oriented table. This significantly decreases the query overhead and improves the query efficiency.

  • The Runtime Filter capability is optimized to support multi-column joins. This significantly improves join efficiency.

  • Full compaction can be manually triggered to merge small files. This improves the query efficiency.

  • The range-based funnel analysis functions are added to allow you to analyze and compare user activity conversions.

  • The Bit-Sliced Index (BSI) extension library is added to optimize the performance and usability of queries in high-cardinality tag scenarios and join queries that involve both user attribute tags and behavior tags.

  • Data can be sorted in descending order based on clustering keys. This improves query performance in sorting scenarios.

  • The caching mechanism for Infrequent Access storage is optimized to improve query performance.

  • The CREATE TABLE WITH and ALTER TABLE SET statements are added to replace the original set_table_property syntax. This simplifies the process of configuring table properties.

  • The capability of writing data to tables without primary keys is optimized. Batch writes to these tables acquire row locks instead of table locks and can be performed concurrently with Fixed Plan.

  • The Proxima-based vector processing feature is optimized. It lets you create a table, import vector data into the table, and then create vector indexes. This helps shorten the index creation time and simplify vector processing.

  • Function capabilities are enhanced in the following aspects:

    • Some array functions can be run on HQE to improve function performance.

    • The KeyValue function is added to split strings.

    • The IF function is added to simplify type detection scenarios and reduce MySQL migration costs.

O&M and stability improvements

  • The slow query capability is enhanced to improve the efficiency of analyzing slow queries.

    • Results of the EXPLAIN ANALYZE statement are recorded in slow query logs for you to view the execution data of each operator.

    • The fixed plan-based diagnostics capability is enhanced. Data for affected_rows in data write scenarios and for result_rows and result_bytes in query scenarios is reported to the metadata warehouse.

  • The hg_relation_size function is added to query the storage size details of a table.

  • Hologres is compatible with native PostgreSQL behaviors and supports load balancing. Load balancing and automatic instance failover are supported in scenarios in which primary and secondary instances are configured. This improves the service availability.

  • The OpenAPI capabilities are upgraded. API operations for creating, renewing, upgrading or downgrading, and releasing instances are added to improve instance operations and management capabilities.

Ecosystem extension

Data lake acceleration supports data stored in Paimon format.

OSS Data Lake Acceleration

Hologres V2.0 (April 2023)

Core feature enhancements

Core feature description

References

Behavior changes

Engine enhancements

  • The Runtime Filter feature is added to optimize the filter operation in join processes. This reduces the amount of data to be scanned, lowers I/O overhead, and improves query performance by more than 20% in typical multi-table join scenarios.

  • The Lazy Create Fragment Instance mechanism is added to the Hologres query engine. This mechanism helps reduce query overheads and significantly improve query performance in scenarios in which data in a large table is queried and an upper limit for the number of returned rows is configured. This mechanism is commonly used in preview scenarios.

  • The display format of execution plans for Explain & Explain Analyze is fully optimized. This improves readability and simplifies SQL performance optimization.

  • The distributed transaction capability is optimized. Multiple DML statements can be executed in one transaction.

  • Supports dropping columns.

  • The CREATE TABLE AS syntax is supported to simplify the iterative optimization of table schemas.

  • Streaming COPY is supported. You do not need to batch data, which results in higher write throughput.

  • You can create a bitmap index for columns that contain JSONB-formatted data in column-oriented storage mode to accelerate point queries.

  • A column of the DATE data type can be configured as the primary key and the partition key of a partitioned table. Partition pruning is optimized to support partition key columns in an IN Array clause even when the number of values exceeds the threshold (default 100).

  • More internal engine optimizations:

    • Storage engine optimization for the Tablet Lazy Open mechanism (supported by both primary and secondary instances): Memory overhead is automatically disabled for tables that have not been accessed for more than 24 hours. When the number of open tables exceeds a threshold, the system dynamically selects and disables tablets based on the least recently used (LRU) policy. This reduces the resident memory overhead in scenarios with numerous open tables.

    • Storage engine optimization for the schema storage management mechanism: A meta tablet is used for storage management. This helps reduce the schema resident memory overhead and resource overhead in scenarios in which multiple tables and shards exist.

    • Storage engine optimization for the quick recovery capability: Quick recovery in repair mode can be enabled if data in specific tables cannot be recovered using routine data recovery methods. By default, metadata management supports logical recovery. Logical recovery helps significantly shorten the recovery time if data in many partitions needs to be recovered. For scenarios involving tens of thousands of partitions, logical recovery shortens the recovery time by more than five times.

  • Function capabilities are enhanced in the following aspects:

    • More functions are supported on HQE to improve function performance.

      • The Table Function support framework is reconstructed to enable HQE to support generate_series (INT, BIGINT, NUMERIC).

      • The PQE function support framework is reconstructed to enable HQE to support functions such as left, right, text::timestamp, and timestamp::text.

    • New array functions are added, including array_max, array_min, array_contains, array_except, array_distinct, and array_union.

    • The max_by and min_by aggregate functions are added to simplify window sorting operations.

  • The column store no longer supports the Segment storage format. Therefore, instances that use the Segment format cannot be upgraded to V2.0 or later. You can use the hg_convert_segment_orc tool function to perform batch format conversion. For more information, see Change the data storage format of column-oriented tables.

  • To prevent resource waste from the misuse of Table Groups, a limit is imposed on the maximum total shard count for a single Table Group and at the instance level starting from V2.0. For more information, see Table Group and Shard Count Operation Guide.

  • DataHub writes no longer support the SDK (legacy) mode and have fully switched to the JDBC mode. The new mode is more stable and supports more data types.

  • By default, the binary log extension is configured. When you consume binary log data in JDBC mode, you do not need to manually create the binary log extension. The default quota for WAL Senders is increased 10 times from 200 Slot/32C to 2,000 Slot/32C. This feature has completed its Beta phase and is available for production use. For more information, see Consume Hologres binary logs using JDBC.

  • Backup and recovery (Local backup and recovery) and tiered storage (Data tiered storage) have completed the Beta phase and are now production-ready.

  • For more information, see Default behavior changes.

O&M and stability improvements

  • Based on pg_stat_activity, hg_stat_activity is introduced. It is compatible with the original usage and provides more detailed runtime diagnostic information, including the execution stage, execution engine type, resource usage, and runtime locks.

  • The shard-level replica capability is improved to support high availability, load balancing, and high throughput for a single instance, addressing issues like machine faults and unbalanced hot spots.

  • The Auto Analyze capability is refactored to enable distributed Auto Analyze. This way, foreign tables, tables of lakehouse acceleration clusters, and incremental data of partitioned tables can be automatically analyzed. This helps resolve issues such as analysis failures of ultra-large tables or ultra-wide columns. The number of tables that lack statistical information is significantly reduced. The execution plan is more stable, less resources are consumed, and the performance is more stable.

  • The storage encryption configuration is optimized to support flexible single-table encryption configurations.

  • The data lineage mechanism is optimized to allow you to use DataWorks to perform cross-engine lineage analysis on data in MaxCompute and Hologres. You can use expressions such as CTEs to parse data lineage.

Ecosystem extension

  • The query acceleration engine for MaxCompute foreign tables is upgraded to improve compatibility and stability.

  • In lakehouse acceleration scenarios, with integrated DLF metadata management, you can use DLF data catalogs (Multi-Catalog) for metadata isolation. This facilitates metadata isolation between test environments, development environments, and cross-department clusters.

  • In lakehouse acceleration scenarios, you can accelerate access to data stored in OSS-HDFS (also known as JindoFS). This helps better meet the requirements for data lake computing in domains such as the big data Hadoop ecosystem and AI.

  • ClickHouse-compatible functions are added to simplify data and job migration scenarios.

2022

Hologres V1.3 (July 2022)

Core feature enhancements

Core feature description

References

Behavior changes

Engine enhancements

  • Supports real-time materialized views to improve query efficiency in real-time aggregation scenarios (Beta).

  • JSONB storage optimization: By adopting column-oriented storage optimization, the efficiency of query statistics and data compression is significantly improved.

  • Supports dynamic partition management for partitioned tables, including automatic creation and deletion of partition sub-tables.

  • Added the UNIQ precise deduplication function to significantly improve deduplication efficiency, optimize multi-Count Distinct scenarios, and reduce memory consumption.

  • Engine optimization.

    • Supports writing directly to partitioned parent tables using Insert statements that conform to FixedPlan.

    • Supports filtering with aggregation expressions, such as string_agg() and array_agg().

    • Supports RowType, along with functions such as row() and row_to_json().

    • Supports modifying the schema of a table.

    • Supports the CTE Reuse operator to improve the performance of with expressions.

  • Supports reading MaxCompute three-layer models (project.schema.table).

  • Supports reading and writing MaxCompute Transactional tables, reading MaxCompute Schema Evolution tables (tables on which operations like deleting columns, modifying column order, or modifying column types have been performed in MaxCompute), and writing back Array and Date types.

Default behavior change notes

O&M and stability improvements

  • Supports self-service configuration of shared storage secondary instances, which optimizes elasticity and high availability.

  • The table_info table is added to the metadata warehouse. This improves data governance capabilities.

  • Continuous memory optimization to reduce metadata memory footprint.

  • Supports automatic periodic and manual backups to restore historical data in scenarios such as data misoperation.

Ecosystem extension

  • Production-grade support for PostGIS extensions is provided.

  • The Oracle extension package is supported, which adds many compatible functions.

  • Supports reading Hudi and Delta format foreign tables via DLF, and writing CSV, Parquet, SequenceFile, and ORC format data to OSS foreign tables via DLF.

  • Enhanced BI compatibility, achieving a 99%+ pass rate in the Tableau compatibility test (TDVT).

2021

Hologres V1.1 (October 2021)

Core feature enhancements

Core feature description

References

Behavior changes

O&M improvements

  • The resource group isolation feature (Beta) is now available. You can create multiple resource groups to implement thread-level workload isolation for the computing resources of different users within an instance. This provides better support for multi-user and multi-scenario usage.

  • You can perform online hot upgrades for Hologres instances. Data reads and queries are not affected during an upgrade. To request a hot upgrade, you can join the Hologres DingTalk group.

  • The Auto Analyze capability is enabled by default in Hologres V1.1.

  • The new Hologres engine for directly reading MaxCompute data is enabled by default in V1.1.

  • The Resharding function has completed its Beta phase, and the related function names have been updated.

For more information, see Changes in default behavior.

Engine enhancements

  • You can create tables with a row-column hybrid storage structure. This allows a single copy of data to support multiple query scenarios, such as point queries and OLAP.

  • You can consume Hologres binary logging data in real time using JDBC (Beta).

  • You can enable Hologres Binlog on demand and dynamically modify its configurations.

  • Hologres now supports renaming columns.

  • The JSONB index feature (Beta) is now available to accelerate queries and the retrieval of data in JSON format.

  • The management mechanism for metadata in memory is optimized. You can cache and compress metadata to manage memory more efficiently.

Appearance Optimization

  • Hologres now supports reading OSS data in CSV, Parquet, SequenceFile, and ORC formats using DLF.

  • Hologres now supports cross-database queries and federated queries across multiple Hologres instances.

Security enhancements

  • Hologres now supports storage encryption for data in Hologres internal tables (Beta) to enhance data access security.

  • Hologres now supports reading encrypted MaxCompute data (Beta). This feature improves the compatibility of Hologres with the MaxCompute ecosystem.

Hologres V0.10 (May 2021)

Core feature enhancements

Core feature description

References

Engine enhancements

  • Supports automatic collection of table statistics: Table statistics are automatically sampled during data writes and updates to generate better query plans, which eliminates the need to manually execute Analyze Table.

  • Supports millisecond-level high reliability for point query (Key/Value) scenarios (Beta): Supports shard-level multi-replica configuration, millisecond-level primary-replica switching, and query retries, which significantly improves high reliability in service scenarios.

  • The RoaringBitmap extension is added, which provides native support for the Bitmap data type and related functions.

  • The bit_construct and bit_match functions are added: Optimized for scenarios like user targeting and attribution, they support more efficient aggregation filtering based on userid.

  • The range_retention_count and range_retention_sum functions are added: Optimized for retention scenarios with multi-day range queries.

  • The Resharding tool is added: With the built-in Resharding function, you can modify the shard count without recreating the table, which simplifies the tuning process.

  • The default compression format for column store is optimized to AliORC, which increases the storage compression ratio by 30% to 50%.

Foreign table query features

  • MaxCompute foreign table query performance is improved (Beta): The new foreign table acceleration engine improves query performance by about 30% to 100% compared to previous versions.

  • DLF integration is added (Beta): Read OSS data via DLF.

Performance optimization

  • Point query performance is improved: Total throughput for row store is increased by 100%, and for column store by 30%.

  • Update operation optimization: Update/Delete performance is improved by 30%.

  • Query Plan cache: The Query Plan Cache is optimized to reduce optimizer latency.

/

Enterprise-level O&M and security optimization

  • Slow queries are exposed, with a built-in query status history. You can query the status of all queries within the last month to quickly locate slow and failed queries.

Viewing and Analyzing Slow Query Logs

Hologres V0.9 (January 2021)

Core feature enhancements

Core feature description

References

Engine enhancements

  • Data types are enriched.

    • JSON and JSONB types.

    • Time types: interval, timetz, time

    • Network type: inet

    • Currency type: money

    • PG system types: name, uuid, oid

    • Other: bytea, bit, varbit

  • Function types are enriched, including PG-compatible functions and Hologres extension functions.

    • Array functions: array_length and array_positions are added.

    • Functions to view table and DB storage size: pg_relation_size and pg_database_size.

  • Supports exporting Hologres data to MaxCompute using Hologres SQL command statements for data archiving.

  • Supports Hologres Binlog subscription (Beta).

  • Supports dynamic modification of table bitmap indexes and dictionary encoding, and automatic creation of dictionary encoding based on data characteristics.

  • The Hologres Client Library is released. It is suitable for large-batch offline and real-time data synchronization to Hologres and for high-QPS point query scenarios. It improves throughput by automatically batching data.

  • The JDBC write pipeline and query optimizer are optimized, which significantly improves engine write efficiency.

  • BI ecosystem connectivity is improved, which supports more BI tools such as Tableau Server and Superset to meet various business analysis needs.

Security enhancements

  • Supports logging on to Hologres using an STS account via a role, which enables a more secure and diverse account login system beyond just the cloud account.

RAM role authorization mode

2020

Hologres V0.8 (October 2020)

Core feature enhancements

Core feature description

References

Engine enhancements

  • You can create views using the CREATE VIEW statement. You can create views based on one or more tables (including internal and foreign tables) or other views.

  • The SERIAL, DATE, TIMESTAMP, VARCHAR(n), and CHAR(n) data types are added. Also, Array type mapping is supported for MaxCompute foreign table data.

  • You can use the INSERT ON CONFLICT feature to update or skip duplicate data based on the primary key configuration when you insert data.

  • Supports the TRUNCATE feature.

  • The built-in Proxima vector search engine supports vector search on massive datasets. This feature is currently in Beta.

Security enhancements

  • The data masking feature is added. You can configure multiple masking policies to mask sensitive information such as phone numbers, addresses, or ID card numbers.

  • Integration with CloudMonitor is supported, which allows for custom metric monitoring and one-click alerts.

MaxCompute foreign table query constraints and limits

  • When querying a MaxCompute partitioned table, the maximum number of partitions that can be scanned is 512 (previously 50 in versions before 0.8).

  • In each query, the maximum amount of underlying data that can be scanned is 200 GB (regardless of the number of foreign tables and fields, previously 100 GB in versions before 0.8).

Constraints and limitations