This topic describes the overview and features of Hologres.

Modern, knowledge-based society is characterized by vast improvements in data collection. These data collection methods help enterprises accelerate digital transformation and allow them to manage terabytes, petabytes, or even exabytes of data. The steep increase in the requirements for data volumes has been accompanied by rapid advancements in Data Mid-end. Data applications tend to focus more on core business areas, such as data support, user profiling, real-time user filtering, and targeted advertising. Data services with high reliability and low latency are critical to the digital transformation of enterprises.

Hologres is an all-in-one real-time data warehouse service developed by Alibaba Cloud that allows you to write, update, and analyze large amounts of data in real time. It is compatible with PostgreSQL and therefore supports standard SQL syntax. Hologres supports online analytical processing (OLAP) and ad hoc analysis for petabytes of data, and provides high-concurrency, low-latency online data services. Seamlessly integrated with MaxCompute, Flink, and DataWorks, Hologres empowers enterprises with full-stack online and offline data warehouse solutions.

Designed with scalable, high-performance, and cost-effective computing engine capabilities, Hologres provides real-time data warehouse solutions and interactive query services in sub-seconds to help you process large amounts of data.

Features

  • Queries and analysis in multiple scenarios
    Hologres supports multiple index types and storage models such as row-oriented storage and column-oriented storage. It also supports diversified queries and analytics, such as simple queries, complex queries, and ad hoc queries. By using a massively parallel processing (MPP) architecture, Hologres processes SQL statements in distributed mode. This improves resource utilization and accelerates the analysis of large amounts of data.
    • Interactive analysis in sub-seconds

      Hologres performs parallel computing based on a scalable MPP architecture, and uses vectorization operators to maximize the computing power of CPUs. It provides column-oriented storage in the ORC format to optimize indexes and SSD storage to improve I/O throughput. This way, Hologres supports interactive analysis in sub-seconds for petabytes of data.

    • High-performance point queries by using primary keys

      Owing to primary key indexes in row-oriented tables and optimized shortest path queries, Hologres supports hundreds of thousands of high-performance point queries per second and data updates with high throughput. Compared with open source systems, Hologres improves the performance by more than 10 times. This makes Hologres suitable for scenarios such as ID mapping and dimension table associations for real-time data processing.

    • Federated queries and accelerated queries by using foreign tables

      Seamlessly integrated with MaxCompute, Hologres allows you to use foreign tables to accelerate queries on MaxCompute data. Compared with direct queries on MaxCompute data, the accelerated queries can be 5 to 10 times faster. Hologres supports the association analysis of hot and cold data. To simplify data import into data lakes or warehouses, Hologres allows millions of rows to be synchronized per second from MaxCompute tables to Hologres tables. In addition, Hologres allows you to read data from and write data to OSS external tables.

  • Native real-time data warehouse
    To tackle frequent data updates, simple data models, and quick data analysis in real-time data warehouses, Hologres supports real-time high-concurrency data writes and updates, as well as isolation and atomicity among transactions. This ensures that data can be queried the moment after it is written.
    • Real-time high-throughput data writes and updates

      Hologres is integrated with computing frameworks such as Flink and Spark. Therefore, Hologres allows you to use built-in connectors to write and update large amounts of data in real time. You can use various tables such as source tables, result tables, and dimension tables, and perform complex operations, such as merging multiple data streams.

    • A development environment in which what you see is what you get

      Hologres allows you to query data the moment after it is written. You can query data in a specific table or all tables in a schema or database. Hologres also allows you to create, delete, or update a view for one or more tables. In addition to data update and delete operations, you can join tables, perform nested queries, and use window functions to query data in Hologres. Hologres also supports semi-structured JSON data.

    • Event-driven from end to end

      Hologres allows you to parse the binary logs of table update events. You can use Flink to consume Hologres binary logs in order to realize end-to-end real-time development across warehouse layers. This way, you can reduce the end-to-end latency of data processing while meeting the requirements for tiered data governance.

  • Enterprise-level O&M capabilities
    Hologres supports fine-grained management in computing loads and access permissions. It provides diversified monitoring and alerting metrics, and supports elastic scaling of computing resources as well as hot system updates. These secure and reliable solutions can meet enterprise-level O&M requirements.
    • Data security

      Hologres provides fine-grained access control policies and data security features, including Bring Your Own Key (BYOK) encryption, data de-identification, Data Security Guard, and IP address whitelists. It also supports multiple authentication systems such as Resource Access Management (RAM), Security Token Service (STS), and independent account systems. Hologres has passed Payment Card Industry Data Security Standard (PCI DSS) assessment.

    • Load isolation

      Hologres supports the isolation of loads based on resource groups. This allows you to isolate resources for different business requirements, various query types, and data reads and writes. This way, you can make sure the sustainability and stability of the system.

    • High reliability

      Hologres allows you to use multiple compute instances to build a high-reliability mode. In this mode, Hologres supports storage sharing among compute instances, fault isolation, high availability of online services, and rapid automatic recovery of failed nodes. Hologres allows you to store data in a highly reliable triplicate redundant storage in Apsara Distributed File System. Therefore, you do not need to use local disks.

  • Ecosystem and scalability
    Hologres is compatible with the PostgreSQL ecosystem and seamlessly integrated with DataWorks. DataWorks is the big data computing engine and big data development platform of Alibaba Cloud. You can get started with Hologres without additional learning.
    • Compatibility with PostgreSQL

      Compatible with PostgreSQL, Hologres provides a Java Database Connectivity (JDBC) or Open Database Connectivity (ODBC) interface to connect to third-party extract, transform, load (ETL) and business intelligence (BI) tools, such as Quick BI, DataV, Tableau, and FanRuan. It also supports spatial data analysis based on geographic information systems (GIS).

    • DataWorks development and integration

      Hologres is seamlessly integrated with DataWorks. Together with DataWorks, Hologres provides visualized, intelligent, and all-in-one data warehouse construction and interactive analysis tools. This way, Hologres provides enterprise-level solutions to data asset management, data lineage management, real-time data synchronization, and data services.

    • Vector search engine: Proxima

      Hologres is also integrated with Alibaba Cloud Machine Learning Platform for AI (PAI) and has a built-in vector search engine named Proxima. Proxima supports online real-time feature storage, real-time retrievals, and vector searches.