This topic describes the overview and features of Hologres.
Hologres is an all-in-one real-time data warehouse service developed by Alibaba Cloud that allows you to write, update, and analyze large amounts of data in real time. Hologres supports standard SQL syntax. It is compatible with PostgreSQL and supports most PostgreSQL functions. Hologres supports online analytical processing (OLAP) and ad hoc analysis for petabytes of data, and provides high-concurrency, low-latency online data services. Seamlessly integrated with MaxCompute, Flink, and DataWorks, Hologres empowers enterprises with full-stack online and offline data warehouse solutions.
Hologres is designed to provide a real-time compute engine that features high performance, high reliability, cost efficiency, and high scalability. Hologres provides real-time data warehouse solutions that help manage large amounts of data and interactive query services that can respond in sub-seconds. Hologres is widely used in scenarios such as construction of real-time data mid-ends, fine-grained analysis, self-service analysis, marketing profile setup audience grouping, and real-time risk control.
Functions and features
- Queries and analysis in multiple scenarios
Hologres supports multiple index types and storage models such as row-oriented storage and column-oriented storage. It also supports diversified queries and analytics, such as simple queries, complex queries, and ad hoc queries. By using a massively parallel processing (MPP) architecture, Hologres processes SQL statements in distributed mode. This improves resource utilization and accelerates the analysis of large amounts of data.
- Interactive analysis in sub-seconds
Hologres performs parallel computing based on a scalable MPP architecture, and uses vectorization operators to maximize the computing power of CPUs. It provides column-oriented storage in the ORC format to optimize indexes and SSD storage to improve I/O throughput. This way, Hologres supports interactive analysis in sub-seconds for petabytes of data.
- High-performance point queries by using primary keys
Owing to primary key indexes in row-oriented tables and optimized shortest path queries, Hologres supports hundreds of thousands of high-performance point queries per second and data updates with high throughput. Compared with open source systems, Hologres improves the performance by more than 10 times. This makes Hologres suitable for scenarios such as ID mapping and dimension table associations for real-time data processing.
- Federated queries and accelerated queries by using foreign tables
Seamlessly integrated with MaxCompute, Hologres allows you to use foreign tables to accelerate queries on MaxCompute data. Compared with direct queries on MaxCompute data, the accelerated queries can be 5 to 10 times faster. Hologres supports the association analysis of hot and cold data. To simplify data import into data lakes or warehouses, Hologres allows millions of rows to be synchronized per second from MaxCompute tables to Hologres tables. In addition, Hologres allows you to read data from and write data to Object Storage Service (OSS) external tables.
- Interactive analysis in sub-seconds
- Native real-time data warehouse
To tackle frequent data updates, simple data models, and quick data analysis in real-time data warehouses, Hologres supports real-time high-concurrency data writes and updates, as well as isolation and atomicity among transactions. This ensures that data can be queried the moment after it is written.
- Real-time high-throughput data writes and updates
Hologres is integrated with computing frameworks such as Flink and Spark. Therefore, Hologres allows you to use built-in connectors to write and update large amounts of data in real time. You can use various tables such as source tables, result tables, and dimension tables, and perform complex operations, such as merging multiple data streams.
- A development environment in which what you see is what you get
Hologres allows you to query data the moment after it is written. You can query data in a specific table, in all tables in a schema, or in a database. Hologres also allows you to create, delete, or update a view for one or more tables. In addition to data update and delete operations, you can join tables, perform nested queries, and use window functions to query data in Hologres. Hologres provides native support for the analysis of semi-structured JSON data and allows you to synchronize databases from sources such as MySQL to Hologres with a few clicks in real time.
- Event-driven from end to end
Hologres allows you to parse the binary logs of table update events. You can use Flink to consume Hologres binary logs in order to realize end-to-end real-time development across warehouse layers. This way, you can reduce the end-to-end latency of data processing while meeting the requirements for tiered data governance.
- Real-time high-throughput data writes and updates
- Enterprise-level O&M capabilities
Hologres supports fine-grained management in computing loads and access permissions. It provides diversified monitoring and alerting metrics, and supports elastic scaling of computing resources as well as hot system updates. These secure and reliable solutions can meet enterprise-level O&M requirements.
- Data security
Hologres provides fine-grained access control policies and data security features, including Bring Your Own Key (BYOK) encryption, data masking, Data Security Guard, and IP address whitelists. It also supports multiple authentication systems such as Resource Access Management (RAM), Security Token Service (STS), and independent account systems. Hologres has passed Payment Card Industry Data Security Standard (PCI DSS) assessment.
- Load isolation
Hologres supports the isolation of loads based on resource groups. This allows you to isolate resources for different business requirements, various query types, and data reads and writes. This way, you can make sure the sustainability and stability of the system.
- High reliability
Hologres allows you to use multiple compute instances to build a high-reliability mode. In this mode, Hologres supports storage sharing among compute instances, fault isolation, high availability of online services, and rapid automatic recovery of failed nodes. Hologres allows you to store data in a highly reliable triplicate redundant storage in Apsara Distributed File System. Therefore, you do not need to use local disks.
- Data security
- Ecosystem and scalability
Hologres is compatible with the PostgreSQL ecosystem and seamlessly integrated with DataWorks. DataWorks is the big data computing engine and big data development platform of Alibaba Cloud. You can get started with Hologres without additional learning.
- Compatibility with PostgreSQL
Compatible with PostgreSQL, Hologres provides a Java Database Connectivity (JDBC) or Open Database Connectivity (ODBC) interface to connect to third-party extract, transform, load (ETL) and business intelligence (BI) tools, such as Quick BI, DataV, Tableau, and FanRuan. It also supports spatial data analysis based on geographic information systems (GIS).
- DataWorks development and integration
Hologres is seamlessly integrated with DataWorks. Together with DataWorks, Hologres provides visualized, intelligent, and all-in-one data warehouse construction and interactive analysis tools. This way, Hologres provides enterprise-level solutions to data asset management, data lineage management, real-time data synchronization, and data services.
- Vector search engine: Proxima
Hologres is also integrated with Alibaba Cloud Machine Learning Platform for AI (PAI) and has a built-in vector search engine named Proxima. Proxima supports online real-time feature storage, real-time retrievals, and vector searches.
- Compatibility with PostgreSQL