View here to log in or access your console

OK

HybridDB for PostgreSQL

An online MPP warehousing service based on the Greenplum Database open source program

Buy Now Contact Sales

Overview

ApsaraDB HybridDB for PostgreSQL is an online MPP (Massively Parallel Processing) data warehousing service based on the open source Greenplum Database.

ApsaraDB HybridDB provides online expansion and performance monitoring service to free your team from complicated MPP cluster operations and management (O&M). This enables database administrators, developers and data analysts to focus on upgrading enterprise productivity through SQL development.


Benefits

Superior Performance

  • ApsaraDB HybridDB enables mixed use of row and column stores. Column stores are 100 times faster than OLAP analytics between row stores.

  • Supports high-performance parallel data imports into OSS, eliminating the bottleneck of single-channel imports.

Hybrid Analytics

  • Supports real-time analytics on GIS data in SQL syntax to assist in LBS statistics for IoT devices and other Internet-based statistics.

  • Supports real-time analytics on JSON, XML and fuzzy strings data in SQL syntax to help financial and government organizations, and enterprises achieve message data processing and fuzzy text matching.

Flexible Scalability

  • Scale the computing unit (CPU, memory, and storage space) and increase the OLAP performance to handle hundreds of terabytes.

  • Supports transparent OSS data operations. Cold data of non-online analytics can be stored in OSS with the storage capacity being scaled as needed. By incorporating data compression based on External Tables this solution greatly reduces production costs.

Stable and Reliable

  • Supports distributed database ACID transactions where all data is synchronized into two copies on two nodes. Also features distributed deployment with ternary protection of the segment, server and cabinet, safeguarding security of the data infrastructure.

  • Features distributed deployment with ternary protection of the segment, server and cabinet, safeguarding security of important data infrastructure.

Easy-to-Use

  • Supports rich OLAP SQL syntax and functions, as well as numerous Oracle functions. Industry-popular BI software can be unrolled for ApsaraDB HybridDB directly.

  • Link ApsaraDB HybridDB with ApsaraDB for RDS to offer OLTP+OLAP (HTAP) hybrid transaction analytics solutions.

Product Details

ApsaraDB HybridDB offers flexible hybrid analyzing capabilities. The product supports traditional SQL data types, XML, JSON and GIS, provides a hybrid row stores/column stores mode to enhance analytics performance, and support data compression technologies to cut down storage costs.

ApsaraDB HybridDB for PostgreSQL is based on the Greenplum Database open source program. Alibaba Cloud has incorporated other in depth extensions including OSS storage, JSON data type, and the HyperLogLog approximating analysis feature. HybridDB is compliant with the SQL 2008 standard query syntax and OLAP aggregate functions.


Features

Distributed

  • Distributed database on ACID.

  • Based on distributed MPP (Massively Parallel Processing) architecture.

  • Storage and computing capabilities capable of linear expansion with increase of segments.

  • Realizes full potential of OLAP computing efficiency.

  • Distributed SQL OLAP statistics and window functions.

  • PL/pgSQL and PL/JAVA stored procedures.

Machine Learning and Analysis

  • MADlib machine learning base on SQL.

  • Accords with international OpenGIS standards in its geographic data hybrid analysis.

  • JSON data type analysis.

  • HyperLogLog algorithm analysis.

Data Integration

  • Supported by popular ETL tools with base on PostgreSQL/Greenplum JDBC drive.

  • MySQL users can be incremental synchronization data via 'rds_dbsync'.

  • Data queries in standard SQL syntax base on OSS External Table.

  • OSS External Table supports data compression to reduce production costs.

Security

  • A maximum of 1,000 server IP addresses are allowed in IP whitelist configuration.

  • Real-time monitoring at the network access to active DDoS attacks.

Pricing

HybridDB offers the following billing methods:

  • Pay-As-You-Go

Asia Pacific SE 1 (Singapore)

High Performance | Node TypeCoreMemoryStoragePay-As-You-Go (USD per hour)
gpdb.group.segsdx118GB80GB SSD0.133
gpdb.group.segsdx2216GB160GB SSD0.265
gpdb.group.segsdx1616128GB1.28TB SSD2.494
High Capacity | Node TypeCoreMemoryStoragePay-As-You-Go (USD per hour)
gpdb.group.seghdx4432GB2TB HDD0.955
gpdb.group.seghdx3636288GB18TB HDD8.246

China North 1, China East 1, China East 2, China South 1

High Performance | Node TypeCoreMemoryStoragePay-As-You-Go (USD per hour)
gpdb.group.segsdx118GB80GB SSD0.129
gpdb.group.segsdx2216GB160GB SSD0.258
gpdb.group.segsdx1616128GB1.28TB SSD2.064
High Capacity | Node TypeCoreMemoryStoragePay-As-You-Go (USD per hour)
gpdb.group.seghdx4432GB2TB HDD0.672
gpdb.group.seghdx3636288GB18TB HDD6.048

Unit Price of Node Type

Calculate the charge of database usage using the formula below. Charges are calculated and collected on an hourly basis:

  • Bill amount = Unit price of the node type [RMB/(hours)] * Number of compute nodes [nodes] * Usage time [hours]

  • If you scaled the database within the hour, the “Number of compute nodes” is calculated based on the maximum number within the hour.

  • Every instance generates an hourly charge.

Other Charges

Network data traffic

  • Network data traffic is free of charge during promotion periods.

SQL auditing fees

  • No charges will be incurred if you have disabled SQL auditing.

  • SQL auditing service is free of charge before June 1, 2017.

Performance monitoring fees

  • Free monitoring service is offered at a frequency of once per 300 seconds.

Other Instructions

  • The prices listed above contain no discounts. Please refer to further notices of any discounts or special offers.

  • This product has multiple duplicates online at the same time for service reliability. You must therefore buy an even number of nodes.

  • Network data traffic is currently free of charge and the starting time for charges is subject to further notice.

  • This product does not support capacity downgrading operations.

  • Prices were updated on May 12, 2017.


Scenarios

One-time Development

ISVs (Independent Software Vendor) can switch between MPP system applications in the on-premise and cloud environments. Businesses with on-premise infrastructure can leverage Greenplum Database directly, while businesses on the cloud can adopt HybridDB directly.

Developers only need to program the application once and the application will be able to run on both traditional and cloud platforms. At the same time, on-premise and cloud schemas are both connectible through PostgreSQL generic drivers, facilitating business communication with more platforms of the same architecture. You can easily build an integrated “hybrid cloud” data warehouse development platform without worrying about the differences between on-premise and cloud platforms.

IoT Analytics (JSON+GIS)

ApsaraDB HybridDB and PostgreSQL are both nested with the OpenGIS-complying spatial database engine PostGIS for real-time positioning and route planning. PostGIS is supported by ArcGIS, Intergraph and QGIS. You can use simple SQL statements in the application in combination with GIS functions to handle complicated spatial geographical data models (2D and 3D processing supported).

Thanks to Hybrid’s comprehensive data OLAP capability, massive data analytics based on geographic information can be performed to provide decision making support for IoT, mobile Internet, logistic delivery, smart cities, LBS, O2O business systems, and more.

Internet Approximating (HyperLogLog)

Cardinality estimation is the most common application for Big Data scenarios. Memory demand, post-merging and processing data are major problems that arise during cardinality estimation. Page PV and VU calculations both fall into this demand category.

In SQL, we usually conduct the calculation using COUNT DISTINCT, but performance is low. HyperLogLog improves the query performance of cardinality estimation by 20 to 100 times, with an error rate of approximately 2%. HyperLogLog can be adopted in business scenarios that do not demand precise calculation accuracy. This greatly reduces server computation load and costs.

OLTP & OLAP

A wide range of options are available to import your Greenplum-based data warehouses to ApsaraDB HybridDB. You also don’t need to worry about the complicated O&M for the MPP cluster. At the same time, Alibaba Cloud provides you with a complete set of scaling and availability solutions to enable database administrators, developers and data analysts to focus on upgrading enterprise productivity through SQL and create core value.

With Alibaba Cloud ApsaraDB for RDS, you can realize the high-performance of OLTP applications. RDS supports MySQL, SQL Server and PostgreSQL. In combination with ApsaraDB HybridDB, you are able to integrate OLTP and OLAP databases on the cloud to establish a database architecture platform, including high-concurrency production transactions and decision-making analysis.


Getting Started

Easily set-up, provision and manage ApsaraDB HybridDB for massive data processing using Management Console, CLI and APIs.

Managing ApsaraDB HybridDB for PostgreSQL through Management Console

Use the Alibaba Cloud Console to launch and configure ApsaraDB HybridDB per your processing requirements.

HybridDB for PostgreSQL Console

ApsaraDB HybridDB for PostgreSQL API Reference

Access ApsaraDB HybridDB APIs for efficient provisioning and management (Coming soon).

Resources

Below are links to the documentation, SDKs and other related resources for ApsaraDB HybridDB.

Developer Resources

FAQs

1. How should I select the RDS, HybridDB for PostgreSQL and E-Mapreduce?

HybridDBRDSE-MapReduce
Based on Greenplum DatabaseOLAP (On-line Analytical Processing) Data WarehouseScale-out storage as needed. MPP distributed architecture analytics and storage performances follow a linear increase curve. Complicated SQL queries can be resolved within seconds or even milliseconds, with concurrency controlled within 500.
MySQL / PostgreSQL / SQL ServerOLTP (On-line Transaction Processing) databaseThey support different database engines and target transaction-based real-time business model processing - CRUD (create, retrieve, update, and delete). Online data less than 2TB is supported.
Hadoop, Apache Spark, HBase, Presto, and StormBig Data processing solution to quickly process a huge amount of dataAllows you to quickly launch Hadoop clusters within minutes for massive data processing. This way, it simplifies complex big data processing by performing data-intensive tasks for applications involved.

2. Which ETL tools support ApsaraDB HybridDB for PostgreSQL?

HybridDB is based on the open-source Greenplum Database program and adopts universal JDBC and ODBC interfaces. Therefore, almost all ETL tools that support Greenplum and PostgreSQL also support HybridDB.

3. What is the difference between HybridDB for PostgreSQL and Greenplum Database on which HybridDB is based?

  • HybridDB extensions support JSON, HyperLogLog and oss_ext external tables, while the open-source Greenplum Database doesn’t.

  • HybridDB is a cloud computing service and configuration is made easy with the click of a mouse via the Alibaba Cloud Console. Users don’t have to worry about the management of data warehouse deployment and expansion, among other complicated configurations.

  • HybridDB is structured on the unified management platform of Alibaba Cloud ApsaraDB and imposes limits on the superuser permissions.

4. Is the instance space purchased for HybridDB fully available for use?

The instance space is the truly usable space. HybridDB reserves an additional temporary file space which will not occupy the resources you have purchased.

5. What is the relationship between the HybridDB node type and the Greenplum Database segment?

  • A node is composed of one or more segments. The cores, memory and disk space in the node type indicate the truly usable space. Taking a node type of 4 Cores/32GB Mem/2TB HDD for example, this node will contain: 4 segments that has 1 Core/8GB Mem/0.5TB HDD each.

  • As an example, for a node type of 4 Cores/32GB Mem/2TB HDD, the corresponding native Greenplum Database contains a primary segment that totals 4 Cores/32GB Mem/2TB HDD, and a mirror segment that has 4 Cores/32GB Mem/2TB HDD. In other words, if you want to build a Greenplum Database cluster of the same configuration, you should prepare physical resources of 8 Cores/64GB Mem/4TB HDD or more (and the additional temporary file space for the cluster).

  • All the segments in a node are allocated on the same server. A highly-configured node is conducive to reducing network switching and improving performance. We recommend you to choose a high-configuration node if you require more computing resources. To purchase a super-high configuration node, please contact your customer manager or apply by submitting a “ticket”.

6. How large can the storage capacity of HybridDB be scaled to?

We offer 2048 Cores/16TB Mem/1024TB HDD or higher configuration of computing and storage resources as per your requirements. To purchase a super-high configuration node, please contact your customer manager or apply by submitting a “Ticket” via the Console.