×
Community Blog What Is PolarDB for PostgreSQL?

What Is PolarDB for PostgreSQL?

This article explains PolarDB for PostgreSQL, its architecture, and the different ways to use PolarDB databases.

PolarDB for PostgreSQL (PolarDB) is a cloud-native database product independently developed by Alibaba Cloud. It is 100% compatible with PostgreSQL. It uses a shared-storage-based storage-compute separation architecture and features extreme elasticity, millisecond latency, and HTAP.

1

1.  Extreme Elasticity: Storage and computing capabilities can be scaled out independently.

  • If the computing capacity is insufficient, you can expand the computing cluster separately without data replication.
  • If the storage capacity or IO is insufficient, you can expand the storage cluster without service interruption.

2.  Delay in Milliseconds:

  • WAL is log storage on the shared storage, and only the metadata of WAL is copied between RW and all RO.
  • The original LogIndex technology implements Lazy playback and Parallel playback, theoretically minimizing the latency between RW and RO nodes.

3.  HTAP Capability: It is a distributed parallel execution framework based on shared storage and accelerates OLAP queries in OLTP scenarios. A set of OLTP-based data can support two sets of computing engines.

  • Standalone Execution Engine: It handles highly concurrent TP-based loads.
  • Distributed Execution Engine: AP-type loads that handle large queries

PolarDB also supports multi-mode innovation features (such as spatio-temporal, GIS, image, vector, search, and graph) to meet the ever-changing needs of enterprises for data processing.

In addition to the preceding Shared-Storage cloud-native, PolarDB can be deployed in Shared-Nothing mode. Please see the Readme in the distribute branch for more information.

Branch Description

The default branch of PolarDB is the main branch, which supports storage and computing separation. The distribution branch is distributed (corresponding to the previous master branch).

Product Architecture and Version Planning

PolarDB PostgreSQL uses a shared storage based architecture where computing is decoupled from storage. The database has changed from the traditional Share-Nothing to the shared storage architecture. From the original N-part calculation + N-part storage, it has changed to N-part calculation +1-part storage. Although the data on the shared storage is one copy, the state of the data in the memory is different, and the consistency of the data needs to be maintained through the synchronization of the memory state. At the same time, the master node needs to coordinate when brushing dirty to prevent the read-only node from reading the advanced future page and avoid the read-only node from reading the outdated past page that has not been correctly played back in the memory. To solve this problem, PolarDB creatively designed a LogIndex data structure to maintain the playback history of pages. The structure can synchronize data from the primary node to the read-only node.

After storage and computing are separated, the IO single-channel delay and the IO throughput become larger. When a single read-only node is used to process analytical queries, the CPUs, memory, and I/O of other read-only nodes and the large storage I/O bandwidth cannot be fully utilized. PolarDB has developed a parallel execution engine based on Shared-Storage to solve this problem, which can elastically utilize any number of CPUs at the SQL level to accelerate analysis and query, and supports hybrid load scenarios of HTAP. Please see Product Architecture and Version Planning for more information.

Documents

Quick Start

We provide three ways to use PolarDB databases. Alibaba Cloud services build instances for local storage based on PFS shared storage.

Alibaba Cloud Services

Visit the Alibaba Cloud PolarDB official website here

Build Local Storage Instance

We provide a one-click deployment script to help you compile the PolarDB kernel and build a local instance. This part describes how to quickly build a PolarDB instance stored as a local disk using the one-click deployment script provided.

Operating System Requirements: CentOS 7.5 and later (The following steps pass the test on CentOS 7.5).

Note: Use the same user to perform the following steps. Do not use the root user to build the instance.

1.  Download the PolarDB source code here

2.  Install related dependencies:

sudo yum install readline-devel zlib-devel perl-CPAN bison flex
sudo cpan -fi Test::More IPC::Run

3.  According to different construction scenarios, we can select different scripts to execute commands.

  • Compile the source code of the database without creating a local instance:
./polardb_build.sh –noinit
  • Compile and create a local single-node instance. The node is the master node (port 5432):
./polardb_build.sh
  • Compile and create a local multi-node instance. The nodes include:
  1. One primary node (port 5432)
  2. One read-only node (port 5433)
./polardb_build.sh --withrep --repnum=1
  • Compile and create a local multi-node instance. The nodes include:
  1. One primary node (port 5432)
  2. One read-only node (port 5433)
  3. One secondary database node (port 5434)
./polardb_build.sh --withrep --repnum=1 –withstandby
  • Compile and create a local multi-node instance. The nodes include:
  1. One primary node (port 5432)
  2. Two read-only nodes (ports 5433 and 5434 respectively)
  3. One secondary database node (port 5435)
./polardb_build.sh --withrep --repnum=2 --withstandby

4.  After the deployment is complete, you need to check and test the instance to ensure the deployment is correct.

Instance check:

$HOME/tmp_basedir_polardb_pg_1100_bld/bin/psql -p 5432 -c 'select version();'
$HOME/tmp_basedir_polardb_pg_1100_bld/bin/psql -p 5432 -c 'select * from pg_replication_slots;'

Perform a full regression test with one click:

./polardb_build.sh --withrep --repnum=1 --withstandby -r-check-all -e -r-contrib -r-pl -r-external -r-installcheck-all

Software License Description

The released PolarDB code is based on the licenses of PostgreSQL code and Apache License 2.0. Please see License and NOTICE for related license instructions.

Acknowledgments

Some code and design ideas are based on other open-source projects, such as Postgres-XC and Postgres-XL (pgxc_ctl), TBase (part of timestamp-based vacuum and MVCC), Greenplum, and Citus (pg_cron). Thank you for your contributions to the preceding open-source projects.

Contact Us

0 0 0
Share on

ApsaraDB

449 posts | 96 followers

You may also like

Comments