This topic provides an overview of the specifications supported by AnalyticDB for PostgreSQL so that you can determine the specifications of your instance.
- High-performance SSD: provides good I/O capabilities and high analysis performance.
- High-capacity HDD: provides large storage capacity at a low cost.
|Storage type||Number of cores per node||Memory||Available storage space||Description|
|High-performance SSD||1||8GB||80GB||These specifications are recommended for low-concurrency scenarios that require less than 5 concurrent queries and less than 32 nodes. These specifications are available for 2 to 128 nodes.|
|High-performance SSD||4||32GB||320GB||These specifications are recommended for high-performance SSD storage and available for 8 to 4,096 nodes.|
|High-capacity HDD||2||16GB||1TB||These specifications are recommended for low-concurrency scenarios that require less than 5 concurrent queries and less than 8 nodes. These specifications are available for 4 to 32 nodes.|
|High-capacity HDD||4||32GB||2TB||These specifications are recommended for high-capacity HDD storage and available for 8 to 4,096 nodes.|
An instance consists of multiple nodes. A single instance can have up to 4,096 nodes. In the Massively Parallel Processing (MPP) architecture, each node is a partition used to store and process a portion of data on the instance. You can add nodes to increase the storage capacity and maintain a stable query response time.
Principles of selecting instance specifications
When you create or upgrade the specifications of an AnalyticDB for PostgreSQL instance, you must configure Storage Type, Node Cores, and Node Num. AnalyticDB for PostgreSQL also supports data storage to Object Storage Service (OSS) external tables. You can use gzip to compress data that is not needed for real-time computing and then upload it to OSS buckets to reduce storage costs.
- Storage type
- If high performance is your primary concern, we recommend that you choose the SSD storage type.
- If large storage capacity is your primary concern, we recommend that you choose the HDD storage type.
- The number of cores per node
Each node stores and processes data from a partition of each user table. We recommend that you configure four cores for each node. The SSD configuration that supports one core per node is only suitable if the instance has at most 32 nodes and only processes a few concurrent queries. The HDD configuration that supports two cores per node is only suitable if the instance has at most eight nodes and only processes a few concurrent queries.
- The number of nodes
AnalyticDB for PostgreSQL uses the MPP architecture. This architecture enables the data processing capability of an instance to increase linearly in proportion with its number of nodes. However, the query response time remains constant when the data volume increases. You can determine the number of nodes the instance needs based on your business scenario and the volume of raw data.
Row-oriented storage and column-oriented storage
AnalyticDB for PostgreSQL supports two storage models: row-oriented storage and column-oriented storage. You can specify a storage model when you create a table.
- If you want to write data in real time or update data frequently by executing INSERT,
UPDATE, and DELETE statements, we recommend that you choose row-oriented storage.
If you choose row-oriented storage, 1 TB of raw data requires approximately 1 TB of storage space. However, the indexes, logs, and temporary files generated during computing also occupy storage space. Therefore, we recommend that you reserve 2 TB of storage space for every 1 TB of raw data. To improve query performance, you can add nodes to increase available CPU and memory resources.
- In batch extract, transform, and load (ETL) scenarios, we recommend that you choose
column-oriented storage because data is rarely updated by executing UPDATE and DELETE
statements and most queries require aggregations and joins of table data based on
a few columns.
Column-oriented storage supports a compression ratio of up to 1:2 to 1:5. For example, 1 TB of raw data is reduced to 0.5 TB or less after compression, which means that you only need to reserve 1 TB of storage space for user data.
If you want to process 5 TB of raw data with high performance to respond to more than 100 concurrent queries, we recommend that you choose the SSD storage type to support 4 cores per node and 32 nodes per instance. In this situation, a total of 10 TB of storage space is available for user data.