After numerous iterations and updates, the elastic storage mode is now the major resource type for AnalyticDB for PostgreSQL instances. When you purchase instances in elastic storage mode, you must select node specifications and specify a storage capacity. When your storage capacity is insufficient, you can add nodes or increase the storage capacity of each node to raise the total storage capacity.
AnalyticDB for PostgreSQL allows you to create instances in elastic storage mode with ease by specifying the following parameters.
- Engine version
AnalyticDB for PostgreSQL provides the following engine version:
- 6.0 Standard Edition: delivers a standard engine that is developed based on the open source Greenplum Database and applicable to general business scenarios.
- Node specifications
AnalyticDB for PostgreSQL uses a massively parallel processing (MPP) architecture. Each AnalyticDB for PostgreSQL instance consists of one or more coordinator nodes and multiple compute nodes. Coordinator nodes receive SQL requests, distribute routes, and process result sets. Compute nodes execute SQL queries and store data. Each compute node processes and stores data in a table partition. Each table partition is a data partition in the MPP architecture.
When you create or upgrade an instance, you must select the coordinator nodes, compute node specifications, compute nodes, and storage type.
- Coordinator nodes: By default, one coordinator node is selected. You can also select more coordinator nodes to receive more requests. If you create an instance that has multiple coordinator nodes, you must use an instance endpoint to connect to the instance. Up to 16 coordinator nodes can be selected.
- Compute node specifications:
- 2 cores and 16 GB memory per node, applicable to low-concurrency scenarios.
- 4 cores and 32 GB memory per node, applicable to medium-concurrency scenarios. We recommend that you select this option.
- 8 cores and 64 GB memory per node, applicable to high-concurrency scenarios.
- 16 cores and 128 GB memory per node, applicable to extremely high-concurrency scenarios.
- Compute nodes: The MPP architecture enables the data processing capability of an instance to increase linearly with its number of compute nodes. However, the query response time remains the same even as the data volume increases. You can determine the number of compute nodes based on your business scenario and the volume of raw data.
- Storage type:
- Standard SSD or enhanced SSD (ESSD): delivers better I/O capabilities and higher analysis
performance in performance-first scenarios. Standard SSDs with 1 core per node or
ESSDs with 2 cores per node are applicable only to instances that have up to 32 compute
nodes in low-concurrency scenarios.
Storage type Number of cores per node Memory Storage Recommended scenario ESSD 2 16 GB 50 GB to 1 TB Low-concurrency scenarios, where an instance has 4 to 128 compute nodes. ESSD 4 32 GB 50 GB to 2 TB High-concurrency scenarios, where an instance has 4 to 128 compute nodes.
- HDD or ultra disk: delivers larger and more cost-effective storage to meet the higher
storage requirements in storage-first scenarios. HDDs or ultra disks with 2 cores
per node are applicable only to instances that have up to 8 compute nodes in low-concurrency
Storage type Number of cores per node Memory Storage Recommended scenario High-capacity ultra disk 2 16 GB 50 GB to 3 TB Low-concurrency scenarios, where an instance has 4 to 128 compute nodes. High-capacity ultra disk 4 32 GB 50 GB to 4 TB Medium-concurrency scenarios, where an instance has 4 to 128 compute nodes.
- We recommend that you select
8 GB of memoryand
80 GB of storagefor each core. To ensure better instance performance and a higher cache hit ratio, we recommend that you select a compute node storage capacity that is at most 20 times the memory capacity.
- Standard SSD or enhanced SSD (ESSD): delivers better I/O capabilities and higher analysis performance in performance-first scenarios. Standard SSDs with 1 core per node or ESSDs with 2 cores per node are applicable only to instances that have up to 32 compute nodes in low-concurrency scenarios.
- Data encryption:
- In scenarios that require data security compliance, you must select the encryption type. By default, disks are not encrypted. Disk encryption can be enabled only when instances are being created. After instances are created, disk encryption cannot be enabled or disabled. To enable the disk encryption feature, you must set the storage type to ESSD or ultra disk when you create an instance. After disk encryption is enabled on an instance, snapshots created on the instance and instances created from these snapshots also inherit the disk encryption feature.
- The disk encryption feature reduces the I/O performance of storage disks, and further increases the response time and decreases the processing capabilities of the entire instance.
Instance specification selection
In high-performance analysis scenarios, if more than 100 concurrent queries are performed on 5 TB of raw data, we recommend that you set the storage type to standard SSD and the compute node specifications to 4 cores and 32 GB memory. To keep the storage usage less than 80%, you can create an instance that has 32 compute nodes with 200 GB storage per node. AnalyticDB for PostgreSQL uses the primary/secondary high-availability architecture to provide dual replicas.
Row store and column store
AnalyticDB for PostgreSQL supports two storage models: row store and column store. These storage models each have their own advantages and disadvantages about performance and storage in different scenarios. When you create a table, you can specify one of these storage models.
- In online transaction processing (OLTP) scenarios where large numbers of INSERT, UPDATE, or DELETE operations must be performed, you can use row store. If you need to perform both OLTP and online analytical processing (OLAP) operations, you can use partitioned tables. For example, you can partition tables by time and maintain up to 200 partitions to ensure SQL performance optimization. If you need to query large amounts of data details, you can also use row store. If you choose row store, 1 TB of raw data requires about 1 TB of storage. However, the indexes, logs, and temporary files generated during computing also occupy storage. Therefore, we recommend that you reserve 1 TB of extra storage for every 1 TB of raw data. To improve query performance, you can add nodes to increase available CPU and memory resources.
- In OLAP scenarios where frequent data statistics must be performed, you can use column store. For example, in batch processing extract, transform, load (ETL) scenarios where data is batch stored and a small number of data is updated or deleted, you may need to perform aggregate queries of table data on several columns. If you need a higher compression ratio, you can also use column store. Column store provides a compression ratio in the range of 1:5 to 1:2. For example, if 1 TB of raw data is reduced to 0.5 TB or less after compression in column store mode, you need to select only 1 TB of storage when you create an instance.
- AnalyticDB for PostgreSQL supports data storage to Object Storage Service (OSS) foreign tables. You can use the gzip utility to compress data that is not needed for real-time computing and then upload the data to OSS buckets to reduce storage costs.