All Products
Search
Document Center

AnalyticDB for PostgreSQL:Instance specification selection and planning

Last Updated:Apr 22, 2024

After numerous iterations and updates, the elastic storage mode and the Serverless mode have become the major resource types of AnalyticDB for PostgreSQL instances. This topic describes how to select the specifications and storage capacity to ensure a better service experience when you purchase an instance.

To simplify the process of selecting instance specifications, you need to only configure the following parameters to create an AnalyticDB for PostgreSQL instance in elastic storage mode or Serverless mode.

Instance resources

  • Engine version

    The following engine versions are available in elastic storage mode:

    • 6.0 Standard Edition: provides a standard engine that is developed based on the open source Greenplum Database and is suitable for general business scenarios.

    • 7.0 Standard Edition: provides a standard engine that is built based on PostgreSQL 12 and is more competitive in terms of functionality, performance, enterprise-class capabilities, and security.

  • Node specifications

    AnalyticDB for PostgreSQL uses a massively parallel processing (MPP) architecture. Each AnalyticDB for PostgreSQL instance consists of one or more coordinator nodes and multiple compute nodes. Coordinator nodes receive SQL requests, distribute routes, and process result sets. Compute nodes execute SQL queries and store data. Each compute node processes and stores data in a table partition. Each table partition is a data partition in the MPP architecture.

    When you create or upgrade an instance, you must configure the Coordinator Node Resources, Compute Node Specifications, Compute Nodes, and Disk Storage Type parameters.

    • Coordinator Node Resources: We recommend that you select coordinator node resources that have the number of compute units (CUs) equal to the number of CPU cores per compute node.

      If you require more coordinator node resources, you can add resources after you create the instance. For more information, see Manage coordinator node resources.

    • Compute Node Specifications:

      • 2 cores and 16 GB memory per node, which is suitable for low-concurrency scenarios.

      • 4 cores and 32 GB memory per node, which is suitable for medium-concurrency scenarios. We recommend that you select this option.

      • 8 cores and 64 GB memory per node, which is suitable for high-concurrency scenarios.

      • 16 cores and 128 GB memory per node, which is suitable for extremely high-concurrency scenarios.

    • Compute Nodes: The MPP architecture enables the data processing capability of an instance to increase linearly with the number of compute nodes. However, the query response time remains the same regardless of whether the data volume increases. You can determine the number of compute nodes based on your business scenario and the volume of raw data.

    • Disk Storage Type:

      • Enhanced SSD (ESSD): delivers better I/O capabilities and higher analysis performance in performance-first scenarios.

        Storage type

        Number of cores per node

        Memory

        Storage

        Recommended scenario

        ESSD

        2

        16 GB

        50 GB to 1 TB

        Low-concurrency scenarios, in which an instance has 4 to 128 compute nodes.

        ESSD

        4

        32 GB

        50 GB to 2 TB

        High-concurrency scenarios, in which an instance has 4 to 128 compute nodes.

      • The recommended ratio of memory to disk capacity is 8:80 per CPU core. To achieve a high cache hit ratio and ensure the overall performance of the instance, we recommend that you maintain the ratio of data volume to memory for each compute node at 20 or less.

    • Encryption Type:

      • In scenarios that require data security compliance, you must select the encryption type. By default, disks are not encrypted. Disk encryption can be enabled only if instances are being created. After instances are created, disk encryption cannot be enabled or disabled. To enable the disk encryption feature, set the storage type to ESSD at a specific performance level when you create an instance. After disk encryption is enabled on an instance, snapshots created on the instance and instances created from these snapshots inherit the disk encryption feature.

      • The disk encryption feature reduces the I/O performance of storage disks, which in turn can affect the response time and processing capabilities of the instance.

Instance specification selection

In high-performance analysis scenarios, if more than 100 concurrent queries are performed on 5 TB of raw data, we recommend that you set the storage type to SSD and the compute node specifications to 4 cores and 32 GB memory. To maintain the storage usage at less than 80%, you can create an instance that has 32 compute nodes and 200 GB storage per node. AnalyticDB for PostgreSQL uses the primary/secondary high-availability architecture to provide dual replicas.

Row-oriented storage and column-oriented storage

AnalyticDB for PostgreSQL supports two storage models: row-oriented storage and column-oriented storage. These storage models each have advantages and disadvantages in terms of performance and storage in different scenarios. When you create a table, you can specify one of the storage models.

  • In online transaction processing (OLTP) scenarios in which large numbers of INSERT, UPDATE, or DELETE operations must be performed, you can use row-oriented storage. If you need to perform both OLTP and online analytical processing (OLAP) operations, you can use partitioned tables. For example, you can partition tables by time and maintain up to 200 partitions to ensure SQL performance optimization. If you want to query large amounts of data details, you can also use row-oriented storage. If you select row-oriented storage, 1 TB of raw data requires approximately 1 TB of storage. However, the indexes, logs, and temporary files generated during computing also occupy storage. In this case, we recommend that you reserve 1 TB of extra storage for every 1 TB of raw data. To improve query performance, you can add nodes to increase available CPU and memory resources.

  • In OLAP scenarios in which data statistics must be frequently collected, you can use column-oriented storage. For example, in batch processing extract-transform-load (ETL) scenarios in which data is batch stored and a small amount of data is updated or deleted, you may need to perform aggregate queries of table data on several columns. If you require a higher compression ratio, you can also use column-oriented storage. Column-oriented storage provides a compression ratio in the range of 1:5 to 1:2. For example, if 1 TB of raw data is reduced to 0.5 TB or less after compression in column-oriented storage mode, you need to select only 1 TB of storage when you create an instance.

  • AnalyticDB for PostgreSQL supports data storage to Object Storage Service (OSS) foreign tables. You can use the gzip utility to compress data that is not required for real-time computing and then upload the data to OSS buckets to reduce storage costs.

References