This topic describes some development and O&M recommendations that can help you increase the security compliance, stability, and performance of your ApsaraDB RDS for PostgreSQL instance.

Design

Table design
  • We recommend that the data types defined for the fields of the schema in your RDS instance be consistent with those defined in your application. We also recommend that the same rules be used to check fields for all tables. This way, you can prevent errors and the inability to use indexes.
  • If a partitioned table has a primary key column, we recommend that a unique sequence be created on the primary key column of the partitioned table.
  • If you need to delete historical data on a regular basis, we recommend that you partition tables by time. We also recommend that you execute the DROP or TRUNCATE statement on tables to delete data. We recommend that you do not execute the DELETE statement on tables to delete data.
  • If you need to frequently update a table, we recommend that you set the FILLFACTOR parameter of the table to 85 and reserve 15% of the available storage space when you create the table. The reserved storage space is used for updates of hot data in the table.
    CREATE TABLE test123(id int, info text) WITH(FILLFACTOR=85);  
  • We recommend that the names of temporary tables start with tmp_. We also recommend that the names of child tables end with the rule based on which the parent table of the child tables is partitioned. For example, if the name of a parent table that is partitioned by year starts with tbl, the names of the child tables of the parent table also start with tbl. Examples: tbl_2016 and tbl_2017.
Index design
  • A B-tree index contains fields of up to 2,000 bytes. If an index contains more than 2,000 bytes, a new index is required. We recommend that you create a function index, such as a hash index. Otherwise, we recommend that you use an analyzer to analyze the data before an index is created on the data.
  • Some data, such as streaming data, time fields, and auto-increment fields, may be stored in a linear order. In most cases, range queries are run to query this data. We recommend that you create BRIN indexes to reduce the size per index and speed up data insertion.
    CREATE INDEX idx ON tbl using BRIN(id);
  • We recommend that you do not run full table scans, except when you need to scan and analyze a large amount of data. ApsaraDB RDS for PostgreSQL supports indexes of almost all data types.

    The following types of indexes are supported: B-tree, Hash, GIN, GiST, SP-GiST, BRIN, RUM, Bloom, and PASE. Among these types of indexes, RUN, Bloom, and PASE are extended indexes.

  • We recommend that the names of primary key indexes start with pk_, the names of unique indexes start with uk_, and the names of regular indexes start with idx_.
Data type design and character set design
  • We recommend that you choose an appropriate data type for the data you want to write. If you want to write numeric data or the data that you want to write can be stored in tree structures, we recommend that you do not choose the string data type.

    An appropriate data type increases query efficiency.

    ApsaraDB RDS for PostgreSQL supports the following data types: Numeric, Floating-Point, Monetary, String, Character, Binary, Date/Time, Boolean, Enumerated, Geometric, Network Address, Bit String, Text Search, UUID, XML, JSON, Array, Composite, Range, Object identifier, row number, large object, ltree structure, Data Cube, geography, H-Store, pg_trgm module, and PostGIS. PostGIS includes data types such as point, line segment, surface, path, latitude, longitude, raster, and topology.

  • We recommend that all characters be stored and represented in the UTF-8 format.
  • We recommend that you do not use Chinese characters in COMMENT. If you use Chinese characters in COMMENT, input encoding is inconsistent with output encoding and the readability of data deteriorates.
  • When you use the pg_dump plug-in to create a logical backup for your RDS instance, we recommend that you use the same encoding that is used to encode COMMENT. This prevents garbled characters.

Data query

  • If you use the AS keyword in an alias in SELECT statements, we recommend that the alias contain lowercase letters, underscores (_), and digits.
  • We recommend that you do not use COUNT (column_name) or COUNT (constants). We recommend that you use COUNT(*). COUNT(*) is a standard function that is defined in SQL-92 to count the number of rows. COUNT(*) counts in NULL values, but COUNT (column_name) does not count in NULL values.
  • If COUNT (DISTINCT) is used, the names of the multiple columns that you want to specify must be enclosed in a pair of parenthesis (()). Example: COUNT( (col1,col2,col3) ). COUNT (DISTINCT) counts in all NULL values. Therefore, COUNT (DISTINCT) produces the same result as COUNT(*).
  • We recommend that you do not use SELECT * FROM t. Replace the wildcard (*) with an array of fields that you need. This way, ApsaraDB RDS returns only the fields that you specify and does not return fields that you do not need.
  • We recommend that you prevent ApsaraDB RDS from returning large amounts of data to the database client, except for extract, transform, and load (ETL) operations. If the amount of data that is returned for a query is abnormally large, consider whether the execution plan of the query is optimal.
  • If you need to run range queries, we recommend that you use the Range data type and GiST indexes to improve query performance.
  • If your application frequently initiates queries for which a large number of results are returned, we recommend that you aggregate all results for such a query into a result set. For example, if the number of results returned for a query reaches 100, we recommend that you aggregate the 100 results for the query into a result set. In addition, if your application frequently accesses the results in the result set by ID, we recommend that you aggregate the results by ID on a regular basis. A small number of results returned indicates shorter response time.

Instance management

  • Before you commit the transactions that are run to delete or modify data, we recommend that you execute the SELECT statement to confirm the transactions. This helps you prevent accidental operations.
  • If you need to perform DDL operations or other similar operations that may acquire locks on specific objects, we recommend that you configure a lock wait mechanism such as VACUUM FULL and CREATE INDEX. The lock wait mechanism is used to prevent these operations from blocking queries on the locked objects.
    begin;  
    SET local lock_timeout = '10s';  
    -- DDL query;  
    end;
  • You can execute the EXPLAIN ANALYZE statement to view the execution plan of a query. However, if the execution plan of a query involves data changes, you must execute the EXPLAIN ANALYZE statement in the transaction, and roll the transaction back after the statement is executed.
    begin;  
    EXPLAIN ANALYZE query;  
    rollback;
  • We recommend that you associate different database accounts with different applications. We recommend that you do not share the same database account with different applications.
  • We recommend that the name of each database consist of the department name and the feature name or be consistent with the application name. This makes it easy to identify each database.
  • We recommend that you allocate a schema for each of your applications. The name of a schema must be consistent with the name of the user.
  • If you need to delete data or update a large amount of data, we recommend that you break the data into batches and delete or update each batch of data in an independent transaction. We recommend that you do not delete or update all data in one transaction. If you delete or update all data in one transaction, a large amount of junk data is generated. If you need to perform a large number of operations on a table, we recommend that you check the bloat rate of the table before the operations and use the Use pg_repack to clear tablespaces plug-in to reorganize the table after the operations.

Performance and stability

  • When you write the logic of paged query into the code, we recommend that you configure ApsaraDB RDS to return the result of a paged query if the value that is returned by the COUNT () function is 0. This way, ApsaraDB RDS does not need to execute the subsequent statements for the paged query.
  • If you no longer need the cursor feature, we recommend that you disable this feature.
  • If you want to delete data from a table, we recommend that you execute the TRUNCATE statement instead of the DELETE statement on the table. This can improve the performance of your RDS instance.
  • If your application initiates highly concurrent queries, we recommend that you use the PreparedStatement variable in these queries to prevent hard parsing from consuming excessive CPU resources. We recommend that you enable the connection pool feature of your application to prevent the performance of your RDS instance from deteriorating. If your application does not support the connection pool feature, we recommend that you configure a connection pool between your application and your RDS instance. For example, PgBouncer or Pgpool-II can be used as a connection pool.
  • If your data does not need to be persistently stored, we recommend that you use the UNLOGGED table statement to improve the performance of data writes and modifications.
  • We recommend that you do not nest the COUNT(*) function in functions or programs to check whether a data record exists. We recommend that you use the LIMIT 1 clause in statements.
  • Do not frequently create or delete temporary tables. This way, you can reduce the consumption of system table resources.
  • PostgreSQL supports DDL transactions and supports the rollback of DDL statements. We recommend that you encapsulate DDL statements in transactions. You can roll the DDL statements back if necessary. However, you plan an appropriate length for these transactions. If these transactions are long, the read operations on the objects that are being accessed by these transactions may be blocked for a long period of time.
  • We recommend that you prevent deadlocks in your application. For example, you can process the data from a user in one thread. We recommend that you do not process the data from a user across threads or across database sessions.
  • For scenarios with complex networks and high Reaction Time (RT) requirements, if the business logic is lengthy, we recommend that you reduce the number of interactions between your application and your RDS instance. In addition, use the PL/pgSQL language or built-in functions. PL/pgSQL is a procedural programming language that is supported by PostgreSQL and is used to process complex business logic.

    PostgreSQL supports the following common functions and complex functions: analytic functions, aggregate functions, windows functions, mathematical functions, and geometric functions.

  • If a large amount of data needs to be written to your RDS instance, we recommend that you run the copy command or execute the INSERT INTO table VALUES (),(),...(); statement to improve the writing speed.

Monitoring and diagnostics

  • ApsaraDB RDS for PostgreSQL supports the alerting feature. You can configure an alert rule for a metric that you are concerned about. If the rule of the metric is triggered, ApsaraDB RDS notifies all the contacts in the alert contact group by emails and text messages. This feature helps you detect potential issues in a timely manner and improves system stability. For more information, see Configure an alert rule on an ApsaraDB RDS for PostgreSQL instance.

References

For more information about PostgreSQL, see PostgreSQL database development specifications.