Transparent Data Encryption (TDE) transparently encrypts data at the database layer. This prevents unauthorized users from bypassing the database to read sensitive information directly from the storage layer.
Applicability
This feature is supported on PolarDB for PostgreSQL clusters that meet the following requirements:
Engine version:
PostgreSQL 11 (minor engine version 2.0.11.2.1.0 or later)
PostgreSQL 14 (minor engine version 2.0.14.5.1.1 or later)
PostgreSQL 16 (minor engine version 2.0.16.9.6.0 or later)
PostgreSQL 17 (minor engine version 2.0.17.6.4.0 or later)
PostgreSQL 18 (minor engine version 2.0.18.0.1.0 or later)
PolarDB for PostgreSQL Distributed Edition clusters are not supported.
You can view the minor engine version in the console or by running the SHOW polardb_version; statement. If the minor engine version does not meet the requirements, upgrade the minor engine version.
Background information
In China, to ensure information security on the Internet, service developers must comply with data security standards, such as:
Cryptography Law of the People's Republic of China (effective January 1, 2020)
Classified Protection of Cybersecurity (GB/T 22239-2019)
Internationally, some industries also have data security standards, such as:
Payment Card Industry Data Security Standard (PCI DSS)
Health Insurance Portability and Accountability Act (HIPAA)
General Data Protection Regulation (GDPR)
California Consumer Protection Act (CCPA)
Sarbanes-Oxley Act (SOX)
To help you meet these data security requirements, PolarDB provides the TDE feature. Authenticated users can transparently access data without changing application code or configurations. TDE prevents OS users from reading sensitive data in tablespace files and prevents malicious users from reading plaintext data from disks or backups.
Glossary
Term | Description |
Key Encryption Key (KEK) | A key that encrypts another key. |
Memory Data Encryption Key (MDEK) | A data encryption key that is stored in memory. It is randomly generated by the |
Table Data Encryption Key (TDEK) | A table data encryption key. It is generated from an MDEK using the HKDF algorithm, is stored in memory, and is used to encrypt data. |
WAL Data Encryption Key (WDEK) | A WAL data encryption key. It is generated from an MDEK using the HKDF algorithm, is stored in memory, and is used to encrypt data. |
Hash-based Message Authentication Code (HMAC) | A key generated using a hash-based message authentication code algorithm. A KEK and an HMACK are generated after a passphrase is processed using the SHA-512 algorithm. |
Hash-based Message Authentication Code of Key Encryption Key (KEK_HMAC) | A digest of a key encryption key generated using a hash-based message authentication code algorithm. A KEK_HMAC is generated from an ENCMDEK and an HMACK using the HMAC algorithm. It is used as verification information when a key is reverted. |
Encode Memory Data Encryption Key (ENCMDEK) | An encrypted data encryption key that is stored in memory. An ENCMDEK is generated by encrypting an MDEK with a KEK. |
How it works
Key management module
Key structure
TDE uses a two-layer key structure that consists of a key encryption key (KEK) and a data encryption key. The data encryption key encrypts the database data, and the KEK encrypts the data encryption key.
Key encryption key (KEK) and its verification value (HMACK): The system runs the command specified by the
polar_cluster_passphrase_commandparameter and calculates the SHA-512 hash of the output to obtain 64 bytes of data. The first 32 bytes are the KEK, and the last 32 bytes are the HMACK.Table data encryption key (TDEK) and WAL data encryption key (WDEK): These keys are generated by a secure random number generator and are the actual keys used for data and WAL log encryption. Verification values are generated using the HMAC algorithm. These values are used for KEK verification and are saved on shared storage.
The KEK and HMACK are obtained from an external source each time. For example, you can retrieve them from KMS. For testing purposes, you can run
echo passphraseto obtain them. The ENCMDEK and KEK_HMAC must be saved on shared storage. This ensures that the primary and read-only nodes can read the file and obtain the actual data encryption key at the next startup. The data structure is as follows:typedef struct KmgrFileData { /* version for kmgr file */ uint32 kmgr_version_no; /* Are data pages encrypted? Zero if encryption is disabled */ uint32 data_encryption_cipher; /* * Wrapped Key information for data encryption. */ WrappedEncKeyWithHmac tde_rdek; WrappedEncKeyWithHmac tde_wdek; /* CRC of all above ... MUST BE LAST! */ pg_crc32c crc; } KmgrFileData;This file is generated during database initialization (initdb). This ensures that a standby node can obtain the file using
pg_basebackup.When a cluster is running, TDE-related control information is stored in the memory of a process. The structure is as follows:
static keydata_t keyEncKey[TDE_KEK_SIZE]; static keydata_t relEncKey[TDE_MAX_DEK_SIZE]; static keydata_t walEncKey[TDE_MAX_DEK_SIZE]; char *polar_cluster_passphrase_command = NULL; extern int data_encryption_cipher;Key encryption
Keys are generated during database initialization. The process is shown in the following figure:

The system runs the command specified by
polar_cluster_passphrase_commandto obtain the 32-byte KEK and the 32-byte HMACK.The system calls the random number generation algorithm in OpenSSL to generate an MDEK.
The system uses the MDEK to call the HKDF algorithm in OpenSSL to generate a TDEK.
The system uses the MDEK to call the HKDF algorithm in OpenSSL to generate a WDEK.
The system uses the KEK to encrypt the MDEK and generate an ENCMDEK.
The system generates a KEK_HMAC from the ENCMDEK and HMACK using the HMAC algorithm. The KEK_HMAC is used as verification information when the key is reverted.
The system writes the ENCMDEK, KEK_HMAC, and other information from the
KmgrFileDatastructure to the global/kmgr file.
Key decryption
When a database crashes or restarts, the data encryption keys must be recovered from the stored ciphertext. The process is as follows:

The system reads the global/kmgr file to obtain the ENCMDEK and KEK_HMAC.
The system runs the command specified by
polar_cluster_passphrase_commandto obtain the KEK and HMACK.The system generates a new KEK_HMAC (KEK_HMAC') from the ENCMDEK and HMACK using the HMAC algorithm. It then compares KEK_HMAC' with the stored KEK_HMAC. If they match, the process continues. If they do not match, an error is returned.
The system uses the KEK to decrypt the ENCMDEK and generate an MDEK.
The system uses the MDEK to call the HKDF algorithm in OpenSSL to generate the TDEK. Because the input information is deterministic, the same TDEK is generated.
The system uses the MDEK to call the HKDF algorithm in OpenSSL to generate the WDEK. Because the input information is deterministic, the same WDEK is generated.
Key rotation
Key rotation is the process of re-encrypting the MDEK with a new KEK and generating a new kmgr file. This involves decrypting the MDEK with the old KEK and then re-encrypting it with the new KEK. The process is shown in the following figure:

The system reads the global/kmgr file to obtain the ENCMDEK and KEK_HMAC.
The
polar_cluster_passphrase_commandreturns a 64 byte KEK + HMAC.The system generates a new KEK_HMAC (KEK_HMAC') from the ENCMDEK and HMACK using the HMAC algorithm. It then compares KEK_HMAC' with the stored KEK_HMAC. If they match, the process continues. If they do not match, an error is returned.
The system uses the KEK to decrypt the ENCMDEK and generate an MDEK.
The system runs the command specified by
polar_cluster_passphrase_commandagain to obtain a new KEK (new_KEK) and a new HMACK (new_HMACK).The system uses the new_KEK to encrypt the MDEK and generate a new ENCMDEK (new_ENCMDEK).
The system generates a new KEK_HMAC (new_KEK_HMAC) from the new_ENCMDEK and new_HMACK using the HMAC algorithm. The new_KEK_HMAC is used as verification information for future key reversions.
The system writes the new_ENCMDEK, new_KEK_HMAC, and other information from the
KmgrFileDatastructure to the global/kmgr file.
Encryption module
All user data is encrypted at the page level using the AES-128 or AES-256 encryption algorithm. By default, AES-256 is used. The
(page LSN, page number)pair is used as the initialization vector (IV) for the encryption of each data page. An IV ensures that encrypting the same plaintext multiple times produces different ciphertext.The header data structure of each page is as follows:
typedef struct PageHeaderData { /* XXX LSN is member of *any* block, not only page-organized ones */ PageXLogRecPtr pd_lsn; /* LSN: next byte after last byte of xlog * record for last change to this page */ uint16 pd_checksum; /* checksum */ uint16 pd_flags; /* flag bits, see below */ LocationIndex pd_lower; /* offset to start of free space */ LocationIndex pd_upper; /* offset to end of free space */ LocationIndex pd_special; /* offset to start of special space */ uint16 pd_pagesize_version; TransactionId pd_prune_xid; /* oldest prunable XID, or zero if none */ ItemIdData pd_linp[FLEXIBLE_ARRAY_MEMBER]; /* line pointer array */ } PageHeaderData;NoteWhere:
Encryption is not supported for
pd_lsnbecause decryption requires an initialization vector (IV).The
pd_flagsfield includes the0x8000flag to indicate whether the page is encrypted. This flag is not encrypted. This provides backward compatibility for reading plaintext pages and lets you enable TDE for existing clusters.pd_checksumis not encrypted. This allows the page checksum to be verified against the ciphertext.
Encrypted files
Files that contain user data are encrypted. For example, files in the following subdirectories of the data directory are encrypted:
base/global/pg_tblspc/pg_replslot/pg_stat/pg_stat_tmp/
When to encrypt
Data is organized into pages and encrypted at the page level. Before a page is written to disk, its checksum is calculated. Even if checksums are disabled, checksum-related functions such as
PageSetChecksumCopyorPageSetChecksumInplaceare still called. Therefore, the encryption process occurs just before the checksum is calculated to ensure that all user data on the storage medium is encrypted.
Decryption module
When a page is read from storage into memory, its checksum is verified. Even if checksums are disabled, the
PageIsVerifiedfunction is still called. Therefore, the decryption process occurs immediately after the checksum is verified to ensure that the data in memory is in plaintext.