In multi-tenant applications, performance degradation caused by broadcast queries and wasted resources due to uneven data distribution are common issues. PolarDB for PostgreSQL (Distributed Edition) provides native multi-tenant management and supports two modes: row-based sharding and schema-based sharding. This feature routes queries for a specific tenant to a single data node (DN), avoiding broadcast queries and enabling efficient reads and writes. It also balances tenant data distribution to prevent data skew and improve resource utilization.
Comparison and selection
Choose the appropriate sharding strategy based on your business scenario and data characteristics. The core goal of both strategies is to confine a single tenant’s requests and data to one DN to achieve optimal performance.
Mode | Description | Recommended scenarios |
Row-based sharding | Uses a tenant ID field (the distribution column) in the same table to distinguish data. It distributes the data rows of different tenants to different DNs. |
|
Schema-based sharding | Creates a separate schema for each tenant. It directs the data of the entire schema to a specific DN. |
|
Row-based sharding
Use a tenant ID column as the distribution column to isolate tenant data by row.
Step 1: Create a distributed table
Create a table that includes a tenant ID column, such as tenant_id. Then, call the create_distributed_table function to convert it into a distributed table that uses this column as the distribution key.
-- 1. Create a table. The tenant_id column is used to distinguish tenants.
CREATE TABLE test_table (tenant_id text primary key, data text);
-- 2. Set the test_table table as a distributed table and specify tenant_id as the distribution column.
SELECT create_distributed_table('test_table', 'tenant_id');Step 2: Verify query routing
Run a query for a specific tenant and check its query plan to verify that the request is routed to a single DN. In the returned query plan, if Task Count is 1, the query was successfully pushed down to a single DN. In this example, the node is 10.0.0.1:5432. This avoids a broadcast query.
-- View the query plan for the 'company1' tenant.
EXPLAIN SELECT * FROM test_table WHERE tenant_id = 'company1';
-- Sample result:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Custom Scan (PolarCluster Adaptive) (cost=0.00..0.00 rows=0 width=0)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=10.0.0.1 port=5432 dbname=testdb
-> Index Scan using test_table_pkey_102185 on test_table_102185 test_table (cost=0.15..2.37 rows=1 width=64)
Index Cond: (tenant_id = 'company1'::text)
(7 rows)Schema-based sharding
This method isolates tenant data by creating a separate distributed schema for each tenant.
Step 1: Create and distribute a schema
Create a dedicated schema for a single tenant, such as company_test_1, and set it as a distributed schema. This operation directs all data and operations under this schema to a single DN.
-- 1. Create a schema for the company_test_1 tenant.
CREATE SCHEMA company_test_1;
-- 2. Set the schema as a distributed schema.
SELECT polar_cluster_schema_distribute('company_test_1');Step 2: Create tables in the tenant schema
Switch the search path (search_path) to the target tenant's schema. Then, create business tables in that schema.
-- 1. Switch to the schema of the company_test_1 tenant.
SET search_path TO company_test_1;
-- 2. Create business tables in this schema.
CREATE TABLE users (id SERIAL, name TEXT);
CREATE TABLE orders (id SERIAL, user_id INT);Step 3: Verify query routing
Set the search_path to the tenant's schema. Then, run a query and check its query plan to verify that the request is routed to a single DN. If the returned query plan shows a Task Count of 1, all operations for the tenant are isolated on a single node. In this example, the node is 10.0.0.1:5432.
-- 1. Make sure the search_path is set to the target tenant.
SET search_path TO company_test_1;
-- 2. View the query plan.
EXPLAIN SELECT * FROM users WHERE id = 1;
-- Sample result:
QUERY PLAN
--------------------------------------------------------------------------------
Custom Scan (PolarCluster Adaptive) (cost=0.00..0.00 rows=0 width=0)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=10.0.0.1 port=5432 dbname=testdb
-> Seq Scan on users_102191 users (cost=0.00..25.88 rows=6 width=36)
Filter: (id = 1)
(7 rows)