Auto Partitioning - ApsaraDB for SelectDB - Alibaba Cloud Documentation Center

Auto Partitioning creates partitions on demand during data import based on the actual values in your partition key column. You no longer need to define partitions in advance or manually maintain hundreds of partitions when key values are scattered or hard to predict.

This makes Auto Partitioning the right choice for historical data with many discrete values, or any dataset where the data range is difficult to estimate at table creation time.

Important

Auto Partitioning requires version 4.0.0 or later.

When to use auto partitioning

Approach	Works well for	Doesn't work well for
Manual partitioning	Small, fixed number of partitions	Large or unpredictable partition ranges
Dynamic partitioning	Real-time data tied to current system time (for example, daily logs)	Historical data where the partition key is unrelated to current time
Auto Partitioning	Historical data, discrete or unpredictable partition key values	—

How it works

In a CREATE TABLE statement, replace the standard partition definition with AUTO PARTITION BY. When a row arrives during data import, the system checks whether a matching partition exists and creates one automatically if it doesn't.

For example, importing a row with TRADE_DATE = 2022-05-10 into a year-granularity table automatically creates the partition p20220101000000.

Two partitioning modes are supported: RANGE (for date/time columns) and LIST (for enumerated values).

Create a table with auto partitioning

AUTO RANGE PARTITION

Use RANGE partitioning for DATE or DATETIME columns. The date_trunc function defines the time granularity of each partition.

Syntax

[AUTO] PARTITION BY RANGE(date_trunc(<partition_column>, '<interval>'))
()

Parameter	Required	Description
`partition_column`	Yes	The partition key column. Must be `DATE` or `DATETIME`. Nullable columns are not supported.
`interval`	Yes	Time granularity for each partition. Valid values: `year`, `month`, `day`, `hour`.

Note

The AUTO keyword is optional in RANGE partitioning.

Example

The following table partitions data by month. Partitions are created automatically as data is imported.

CREATE TABLE `date_table` (
    `TIME_STAMP` datev2 NOT NULL
)
ENGINE=OLAP
DUPLICATE KEY(`TIME_STAMP`)
AUTO PARTITION BY RANGE (date_trunc(`TIME_STAMP`, 'month'))
()
DISTRIBUTED BY HASH(`TIME_STAMP`) BUCKETS 10
PROPERTIES ( "replication_allocation" = "tag.location.default: 1" );

To confirm that partitions are created automatically, insert a row and query the partition list:

-- Insert a row
INSERT INTO date_table VALUES ('2024-03-15');

-- View the resulting partitions
SELECT PartitionName, PartitionKey, Range
FROM partitions("catalog"="internal", "database"="your_db", "table"="date_table");

The output includes a partition covering March 2024, created automatically on insert.

AUTO LIST PARTITION

Use LIST partitioning for enumerated values. The system creates a separate partition for each unique value (or combination of values) in the partition key column.

Syntax

AUTO PARTITION BY LIST(`partition_col1` [, `partition_col2`, ...])
()

Parameter	Required	Description
`partition_col1, ...`	Yes	One or more partition key columns. Supported types: `BOOLEAN`, `TINYINT`, `SMALLINT`, `INT`, `BIGINT`, `LARGEINT`, `DATE`, `DATETIME`, `CHAR`, `VARCHAR`.

Example

CREATE TABLE `city_table` (
    `user_id` INT,
    `city` VARCHAR(20) NOT NULL
)
ENGINE=OLAP
DUPLICATE KEY(`user_id`)
AUTO PARTITION BY LIST (`city`)
()
DISTRIBUTED BY HASH(`user_id`) BUCKETS 10
PROPERTIES ( "replication_allocation" = "tag.location.default: 1" );

Limits and rules

AUTO RANGE PARTITION limits

Only date_trunc is supported as the partition function.
The partition key column must be DATE or DATETIME.
Nullable columns are not supported as partition key columns, even when the session variable allow_partition_column_nullable is set to true.

AUTO LIST PARTITION limits

Function calls on the partition key column are not supported.
Auto-generated partition names have a maximum length of 50 characters. Keep the combined length of partition key values within this limit.
When allow_partition_column_nullable is set to true, null values are allowed. The system creates a dedicated partition to store all NULL values.

General rules

The maximum number of partitions per auto-partitioned table is controlled by the Frontend (FE) configuration item max_auto_partition_num. The default is 1,024.
An auto-partitioned table behaves like a standard partitioned table — the only difference is that partitions are created automatically.
If a data import task that created new partitions fails or is canceled, the resulting empty partitions are not deleted.
If data import creates new partitions while a schema change is running on the same table, the import task may fail.

Advanced features

Lifecycle management

Set the partition.retention_count property to automatically remove outdated historical partitions. This property specifies how many historical partitions to keep. Older partitions beyond that count are deleted automatically.

Historical partitions: partitions whose upper bound (the end time of the RANGE) is less than or equal to the current time.
Current and future partitions: partitions whose upper bound is greater than the current time. These are not affected by partition.retention_count.

Note

Lifecycle management applies only to AUTO RANGE PARTITION. Do not combine it with dynamic partitioning.

Example

The following table retains the three most recent historical partitions:

CREATE TABLE auto_recycle(
    k0 DATETIME NOT NULL
)
AUTO PARTITION BY RANGE (date_trunc(k0, 'day')) ()
DISTRIBUTED BY HASH(`k0`) BUCKETS 1
PROPERTIES(
    "partition.retention_count" = "3"
);

Given a current date of 2025-10-21 and data imported for 2025-10-16 through 2025-10-23, the cleanup process retains:

p20251018, p20251019, p20251020 — the three most recent historical partitions
p20251021, p20251022, p20251023 — current and future partitions (always retained)

And deletes p20251016, p20251017.

Combined use with auto bucketing

AUTO RANGE PARTITION can be combined with auto bucketing, but only when data is imported incrementally and in chronological order, with each import operation affecting only the latest partition.

Warning

Writing to multiple historical partitions simultaneously causes auto bucketing to configure an inappropriate bucket count, which severely degrades query performance. Use this combination only when your import pattern meets the chronological requirement.

Manage partitions

Because partition names are auto-generated, use the auto_partition_name function together with the partitions table function to query and manage them.

Query partition information

SELECT *
FROM partitions("catalog"="internal", "database"="db_name", "table"="DAILY_TRADE_VALUE_AUTO")
WHERE PartitionName = auto_partition_name('range', 'year', '2008-02-03');

Usage notes

Data import polling interval: When importing into an auto-partitioned table, the internal polling interval is controlled by the Backend (BE) configuration item olap_table_sink_send_interval_auto_partition_factor. This setting has no effect when enable_memtable_on_sink_node is set to true.
INSERT OVERWRITE: For details on using INSERT OVERWRITE with auto-partitioned tables, see the feature documentation.