All Products
Search
Document Center

ApsaraDB for SelectDB:Colocation join

Last Updated:Mar 28, 2026

Standard distributed join strategies—shuffle join and broadcast join—both move data between backend (BE) nodes before joining, adding latency proportional to data volume. Colocation join eliminates that network transfer by guaranteeing that tables in the same Colocation Group (CG) always store matching buckets on the same BE node, so joins on bucket columns run entirely locally.

Important

The colocation property of a table cannot be synchronized by cross-cluster replication (CCR). If a table has is_being_synced = true, its colocation property is removed.

How it works

ApsaraDB for SelectDB groups co-located tables into a Colocation Group (CG). All tables in a CG must share the same Colocation Group Schema (CGS). When these conditions are met, a bucket sequence maps each bucket index identically to the same BE node across all CG tables—rows with the same bucket column values always land on the same BE node, making local joins possible.

Example: 8 buckets mapped to 4 BE nodes (A, B, C, D)

+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+
| 0 | | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 |
+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+
| A | | B | | C | | D | | A | | B | | C | | D |
+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+

CGS requirements — what must match across all tables in a CG:

RequirementDetails
Bucket column typesEach bucket column must have the same data type
Number of bucket columnsThe DISTRIBUTED BY HASH(...) clause must list the same number of columns
Number of bucketsThe BUCKETS value must be identical

What does not need to match:

  • Number of partitions

  • Partition scope or range

  • Partition column types

For a single-partition table, each bucket contains one tablet. For a multi-partition table, each bucket contains multiple tablets—one per partition.

Key concepts

  • Colocation Group (CG): A named group of one or more tables that share the same CGS and bucket distribution. A CG belongs to a single database; its name is unique within that database. Internally, SelectDB stores the CG as dbId_groupName, but you interact with it using only the group name.

  • Colocation Group Schema (CGS): The schema constraints shared by all tables in a CG: bucket column types, number of bucket columns, and number of buckets.

  • Stable/Unstable: A CG is Stable when all its tablets are fully replicated and no migration is in progress. When a CG is Unstable (for example, during replica repair or rebalancing), colocation join is temporarily degraded to a regular join, which may noticeably reduce query performance.

Create a colocation table

Add "colocate_with" = "group_name" to the PROPERTIES clause when creating a table:

CREATE TABLE tbl (k1 int, v1 int sum)
DISTRIBUTED BY HASH(k1)
BUCKETS 8
PROPERTIES(
    "colocate_with" = "group1"
);
  • If the CG does not exist, SelectDB creates it automatically with this table as the first member.

  • If the CG already exists, SelectDB checks whether the table satisfies the CGS. If it does, the table is added and its tablets are created following the existing bucket distribution.

Cross-database colocation groups

To co-locate tables across databases, prefix the group name with __global__:

CREATE TABLE tbl (k1 int, v1 int sum)
DISTRIBUTED BY HASH(k1)
BUCKETS 8
PROPERTIES(
    "colocate_with" = "__global__group1"
);

A global CG is not owned by any database, and its name is globally unique across the cluster.

Verify that colocation join is active

Query colocation tables the same way you query any other table. SelectDB automatically generates a query plan that uses colocation join when the conditions are met.

To verify, run DESC (or EXPLAIN) on a join query and inspect the output:

-- Create tbl1 and tbl2 in the same CG, then run:
DESC SELECT * FROM tbl1 INNER JOIN tbl2 ON (tbl1.k2 = tbl2.k2);

Colocation join active — the HASH JOIN node shows colocate: true:

| 2:HASH JOIN                                      |
| |  join op: INNER JOIN                           |
| |  colocate: true                                |
| |    `tbl1`.`k2` = `tbl2`.`k2`                   |

Colocation join inactive — the HASH JOIN node shows colocate: false with a reason, and an EXCHANGE node appears:

| 2:HASH JOIN                                       |
| |  join op: INNER JOIN (BROADCAST)                |
| |  colocate: false, reason: group is not stable   |
| |    `tbl1`.`k2` = `tbl2`.`k2`                    |
...
| 3:EXCHANGE                                        |

The most common reason for colocate: false is that the CG is Unstable—replica repair or rebalancing is in progress.

Manage colocation groups

View CG information

Run the following command to list all CGs in the cluster (requires the admin role):

SHOW PROC '/colocation_group';

Example output:

+-------------+--------------+--------------+------------+----------------+----------+----------+
| GroupId     | GroupName    | TableIds     | BucketsNum | ReplicationNum | DistCols | IsStable |
+-------------+--------------+--------------+------------+----------------+----------+----------+
| 10005.10008 | 10005_group1 | 10007, 10040 | 10         | 3              | int(11)  | true     |
+-------------+--------------+--------------+------------+----------------+----------+----------+
FieldDescription
GroupIdCluster-wide unique identifier in dbId.grpId format
GroupNameFull name of the CG
TableIdsIDs of tables in the CG
BucketsNumNumber of buckets
ReplicationNumNumber of replicas
DistColsBucket column types
IsStabletrue if the CG is Stable and colocation join is available

To inspect the bucket-to-BE mapping for a specific CG:

SHOW PROC '/colocation_group/10005.10008';

Example output:

+-------------+------------+
| BucketIndex | BackendIds |
+-------------+------------+
| 0           | 10004      |
| 1           | 10003      |
| 2           | 10002      |
| 3           | 10003      |
| 4           | 10002      |
| 5           | 10003      |
| 6           | 10003      |
| 7           | 10003      |
+-------------+------------+
The admin role is required to run SHOW PROC commands. Regular users cannot run these commands.

Modify the colocation property of a table

To move a table to a different CG:

ALTER TABLE tbl SET ("colocate_with" = "group2");
  • If the table is not yet in any CG, SelectDB checks the schema and adds it to the specified CG (creating the CG if needed).

  • If the table is already in a CG, SelectDB removes it from the current CG and adds it to the specified CG (creating the CG if needed).

To remove the colocation property from a table:

ALTER TABLE tbl SET ("colocate_with" = "");

Delete a colocation table

When you run DROP TABLE, the table moves to the recycle bin and stays there for one day before being permanently deleted. After the last table in a CG is permanently deleted, the CG is automatically removed.

Constraints on schema changes

When you add partitions (ADD PARTITION) or change the replica count of a colocation table, SelectDB validates the change against the CGS. If the change would violate CGS constraints, SelectDB rejects it.

Advanced configuration

FE configuration items

The following frontend (FE) configuration items control colocation join behavior. You can dynamically modify disable_colocate_relocate and disable_colocate_balance using ADMIN SET CONFIG. To view current values or learn about configuration syntax, run HELP ADMIN SHOW CONFIG; and HELP ADMIN SET CONFIG;.

Configuration itemDefaultDescription
disable_colocate_relocatefalseDisables automatic colocation replica repair. Affects colocation tables only.
disable_colocate_balancefalseDisables automatic colocation replica rebalancing. Affects colocation tables only.
disable_colocate_joinfalseDisables the colocation join feature entirely.
use_new_tablet_schedulertrueEnables the new replica scheduling logic.

HTTP RESTful API

SelectDB exposes HTTP RESTful API endpoints for viewing and managing CGs. All endpoints are served by FE nodes at fe_host:fe_http_port and require the admin role.

View all colocation information:

GET /api/colocate

Returns colocation information in JSON format:

{
    "msg": "success",
    "code": 0,
    "data": {
        "infos": [
            ["10003.12002", "10003_group1", "10037, 10043", "1", "1", "int(11)", "true"]
        ],
        "unstableGroupIds": [],
        "allGroupIds": [{"dbId": 10003, "grpId": 12002}]
    },
    "count": 0
}

Mark a CG as Stable:

DELETE /api/colocate/group_stable?db_id=10005&group_id=10008

Return value: 200

Mark a CG as Unstable:

POST /api/colocate/group_stable?db_id=10005&group_id=10008

Return value: 200

Manually configure bucket distribution:

POST /api/colocate/bucketseq?db_id=10005&group_id=10008

Body:
[[10004],[10003],[10002],[10003],[10002],[10003],[10003],[10003],[10003],[10002]]

Return value: 200

The request body is a nested array where each element lists the BE node IDs for that bucket index.

Important

Before calling the bucketseq endpoint, set both disable_colocate_relocate and disable_colocate_balance to true. Otherwise, the system's automatic repair and rebalancing will override your manual configuration.

Replica balancing and repair

How replica repair works

Colocation replicas are pinned to specific BE nodes. If a BE node goes down or is decommissioned, SelectDB selects the least-loaded available BE node as a replacement and begins repairing all affected tablets. During this migration, the CG is marked Unstable.

How replica rebalancing works

For regular tables, SelectDB balances replicas individually—finding a suitable BE for each replica independently. For colocation tables, replicas are balanced at the bucket level: all replicas in a bucket are migrated together to preserve the co-location guarantee.

The balancing algorithm distributes the bucket sequence across BE nodes based on replica count, not actual data size.

The current balancing algorithm may not distribute load evenly in heterogeneous deployments, where BE nodes have different disk capacities, disk counts, or disk types (SSD vs. HDD). In such environments, small-capacity and large-capacity nodes may end up holding the same number of replicas.

If a CG becomes Unstable and you want to prevent automatic rebalancing from repeatedly disrupting queries, set disable_colocate_balance to true. Set it back to false when you are ready to allow rebalancing again.