All Products
Search
Document Center

MaxCompute:StarRocks connector

Last Updated:Mar 26, 2026

The StarRocks connector lets you query MaxCompute data directly from a StarRocks cluster using an external catalog—no data import or external table creation required.

Prerequisites

Before you begin, ensure that you have:

Limitations

  • Supported table types: standard tables, partitioned tables, clustered tables, Delta tables, and materialized views

  • Not supported: foreign tables, logical views, or JSON data

  • MaxCompute projects with the schema feature enabled are not supported. See Schema-related operations.

Create a MaxCompute catalog

Log on to your StarRocks cluster and run CREATE EXTERNAL CATALOG to create a MaxCompute catalog.

Syntax

CREATE EXTERNAL CATALOG <catalog_name>
[COMMENT <comment>]
PROPERTIES
(
    "type" = "odps",
    CatalogParams,
    ScanParams,
    CachingMetaParams
)

Parameters

catalog_name

The name of the MaxCompute catalog. The name must meet the following requirements:

  • Contains only letters (a–z, A–Z), digits (0–9), or underscores (_), and must start with a letter

  • Does not exceed 1,023 characters

  • Is case-sensitive

type

Set to odps. This identifies MaxCompute as the data source type.

comment

(Optional) A description of the MaxCompute catalog.

CatalogParams

Parameters for StarRocks to connect to MaxCompute. All four connection parameters are required.

Parameter Description
odps.endpoint The endpoint of the MaxCompute service. Configure this based on the region and network mode of your MaxCompute project. See Endpoints. Only Alibaba Cloud Virtual Private Cloud (VPC) networks are supported.
odps.project The name of the MaxCompute project. Find it in the MaxCompute console under Manage Configurations > Projects. If your workspace is in standard mode, use the development environment name (with _dev suffix) for dev and the production name for prod.
odps.access.id The AccessKey ID of your Alibaba Cloud account or RAM user. Retrieve it from the AccessKey Management page.
odps.access.key The AccessKey secret that corresponds to the AccessKey ID.
odps.tunnel.quota The quota name for accessing MaxCompute. Two billing options are supported: Subscription (exclusive resource groups for Data Transmission Service (subscription)) and Pay-as-you-go (open storage). Specify only one quota name. See the table below for how to find the quota name for each option.

How to find your quota name

Option Steps
Subscription In the MaxCompute console, switch to the target region, then go to Manage Configurations > Quotas. See Computing resources - Quota management.
Pay-as-you-go In the MaxCompute console, switch to the target region, then go to Manage Configurations > Tenant Property and turn on Storage API Switch.

ScanParams

(Optional) Parameters that control how StarRocks scans MaxCompute data.

Parameter Required Default Description
odps.split.policy No size The data sharding policy. size: shards by data size (default shard size: 256 MB). row_offset: shards by row count. The sharding policy affects scan concurrency—adjust it if the default does not meet your throughput requirements.
odps.split.row.count No 4,194,304 The maximum number of rows per shard. Applies only when odps.split.policy is row_offset. Value range: any integer greater than 0.

CachingMetaParams

(Optional) Parameters that control how StarRocks caches MaxCompute metadata.

Parameter Required Default Description
odps.cache.table.enable No true Whether to cache table metadata. Enable this for environments with frequent DDL operations, multiple concurrent users, or complex queries. Disable it when table metadata changes frequently or caching memory is constrained.
odps.cache.table.expire No 86400 How long (in seconds) to retain cached table metadata before deletion. Default is 86,400 seconds (24 hours). Range: any integer >= 0.
odps.cache.table.size No 1000 The maximum number of tables whose metadata is cached.
odps.cache.partition.enable No true Whether to cache partition metadata for the target table. Has no effect on non-partitioned tables.
odps.cache.partition.expire No 86400 How long (in seconds) to retain cached partition metadata before deletion. Default is 86,400 seconds (24 hours). Range: any integer >= 0.
odps.cache.partition.size No 1000 The maximum number of partitioned tables whose partition metadata is cached.
odps.cache.table-name.enable No false Whether to cache the names of all tables in the MaxCompute project.
odps.cache.table-name.expire No 86400 How long (in seconds) to retain cached table names before deletion. Default is 86,400 seconds (24 hours). Range: any integer >= 0.

Example

The following example creates a catalog named odps_catalog for the MaxCompute project mf_mc_bj.

CREATE EXTERNAL CATALOG odps_catalog
PROPERTIES (
    "type"              = "odps",
    "odps.access.id"    = "<yourAccessKeyId>",
    "odps.access.key"   = "<yourAccessKeySecret>",
    "odps.endpoint"     = "http://service.cn-beijing.maxcompute.aliyun.com/api",
    "odps.tunnel.quota" = "pay-as-you-go",
    "odps.project"      = "mf_mc_bj"
);

Replace the placeholder values:

Placeholder Description
<yourAccessKeyId> Your Alibaba Cloud AccessKey ID
<yourAccessKeySecret> Your Alibaba Cloud AccessKey secret

Query MaxCompute data

  1. Switch to the catalog and select the project. Use either of the following methods:

    • Method 1: Switch in two steps.

      SET CATALOG odps_catalog;
      USE mf_mc_bj;
    • Method 2: Switch in one step.

      USE odps_catalog.mf_mc_bj;
  2. Query a table.

    SELECT * FROM src LIMIT 10;

    Expected output:

    +------+-------+
    | key  | value |
    +------+-------+
    | 1    | 1     |
    | 3    | 3     |
    | 2    | 2     |
    | 4    | 100   |
    | 5    | 200   |
    | 6    | 300   |
    | 3    | 400   |
    +------+-------+

Manage catalogs and metadata

View catalogs

List all catalogs in the current StarRocks cluster:

SHOW CATALOGS;

Expected output:

+-----------------+----------+------------------------------------------------------------------+
| Catalog         | Type     | Comment                                                          |
+-----------------+----------+------------------------------------------------------------------+
| default_catalog | Internal | An internal catalog contains this cluster's self-managed tables. |
| odps_catalog    | Odps     | NULL                                                             |
+-----------------+----------+------------------------------------------------------------------+

View the creation statement for a specific catalog:

SHOW CREATE CATALOG odps_catalog;

View the schema of a MaxCompute table

View the column definitions of a table:

-- Syntax
DESC[RIBE] <catalog_name>.<database_name>.<table_name>;

-- Example
DESC odps_catalog.mf_mc_bj.src;

Expected output:

+-------+------+------+-------+---------+-------+
| Field | Type | Null | Key   | Default | Extra |
+-------+------+------+-------+---------+-------+
| id    | INT  | Yes  | false | NULL    |       |
| a     | INT  | Yes  | false | NULL    |       |
+-------+------+------+-------+---------+-------+

View the schema and creation statement of a table:

-- Syntax
SHOW CREATE TABLE <catalog_name>.<database_name>.<table_name>;

-- Example
SHOW CREATE TABLE odps_catalog.mf_mc_bj.src;

Delete a catalog

DROP CATALOG odps_catalog;

Collect statistics for the cost-based optimizer (CBO)

StarRocks does not automatically collect statistics for MaxCompute tables. Run the following command to trigger manual collection:

-- Replace <table_name> with the name of the MaxCompute table.
ANALYZE TABLE <table_name>;

The cost-based optimizer (CBO) uses statistical information to estimate execution costs and select the most efficient query plan. For details, see Gather statistics for CBO.

Refresh cached metadata

By default, StarRocks caches MaxCompute metadata to improve query performance. If you change a table schema or update data, run the following command to force a metadata refresh:

-- Replace <table_name> with the name of the MaxCompute table.
REFRESH EXTERNAL TABLE <table_name>;

Troubleshooting

"Your slot quota is exceeded" error

This error occurs when the number of concurrent read slots exceeds the quota of your Data Transmission Service resource group. When StarRocks reads data, it splits the table into shards and distributes them to backend (BE) nodes, which read data in parallel. The number of occupied slots is approximately Max(Number of shards, Number of BEs × Concurrency of the BE thread pool).

To resolve this, try one or more of the following:

  • Purchase an exclusive resource group for Data Transmission Service that supports higher concurrency. See Purchase and use an exclusive resource group for Data Transmission Service.

  • Reduce the number of shards by adjusting ScanParams. If you use the row_offset sharding policy, increase the value of odps.split.row.count to merge smaller shards.

  • Modify the number of threads in the StarRocks BE thread pool by changing scanner_thread_pool_thread_num or scanner_thread_pool_queue_size. See StarRocks configuration.

Appendix: Data type mappings

MaxCompute type StarRocks type
BOOLEAN BOOLEAN
TINYINT TINYINT
SMALLINT SMALLINT
INT INT
BIGINT BIGINT
FLOAT FLOAT
DOUBLE DOUBLE
DECIMAL(p, s) DECIMAL(p, s)
STRING VARCHAR(1073741824)
VARCHAR(n) VARCHAR(n)
CHAR(n) CHAR(n)
JSON VARCHAR(1073741824)
BINARY VARBINARY
DATE DATE
DATETIME DATETIME
TIMESTAMP DATETIME (precision loss: StarRocks does not support TIMESTAMP and converts it to DATETIME)
ARRAY ARRAY
MAP MAP
STRUCT STRUCT