All Products
Search
Document Center

MaxCompute:StarRocks connector

Last Updated:Nov 24, 2025

A StarRocks connector seamlessly accesses a MaxCompute data source and performs complex SQL queries using an external catalog, without the need to import data or create an external table. This improves data analytics efficiency and reduces operational difficulties and costs. This topic describes how to use a StarRocks connector to access MaxCompute.

Background

StarRocks is a next-generation, high-speed Massively Parallel Processing (MPP) database. StarRocks meets various enterprise-level analysis requirements, such as multi-dimensional analysis, custom report creation, real-time data analytics, and ad hoc data analytics. For more information about StarRocks, see What is StarRocks?. StarRocks supports the data catalog feature. The data catalog feature lets you manage internal and external data within the same system. This lets you easily access and query data from various external data sources. For more information, see Catalog overview.

Prerequisites

Limits

  • You can read data from standard tables, partitioned tables, clustered tables, Delta tables, and materialized views. You cannot read data from foreign tables or logical views.

  • You cannot read JSON data.

  • Only MaxCompute projects that do not have the schema feature enabled are supported. For more information about the schema feature, see Schema-related operations.

Create a MaxCompute catalog

Log on to the StarRocks cluster and create a MaxCompute catalog. For information about external catalogs, see CREATE EXTERNAL CATALOG.

Syntax

CREATE EXTERNAL CATALOG <catalog_name>
[COMMENT <comment>]
PROPERTIES
(
    "type" = "odps",
    CatalogParams,
    ScanParams,
    CachingMetaParams
)

Parameters

  • Common parameters

    Parameter

    Required

    Description

    catalog_name

    Yes

    The name of the MaxCompute catalog. The name must meet the following requirements:

    • The name can contain only letters (a to z or A to Z), digits (0 to 9), or underscores (_) and must start with a letter.

    • The name cannot exceed 1,023 characters in length.

    • The catalog name is case-sensitive.

    type

    Yes

    The data source type. Set this parameter to odps.

    comment

    No

    The description of the MaxCompute catalog.

    CatalogParams

    Yes

    The parameters that are used for StarRocks to connect to MaxCompute.

    ScanParams

    No

    The parameters that are used for StarRocks to connect to MaxCompute file storage.

    CachingMetaParams

    No

    The parameters that define the metadata cache policy.

  • Parameters configurable in CatalogParams

    Parameter

    Required

    Description

    odps.endpoint

    Yes

    The endpoint of the MaxCompute service. You must configure this parameter based on the region and network connectivity mode that you selected when you create the MaxCompute project. For information about the endpoints that correspond to different regions and network connectivity modes, see Endpoints.

    Important

    Only Alibaba Cloud VPC networks are supported.

    odps.project

    Yes

    The name of the MaxCompute project. You can log on to the MaxCompute console and choose Manage Configurations > Projects in the navigation pane on the left to obtain the MaxCompute project name.

    Note

    If your workspace is in standard mode, you must distinguish the project name in the development environment (_dev) from the project name in the production environment when you configure this parameter.

    odps.access.id

    Yes

    The AccessKey ID of your Alibaba Cloud account or RAM user. You can go to the AccessKey Management page to obtain the AccessKey ID.

    odps.access.key

    Yes

    The AccessKey secret that corresponds to the AccessKey ID.

    odps.tunnel.quota

    Yes

    The name of the quota used to access MaxCompute. Two types of resources are supported for accessing MaxCompute: exclusive resource groups for Data Transmission Service (subscription) and open storage (pay-as-you-go). You can specify only one quota name.

    • Subscription: Log on to the MaxCompute console, switch the region in the upper-left corner, and then choose Manage Configurations > Quotas in the navigation pane on the left to view the list of available quotas. For more information, see Computing resources - Quota management.

    • Pay-as-you-go: Log on to the MaxCompute console, switch the region in the upper-left corner, and then choose Manage Configurations > Tenant Property in the navigation pane on the left. Turn on the Storage API Switch switch to grant the required permissions.

  • Parameters configurable in ScanParams

    Parameter

    Required

    Description

    odps.split.policy

    No

    The data sharding policy that is used during data scanning. Valid values:

    • size: Data is sharded by data size. This is the default value. The default size for sharding is 256 MB.

    • row_offset: Data is sharded by the number of rows.

    Note

    The sharding policy significantly affect the data scan concurrency of the compute engine. If the default policy does not meet your business requirements, you can use the row_offset policy together with the odps.split.row.count parameter to make adjustments as needed.

    odps.split.row.count

    No

    The maximum number of rows of data contained in each shard. Default value: 4194304. Value range: values greater than 0.

    Note

    This parameter is required only when you set the odps.split.policy parameter to row_offset.

  • Parameters configurable in CachingMetaParams

    Parameter

    Required

    Description

    odps.cache.table.enable

    No

    Specifies whether to cache table metadata. Valid values:

    • true: caches table metadata. This is the default value. Table metadata caching helps improve metadata retrieval and update speed. Table metadata caching is suitable for scenarios in which you need to frequently perform DDL operations, a large environment that involves multiple users is used, or you want to optimize complex queries.

    • false: does not cache table metadata. The caching resources are limited, and excessive dependence on caching may exhaust memory resources. You can disable table metadata caching in scenarios in which you do not need to frequently query data or table metadata frequently changes.

    odps.cache.table.expire

    No

    The duration after which the cached table metadata is automatically deleted. Unit: seconds (s). Default value: 86400, which is equal to 24 hours. Value range: values greater than or equal to 0.

    odps.cache.table.size

    No

    The number of tables whose metadata needs to be cached. Default value: 1000.

    odps.cache.partition.enable

    No

    Specifies whether to cache the metadata of all partitions in the desired table. Valid values:

    • true: caches the metadata of all partitions in the desired table. This is the default value.

    • false: does not cache the metadata of any partitions in the desired table.

    Note

    The odps.cache.partition.enable parameter does not take effect for non-partitioned tables.

    odps.cache.partition.expire

    No

    The duration after which the cached metadata of all partitions is automatically deleted. Unit: seconds (s). Default value: 86400, which is equal to 24 hours. Value range: values greater than or equal to 0.

    odps.cache.partition.size

    No

    The number of partitioned tables that can be cached. Default value: 1000.

    odps.cache.table-name.enable

    No

    Specifies whether to cache the names of all tables in the MaxCompute project. Valid values:

    • true: caches the names of all tables in the MaxCompute project.

    • false: does not cache the names of all tables in the MaxCompute project. This is the default value.

    odps.cache.table-name.expire

    No

    The duration after which the cached names of tables in the MaxCompute project are automatically deleted. Unit: seconds (s). Default value: 86400, which is equal to 24 hours. Value range: values greater than or equal to 0.

Example

This section provides an example of how to create a MaxCompute catalog named odps_catalog. This example uses the MaxCompute project mf_mc_bj.

-- Create a catalog.
CREATE EXTERNAL CATALOG odps_catalog PROPERTIES(
    "type"="odps",
    "odps.access.id"="<yourAccessKeyId>",
    "odps.access.key"="<yourAccessKeySecret>",
    "odps.endpoint"="http://service.cn-beijing.maxcompute.aliyun.com/api",
    "odps.tunnel.quota"="pay-as-you-go",
    "odps.project"="mf_mc_bj"
);

Access MaxCompute

  1. Switch to the MaxCompute catalog and project that you created. This example uses the MaxCompute catalog odps_catalog and the MaxCompute project mf_mc_bj.

    • Syntax

      • Method 1

        -- Specify a catalog.
        SET CATALOG <catalog_name>;
        -- Specify a MaxCompute project.
        USE <project_name>;
      • Method 2

        USE <catalog_name>.<project_name>;
    • Example

      SET CATALOG odps_catalog;
      USE mf_mc_bj;
  2. Query data. This example uses the src table.

SELECT * FROM src LIMIT 10;

The following result is returned.

+------+-------+
| key  | value |
+------+-------+
| 1    | 1     |
| 3    | 3     |
| 2    | 2     |
| 4    | 100   |
| 5    | 200   |
| 6    | 300   |
| 3    | 400   |
+------+-------+

Other operations

View MaxCompute catalogs

  • Query all catalogs in the current StarRocks cluster.

    SHOW CATALOGS;

    The following result is returned.

    +-----------------+----------+------------------------------------------------------------------+
    | Catalog         | Type     | Comment                                                          |
    +-----------------+----------+------------------------------------------------------------------+
    | default_catalog | Internal | An internal catalog contains this cluster‘s self-managed tables. |
    | odps_catalog    | Odps     | NULL                                                             |
    +-----------------+----------+------------------------------------------------------------------+
  • Query the creation statement for a specific catalog.

    SHOW CREATE CATALOG odps_catalog;

    The following result is returned.

    +--------------+--------------------------------------------------------------------------+
    | Catalog      | Create Catalog                                                           |
    +--------------+--------------------------------------------------------------------------+
    | odps_catalog | CREATE EXTERNAL CATALOG `odps_catalog`
                     PROPERTIES ("odps.endpoint"  =  "http://service.cn-beijing.maxcompute.aliyun.com/api",
                     "odps.access.id"  =  "<yourAccessKeyId>",
                     "odps.access.key"  =  "<yourAccessKeySecret>",
                     "odps.project"  =  "odps_project",
                     "type"  =  "odps"
                    )|
    +-----------------------------------------------------------------------------------------+

View the table schema of a MaxCompute table

  • Query the schema of a MaxCompute table.

    • Syntax

      DESC[RIBE] <catalog_name>.<database_name>.<table_name>;
    • Parameters

      • catalog_name: Required. The name of the catalog.

      • database_name: Required. The name of the MaxCompute project in the catalog.

      • table_name: Required. The name of the table in the MaxCompute project.

    • Example

      DESC odps_catalog.mf_mc_bj.src;

      The following result is returned.

      +-------+------+------+-------+---------+-------+
      | Field | Type | Null | Key   | Default | Extra |
      +-------+------+------+-------+---------+-------+
      | id    | INT  | Yes  | false | NULL    |       |
      | a     | INT  | Yes  | false | NULL    |       |
      +-------+------+------+-------+---------+-------+
  • Query the schema of a MaxCompute table and its creation statement.

    • Syntax

      SHOW CREATE TABLE <catalog_name>.<database_name>.<table_name>;
    • Parameters

      • catalog_name: Required. The name of the catalog.

      • database_name: Required. The name of the MaxCompute project in the catalog.

      • table_name: Required. The name of the table in the MaxCompute project.

    • Example

      SHOW CREATE TABLE odps_catalog.mf_mc_bj.src;

      The following result is returned.

      +-------+--------------------------------------------------------------------------------+
      | Table | Create Table                                                                   |
      +-------+--------------------------------------------------------------------------------+
      | src   | CREATE TABLE `src` (
        `id` int(11) DEFAULT NULL,
        `a` int(11) DEFAULT NULL
      ) |
      +-------+--------------------------------------------------------------------------------+

Delete a MaxCompute catalog

You can run the following command to delete a MaxCompute catalog.

DROP CATALOG odps_catalog;

Collect statistical information for Cost-Based Optimizer (CBO)

StarRocks does not support automatic collection of statistical information for a MaxCompute table. You can run the following command to manually create a task to collect the statistical information.

-- Replace <table_name> with the name of the MaxCompute table.
ANALYZE TABLE <table_name>;
Note

The StarRocks Cost-Based Optimizer (CBO) estimates the execution cost of SQL queries based on various types of statistical information and selects the execution plan with the lowest cost from tens of thousands of potential execution plans. This improves the efficiency and performance of complex queries. For more information, see Gather statistics for CBO.

Manually update the cached metadata

By default, StarRocks caches MaxCompute metadata to improve query performance. If you change a table schema or update data in a table, you can manually update the table's metadata. This ensures that StarRocks retrieves the latest metadata cache as soon as possible.

-- Replace <table_name> with the name of the MaxCompute table.
REFRESH EXTERNAL TABLE <table_name>;

FAQ

Error message "Your slot quota is exceeded" is reported during data reads.

  • Problem analysis

    This error occurs when the data read operation exceeds the quota of the Data Transmission Service. When StarRocks reads data, it splits the table into multiple shards and distributes them to backend (BE) nodes. The BE nodes then read the data concurrently using a thread pool. Therefore, the number of slots occupied during the read operation is typically calculated as follows: Max(Number of shards, Number of BEs × Concurrency of the BE thread pool).

  • Solutions

    • Purchase an exclusive resource group for Data Transmission Service that supports high concurrency. For more information, see Purchase and use an exclusive resource group for Data Transmission Service.

    • Configure the ScanParams parameter to adjust the number of shards. If you use the row_offset sharding policy, you must also increase the value of the odps.split.row.count parameter.

    • Modify the number of threads in the StarRocks BE thread pool by changing the scanner_thread_pool_queue_size and scanner_thread_pool_thread_num parameters. For more information about how to modify these parameters, see What is StarRocks?.

Appendix: Data type mappings

Field data type in MaxCompute

Field data type in StarRocks

BOOLEAN

BOOLEAN

TINYINT

TINYINT

SMALLINT

SMALLINT

INT

INT

BIGINT

BIGINT

FLOAT

FLOAT

DOUBLE

DOUBLE

DECIMAL(p, s)

DECIMAL(p, s)

STRING

VARCHAR(1073741824)

VARCHAR(n)

VARCHAR(n)

CHAR(n)

CHAR(n)

JSON

VARCHAR(1073741824)

BINARY

VARBINARY

DATE

DATE

DATETIME

DATETIME

TIMESTAMP

DATETIME

Important

StarRocks does not support the TIMESTAMP data type. When StarRocks reads data of the TIMESTAMP type, StarRocks converts the data type of the data into DATETIME, which causes a loss in data precision.

ARRAY

ARRAY

MAP

MAP

STRUCT

STRUCT