All Products
Search
Document Center

MaxCompute:Overview

Last Updated:Mar 26, 2026

Proxima CE is an offline vector search engine built on the Proxima 2.x kernel, developed by Alibaba DAMO Academy. It runs as MapReduce or Graph jobs inside MaxCompute, reading vector data from MaxCompute tables and writing search results back to MaxCompute tables. Use Proxima CE when you need large-scale batch vector search — including top K retrieval from millions of records, multi-category search, and cluster-sharded index queries — without managing a separate search infrastructure.

What Proxima CE supports

Data types

Data type Notes
INT8
FLOAT
BINARY Can be converted to INT32 using the binary_to_int parameter. See Optional parameters.

Search methods

Method Full name Default
HNSW Hierarchical Navigable Small World Yes
SSG Satellite System Graph
HC Hierarchical Clustering
GC Graph Clustering
QC Quantized Clustering
Linear search

Distance calculation

Three distance methods are available via the distance_method parameter:

  • Squared Euclidean distance

  • Inner product

  • Hamming distance

For details, see Optional parameters.

Similarity threshold

Set a similarity threshold using the threshold_score parameter. If the value of a vector exceeds the specified threshold, the system filters out the vector. For details, see Optional parameters.

How it works

MaxCompute table (source data)
        │
        ▼
Proxima CE — creates index, runs batch queries
(via MapReduce or Graph jobs)
        │
        ▼
MaxCompute table (search results)

Proxima CE provides built-in executable JAR files to run in MaxCompute. Index files are stored in MaxCompute Volume storage (backed by an OSS external volume) and are reused across query tasks.

Prerequisites

Before you begin, make sure you have:

Required

Recommended

  • Create the external volume before you start. If you skip this step, you must provide role_arn as a required startup parameter, which introduces security risks.

Usage notes

The external volume must be configured with an OSS internal endpoint, for example, oss-cn-beijing-internal.aliyuncs.com. For OSS internal endpoints by region, see Regions and endpoints.

Supported tools

Tool Supported platforms Notes
odpscmd Linux only JAR files are compiled for Linux. Windows and macOS are not supported.
DataWorks All platforms Create ODPS MR nodes and run them with ODPS SQL scripts.

Get started

  1. Install the Proxima CE package — Set up the environment and configure Proxima CE. See Install the Proxima CE package.

  2. Run a vector search — Choose a search scenario from the table below.

Scenario Key capability Reference
Basic vector search Top K retrieval from millions of records Basic vector search
Multi-category search Supports different-category query/doc tables and single-query-multiple-category scenarios Multi-category search
Cluster sharding Index by cluster shard to reduce compute and accelerate queries Cluster sharding
Inner product and cosine distance Inner-product and cosine distance search Inner product and cosine distance
Converters Improve performance and reduce index size (retrieval loss varies) Converters

References

Parameters and kernel modules

Test reports

Feature testing:

Performance testing:

FAQ and troubleshooting