All Products
Search
Document Center

DataHub:Terms

Last Updated:Nov 13, 2025

Terms

Term

Description

Project

  • A project is the basic organizational unit for data in DataHub and contains multiple topics.

  • DataHub projects are independent of MaxCompute projects.

Topic

A topic is the minimum unit for data subscription and publishing in DataHub. You can use a topic to represent a class or type of streaming data.

time-to-live (TTL) period of a topic

The time-to-live (TTL) period of a topic specifies the maximum retention period for data written to a topic. The unit is day. The value ranges from 1 to 7.

Shard

A shard is a concurrent channel for data transmission in a topic. Each shard has a unique ID.

An enabled shard consumes server resources. Create shards as needed.

For more information about the different states of a shard, see Shard states.

shard hash key range

A property of a shard that specifies the range of hash key values. The range is left-closed and right-open.

Data with the same key is written to the same shard.

shard merge

A shard merge is an operation that merges shards with contiguous key ranges into a single shard.

For more information, see Shard operations.

shard split

A shard split is an operation that splits a shard into two shards with contiguous shard key ranges.

Record

The basic unit of data interaction between user data and DataHub.

record type

The data type of a topic. DataHub supports the Tuple and BLOB types.

  • A topic of the Tuple type supports database-like data records. Each record contains multiple columns.

  • A topic of the BLOB type supports writing only a block of binary data.

For more information about supported data types, see Data types.

Data types

  • Tuple type: Supports writing data of the following data types:

    The TINYINT, SMALLINT, INTEGER, and FLOAT data types in DataHub are supported as of Java SDK V2.16.1-public.

    Type

    Description

    Range

    BIGINT

    8-byte signed integer

    -9223372036854775807 to 9223372036854775807

    DOUBLE

    8-byte double-precision floating-point number

    -1.0 × 10^308 to 1.0 × 10^308

    BOOLEAN

    Boolean type

    • True/true/1

    • False/false/0

    TIMESTAMP

    Timestamp type

    A timestamp that is accurate to the microsecond.

    STRING

    String. Only UTF-8 encoding is supported.

    A single STRING column can be up to 2 MB.

    TINYINT

    Single-byte integer

    -128 to 127

    SMALLINT

    Double-byte integer

    -32768 ~ 32767

    INTEGER

    4-byte integer

    -2147483648 to 2147483647

    FLOAT

    4-byte single-precision floating-point number

    -3.40292347 × 10^38 to 3.40292347 × 10^38

    DECIMAL

    Numeric type

    -10^38 + 1 to 10^38 - 1

  • In Blob mode, a block of binary data can be written as a record. The data is transmitted using Base64 encoding.

Shard states

State

Description

Read/write support

Opening

When a topic is created, all its shards are in the Opening state while they are being initialized.

Read and write operations are not supported.

Active

After a shard channel is opened, the shard enters the Active state.

Normal read and write operations are supported.

Closing

A shard is in the Closing state when it undergoes a split or merge operation.

Read and write operations are not supported.

Closed

After a split or merge operation is complete, the shard enters the Closed state.

Read-only.

Error descriptions

ErrorCode

HttpCode

Description

InvalidUriSpec

400

The requested URI is invalid.

InvalidParameter

400

Invalid parameter. For more information, see the returned error message.

Unauthorized

401

Signature error.

NoPermission

403

Insufficient account permissions.

InvalidSchema

400

Invalid schema format.

InvalidCursor

400

The cursor is invalid or has expired.

NoSuchProject

404

The requested project does not exist.

NoSuchTopic

404

The requested topic does not exist.

NoSuchShard

404

The requested shard ID does not exist.

ProjectAlreadyExist

400

The project already exists.

TopicAlreadyExist

400

The topic already exists.

InvalidShardOperation

405

Invalid shard operation. For example, writing data to a shard after it is closed.

LimitExceeded

400

The request parameters exceed the limit. For example, the total number of shards exceeds 512.

InternalServerError

500

An unknown error, internal service exception, or system upgrade is in progress.