All Products
Search
Document Center

ApsaraDB for SelectDB:Data types

Last Updated:Mar 28, 2026

ApsaraDB for SelectDB supports data types across six categories. Choosing the right type for each column improves storage efficiency, query performance, and data integrity. Run SHOW DATA TYPES; to view all supported types in your instance.

Type categories

CategoryTypes
NumericBOOLEAN, TINYINT, SMALLINT, INT, BIGINT, LARGEINT, FLOAT, DOUBLE, DECIMAL
DateDATE, DATETIME
StringCHAR, VARCHAR, STRING
Semi-structuredARRAY, MAP, STRUCT, JSON, VARIANT
AggregationHLL, BITMAP, QUANTILE_STATE, AGG_STATE
IPIPv4, IPv6

Numeric types

TypeBytesDescription
BOOLEAN1Boolean value. 0 = false, 1 = true.
TINYINT1Signed integer. Range: [-128, 127].
SMALLINT2Signed integer. Range: [-32768, 32767].
INT4Signed integer. Range: [-2147483648, 2147483647].
BIGINT8Signed integer. Range: [-9223372036854775808, 9223372036854775807].
LARGEINT16Signed integer. Range: [-2^127 + 1, 2^127 - 1].
FLOAT4Floating-point number. Range: [-3.4x10^38, 3.4x10^38].
DOUBLE8Floating-point number. Range: [-1.79x10^308, 1.79x10^308].
DECIMAL4/8/16High-precision fixed-point number. See DECIMAL for details.

DECIMAL

Definition format: DECIMAL(M[,D])

ParameterDescriptionRange
MTotal number of significant digits (precision)[1, 38]
DNumber of digits after the decimal point (scale)[0, M]

Storage size is determined by the precision value:

Precision (M)Storage
0 <= M <= 94 bytes
9 < M <= 188 bytes
16 < M <= 3816 bytes

Date types

TypeBytesDescription
DATE16Date only. Range: 0000-01-01 to 9999-12-31. Output format: yyyy-MM-dd.
DATETIME16Date and time. Supports up to microsecond precision. See DATETIME for details.

DATETIME

Definition format: DATETIME([P])

P controls the sub-second precision (number of decimal places):

  • Default: 0 (second-level precision)

  • Range: [0, 6], where 6 gives microsecond precision

Value range: 0000-01-01 00:00:00[.000000] to 9999-12-31 23:59:59[.999999]

Output format: yyyy-MM-dd HH:mm:ss.SSSSSS

String types

TypeBytesDescription
CHARMFixed-length string. Format: CHAR(M). Storage: 1–255 bytes. M is the byte length.
VARCHARVariableVariable-length string. Format: VARCHAR(M). Storage: 1–65533 bytes. Stored in UTF-8 encoding (1 byte per English character, 3 bytes per Chinese character).
STRINGVariableVariable-length string with large capacity. Default minimum: 1,048,576 bytes (1 MB). Maximum: 2,147,483,643 bytes (2 GB).

STRING usage notes:

  • Adjust the default storage size with the string_type_length_soft_limit_bytes parameter.

  • STRING can only be used for Value columns. It cannot be used as a Key column, or as a partitioning or bucketing column.

Semi-structured types

SelectDB provides five semi-structured types. Use the table below to select the right type for your data.

TypeUse when
ARRAYYour data is an ordered list of elements with the same type (for example, a list of scores or tags).
MAPYour data is a set of key-value pairs with consistent key and value types.
STRUCTYour data has a fixed set of named fields with known types (for example, an address or a product record).
JSONYour JSON data has a predictable, stable structure and you need to query specific fields using JSON functions.
VARIANTYour JSON data has a dynamic or evolving structure with varying keys or schemas. VARIANT auto-splits JSON fields into sub-columns for better storage efficiency and query performance.
TypeBytesDescription
ARRAYVariableAn ordered collection of elements of a single type T. Format: ARRAY<T>.
MAPVariableA collection of key-value pairs. Format: MAP<K,V>.
STRUCTVariableA record of named fields, each with its own type. Format: STRUCT<field_name:field_type, ...>.
JSONVariableStored in binary JSON format. Default storage: 1,048,576 bytes (1 MB). Maximum: 2,147,483,643 bytes (2 GB). Access fields using JSON functions.
VARIANTVariableA dynamic type for semi-structured JSON data. Automatically splits JSON fields into sub-columns for improved storage and query performance. Requires kernel 4.0.0+.

ARRAY

Supported element types for T:

BOOLEAN, TINYINT, SMALLINT, INT, BIGINT, LARGEINT, FLOAT, DOUBLE, DECIMAL, DATE,
DATEV2, DATETIME, DATETIMEV2, CHAR, VARCHAR, STRING

Limitations:

  • Cannot be used as a Key column.

  • Only supported in tables with the Duplicate or Unique model.

MAP

Supported types for K and V:

BOOLEAN, TINYINT, SMALLINT, INT, BIGINT, LARGEINT, FLOAT, DOUBLE, DECIMAL, DECIMALV3, DATE,
DATEV2, DATETIME, DATETIMEV2, CHAR, VARCHAR, STRING

Limitations:

  • Cannot be used as a Key column.

  • Only supported in tables with the Duplicate or Unique model.

STRUCT

  • field_name: Field identifier. Must be unique within the struct.

  • field_type: Data type of the field.

Limitations:

  • Cannot be used as a Key column.

  • Only supported in tables with the Duplicate model.

JSON

  • Default storage size: 1,048,576 bytes (1 MB). Maximum: 2,147,483,643 bytes (2 GB).

  • Adjust the storage size with the jsonb_type_length_soft_limit_bytes parameter.

VARIANT

  • Accepts any valid JSON input and automatically splits JSON fields into sub-columns for improved storage efficiency and query performance.

  • Useful for complex nested structures that may change over time.

  • For details, see VARIANT.

Important

VARIANT is only supported in SelectDB kernel version 4.0.0 and later. If your instance runs an earlier version, upgrade the kernel version.

Aggregation types

Aggregation types pre-compute and store aggregate results during data ingestion. They require dedicated functions for queries — they cannot be queried directly with standard SQL expressions.

TypeDescription
HLLApproximate deduplication using the HyperLogLog (HLL) algorithm. Typical error rate: ~1%, up to 2%.
BITMAPExact deduplication. Suited for UV counting and audience segmentation.
QUANTILE_STATEApproximate quantile computation. Uses explicit storage for small datasets, and the TDigest algorithm for large datasets.
AGG_STATEStores intermediate aggregation state to accelerate aggregation computations.

HLL

HLL provides approximate deduplication with a typical error rate of ~1% (up to 2%).

Limitations and requirements:

  • Cannot be defined as a Key column. The aggregation type must be HLL_UNION.

  • No length or default value is required — storage is managed internally based on data aggregation.

  • Query HLL columns using functions: hll_union_agg, hll_raw_agg, hll_cardinality, hll_hash.

BITMAP

BITMAP provides exact deduplication and is commonly used for UV counting and audience segmentation.

Limitations and requirements:

  • Supported in Aggregate, Unique, and Duplicate tables, but only as non-Key columns. In Aggregate tables, use with the BITMAP_UNION aggregation type.

  • No length or default value is required — storage is managed internally based on data aggregation.

  • Query BITMAP columns using functions: bitmap_union_count, bitmap_union, bitmap_hash, bitmap_hash64.

QUANTILE_STATE

QUANTILE_STATE computes approximate quantiles. During ingestion, values with the same Key are pre-aggregated:

  • If the number of values <= 2048: all values are stored explicitly.

  • If the number of values > 2048: the TDigest algorithm clusters data and stores centroid points.

Limitations and requirements:

  • Cannot be defined as a Key column. The table's aggregation type must be QUANTILE_UNION.

  • No length or default value is required — storage is managed internally based on data aggregation.

  • Query QUANTILE_STATE columns using functions: QUANTILE_PERCENT, QUANTILE_UNION, TO_QUANTILE_STATE.

AGG_STATE

AGG_STATE stores intermediate aggregation state to accelerate aggregation computations. Use it with the state, merge, and union function combinators.

Limitations and requirements:

  • Cannot be used as a Key column.

  • When creating a table, declare the signature of the aggregate function alongside the AGG_STATE column.

  • No length or default value is required — actual storage size depends on the function implementation.

IP types

Important

IPv4 and IPv6 are only supported in SelectDB kernel version 4.0.0 and later. If your instance runs an earlier version, upgrade the kernel version.

TypeBytesDescription
IPv44Stores IPv4 addresses as 4-byte binary values. Use with ipv4_* family functions.
IPv616Stores IPv6 addresses as 16-byte binary values. Use with ipv6_* family functions.