The MaxCompute V2.0 data type edition expands the set of supported SQL types and changes how certain DML statements and functions behave. Use it for projects that do not contain data generated before April 2020 and that depend on components compatible with this edition.
Enable the edition
Run the following commands at the project level to activate the MaxCompute V2.0 data type edition:
setproject odps.sql.type.system.odps2=true; -- Enable the MaxCompute V2.0 data type edition.
setproject odps.sql.decimal.odps2=true; -- Enable the DECIMAL data type in MaxCompute V2.0.
setproject odps.sql.hive.compatible=false; -- Disable the Hive-compatible data type edition.
Basic data types
All basic data types support NULL values. Valid partition key column types are: STRING, VARCHAR, CHAR, TINYINT, SMALLINT, INT, and BIGINT. Built-in functions that require the MaxCompute V2.0 data type edition can be run. Data types defined in UDFs are parsed and overloaded based on the MaxCompute V2.0 data type edition.
Numeric types
Integer types
| Type | Constant example | Range |
|---|---|---|
| TINYINT | 1Y, -127Y |
8-bit signed integer; –128 to 127 |
| SMALLINT | 32767S, -100S |
16-bit signed integer; –32768 to 32767 |
| INT | 1000, -15645787 |
32-bit signed integer; –2^31 to 2^31 – 1 |
| BIGINT | 100000000000L, -1L |
64-bit signed integer; –2^63 + 1 to 2^63 – 1 |
Usage notes for integer types:
-
The
INTkeyword in SQL always refers to the 32-bit signed integer type. For example,CAST(a AS INT)convertsato a 32-bit integer. -
Integer constants default to INT. If a constant exceeds the INT range but fits within BIGINT, it is treated as BIGINT. If it exceeds BIGINT, it is treated as DOUBLE.
Floating point types
| Type | Constant example | Description |
|---|---|---|
| FLOAT | 3.14F, cast(3.14159261E+7 as float) |
32-bit binary floating point |
| DOUBLE | 3.14D, 3.14159261E+7 |
64-bit binary floating point |
Both FLOAT and DOUBLE are subject to precision loss due to how floating point values are stored and computed. For calculations requiring exact results, use DECIMAL instead.
DECIMAL type
DECIMAL stores exact numeric values based on the decimal system. It takes two parameters:
-
precision: total number of digits. Valid values: 1 to 38.
-
scale: number of digits to the right of the decimal point. Valid values: 0 to 18.
The default expression is DECIMAL(38, 18).
Constants must use the BD suffix. Example:
INSERT INTO test_tb(a) VALUES (3.5BD);
To extend the scale range to 0–38, run:
set odps.sql.decimal2.extended.scale.enable=true;
DECIMAL values from the old and new editions cannot coexist in the same table.
When the Hive-compatible data type edition is enabled (setproject odps.sql.hive.compatible=true;), the following behaviors apply toDecimal(precision,scale)values during Tunnel-based uploads or SQL operations: if the number of digits to the right of the decimal point exceeds the value ofscale, the value is rounded; if the integer part exceeds the limit, no error is reported, but the input data becomes NULL values.
Whenodps.sql.decimal.tostring.trimzerois set totrue(default), trailing zeros after the decimal point are removed when reading from a table. Set it tofalseto retain trailing zeros. This parameter does not affect static values.
String types
| Type | Constant example | Description |
|---|---|---|
| STRING | "abc", 'bcd', "alibaba", 'inc' |
String type. Maximum length: 8 MB. |
| VARCHAR(n) | No default value | Variable-length string. Valid values for n: 1 to 65535. |
| CHAR(n) | No default value | Fixed-length string. Maximum value for n: 255. Strings shorter than n are padded with spaces, but trailing spaces are not included in comparisons. |
Usage notes for string types:
-
STRING constants support concatenation: adjacent string literals are automatically merged. For example,
select 'abc' 'efg' 'ddt';returnsabcefgddt. -
VARCHAR constants can be implicitly converted to STRING.
Date and time types
Choosing between TIMESTAMP and TIMESTAMP_NTZ
| Type | Time zone behavior | Valid range | Accuracy | Constant example |
|---|---|---|---|---|
| TIMESTAMP | Time zone independent. Stores a UTC offset from Epoch (UTC 1970-01-01 00:00:00). Use built-in functions for time zone conversion. | 0001-01-01 00:00:00.000000000 to 9999-12-31 23:59:59.999999999 | Nanosecond | TIMESTAMP'2017-11-11 00:00:00.123456789' |
| TIMESTAMP_NTZ | TIMESTAMP data type independent from time zones. | 0000-01-01 00:00:00.000000000 to 9999-12-31 23:59:59.999999999 | Nanosecond | TIMESTAMP_NTZ '2017-11-11 00:00:00.123456789' |
Use TIMESTAMP when you need to record an absolute point in time and may need to convert between time zones later. Use TIMESTAMP_NTZ when you need to store a local date and time without any time zone context.
Other date and time types
| Type | Constant example | Valid range | Accuracy |
|---|---|---|---|
| DATE | DATE'2017-11-11' |
0001-01-01 to 9999-12-31 | Day (yyyy-mm-dd) |
| DATETIME | DATETIME'2017-11-11 00:00:00' |
0001-01-01 00:00:00.000 to 9999-12-31 23:59:59.999 | Millisecond |
DATETIME values do not include the millisecond component. To upload data with millisecond precision using Tunnel, add -dfp 'yyyy-MM-dd HH:mm:ss.SSS' to the Tunnel command. For details, see Tunnel commands.
INTERVAL type
INTERVAL represents a time period between two dates or times. It includes two subtypes:For more details, see INTERVAL data type.
| Subtype | Constant examples |
|---|---|
| INTERVAL_YEAR_MONTH | INTERVAL '2021' YEAR, INTERVAL '2000-1' YEAR TO MONTH |
| INTERVAL_DAY_TIME | INTERVAL '1' DAY, INTERVAL '-1 23:59:59.999' DAY TO SECOND |
Binary type
BINARY stores raw binary data. Maximum length: 8 MB.
| Constant form | Example | Notes |
|---|---|---|
unhex('<hex_string>') |
unhex('FA34E10293CB42848573A4E39937F479') |
Converts a hexadecimal string to binary |
X'<hex_digits>' |
X'616263' |
Hexadecimal literal (digits: 0–9, A–F) |
X'616263'is semantically equivalent tounhex('616263'). It representsabcbecausea= 0x61,b= 0x62,c= 0x63 in ASCII.
If the hexadecimal string has an odd number of digits,0is prepended. For example,X'616'is equivalent toX'0616'.
Use single quotes', not double quotes".X"616263"is not interpreted as a BINARY constant.
Boolean type
| Type | Valid values |
|---|---|
| BOOLEAN | True, False |
Complex data types
| Type | Type definition example | Construction example |
|---|---|---|
| ARRAY | array<int>, array<struct<a:int, b:string>> |
array(1, 2, 3), array(struct(1, 2), struct(3, 4)) |
| MAP | map<string, string>, map<smallint, array<string>> |
map("k1", "v1", "k2", "v2"), map(1S, array("a", "b"), 2S, array('z', 'y')) |
| STRUCT | struct<x:int, y:int>, struct<field1:bigint, field2:array<int>, field3:map<int, int>> |
named_struct('x', 1, 'y', 2), named_struct('field1', 100L, 'field2', array(1, 2), 'field3', map(1, 100, 2, 200)) |
| JSON | JSON |
JSON '123' |
Usage notes:
Implicit conversion rules
Some implicit conversions are disabled in the MaxCompute V2.0 data type edition because they can reduce precision or cause errors. Use the CAST function for these conversions.
| Source type | Target type | Status | Workaround |
|---|---|---|---|
| STRING | BIGINT | Disabled | CAST(a AS BIGINT) |
| STRING | DATETIME | Disabled | CAST(a AS DATETIME) |
| DOUBLE | BIGINT | Disabled | CAST(a AS BIGINT) |
| DECIMAL | DOUBLE | Disabled | CAST(a AS DOUBLE) |
| DECIMAL | BIGINT | Disabled | CAST(a AS BIGINT) |
| VARCHAR | STRING | Allowed | — |
Differences from other data type editions
DML execution rules
LIMIT in SET operations
The scope of a LIMIT clause differs across editions. Given SELECT * FROM t1 UNION ALL SELECT * FROM t2 LIMIT 10;:
| Edition | Behavior |
|---|---|
| MaxCompute V1.0 | LIMIT applies to t2 only: SELECT * FROM t1 UNION ALL SELECT * FROM (SELECT * FROM t2 LIMIT 10) t2; |
| MaxCompute V2.0 | LIMIT applies to the entire result: SELECT * FROM (SELECT * FROM t1 UNION ALL SELECT * FROM t2) t LIMIT 10; |
This difference also applies to ORDER BY, DISTRIBUTE BY, SORT BY, and CLUSTER BY clauses.
IN expressions
For a IN (1, 2, 3):
| Edition | Behavior |
|---|---|
| MaxCompute V1.0 | All values in the list must be the same type. |
| MaxCompute V2.0 | Values can be of different types if they are implicitly convertible to a common type. |
INSERT conversion rules
| Edition | Behavior |
|---|---|
| Hive-compatible | If a source type can be explicitly converted, MaxCompute automatically inserts the conversion function. |
| MaxCompute V1.0 and V2.0 | The source type must be implicitly convertible to the target type. Otherwise, an error is returned. |
The following operations succeed in Hive-compatible mode but fail in V1.0 and V2.0 mode:
CREATE TABLE t (a BIGINT);
INSERT INTO TABLE t SELECT 1.5;
Function behavior
| Function or operator | Hive-compatible | MaxCompute V1.0 and V2.0 |
|---|---|---|
+, -, *, /, POW — value exceeds type range |
Returns the initial value | Returns an error (in other modes, returns NULL) |
>, >=, =, <, <= — comparing DOUBLE values |
Compares values directly | Treats values as equal if the first 15 digits to the right of the decimal point match |
Bitwise operators: &, |, ^ — return type |
Same type as input | BIGINT |
| LENGTH, LENGTHB, FIND_IN_SET, INSTR, SIZE, HASH, SIGN — return type | INT | BIGINT |
| FLOOR, CEIL — DECIMAL input, return type | DECIMAL | BIGINT |
| FROM_UNIXTIME — return type | STRING | DATETIME |
| CONCAT_WS — NULL input string | NULL is ignored | Returns NULL |
| FIND_IN_SET — empty string input | Treated as a match at the end of the string | Returns 0 (unmatched) |
| REGEXP_EXTRACT, REGEXP_REPLACE — regex schema | Java regular expression specs | MaxCompute specs |
SUBSTR — start_position = 0 |
Returns the same result as start_position = 1 |
Returns NULL |