This topic describes the data types supported by MaxCompute V2.0, basic data types and complex data types.
Basic data types
MaxCompute supports various data types. MaxCompute V2.0 supports more basic data types.
setproject odps.sql.type.system.odps2=true;. Otherwise, the error message
xxxx type is not enabled in current modemay be displayed. In addition, you need to pay attention to impacts on the original tasks. For more information, see the precautions for using new data types described in this section.
odps.sql.type.system.odps2for setting new data types supports only lowercase letters.
|Type||New in MaxCompute V2.0||Constant||Description|
|TINYINT||Yes||1Y, -127Y||The 8-bit signed integer type.
Value range: -128 to 127.
|SMALLINT||Yes||32767S, -100S||The 16-bit signed integer type.
Value range: -32768 to 32767.
|INT||Yes||1000, -15645787||The 32-bit signed integer type.
Value range: (-2) 31 to 2 31 -1. See note 1 and note 2.
|BIGINT||No||100000000000L, -1L||The 64-bit signed integer type.
Value range: (-2) 63 +1 to 2 63 -1.
|FLOAT||Yes||N/A||The 32-bit binary float type.|
|DOUBLE||No||3.1415926 1E+7||The 64-bit binary float type.|
|DECIMAL||No||3.5BD, 99999999999.9999999BD||The decimal precision number type. Integer range: (-10) 36 +1 to 10 36 -1. Decimal place: accurate to 10 -18 . See note 5.|
|VARCHAR(n)||Yes||N/A||The variable-length character type. n specifies a length ranging from 1 to 65535. See note 3.|
|STRING||No||"abc", 'bcd', "alibaba", 'inc'||The string type. Maximum length: 8 MB. See note 4.|
|BINARY||Yes||N/A||The binary data type. Maximum length: 8 MB.|
|DATETIME||No||DATETIME '2017-11-11 00:00:00'||The datetime type.
Value range: 0000-01-01 to 9999-12-31, accurate to milliseconds. See note 6.
|TIMESTAMP||Yes||TIMESTAMP '2017-11-11 00:00:00.123456789'||The timestamp type is independent of time zones. Value range: 0000-01-01 to 9999-12-31
23.59:59.999999999, accurate to nanoseconds.
Note For some time zone-related functions, such as
|BOOLEAN||No||True, False||The Boolean type. Valid values: True, False.|
|DATE||Yes||DATE'2017-11-11'||The date type, in the format of yyyy-mm-dd.
Value range: 0000-01-01 to 9999-12-31.
|CHAR[(n)]||Yes||N/A||The fixed-length character type. n specifies the length.
Maximum value: 255. If the length does not reach the specified value, all extra spaces are filled in, but are not involved in the comparison.
- New data types are supported by:
- MaxCompute SQL
- MapReduce of the new version
- SDK 0.27.2-public and later versions, client 0.27.0 and later versions
- Enabling new data types
By default, new data types cannot be used due to historical reasons. To use the new data types, run a flag command (the flag command supports only lowercase letters):
- Session level: When a new data type, such as tinyint, smallint, int, float, varchar,
timestamp, or binary, is used in an SQL statement or a MapReduce task, you must insert
set odps.sql.type.system.odps2=true;before the SQL statement or the MapReduce task, and commit and run it along with the SQL statement or the MapReduce task. When you use MaxCompute Studio to commit SQL statements, the system automatically inserts this set statement. For more information about examples, see Create and view a table.
- Project level: You can enable new data types at the project level. The following section
describes major impacts caused by enabling new data types. Read the description and
proceed with caution. The project owner can run the following command to configure
For more information about setproject, see Other operations.
- Session level: When a new data type, such as tinyint, smallint, int, float, varchar, timestamp, or binary, is used in an SQL statement or a MapReduce task, you must insert
- Major impacts caused by enabling new data types:
- Note 1: changes in the semantics of the INT keyword. After you enable the new data
types, the INT keyword in the SQL statement refers to the 32-bit integer type, for
cast(a as INT)is to convert the a type to the 32-bit integer type. If new data types are not enabled, it is converted to the BIGINT type, which is a 64-bit value. For example, in an SQL statement,
cast(a as INT)is equivalent to
cast(a as BIGINT), or
create table t(a INT)is equivalent to
create table a (BIGINT). To be compatible with the original data types of MaxCompute, if you do not set
odps.sql.type.system.odps2=true;in new data types, the preceding conversion is still performed but an error is prompted, indicating that INT is processed as BIGINT. To prevent any confusion that may result from this conversion, we recommend that you change all INT to BIGINT.
- Changes to the semantics of an integer constant
Use the integer constant 1 in
SELECT 1 + a;as an example.
- If new data types are not enabled, the integer is processed as the BIGINT type. (If the length of the constant exceeds the value range of BIGINT, for example, 1000000000000000000000000, it is processed as the DOUBLE type.)
- If new data types are enabled, the integer constant is first interpreted as 1 of a 32-bit integer of the INT type. If the constant value exceeds the value range of the INT type but does not exceed the value range of the BIGINT type, it is interpreted as the BIGINT type. If the value of the constant exceeds the value range of the BIGINT type, it is interpreted as the DOUBLE type.
- Possible compatibility issues: The INT type may cause inconsistencies in function prototypes during subsequent operations. For example, the actions of peripheral tools and subsequent operations might be altered due to the new-type tables generated after data is written to a disk.
- Changes to the implicit type conversion rules
- If new data types are enabled, some implicit type conversions may be disabled. For example, errors may occur or precision may be reduced when the data type is converted from STRING to BIGINT, from STRING to DATETIME, from DOUBLE to BIGINT, from DECIMAL to DOUBLE, or from DECIMAL to BIGINT. In this case, you can use the CAST function to convert the data type.
- Implicit type conversion has major impacts on function invocations and INSERT statements. For example, the SQL statements can be executed when new data types are not enabled, but an error is reported when new data types are enabled.
- Changes to the supported operations, built-in functions, and UDFs
When new data types are not enabled, some operations and built-in functions that use new data types as input parameters and return values are ignored. These operations and functions can be normally executed only after you enable the new data types. The major impact is as follows:
- Some built-in functions can only be used after new data types are enabled. Such functions include most functions that use the INT type as parameters and subsequently bear BIGINT overload, such as YEAR, QUARTER, MONTH, DAY, HOUR, MINUTE, SECOND, MILLISECOND, NANOSECOND, DAYOFMONTH, and WEEKOFYEAR. These functions can be implemented by using the DATEPART built-in function.
- Changes to UDF resolution Assume that a UDF contains two overloads: BIGINT and INT. The BIGINT overload is performed based on the original data types because the INT type cannot be identified. However, the INT overload may be parsed based on the new data types.
- Changes to the resolution of the BIGINT keyword
A single integer constant, such as 123, is classified as the BIGINT type based on the original data types, whereas it is classified as the INT type based on the new data types. Possible compatibility issues: The INT type may cause inconsistencies in function prototypes during subsequent operations. For example, the actions of peripheral tools and subsequent operations might be altered due to new-type tables generated after data is written to a disk.
- Changes to partition column types
- If new data types are not enabled, the partition column type can only be STRING.
- If new data types are enabled, the partition column type can be STRING, VARCHAR, CHAR, TINYINT, SMALLINT, INT, or BIGINT.
- Special processing is used if new data types are not enabled: The partition fields
in INSERT operations are processed as the STRING type. For example, enter:
insert overwrite table t partition (pt = 045) select ... ;. Note that 045 is not enclosed in quotation marks. Therefore, it should be parsed as the integer type, that is, 045 is actually 45. However, if new data types are not enabled, 045 is still processed as 045. If new data types are enabled, the special processing is not performed, that is, 045 is 45.
- Changes to the behavior of the LIMIT statement
SELECT * FROM t1 UNION ALL SELECT * FROM t2 limit 10;statement as an example.
- If new data types are not enabled, this statement is
SELECT * FROM t1 UNION ALL SELECT * FROM ( SELECT * FROM t2 limit 10) t2;.
- If new data types are enabled, this statement is
SELECT * FROM (SELECT * FROM t1 UNION ALL SELECT * FROM t2 ) t limit 10;.
Actions of the UNION, INTERSECT, and EXCEPT, INTERSECT, EXCEPT, LIMIT, ORDER BY, DISTRIBUTE BY, SORT BY, and CLUSTER BY statements also change if new data types are enabled.
- If new data types are not enabled, this statement is
- Changes to the resolution of the IN expression
IN expressions, such as
a in (1, 2, 3):
- If new data types are not enabled, all the values in the parentheses () must be of the same type.
- If new data type are enabled, all the values in the parentheses () are implicitly converted to the same type.
- Note 1: changes in the semantics of the INT keyword. After you enable the new data types, the INT keyword in the SQL statement refers to the 32-bit integer type, for example,
- Note 2: If an INT constant exceeds the INT value range, it is converted into the BIGINT
type. If an INT constant exceeds the BIGINT value range, it is converted into the
DOUBLE type. In earlier MaxCompute versions, all INT types in SQL statements are converted
to BIGINT, as shown in the following example:
create table a_bigint_table(a int); -- The INT in this statement is actually processed as BIGINT. select cast(id as int) from mytable; -- The INT in this statement is actually processed as BIGINT.
- Note 3: VARCHAR constants can be implicitly converted to STRING constants.
- Note 4: STRING constants can be connected. For example,
xyzare parsed as
abcxyz, and different parts can be written on different lines.
- Note 5: When you insert a constant into the DECIMAL field, make sure that it is written
in the same format as that in the constant definition, for example, 3.5BD in the following
insert into test_tb(a) values (3.5BD)
- Note 6: Time values in milliseconds are not displayed. Tunnel commands use -dfp to specify the time format, which can be accurate to milliseconds, for example,
tunnel upload -dfp 'yyyy-MM-dd HH:mm:ss.SSS'. For more information about Tunnel commands, see Tunnel commands.
Complex data types
The following table lists the complex data types supported by MaxCompute V2.0.
|Type||Definition method||Construction method|