Dataphin currently utilizes MaxCompute 2.0 as its compute engine. This topic outlines the data types that MaxCompute 2.0 supports within Dataphin.
Scenarios
The MaxCompute V2.0 data type edition is suitable for scenarios in which your project does not contain data generated before April 2020 and depends on components that support the MaxCompute V2.0 data type edition.
Basic data types
Type | Constant example | Description |
TINYINT | 1Y and -127Y | An 8-bit signed integer type. Valid values range from -128 to 127. |
SMALLINT | 32767S and -100S | A 16-bit signed integer type. Valid values range from -32768 to 32767. |
INT | 1000 and -15645787 | A 32-bit signed integer type. Valid values range from -231 to 231-1. |
BIGINT | 100000000000L and -1L | A 64-bit signed integer type. Valid values range from -263+1 to 263 -1. |
BINARY |
| A binary number with a maximum length of 8 MB. Note
|
FLOAT | 3.14F and cast(3.14159261E+7 as float) | A 32-bit binary floating-point type. Note FLOAT data type calculations may result in precision loss due to computer storage and internal computing logic. For high precision requirements, consider converting FLOAT data to DECIMAL data. |
DOUBLE | 3.14D and 3.14159261E+7 | A 64-bit binary floating-point type. Note DOUBLE data type calculations may result in precision loss due to computer storage and internal computing logic. For high precision requirements, consider converting DOUBLE data to DECIMAL data. |
DECIMAL(precision,scale) | 3.5BD and 99999999999.9999999BD | A precise numeric type based on the decimal system.
The default is Note
|
VARCHAR(n) | None | A variable-length character type where n specifies the length. Valid values: 1 to 65535. |
CHAR(n) | None | The fixed-length character type is defined by n, which specifies its length, with a maximum of 255 characters. When the length is insufficient, spaces are added for padding; however, these spaces are not considered during comparisons. |
STRING | "abc", 'bcd', "alibaba", and 'inc' | The string type with a maximum length of 8 MB. |
DATE | DATE'2017-11-11' | The date type formatted as Valid values span from 0001-01-01 to 9999-12-31. |
DATETIME | DATETIME'2017-11-11 00:00:00' | The DATETIME type. Valid values range from 0001-01-01 00:00:00.000 to 9999-12-31 23:59:59.999, accurate to the millisecond. |
TIMESTAMP | TIMESTAMP'2017-11-11 00:00:00.123456789' | The TIMESTAMP data type. Valid values span from 0001-01-01 00:00:00.000000000 to 9999-12-31 23:59:59.999999999, accurate to the nanosecond. Note The timestamp is time zone agnostic. It stores a date offset from Epoch (UTC 1970-01-01 00:00:00) that is consistent across all time zones. TIMESTAMP data can be manipulated for time zone calculations using built-in functions, such as converting TIMESTAMP to STRING data based on the current time zone with |
TIMESTAMP_NTZ | TIMESTAMP_NTZ '2017-11-11 00:00:00.123456789' | A TIMESTAMP data type that does not consider time zones. Valid values range from 0000-01-01 00:00:00.000000000 to 9999-12-31 23:59:59.999999999. For more details on the TIMESTAMP_NTZ data type, see the MaxCompute TIMESTAMP_NTZ data type documentation. |
BOOLEAN | True and False | The BOOLEAN type. Valid values: True and False. |
Please consider the following points:
All the preceding data types support NULL values.
The INT keyword in an SQL statement refers to the 32-bit integer type.
-- Convert a into a 32-bit integer. CAST(a AS INT)By default, an integer constant is processed as the INT type. For example, integer constant 1 in
SELECT 1 + a;is processed as the INT type. If a constant exceeds the value range of the INT type but does not exceed the value range of the BIGINT type, the constant is processed as the BIGINT type. If the constant exceeds the value range of the BIGINT type, the constant is processed as the DOUBLE type.Implicit conversions
Some implicit conversions are disabled. If the data type is converted from STRING to BIGINT, from STRING to DATETIME, from DOUBLE to BIGINT, from DECIMAL to DOUBLE, or from DECIMAL to BIGINT, precision may be reduced, or errors may occur. You can use the CAST function to force the data type conversions.
Constants of the VARCHAR type can be implicitly converted into the STRING type.
Tables, built-in functions, and user-defined functions (UDFs)
Built-in functions that require the MaxCompute V2.0 data type edition can be run.
The data types defined in UDFs are parsed and overloaded based on the MaxCompute V2.0 data type edition.
The data type of a partition key column can be STRING, VARCHAR, CHAR, TINYINT, SMALLINT, INT, or BIGINT.
Constants of the STRING type support concatenation. Two or more strings are automatically concatenated into one string. For example, if you execute the
select 'abc' 'efg' 'ddt';, the return value isabcefgddt.If a constant is inserted into a field of the DECIMAL type, the expression of the constant must conform to the format in the constant definition. Example:
3.5BDin the following sample code:INSERT INTO test_tb(a) VALUES (3.5BD)DATETIME values do not include milliseconds. To specify the time format with millisecond precision, use the
-dfpparameter in the Tunnel command, as intunnel upload -dfp 'yyyy-MM-dd HH:mm:ss.SSS'. For more information, see Tunnel command documentation.
Complex data types
| Data type | Definition | Constructor |
| ARRAY |
|
|
| MAP |
|
|
| STRUCT |
|
|