All Products
Search
Document Center

Basic elements

Last Updated: Jan 07, 2020

Data types

ApsaraDB for OceanBase supports the following data types, which are consistent with the Oracle data types:

  • Numeric

  • Date and time

  • Text

Numeric

Numeric types can be divided into three categories:

  • Integer types: SMALLINT, INT, and INTEGER.

  • Fixed-point types: DECIMAL, NUMERIC, and NUMBER.

  • Floating-point types: FLOAT, BINARY_FLOAT, and BINARY_DOUBLE.

The floating-point types are not currently supported.

Integer type

You do not need to specify the precision and scale for integer types. The internal implementation is equivalent to NUMBER (38). Therefore, the integer type can only be precise up to integers of 38 digits.

The integer type is of fixed length and exact value. The value range of an integer type depends on the type length. The following table provides the details.

Type Length (bytes) Value range
SMALLINT/INT/INTEGER 1 to 21 [-10^39 + 1 , 10^39 - 1]

Fixed-point type

The fixed-point type has a variable length and an exact value. The range and degree of precision depend on the precision and scale parameters. The precision parameter represents the number of significant digits and the scale parameter represents the maximum number of decimal places. The maximum number of significant digits in the integer part is equal to precision minus scale. The valid value for the precision is [1, 38]. The default value is 38. The valid value for the scale is [-84, 127]. The default value is 0.

For example, NUMBER(5, 2) has three significant digits for the integer part and two significant digits for the fractional part. So the value range of the number is [-999.99, 999.99].

DECIMAL, NUMERIC or other ANSI data types can be represented by using NUMBER.

Date and time

The following table lists the date and time types.

Type Format Lower bound Upper bound Description
DATE YYYY-MM-DD HH24:MI:SS 0001-01-01 00:00:00 9999-12-31 23:59:59 A time that is accurate to seconds (ignoring time zones)
TIMESTAMP [(scale)] YYYY-MM-DD HH24:MI:SS.FF 0001-01-01 00:00:00.000000000 9999-12-31 23:59:59.999999999 A timestamp that is accurate to nanoseconds (ignoring time zones)
TIMESTAMP [(scale)] WITH TIME ZONE YYYY-MM-DD HH24:MI:SS.FF TZR TZD 0001-01-01 00:00:00.000000000 xx 9999-12-31 23:59:59.999999999 xx A timestamp with time zone information
TIMESTAMP [(scale)] WITH LOCAL TIME ZONE YYYY-MM-DD HH24:MI:SS.FF 0001-01-01 00:00:00.000000000 9999-12-31 23:59:59.999999999 A timestamp that uses the time zone specified in SESSION TIMEZONE

The value range and precision of the TIMESTAMP [(scale)], TIMESTAMP [(scale)] WITH TIME ZONE, and TIMESTAMP [(scale)] WITH LOCAL TIME ZONE types are based on the scale parameter. The scale parameter represents the maximum number of significant digits in the fractional part. The maximum value is 9, the minimum value is 0, and the default value is 6.

The format of the time type is controlled by the system variables such as NLS_DATE_FORMAT, NLS_TIMESTAMP_FORMAT, and NLS_TIMESTAMP_TZ_FORMAT.

Text

The following table lists the text types:

Type Length Maximum length (characters) Character set
CHAR (n [char | byte]) Fixed length 2,000 UTF8MB4
VARCHAR2 (n [char | byte]) Variable length 4,000 UTF8MB4
RAW Variable length 2,000 BINARY

You need to specify the length for the CHAR and VARCHAR2 types. The default length is determined by the NLS_LENGTH_SEMANTICS system variable.

The following table provides information about large object types:

Type Length Maximum length (bytes) Character set
CLOB Variable length 48M UTF8MB4
BLOB Variable length 48M BINARY

Expressions

Expressions are generalized notions. Each expression takes several input parameters and returns an output result. An input parameter may be a constant or a single row of data, or multiple rows of data. Expressions can be nested. The input of one expression can be the output of another expression.

Expressions are divided into the following categories:

  • Column references
  • Constants
  • Operators
  • Functions

Example:

  1. SELECT ABS(a + 1)
  2. FROM t1
  3. WHERE a > 0;
  • “a” is the column reference.
  • 0 and 1 are constants.
  • > and + are operators that take in the expressions 0, 1, and a as input.

    Note: > and other comparison operators are conditional expressions in the Oracle mode.

  • ABS is a function that takes in the + expression as input.

Type conversion

ApsaraDB for OceanBase supports explicit type conversion and implicit type conversion.

You can perform explicit type conversion by using the CAST function.

Implicit type conversion occurs in the following scenario: An operation requires a parameter of a specified type, but the actual parameter value does not match the specified type. When this happens, ApsaraDB for OceanBase converts the actual parameter value to the specified type before performing subsequent operations.

Character sets

The following character sets are supported:

  • UTF8MB4: variable-length encoding. The maximum character length is 4 bytes.
  • BINARY: fixed-length encoding. The fixed length is 1 byte.

UTF8 and UTF8MB3 are subsets of UTF8MB4. They are variable-length encodings with a maximum character length of 3 bytes. To support seamless migration, ApsaraDB for OceanBase treats UTF8 syntactically the same as UTF8MB4.

Collations

ApsaraDB for OceanBase supports follows collation rules:

  • UTF8MB4_GENERAL_CI of UTF8MB4

  • UTF8MB4_BIN of UTF8MB4

  • BINARY of BINARY

Data comparison rules

ApsaraDB for OceanBase supports comparing any two or more types of data. The comparison result may be:

  • A non-zero value or TRUE
  • 0 or FALSE
  • NULL

If the data types involved in a comparison are different, ApsaraDB for OceanBase automatically determines a data type for comparison based on internal rules. The data are converted to this data type before the comparison.

If the comparison type is text, you also need to determine a collation for comparison.

Literals

Text literals

A text literal is a string that is enclosed by single quotation marks (') or double quotation marks ("). If you enable the ANSI_QUOTES mode, only single quotation marks (') can be used to enclose strings.

Numeric Literals

Decimal numeric literals are divided into exact values (integer and fixed-point values) and floating-point values. Values can include a decimal point (.) as a decimal separator and prefix a hyphen (-) to represent negative values.

Hexadecimal numeric literals only support integer values that prefixed with 0X, and allow letters from A to F. All letters are not case-sensitive.

Date and time literals

Date and time literals can be text or numeric.

  • You can use either full separators ('2015-07-21 12:34:56.789') or no separator at all ('20150721') in the text format.
  • You can also use TIMESTAMP or DATE to specify the target type, for example, TIMESTAMP'2015-07-21 12: 34: 56.789' in the text format.
  • The numeric format only allows decimal points (.) as the separator for the seconds and microseconds, for example, 20150721123456.789.
  • A period (.) must be used between seconds and microseconds. Other than that, if you require other separators, we recommend that you only use common separators such as hyphens (-), forward slashes (/), and colons (:).

Escape characters

An escape character is a string prefixed with a backslash (\) that invokes an alternative interpretation on the string. Escape characters are case-sensitive. For example, \b represents a backspace, but\B represents B.

The following table lists all escape characters.

Escape character Description
\b A backspace
\f A form feed
\n A line feed
\r A carriage return
\t A tab.
\\ A backslash ().
\' A single quotation mark (‘).
\" A double quotation mark (“)
\_ An underscore (_)
\% A percent sign (%)
\0 NULL
\Z ASCII 26 (Ctrl+Z)

Comments

SQL statements

In normal SQL statements, ApsaraDB for OceanBase supports the following three annotation methods:

  • Begin a comment with --.
  • Begin a comment with /*and end the comment with*/.

Database objects

In a DDL statement, the COMMENT clause can be used to specify comments for a database object.

For example:

COMMENT ON COLUMN t.c1 IS ‘comment on this column’;

COMMENT ON TABLE t IS ‘comment on this table’;