All Products
Search
Document Center

Data Lake Analytics - Deprecated:String functions and operators

Last Updated:May 22, 2020

Notes:

  • The following string functions only processes valid UTF-8 encoded characters, but does not check the validity of the encoded characters. The functions may return incorrect results for invalid UTF-8 characters or strings.

  • The lower() and upper() functions converts lowercase and uppercase letters, and returns incorrect results for non-Latin language inputs, such as Chinese.

chr(n) → varchar

Converts the Unicode code point n into a single character string.

concat(string1, …, stringN) → varchar

Returns the concatenation of string1, string2, …, stringN. This function has the same functionality as the SQL-standard concatenation operator (||).

Left/Right(str,len)

Returns the leftmost/rightmost len characters from str.

length(string) → bigint

Returns the length of string in characters.

lower(string) → varchar

Converts string to lowercase format.

lpad(string, size, padstring) → varchar

Pads the left-side of a string to the length indicated by size with padstring. If size is shorter than the length of string, the result is truncated to size characters. The size needs to be non-negative and padstring must be non-empty.

ltrim(string) → varchar

Removes leading white space from string.

replace(string, search) → varchar

Removes all incidents of search from string.

replace(string, search, replace) → varchar

Replaces all incidents of search with replace in string.

reverse(string) → varchar

Returns string with characters in the reverse order.

rpad(string, size, padstring) → varchar

Pads the right-side of a string to the length designated by size with padstring. If size is shorter than the length of string, the result is truncated to size characters. size must not be negative and padstring must be non-empty.

rtrim(string) → varchar

Removes white space behind the string.

split(string, delimiter) → array<varchar>

Splits string by the delimiter and returns an array.

split(string, delimiter, limit) → array<varchar>

Splits string by the delimiter and returns an array of the size specified by the limit. The last element in the array contains everything remaining in the string. The limit must be a positive number.

split_part(string, delimiter, index) → varchar

Splits string by the delimiter and returns a field index. Field indexes start from 1. If the index is larger than the number of fields, this function returns null.

split_to_map(string, entryDelimiter, keyValueDelimiter) → map<varchar, varchar>

Splits string by entryDelimiter and keyValueDelimiter and returns a map. entryDelimiter splits string into key-value pairs. keyValueDelimiter splits each pair into key and value.

strpos(string, substring) → bigint

Returns the starting position of the first substring in a string. Positions start at 1. If no position is found, 0 is returned.

substr(string, start) → varchar

Returns the remaining string from the starting position start. Positions start at 1. If start is a negative number, the starting position is counted from the end of the string.

substr(string, start, length) → varchar

Returns a substring from string based on the length specified from the starting position start. Positions start at 1. If start is a negative number, the starting position is counted from the end of the string.

trim(string) → varchar

Removes whitespaces in front of or behind the string.

upper(string) → varchar

Converts string to uppercase format.

uuid()

Returns a string that is unique in the current cluster. The algorithm implementation is similar to the MongoDB ObjectID.

Unicode functions

normalize(string) → varchar

Performs Unicode normalization on string by NFC.

Note:

  • This SQL-standard function contains special syntax in which the form must be specified as a keyword.

to_utf8(string) → varbinary

Encodes string into a UTF-8 varbinary representation.

from_utf8(binary) → varchar

Decodes a UTF-8 encoded string from binary. Note invalid UTF-8 characters are replaced with the Unicode replacement character U+FFFD.

from_utf8(binary, replace) → varchar

Decodes a UTF-8 encoded string from binary. Invalid UTF-8 sequences are replaced with replace, which must either be a single character or empty (invalid characters are removed).

substring(string, start) → varchar

The alias for substr() function.

substring(string, start, length) → varchar

The alias for substr().

MySQL functions

These functions provide compatibility with MySQL.

ASCII

Returns the ASCII code of the first character of the str string.

Syntax:

ASCII(str)

Example:

  1. SELECT ASCII('2');
  2. SELECT ASCII(2);
  3. SELECT ASCII('dx');

BIN

Returns the string representation of the binary value of `N .

Syntax:

BIN(N)

Example:

  1. SELECT BIN(12);

CHAR

Returns the character value of the input integer value based on the ASCII table.

Syntax:

CHAR(N,...)

Example:

  1. SELECT CHAR(77,121,83,81,76);

CHAR_LENGTH

Returns the length of the string str in characters. Multibyte characters are counted as one character.

Syntax:

CHAR_LENGTH(str)

Example:

  1. SELECT CHAR_LENGTH('111');

CHARACTER_LENGTH

Returns the string length of str. A multibyte character is counted as a single character.

Syntax:

CHARACTER_LENGTH(str)

Example:

  1. SELECT CHARACTER_LENGTH('111');

EXPORT_SET

Returns a string for every bit set in the value bits. An on string is returned for each bit set in the field bits, but an off string is returned for each bit not set in the field bits.Syntax:

EXPORT_SET(bits,on,off,separator,number_of_bits)

Example:

  1. SELECT EXPORT_SET(5,'Y','N',',',4);
  2. SELECT EXPORT_SET(6,'1','0',',',10);

FIND_IN_SET

A value ranging from 1 to N is returned if the specified string str is in the table strlist comprising N substrings.

Syntax:

FIND_IN_SET(str,strlist)

Example:

  1. SELECT FIND_IN_SET('b','a,b,c,d');

FORMAT

Formats X into format ‘#,###,###.##’, rounded to the number of decimal places specified by D, and returns the result as a string.

Syntax:

FORMAT(X,D[,locale])

Example:

  1. SELECT FORMAT(12332.123456, 4);
  2. SELECT FORMAT(12332.1, 4);
  3. SELECT FORMAT(12332.2,0);
  4. SELECT FORMAT(12332.2,2,'de_DE');

FROM_BASE64

Returns a Base64-encoded string str as a binary string.

Syntax:

FROM_BASE64(str)

Example:

  1. select from_base64(TO_BASE64('abc'));
  2. +-------------------------------------+
  3. | from_base64(TO_BASE64('abc')) |
  4. +-------------------------------------+
  5. | 61 62 63 |

HEX

If the specified N is a numeric value, the returned hexadecimal string value N is a longlong (BIGINT) number. If the specified str is a string, the str value is returned as a hexadecimal string where each character in str is converted into two hexadecimal digits.

Syntax:

HEX(str) , HEX(N)

Example:

  1. select hex(16);
  2. +---------+
  3. | hex(16) |
  4. +---------+
  5. | 10 |
  1. select hex('16');
  2. +-----------+
  3. | hex('16') |
  4. +-----------+
  5. | 3136 |

PRESTO_INSERT

Replaces the specified start position and substring length in the target string with the specified string.

Syntax:

PRESTO_INSERT(str,pos,len,newstr)

Example:

  1. SELECT PRESTO_INSERT('Quadratic', 3, 4, 'What');
  2. SELECT PRESTO_INSERT('Quadratic', -1, 4, 'What');
  3. SELECT PRESTO_INSERT('Quadratic', 3, 100, 'What');

INSTR

Returns the position in which substring substr appears in string str for the first time.

Syntax:

INSTR(str,substr)

Example:

SELECT INSTR('foobarbar', 'bar'); SELECT INSTR('xbar', 'foobar');

LCASE

Converts all characters of str into lowercase and returns as a string.

Syntax:

LCASE(str)

Example:

  1. SELECT LCASE('FoOBAr');
  2. SELECT Lower('FoOBAr');

Ucase

Converts all characters of str into uppercase and returns as a string. Ucase () is a synonym for Upper.

UPPER

Converts all characters of str into uppercase and returns as a string.

Syntax:

UPPER(str)

Example:

  1. SELECT UPPER('FoOBAr');
  2. SELECT UCASE('FoOBAr');

LEFT

Returns the characters from the string str from the left len. Returns NULL if any parameter is NULL.

Syntax:

LEFT(str,len)

Example:

  1. SELECT LEFT('foobarbar', 5);

LENGTH

Returns the length of the str string in bytes.

Syntax:

LENGTH(str)

Example:

  1. SELECT LENGTH('text');

LOCATE

Returns the position for which the substring substr appears in string str for the first time.

Syntax:

LOCATE(substr,str)

Example:

  1. SELECT LOCATE('bar', 'foobarbar');
  2. SELECT LOCATE('xbar', 'foobar');
  3. SELECT LOCATE('xbar', 'foobar');

LPAD

Returns the string str that is left-padded with the string padstr to the specified length of len characters. If str is longer than len, the return value is reduced to len characters.

Syntax:

LPAD(str,len,padstr)

Example:

  1. SELECT LPAD('hi',4,'??');
  2. SELECT LPAD('hi',1,'??');

LTRIM

Returns the string str without leading space characters.

Syntax:

LTRIM(str)

Example:

  1. SELECT LTRIM(' foOBarBaR');

SUBSTRING

If a len argument exists, returns a substring with the length ‘len’ from string str starting from the position pos. In the absence of a len argument, returns a substring str starting from the position pos.

Syntax:

SUBSTRING(str,pos) , SUBSTRING(str,pos,len) , SUBSTRING(str FROM pos FOR len)

Example:

  1. SELECT SUBSTRING('Quadratically',5);
  2. SELECT SUBSTRING('Quadratically',5,6);
  3. SELECT SUBSTRING('Sakila', -3);
  4. SELECT SUBSTRING('Sakila', -5, 3);
  5. SELECT SUBSTRING('Sakila' FROM -4 FOR 2);
  6. SELECT SUBSTR('Quadratically',5);
  7. SELECT SUBSTR('Quadratically',5,6);
  8. SELECT SUBSTR('Sakila', -3);
  9. SELECT SUBSTR('Sakila', -5, 3)

MID

This function is a synonym for SUBSTRING (str, pos, len).

Syntax:

MID(str,pos,len)

Example:

  1. SELECT mid('Quadratically',5,6);

OCT

Returns the octal value of N in a string. N is a longlong (BIGINT) number.

Syntax:

OCT(N)

Example:

  1. SELECT OCT(12);

ORD

If the first character on the left side of str is a multibyte character, calculates, and returns the code value for this character.

Syntax:

ORD(str)

Example:

  1. SELECT ORD('2');

REPEAT

Returns a string comprising the string str repeated count times.

Syntax:

REPEAT(str,count)

Example:

  1. SELECT REPEAT('MySQL', 3);

REPLACE

Returns the string str where all existing from_str are replaced with with to_str.

Syntax:

REPLACE(str,from_str,to_str)

Example:

  1. SELECT REPLACE('www.mysql.com', 'w', 'Ww');

REVERSE

Returns all characters of str in reverse order.

Syntax:

REVERSE(str)

Example:

  1. SELECT REVERSE('abc');

RIGHT

Returns the last character on the right of len from the string str. This function returns NULL if a parameter value is NULL.

Syntax:

RIGHT(str,len)

Example:

  1. SELECT RIGHT('foobarbar', 4);

RPAD

Returns the string str, right-padded with the string padstr to a length specified by len characters. If str is longer than len, the return value is reduced to len characters.

Syntax:

RPAD(str,len,padstr)

Example:

  1. SELECT RPAD('hi',5,'?');
  2. SELECT RPAD('hi',1,'?');

RTRIM

Returns the string str without trailing space characters.

Syntax:

RTRIM(str)

Example:

  1. SELECT RTRIM('barbar ');

SOUNDEX

Returns the soundex string from str.

Syntax:

SOUNDEX(str)

Example:

  1. SELECT SOUNDEX('Hello');
  2. SELECT SOUNDEX('Quadratically');

SPACE

Returns a string containing N space characters.

Syntax:

SPACE(N)

Example:

  1. SELECT SPACE(6);

SUBSTRING_INDEX

Returns the substring from string str before the number of count incidents of the delimiter delim. If count is positive, everything left of the last delimiter (counting from the left) is returned. If count is negative, everything right of the last delimiter (counting from the right) is returned. This function is case-sensitive when searching for delimiter results delim.

Syntax:

SUBSTRING_INDEX(str,delim,count)

Example:

  1. SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2);
  2. SELECT SUBSTRING_INDEX('www.mysql.com', '.', -2);

TRIM

Returns the string str without remstr prefixes or suffixes. Assumes BOTH when none of the specifiers BOTH, LEADING, or TRAILING are given. remstr is optional and removes spaces unless specified.

Syntax:

TRIM(str)

Example:

  1. SELECT TRIM(' bar ');

ip2region

Returns the region of the IP address parameter ip_address. level can be defined as COUNTRY, AREA, PROVINCE, CITY, or ISP.

Syntax:

ip2region(ip_address, level)

Example:

  1. SELECT ip2region('101.105.35.57', 'COUNTRY');
  2. SELECT ip2region('101.105.35.57', 'Province');
  3. SELECT ip2region('101.105.35.57', 'CITY');

Oracle string functions

INITCAP

Returns a string where the first letter of each word is in uppercase while all other letters are in lowercase.

Syntax:

INITCAP(char)

Example:

  1. SELECT INITCAP('the soap')
  2. SELECT INITCAP('the,soap')

INSTR

Searches for the specified character in the first string and returns the position of the specified character as an integer. This function searches the string for substring. position is a nonzero integer that designates the start of the search, while occurrence is an integer that designates the incident in the string that Oracle should search for. Both position and occurrence have a default value of 1.

Syntax:

INSTR(string, substring) | INSTR(string, substring, position) | INSTR(string, substring, position, occurence)

Example:

  1. SELECT INSTR('CORPORATE FLOOR','OR');
  2. SELECT INSTR('CORPORATE FLOOR','OR', 3);
  3. SELECT INSTR('CORPORATE FLOOR','OR', 3, 2);

LPAD

Pads the left-side of a string with pad-exp to the specified character length length. Duplicate the string if the pad-exp length is shorter than the character length.

Syntax:

LPAD (text-exp , length) | LPAD (text-exp , length, pad-exp)

Example:

  1. SELECT LPAD('HELLO', 5);
  2. SELECT LPAD('HELLO', 5, 'A');

LTRIM

Deletes the character char in a set as counted from the left. By default, char is a single blank.

Syntax:

LTRIM(char) | LTRIM(char, set)

Example:

  1. SELECT LTRIM(' WWW.TTTT');
  2. SELECT LTRIM('WWW.TTTT','W');

REGEXP_COUNT

Searches for the frequency of pattern in the source string source_char and returns the number of incidents. The returned value is an integer. If no results are found, 0 is returned.

Syntax:

REGEXP_COUNT(source_char, pattern)

Example:

  1. SELECT REGEXP_COUNT('rat cat\nbat dog', '.at');

REGEXP_SUBSTR

Expands the SUBSTR function by allowing you to search the string source_char for the regular expression pattern. position is a positive integer that designates the start position of the search, and has a default value of 1.

Syntax:

REGEXP_SUBSTR(source_char, pattern) | REGEXP_SUBSTR(source_char, pattern, position)

Example:

  1. REGEXP_SUBSTR('Hello world bye', '\\b[a-z]([a-z]*)');
  2. REGEXP_SUBSTR('Hello world bye', '\\b[a-z]([a-z]*)', 1);

REPLACE

Returns char with every appearance of search_string replaced withreplacement_string.Replaces all substrings search_string in char with replacement_string. If replacement_string is not specified, then all search_string in char are deleted. REPLACE is a subset of the functions provided by TRANSLATE.

Syntax:

REPLACE(char, search_string, replacement_string) | REPLACE(char, search_string)

Example:

  1. SELECT REPLACE('JACK and JUE','J','BL');
  2. SELECT REPLACE('JACK and JUE','J');

RPAD

Returns expr1 right-padded to specified length n characters with expr2, replicated to the number of times required. If the length of expr2 is shorter than that of the character n, you must duplicate the string.

Syntax:

RPAD (expr1, n, expr2) | RPAD (expr1, n)

Example:

  1. SELECT RPAD('HELLO', 5);
  2. SELECT RPAD('HELLO', 5, 'A');

RTRIM

Deletes all characters char that are in the set counting from the right. By default, char is a single space.

Syntax:

RTRIM(char) | RTRIM(char, set)

Example:

  1. SELECT RTRIM('WWW.TTTT ');
  2. SELECT RTRIM('WWW.TTTT', 'T');

TRANSLATE

Returns expr where all incidents of each character in from_string are replaced by the corresponding character in to_string.

Syntax:

TRANSLATE(expr, from_string, to_string)

Example:

  1. SELECT TRANSLATE('acbd','ab','AB');
  2. SELECT TRANSLATE('acbd','abc','A');
  3. SELECT TRANSLATE('acbd','abc','');