All Products
Search
Document Center

IoT Platform:String functions

Last Updated:Dec 01, 2023

This topic describes the syntax, parameters, and examples of string functions supported by the data analysis feature.

Function

Description

CHR

Converts an ASCII code into a character.

CONCAT

Concatenates all specified strings and returns the final string.

GET_JSON_OBJECT

Extracts a specified substring from a JSON string.

INSTR

Returns the position of a specified substring in a string.

BASE64_DECODE

Decodes Base64-encoded data.

LENGTH

Returns the length of a string.

LENGTHB

Returns the length of a string in bytes.

MD5

Returns the MD5 value of a string.

REGEXP_REPLACE

Replaces a substring in a specified position of a string with a specified substring.

SPLIT_PART

Splits a string on a specified delimiter and returns a substring that starts from a specified position to another specified position.

SUBSTR

Returns a substring that has a specified length from a specified position of a string.

TOLOWER

Replaces every uppercase letter in a string with a lowercase letter.

TOUPPER

Replaces every lowercase letter in a string with an uppercase letter.

TO_CHAR

Converts data of the BOOLEAN, BIGINT, DECIMAL, or DOUBLE type to the STRING type.

TRIM

Removes the spaces from the left and right sides of a string.

LTRIM

Removes the spaces from the left side of a string.

RTRIM

Removes the spaces from the right side of a string.

REVERSE

Returns the characters of a string in reverse order.

ASCII

Returns the ASCII code of the first character in a string.

LPAD

Pads the left side of a string specified by a with a string specified by b until the number of characters in the new padded string reaches a value specified by len.

RPAD

Pads the right side of a string specified by a with a string specified by b until the number of characters in the new padded string reaches a value specified by len.

REPLACE

Replaces a specified substring in a string with a new substring.

URL_ENCODE

Encodes the input string into the application/x-www-form-urlencoded MIME format.

URL_DECODE

Converts an input string from the application/x-www-form-urlencoded MIME format into a normal string. This is the inverse function of url_encoding.

ENDSWITH

Checks whether String 1 ends with String 2.

TO_BASE64

Converts binary data to a Base64-encoded string.

CHR

  • Syntax

    STRING CHR(BIGINT ascii)
  • Description

    Converts a specified ASCII code to the corresponding character.

  • Parameters

    ascii: a value of the BIGINT type, which indicates the ASCII code. Value range: 0 to 255. An exception is returned if the input value is out of this range. The input value is implicitly converted to a value of the BIGINT type before computing if it is of the STRING, DOUBLE, or DECIMAL type. If the input value is of another data type, an exception is returned.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to NULL, NULL is returned.

CONCAT

  • Syntax

    STRING CONCAT(STRING a, STRING b...)
  • Description

    Concatenates all specified strings and returns the final string.

  • Parameters

    The values of all parameters are of the STRING type. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is returned.

  • Return value type

    A value of the STRING type is returned. NULL is returned if no input parameters are present or an input parameter is NULL.

  • Examples

    CONCAT('ab','c') = 'abc'
    CONCAT() = NULL
    CONCAT('a', null, 'b') = NULL

GET_JSON_OBJECT

  • Syntax

    STRING GET_JSON_OBJECT(STRING json,STRING path)
  • Description

    Extracts a specified string from a standard JSON string based on path. The original data is read each time this function is called. Therefore, repeated calls may waste system resources and increase your cost. To prevent repeated calls, you can call the GET_JSON_OBJECT function together with user-defined table-valued functions (UDTFs) to convert JSON log data.

  • Parameters

    • json: the string in the standard JSON format.

    • path: the string that starts with the dollar sign ($), which describes the path of a JSON object in the JSON string. For more information about path, see LanguageManual UDF.

      • $: indicates the root node.

      • .: indicates a child node.

      • []: [number] indicates the array subscript. The array is represented in the format of key[sub1][sub2][sub3]…….

      • *: indicates the wildcard for []. If this character is used in the path parameter, an entire array is returned. An asterisk (*) cannot be escaped.

  • Return value

    • Returns NULL if json is empty or invalid.

    • Returns the corresponding string if json is valid and the specified path exists.

    • Duplicate keys cannot appear in an object. For example, {a:1, a:0} is not supported. Otherwise, the string may fail to be parsed.

    • Emoji expressions are not supported.

  • Examples

    • Example 1:

      -- JSON data. 
      {"store":
      {"fruit":[{"weight":8,"type":"apple"},{"weight":9,"type":"pear"}],
      "bicycle":{"price":19.95,"color":"red"}
      },
      "email":"amy@only_for_json_udf_test.net",
      "owner":"amy"
      }

      You can execute the following statements to extract data from the preceding JSON string.

      -- The return value is amy. 
          GET_JSON_OBJECT(json, '$.owner');
      
      -- The return value is {"weight":8,"type":"apple"}. 
          GET_JSON_OBJECT(json, '$.store.fruit\[0]');
      
      -- The return value is null. 
          GET_JSON_OBJECT(json, '$.non_exist_key');
    • Example 2:

      GET_JSON_OBJECT('{"array":[["aaaa",1111],["bbbb",2222],["cccc",3333]]}','$.array[1][1]')= "2222"
      
      GET_JSON_OBJECT('{"aaa":"bbb","ccc":{"ddd":"eee","fff":"ggg","hhh":["h0","h1","h2"]},"iii":"jjj"}','$.ccc.hhh[*]') = "["h0","h1","h2"]"
      
      GET_JSON_OBJECT('{"aaa":"bbb","ccc":{"ddd":"eee","fff":"ggg","hhh":["h0","h1","h2"]},"iii":"jjj"}','$.ccc.hhh[1]') = "h1"

INSTR

  • Syntax

    BIGINT INSTR(STRING str1, STRING str2[, BIGINT start_position[, BIGINT nth_appearance]])
  • Description

    Returns the position of substring str2 in string str1.

  • Parameters

    • : a value of the STRING type, which indicates the string in which you search for the substring. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is returned.

    • : a value of the STRING type, which indicates the substring you want to search for. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is returned.

    • : a value of the BIGINT type. If it is of another data type, an exception is returned. It indicates which character in str1 the search will start with. The default start position is the first character, marked as 1.

    • : a value of the BIGINT type, which must be greater than 0. It indicates the nth time that the substring appears in the string, where n is specified by nth_appearance. An exception is returned if the value of nth_appearance is of another data type or is no greater than 0.

  • Return value

    • A value of the BIGINT type is returned.

    • If str1 is not found in str2, 0 is returned.

    • If an input parameter is set to NULL, NULL is returned.

    • The matching always succeeds if str2 is empty. For example, 1 is returned for instr(‘abc’, ").

  • Examples

    INSTR('Tech on the net', 'e') = 2
    INSTR('Tech on the net', 'e', 1, 1) = 2
    INSTR('Tech on the net', 'e', 1, 2) = 11
    INSTR('Tech on the net', 'e', 1, 3) = 14

BASE64_DECODE

  • Syntax

    base64_decode(STRING str)
  • Description

    Decodes a Base64-encoded string.

  • Parameters

    str: a value of the STRING type, which indicates a string to be split. If the input value is of the BIGINT, DOUBLE, DECIMAL, TIMESTAMP, or DATE type, the value is implicitly converted into the STRING type before calculation. If the input value is not a Base64-encoded string, the original string is returned.

  • Return values

    A value of the STRING type is returned.

LENGTH

  • Syntax

    BIGINT LENGTH(STRING str)
  • Description

    Returns the length of the string str.

  • Parameters

    • : a value of the STRING type. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is returned.

    • A value of the BIGINT type is returned. NULL is returned if the input parameter is NULL. -1 is returned if the input parameter is not encoded in UTF-8.

  • Example

    LENGTH('hi! China') = 6

LENGTHB

  • Syntax

    BIGINT LENGTHB(STRING str)
  • Description

    Returns the length of the string str in bytes.

  • Parameters

    • : a value of the STRING type. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is returned.

    • A value of the BIGINT type is returned. If the input parameter is set to NULL, NULL is returned.

  • Example

    LENGTHB('hi! China') = 10

MD5

  • Syntax

    STRING MD5(STRING value)
  • Description

    Returns the MD5 value of the input string value.

  • Parameters

    : a value of the STRING type. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is returned.

  • Return values

    A value of the STRING type is returned. If the input parameter is set to NULL, NULL is returned.

REGEXP_REPLACE

  • Syntax

    STRING REGEXP_REPLACE(STRING source, STRING pattern, STRING replace_string[, BIGINT occurrence])
  • Description

    Uses a string specified by replace_STRING to replace the substring that matches a specified pattern at the nth occurrence specified by occurrence in the source string and returns a result.

  • Parameters

    • source: the string to be replaced.

    • pattern: a constant of the STRING type, which indicates the pattern to be matched. An exception is returned if pattern is an empty string.

    • replace_string: a value of the STRING type, which is a string that is substituted for the substring matching the given pattern.

    • occurrence: a constant of the BIGINT type, which must be no less than 0. It indicates that replace_string is substituted for the matching substring at the nth occurrence. The value 0 indicates that all matching substrings are replaced. An exception is returned if its value is of another type or is less than 0. Default value: 0.

  • Return value

    • A value of the STRING type is returned. If the referenced group does not exist, no substring is replaced.

    • If an input parameter is set to NULL, NULL is returned.

    • If replace_string is set to null and a substring matches the given pattern, NULL is returned.

    • If replace_string is set to null but no substring matches the given pattern, the original string is returned.

  • Examples

    regexp_replace("123.456.7890", "([[:digit:]]{3})\\.([[:digit:]]{3})\\.([[:digit:]]{4})",
    "(\\1)\\2-\\3", 0) = "(123)456-7890"
    regexp_replace("abcd", "(.)", "\\1 ", 0) = "a b c d "
    regexp_replace("abcd", "(.)", "\\1 ", 1) = "a bcd"
    regexp_replace("abcd", "(.)", "\\2", 1) = "abcd"
    -- Only one group is defined in the pattern and the group that you want to reference does not exist. 
    -- We recommend that you do not use this function in this manner. The result of referencing a nonexistent group is not defined. 
    regexp_replace("abcd", "(.*)(.)$", "\\2", 0) = "d"
    regexp_replace("abcd", "a", "\\1", 0) = "bcd"
    -- No character class is defined in pattern. \1 references a non-existent character class. 
    -- We recommend that you do not use this function in this manner. The result of referencing a nonexistent group is not defined.

SPLIT_PART

  • Syntax

    STRING SPLIT_PART(STRING str, STRING separator, BIGINT start[, BIGINT end])
  • Description

    Splits a string specified by str on a delimiter specified by separator and returns a substring that starts from a position specified by start to another position specified by end.

  • Parameters

    • : a value of the STRING type, which indicates a string to be split. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is returned.

    • separator: a constant of the STRING type, which is a delimiter used for splitting. It can be a character or string. If it is of another data type, an exception is returned.

    • : a constant of the BIGINT type, which must be greater than 0. An exception is returned if its value is not a constant or is of another data type. This parameter specifies the start position of the substring to be returned. The position starts from 1. If end is not configured, the substring in the position specified by start is returned.

    • end: a constant of the BIGINT type, which must be no less than the value of start. This parameter specifies the end position of the segment to be returned. If the value is not a constant or is of another data type, an exception is returned. If end is not configured, the last substring is returned.

  • Return value

    • A value of the STRING type is returned.

    • If you set start to a value that is greater than the number of segments, an empty string is returned. For example, if a string has six segments but you set start to a value that is greater than 6, an empty string is returned.

    • If the value of separator does not exist in a string specified by str and you set start to 1, the string specified by str is returned. If the value of str is an empty string, an empty string is returned.

    • If the value of separator is an empty string, a string specified by str is returned.

    • If you set end to a value that is greater than the number of segments, all segments are returned.

    • If an input parameter is set to NULL, NULL is returned.

  • Examples

    SPLIT_PART('a,b,c,d', ',', 1) = 'a'
    SPLIT_PART('a,b,c,d', ',', 1, 2) = 'a,b'
    SPLIT_PART('a,b,c,d', ',', 10) = ''

SUBSTR

  • Syntax

    STRING SUBSTR(STRING str, BIGINT start_position[, BIGINT length])
  • Description

    Returns a substring that starts from a position specified by start_position in str and has a length specified by length.

  • Parameters

    • str: a value of the STRING type. The input value is implicitly converted to a value of the STRING type before calculation if it is of the BIGINT, DECIMAL, DOUBLE, or DATETIME type. If the input value is of another data type, an exception is returned.

    • start_position: a value of the BIGINT type. Default value: 1. If you set start_position to 0, an empty string is returned. If you set start_position to a negative value, the start position is counted backwards from the last character of the string. For example, -1 indicates the last character, -2 indicates the second-to-last character, and -3 indicates the third-from-last character. If the value is of another data type, an exception is returned.

    • : a value of the BIGINT type, which indicates the length of the substring. Its value is greater than 0. An exception is returned if its value of another data type or is no greater than 0.

  • Return value

    A value of the STRING type is returned. If an input parameter is set to NULL, NULL is returned.

    Note

    If the length is not configured, the substring from a position specified by start_position to the end of a string specified by str is returned.

  • Examples

    SUBSTR("abc", 2) = "bc"
    SUBSTR("abc", 2, 1) = "b"
    SUBSTR("abc",-2,2) = "bc"
    SUBSTR("abc",-3) = "abc"

TOLOWER

  • Syntax

    STRING TOLOWER(STRING source)
  • Description

    Converts the string source to a lowercase string.

  • Parameters

    : a value of the STRING type. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is returned.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to NULL, NULL is returned.

  • Examples

    tolower("aBcd") = "abcd"
    tolower("HAHACd") = "hahacd"

TOUPPER

  • Syntax

    STRING TOUPPER(STRING source)
  • Description

    Converts a string specified by source to an uppercase string.

  • Parameters

    : a value of the STRING type. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is returned.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to NULL, NULL is returned.

  • Examples

    toupper("aBcd") = "ABCD"
    toupper("HahaCd") = "HAHACD"

TO_CHAR

  • Syntax

    STRING TO_CHAR(BOOLEAN value)
    STRING TO_CHAR(BIGINT value)
    STRING TO_CHAR(DOUBLE value)
    STRING TO_CHAR(DECIMAL value)
  • Description

    Converts data of the BOOLEAN, BIGINT, DECIMAL, or DOUBLE type to the STRING type.

  • Parameters

    : The input value can be of the BOOLEAN, BIGINT, DECIMAL, or DOUBLE type. An exception is returned if its value is of another data type.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to NULL, NULL is returned.

  • Examples

    TO_CHAR(123) = '123'
    TO_CHAR(true) = 'TRUE'
    TO_CHAR(1.23) = '1.23'
    TO_CHAR(null) = NULL

TRIM

  • Syntax

    STRING TRIM(STRING str)
  • Description

    Removes the spaces from the left and right sides of a string specified by str.

  • Parameters

    : a value of the STRING type. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is thrown.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to NULL, NULL is returned.

  • Example

    LTRIM(' abc ')="abc"; 

LTRIM

  • Syntax

    STRING LTRIM(STRING str)
  • Description

    Removes the spaces from the left side of a string specified by str.

  • Parameters

    : a value of the STRING type. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is thrown.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to NULL, NULL is returned.

  • Example

    LTRIM(' abc ')="abc "; 

RTRIM

  • Syntax

    STRING RTRIM(STRING str)
  • Description

    Removes the spaces from the right side of a string specified by str.

  • Parameters

    : a value of the STRING type. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is thrown.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to NULL, NULL is returned.

  • Example

    RTRIM('a abc ')="a abc"; 

REVERSE

  • Syntax

    STRING REVERSE(STRING str)
  • Description

    Returns a string in reverse order.

  • Parameters

    : a value of the STRING type. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is thrown.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to NULL, NULL is returned.

  • Example

    REVERSE('abcedfg')="gfdecba"

ASCII

  • Syntax

    BIGINT ASCII(STRING str)
  • Description

    Returns the ASCII code of the first character in a string specified by str.

  • Parameters

    : a value of the STRING type. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is thrown.

  • Return value

    A value of the BIGINT type is returned.

  • Example

    -- The return value is 97. 
    ASCII('abcde')

LPAD

  • Syntax

    STRING LPAD(STRING a, INT len, STRING b)
  • Description

    Pads the left side of a string specified by a with a string specified by b until the number of characters in the new padded string reachs a value specified by len.

  • Parameters

    • : a value of the INT type.

    • a and b: values of the STRING type.

  • Return value

    A value of the STRING type is returned. If the value of len is less than the number of characters in a string specified by a, a substring of a length specified by len is truncated from the left of the string and is returned. If you set len to 0, NULL is returned.

  • Examples

    lpad('abcdefgh',10,'12')='12abcdefgh'
    lpad('abcdefgh',5,'12')='abcde'
    lpad('abcdefgh',0,'12') -- Return NULL.

RPAD

  • Syntax

    STRING RPAD(STRING a, INT len, STRING b)
  • Description

    Pads the right side of string a with string b until the number of characters in the new padded string reaches a value specified by len.

  • Parameters

    • : a value of the INT type.

    • Parameters such as a and b: values of the STRING type.

  • Return values

    A value of the STRING type is returned. If the value of len is less than the number of characters in a string specified by a, a substring of a length specified by len is truncated from the left of the string and is returned.len. If the value of len is 0, NULL is returned.

  • Examples

    rpad('abcdefgh',10,'12')='abcdefgh12'
    rpad('abcdefgh',5,'12')='abcde'
    rpad('abcdefgh',0,'12') -- Return NULL.

REPLACE

  • Syntax

    STRING REPLACE(STRING a, STRING OLD, STRING NEW)
  • Description

    Purpose: It is used to replace the part of string a that is exactly the same as string OLD with string NEW, and return string a.

  • Parameters

    The values of all parameters are of the STRING type.

  • Return value

    A value of the STRING type is returned. If an input parameter is set to NULL, NULL is returned.

  • Examples

    REPLACE('ababab','abab','12')='12ab'
    REPLACE('ababab','cdf','123')='ababab'
    REPLACE('123abab456ab',null,'abab')=null

URL_ENCODE

  • Syntax

    STRING URL_ENCODE(STRING input[, STRING encoding])
  • Description

    Encodes the input string in the application/x-www-form-urlencoded MIME format and returns the encoded string.

    • All letters remain unchanged.

    • Periods (.), hyphens (-), asterisks (*), and underscores (_) remain unchanged.

    • Spaces are converted into plus signs (+).

    • Other characters are converted into byte values based on the format specified by encoding. Each byte value is then represented in the %xy format, where xy is the hexadecimal representation of the character value.

  • Parameters

    • : the string that you want to decode.

    • : the encoding format. Valid values include GBK and UTF-8. If you do not configure this parameter, the default value UTF-8 is used.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to NULL, NULL is returned.

  • Examples

    URL_ENCODE('Example for URL_ENCODE:// (fdsf)') = "%E7%A4%BA%E4%BE%8Bfor+URL_ENCODE%3A%2F%2F+%28fdsf%29"
    URL_ENCODE('Example for URL_ENCODE:// dsf(fasfs)', 'GBK') = "Example+for+URL_ENCODE+%3A%2F%2F+dsf%28fasfs%29"

URL_DECODE

  • Syntax

    STRING URL_DECODE(STRING input[, STRING encoding])
  • Description

    Converts an input string from the application/x-www-form-urlencoded MIME format into a normal string. This is the inverse function of URL_ENCODE.

    • All letters remain unchanged.

    • Periods (.), hyphens (-), asterisks (*), and underscores (_) remain unchanged.

    • Plus signs (+) are converted into spaces.

    • The %xy formatted sequence is converted into byte values. Consecutive byte values are decoded to the related strings based on the value of encoding.

    • Other characters remain unchanged.

    • The return value of the function is a string encoded in UTF-8.

  • Parameters

    • : the string that you want to decode.

    • : the encoding format. Valid values include GBK and UTF-8. If you do not specify this parameter, the default value UTF-8 is used.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to NULL, NULL is returned.

  • Examples

    URL_DECODE('%E7%A4%BA%E4%BE%8Bfor+URL_DECODE%3A%2F%2F+%28fdsf%29')= "Example for URL_DECODE:// (fdsf)"
    URL_DECODE('Example+for+URL_DECODE+%3A%2F%2F+dsf%28fasfs%29', 'GBK') = "Example for URL_DECODE:// dsf(fasfs)"

ENDSWITH

  • Syntax

    BOOLEAN ENDSWITH(STRING input, STRING suffix)
  • Description

    Checks whether the input string ends with a string specified by suffix.

  • Parameters

    The values of all parameters are of the STRING type. The input value is implicitly converted to a value of the STRING type before computing if it is of the BIGINT, DOUBLE, DECIMAL, or DATETIME type. If the input value is of another data type, an exception is thrown.

  • Return value

    A value of the BOOLEAN type is returned.

TO_BASE64

  • Syntax

    TO_BASE64(BINARY bin)
  • Description

    Converts binary data into a Base64-encoded string.

  • Parameters

    The value of the input parameter is of the BINARY type.

  • Return value

    A Base64-encoded string.