This topic describes the syntax rules of resource functions, including parameter descriptions and function examples.

Functions

Type Function Description
Local res_local Obtains information of advanced parameters of a current data transformation task.
ApsaraDB for RDS res_rds_mysql Obtains data in a specified database table from ApsaraDB RDS for MySQL. The data can be refreshed at regular intervals.
Log Service res_log_logstore_pull Pulls data from another Logstore when a Logstore is being transformed. You can pull data and maintain it in a table in a continuous manner.
OSS res_oss_file Obtains the content of an object in a specified bucket from OSS.

res_local

  • Function syntax
    res_local(param, default=None, type="auto")
  • Function parameters
    Parameter Type Required Description
    param String Yes The names of advanced parameters for the data transformation task.
    default Arbitrary type No The default value if no advanced parameter exists. The default value is "None".
    type String No The format of the output data. Valid values:
    • auto: The default value. Raw data is converted into a JSON string. If the conversion fails, the data is returned as a raw string.
    • JSON: Raw data is converted into a JSON string. If the conversion fails, the value of the default parameter is returned.
    • raw: A raw string is returned.
  • Response

    A JSON or raw string is returned, depending on the corresponding parameter configurations.

  • Conversions to JSON strings
    • The following table shows successful examples of conversions.
      Raw string Return value Return value type
      1 1 Integer
      1.2 1.2 Float
      true True Boolean
      false False Boolean
      "123" 123 String
      null None None
      ["v1", "v2", "v3"] ["v1", "v2", "v3"] List
      ["v1", 3, 4.0] ["v1", 3, 4.0] List
      {"v1": 100, "v2": "good"} {"v1": 100, "v2": "good"} List
      {"v1": {"v11": 100, "v2": 200}, "v3": "good"} {"v1": {"v11": 100, "v2": 200}, "v3": "good"} List
    • The following table shows unsuccessful examples of conversions. The raw strings are returned after they fail to be converted into JSON strings.
      Raw string Return value Description
      (1,2,3) "(1,2,3)" Tuple is not supported. The list data type must be used.
      True "True" The values of the Boolean data type can only be true and false.
      {1: 2, 3: 4} "{1: 2, 3: 4}" The dictionary key can only be a string.
  • Function example: The following example re-assigns the information of advanced parameters to the "local" parameter.
    The following figure shows the configurations of advanced parameters in the console.Advanced parameter
    Raw log:
    content: 1
    Transformation rule:
    e_set("local", res_local('endpoint'))
    Transformation result:
    content: 1
    local: hangzhou

res_rds_mysql

  • Function syntax
    res_rds_mysql(address, username, password, database, table=None, sql=None, fields=None, fetch_include_data=None, fetch_exclude_data=None, refresh_interval=0,base_retry_back_off=1,max_retry_back_off=60,use_ssl=False)
  • Function parameters
    Parameter Type Required Description
    address String Yes The domain name or IP address. If the port number is not 3306, set this parameter in the format of IP address:port. Only public IP addresses are supported.
    username String Yes The name of the user who accesses ApsaraDB RDS for MySQL. Plaintext usernames are not supported. For more information, see AccessKey.
    password String Yes The password of the user. Plaintext passwords are not supported. For more information, see AccessKey.
    database String Yes The name of the database to be connected.
    table String Yes The name of the table to be accessed. This parameter is not required if the sql parameter is specified.
    sql String Yes The name of the table to be accessed. This parameter is not required if the table parameter is specified.
    fields String list or string alias list No The string list or string alias list. If this parameter is not specified, all columns returned by the table or sql parameter are used. For example, if you want to change the name parameter in the ["user_id", "province", "city", "name", "age"] list to user_name, you can set the fields parameter to ["user_id", "province", "city", ("name", "user_name"), ("nickname", "nick_name"), "age"].
    fetch_include_data Search string No The string whitelist. Strings that match the fetch_include_data parameter are retained. Strings that do not match this parameter are abandoned.
    fetch_exclude_data Search string No The string blacklist. Strings that match the fetch_exclude_data parameter are abandoned. Strings that do not match this parameter are retained.
    refresh_interval Numeric string or number No The interval at which data is pulled. Unit: seconds. The default value is 0, which means to load all data that matches the search conditions only once. In this case, the data will not be updated.
    base_retry_back_off Number No The interval at which data is re-pulled after a pull failure. Default value: 1. Unit: seconds.
    max_retry_back_off int No The default value is 60. Unit: seconds. The maximum interval between retries after a transformation task fails. We recommend that you use the default value.
    use_ssl Bool No Indicates whether to use the SSL protocol to connect to ApsaraDB RDS for MySQL. Default value: false.
    Note If SSL is enabled for ApsaraDB RDS for MySQL and this parameter is specified, the SSL channel will be used to connect to ApsaraDB RDS for MySQL, but the CA certificate of the server will not be verified. You cannot use the CA certificate provided by the server for the connection.
  • Response

    A table that contains multiple columns is returned. The columns are defined by the fields parameter.

  • Error handling

    If an error occurs during the data pull process, an error message will be displayed, but the data transformation task will continue. Anneals and retries are performed according to the configuration of the refresh_interval_max parameter. The first retry interval is 1 second. If the first retry fails, the second retry interval is twice the length of the first one, and so on until the interval reaches the value of the refresh_interval_max parameter. Subsequent retries are performed according to the value of the refresh_interval_max parameter. After a retry succeeds, the retry interval is restored to the initial value (1 second).

  • Function examples
    • The following example pulls data from the test_table table in the test_db database every 300 seconds.
      res_rds_mysql(address="rds mysql database host IP address",username="xxx",password="xxx",database="test_db",table="test_table",refresh_interval=300)
    • The following example pulls data from the test_table table, excluding the data whose status value is delete.
      res_rds_mysql(address="rds mysql database host IP address",username="xxx",password="xxx",database="test_db",table="test_table",refresh_interval=300,fetch_exclude_data="'status':'delete'")
    • The following example pulls the data whose status value is exit from the test_table table.
      res_rds_mysql(address="rds mysql database host IP address",username="xxx",password="xxx",database="test_db",table="test_table",refresh_interval=300,fetch_include_data="'status':'exit'")
    • The following example pulls the data whose status value is exit from the test_table table, excluding the data whose name value is aliyun.
      res_rds_mysql(address="rds mysql database host IP address",username="xxx",password="xxx",database="test_db",table="test_table",refresh_interval=300,fetch_exclude_data="'status':'exit'",fetch_exclude_data="'name':'aliyun'")

res_log_logstore_pull

  • Function syntax
    res_log_logstore_pull(endpoint, ak_id, ak_secret, project, logstore, fields, from_time="begin", to_time="end", fetch_include_data=None, fetch_exclude_data=None, primary_keys=None, upsert_data=None, delete_data=None,max_retry_back_off=60,fetch_interval=2,base_retry_back_off=1)
  • Function parameters
    Parameter Type Required Description
    endpoint String Yes The service endpoint. For more information about Log Service endpoints, see Service endpoint. The default value is an HTTPS address. This parameter can also be an HTTP address. Port 80 or 443 can be used. For example, this parameter can be set to http://endpoint:port.
    ak_id String Yes The AccessKey ID of the account. Plaintext is not supported.
    ak_secret String Yes The AccessKey secret of the account. Plaintext is not supported.
    project String Yes The name of the project from which you want to pull data.
    logstore String Yes The name of the Logstore in the project from which you want to pull data.
    fields String list or string alias list Yes The string list or string alias list. If the Logstore does not contain a specified column, the value of this column is null. For example, if you want to change the name parameter in the ["user_id", "province", "city", "name", "age"] list to user_name, you can set the fields parameter to ["user_id", "province", "city", ("name", "user_name"), ("nickname", "nick_name"), "age"].
    from_time String No The server time at which the data pull from the Logstore starts. The following time formats are supported:
    • Unix timestamp.
    • Time string.
    • Specific string, for example, begin or end.
    • Expression: the time returned by the dt_ function. For example, the expression dt_totimestamp(dt_truncate(dt_today(tz="Aisa/Shanghai"), day=op_neg(-1))) indicates the start time of the data pull yesterday. If the current time is 2019-5-5 10:10:10+8:00, this expression indicates the time 2019-5-4 0:0:0+8:00.
    The default value is begin, which means that the data pull starts from the first log entry.
    to_time String No The server time at which the data pull from the Logstore ends. The following time formats are supported:
    • Unix timestamp.
    • Time string.
    • Specific string, for example, begin or end.
    • Expression: The time returned by the dt_ function.
    The default value is end, which means that the data pull ends at the last log.

    If this parameter is not specified or is set to None, it means to pull data from the latest log entry in a continuous manner.

    Note If this parameter is set to a future time, only existing data in the Logstore will be pulled.
    fetch_include_data Search string No The string whitelist. Strings that match the fetch_include_data parameter are retained. Strings that do not match this parameter are abandoned.
    fetch_exclude_data Search string No The string blacklist. Strings that match the fetch_exclude_data parameter are abandoned. Strings that do not match this parameter are retained.
    primary_keys String list No The list of primary key columns used to maintain the table. If you modify the name of a primary key column in the fields parameter, this parameter must be set to the modified column as the primary key.
    Note The value of the primary_keys parameter must be a column name specified in the fields parameter. The parameter is valid when only one shard exists in the target Logstore from which data is pulled.
    fetch_interval Int No The interval between continuous data pull requests. The default value is 2 seconds. The value must be greater than or equal to 1 second.
    delete_data Search string No Deletes data from the table. Log entries whose primary_keys column is not specified cannot be deleted.
    base_retry_back_off Number No The interval at which data is re-pulled after a pull failure. Default value: 1. Unit: seconds.
    max_retry_back_off Int No The default value is 60. Unit: seconds. The maximum interval between retries after a transformation task fails. We recommend that you use the default value.
  • Response

    A table that contains multiple columns is returned.

  • Error handling

    If an error occurs during the data pull process, an error message will be displayed, but the data transformation task will continue. Anneals and retries are performed according to the configuration of the refresh_interval_max parameter. The first retry interval is 1 second. If the first retry fails, the second retry interval is twice the length of the first one, and so on until the interval reaches the value of the refresh_interval_max parameter. Subsequent retries are performed according to the value of the refresh_interval_max parameter. After a retry succeeds, the retry interval is restored to the initial value (1 second).

  • Function example
    # Sample data in the Logstore
    
    time:1234567
    __topic__:None
    key1:value1
    key2:value2
    
    ...
    • The following example pulls data of column key1 and column key2 from the test_logstore Logstore of the test_project project. The data pull starts from the time when log data is written to the Logstore and ends at the time when the data write is finished. The data pull is performed only once.
      res_log_logstore_pull("endpoint", "ak_id", "ak_secret", "test_project", "test_logstore", ["key1","key2"], from_time="begin", to_time="end")
    • This example pulls data from the Logstore at an interval of 30 seconds.
      res_log_logstore_pull("endpoint", "ak_id", "ak_secret", "test_project", "test_logstore", ["key1","key2"], from_time="begin", to_time=None,fetch_interval=30)
    • The following example pulls data from the Logstore and specifies the blacklist to exclude data that contains the key1:value1 column.
      res_log_logstore_pull("endpoint", "ak_id", "ak_secret", "test_project", "test_logstore", ["key1","key2"], from_time="begin", to_time=None,fetch_interval=30,fetch_exclude_data="key1:value1")
    • The following example pulls data that contains the key1:value1 column from the Logstore by specifying the whitelist.
      res_log_logstore_pull("endpoint", "ak_id", "ak_secret", "test_project", "test_logstore", ["key1","key2"], from_time="begin", to_time=None,fetch_interval=30,fetch_include_data="key1:value1")

res_oss_file

  • Function syntax
    
    res_oss_file(endpoint, ak_id, ak_key, bucket, file, format='text', change_detect_interval=0,base_retry_back_off=1, max_retry_back_off=60,encoding='utf8',error='ignore')
  • Function parameters
    Parameter Type Required Description
    endpoint String Yes The service endpoint. For more information about OSS service endpoints, see Regions and endpoints. The default value is an HTTPS address. This parameter can also be an HTTP address. Port 80 or 443 can be used. For example, this parameter can be set to http://endpoint:port.
    Note We recommend that the OSS bucket and the data transformation Logstore are in the same region so that the network is a stable and fast intranet. Otherwise, the network is a public network, which consumes more bandwidth and provides less stability.
    ak_id String Yes The AccessKey ID of the account. Plaintext is not supported.
    ak_key String Yes The AccessKey secret of the account. Plaintext is not supported.
    bucket String Yes The name of the bucket from which you want to extract object data.
    file String Yes The path of the file to be extracted from the project. This parameter cannot start with a forward slash (/). The format can be test/data.txt.
    format String Yes The format of the output file. The default value is text, which means that the output file is a text file. If the format parameter is set to binary, the output file is in the byte format.
    change_detect_interval String No Checks whether an update interval exists according to the change-tag parameter of the file. If an update interval exists, the file will be refreshed at this interval. The default value is 0, which means not to refresh the file.
    base_retry_back_off Number No The interval at which data is re-pulled after a pull failure. Default value: 1. Unit: seconds.
    max_retry_back_off int No The default value is 60. Unit: seconds. The maximum interval between retries after a transformation task fails. We recommend that you use the default value.
    encoding String No The encoding format.[DO NOT TRANSLATE] If the format parameter is set to text, the encoding parameter is set to utf8 by default.
    error String No The handling method of errors. The default value is ignore. Other valid values are strict, replace, and xmlcharrefreplace. This parameter is valid only when the UnicodeError message is reported.
  • Response
    • File data is returned in the byte stream format.
    • File data is returned in the text format.
  • Error handling

    If an error occurs during the data pull process, an error message will be displayed, but the data transformation task will continue. Anneals and retries are performed according to the configuration of the refresh_interval_max parameter. The first retry interval is 1 second. If the first retry fails, the second retry interval is twice the length of the first one, and so on until the interval reaches the value of the refresh_interval_max parameter. Subsequent retries are performed according to the value of the refresh_interval_max parameter. After a retry succeeds, the retry interval is restored to the initial value (1 second).

  • Example 1: This example extracts a JSON data file from OSS.
    • JSON file content in OSS:
      {
        "users": [
          {
              "name": "user1",
              "login_historys": [
                {
                  "date": "2019-10-10 0:0:0",
                  "login_ip": "1.1.1.1"
                },
                {
                  "date": "2019-10-10 1:0:0",
                  "login_ip": "1.1.1.1"
                }
              ]
          },
          {
              "name": "user2",
              "login_historys": [
                {
                  "date": "2019-10-11 0:0:0",
                  "login_ip": "1.1.1.2"
                },
                {
                  "date": "2019-10-11 1:0:0",
                  "login_ip": "1.1.1.3"
                },
                {
                  "date": "2019-10-11 1:1:0",
                  "login_ip": "1.1.1.5"
                }
              ]
          }
        ]
      }
    • Raw log:
      content: 123
    • Transformation rule:
      e_set("json_parse",json_parse(res_oss_file(endpoint='http://oss-cn-hangzhou.aliyuncs.com',ak_id='your ak_id',
                                                                 ak_key='your ak_key',
                                                                 bucket='your bucket', file='testjson.json')))
    • Transformation result:
      content: 123
          prjson_parse: 
      '{
        "users": [
          {
              "name": "user1",
              "login_historys": [
                {
                  "date": "2019-10-10 0:0:0",
                  "login_ip": "1.1.1.1"
                },
                {
                  "date": "2019-10-10 1:0:0",
                  "login_ip": "1.1.1.1"
                }
              ]
          },
          {
              "name": "user2",
              "login_historys": [
                {
                  "date": "2019-10-11 0:0:0",
                  "login_ip": "1.1.1.2"
                },
                {
                  "date": "2019-10-11 1:0:0",
                  "login_ip": "1.1.1.3"
                },
                {
                  "date": "2019-10-11 1:1:0",
                  "login_ip": "1.1.1.5"
                }
              ]
          }
        ]
      }'
  • Example 2: This example extracts the content of the test.txt file from OSS.
    • Content of the test.txt file in OSS:[DO NOT TRANSLATE]
      Test bytes
    • Raw log:
      content: 123
    • Transformation rule:
      e_set("test_txt",res_oss_file(endpoint='http://oss-cn-hangzhou.aliyuncs.com',ak_id='your ak_id',
                                                                 ak_key='your ak_key',
                                                                 bucket='your bucket', file='test.txt'))
    • Transformation result:
      content: 123
      test_txt: Test bytes