All Products
Search
Document Center

MaxCompute:Example: Obtain values of strings that do not have delimiters

Last Updated:Mar 26, 2026

This example shows how to write a user-defined function (UDF) named UDF_EXTRACT_KEY_VALUE to extract a value from a key-value pair string that uses custom delimiters — for example, name:zhangsan;age:21;.

This UDF is designed for strings without fixed delimiters. To parse strings that already have standard delimiters, see Example: Obtain the values of strings that have delimiters.

How it works

UDF_EXTRACT_KEY_VALUE splits a string in two passes:

  1. Use split1 to split the string into key-value pairs.

  2. Use split2 to split each pair into a key and a value.

  3. Return the value for the key specified by keyname.

Syntax

STRING UDF_EXTRACT_KEY_VALUE(STRING <s>, STRING <split1>, STRING <split2>, STRING <keyname>)

Parameters

Parameter Type Required Description
s STRING Yes The source string
split1 STRING Yes The delimiter that separates key-value pairs
split2 STRING Yes The delimiter that separates keys from values within each pair
keyname STRING Yes The key whose value you want to retrieve

Return value: STRING — the value of the specified key, or NULL if the key is not found or the input is empty.

Prerequisites

Before you begin, ensure that you have:

  • A MaxCompute project with UDF creation permissions

  • The MaxCompute client or DataWorks DataStudio for uploading resources and registering the UDF

Write the UDF

Choose a language and implement the UDF logic.

Java

package com.aliyun.rewrite; // Specify a package name.
import com.aliyun.odps.udf.UDF;

import java.util.HashMap;
import java.util.Map;

public class ExtractKeyValue extends UDF {
    private static final int KEY_VALUE_LENGTH = 2;

    /**
     * Extracts the value for a given key from a delimited string.
     * @param str     The source string.
     * @param split1  The delimiter that separates key-value pairs.
     * @param split2  The delimiter that separates keys from values.
     * @param keyname The key whose value you want to retrieve.
     * @return The value of the specified key, or null if not found.
     */
    public String evaluate(String str, String split1, String split2, String keyname) {
        try {
            // Return null for empty or null input.
            if (str == null || "".equals(str)) {
                return null;
            }
            Map<String, String> keyValueCache = new HashMap<>(8);
            String[] extractedKeyValues = str.split(split1);

            // Split each key-value pair and store the result.
            for (String keyValue : extractedKeyValues) {
                storeKeyValue(keyValueCache, keyValue, split2);
            }

            // Return the value for the specified key.
            return keyValueCache.get(keyname);
        } catch (Exception e) {
            return null;
        }
    }

    /**
     * Splits a key-value pair and stores it in the cache.
     * @param keyValueCache The map to store the parsed key-value pairs.
     * @param keyValue      A single key-value pair string.
     * @param split         The delimiter between the key and value.
     */
    private void storeKeyValue(Map<String, String> keyValueCache, String keyValue, String split) {
        if (keyValue == null || "".equals(keyValue)) {
            return;
        }
        String[] keyValueArr = keyValue.split(split);
        if (keyValueArr.length == KEY_VALUE_LENGTH) {
            keyValueCache.put(keyValueArr[0], keyValueArr[1]);
        }
    }
}

The evaluate method defines four STRING input parameters and returns a STRING value. This signature becomes the UDF's SQL signature. For Java UDF specifications, see Java UDFs.

Python 3

from odps.udf import annotate


@annotate("string,string,string,string->string")
class ExtractKeyValue(object):
    def evaluate(self, s, split1, split2, keyname):
        if not s:
            return None
        # Split the string into key-value pairs, then split each pair into a key and value.
        key_value_cache = dict(kv.split(split2) for kv in s.split(split1) if kv)
        # Return the value for the specified key.
        return key_value_cache.get(keyname)

MaxCompute projects run Python 2 by default. To use this Python 3 UDF, run the following command at the session level before calling the UDF:

set odps.sql.python.version=cp37;

For Python 3 UDF specifications, see Python 3 UDFs.

Python 2

#coding:utf-8
from odps.udf import annotate


@annotate("string,string,string,string->string")
class ExtractKeyValue(object):
    def evaluate(self, s, split1, split2, keyname):
        if not s:
            return None
        # Split the string into key-value pairs, then split each pair into a key and value.
        key_value_cache = dict(kv.split(split2) for kv in s.split(split1) if kv)
        # Return the value for the specified key.
        return key_value_cache.get(keyname)

If your UDF code contains Chinese characters, add an encoding declaration (#coding:utf-8 or # -*- coding: utf-8 -*-) at the top of the file to avoid runtime errors. For Python 2 UDF specifications, see Python 2 UDFs.

Upload resources and register the UDF

After developing and testing your UDF code, upload it to MaxCompute and register the UDF as UDF_EXTRACT_KEY_VALUE.

Run the UDF

After registering the UDF, run the following SQL to extract the value of the name key from name:zhangsan;age:21;:

-- To use a Python 3 UDF, run this line first.
set odps.sql.python.version=cp37;

SELECT UDF_EXTRACT_KEY_VALUE('name:zhangsan;age:21;', ';', ':', 'name');

Expected output:

+----------+
| _c0      |
+----------+
| zhangsan |
+----------+

Next steps