This topic describes how to use a Java UDF to replace strings by using regular expressions.

UDF used to replace strings by using regular expressions

String UDFRegxpReplace(String s, String regex, String replacement)
  • Description: used to replace strings that match the regular expression. Compared with the REGEXP_REPLACE built-in function of MaxCompute, the regular expression in this UDF supports variables.
  • Parameters:
    • s: the string you want to replace, which is of the STRING type
    • regex: the regular expression of the STRING type
    • replacement: the string you use to replace the original string, which is of the STRING type

UDF example

  • Function registration
    After UDFRegxpReplace.java passes the test, register it as a function.
    Note To publish a UDF to a server for production use, the UDF needs to go through packaging, uploading, and registration. You can use the one-click publish function to complete these steps. MaxCompute Studio allows you to run the mvn clean package command, upload a JAR package, and register the UDF in sequence. For more information, see Package、Upload and Register.
  • Example
    After the UDF is registered, execute the following statement:
    select UDFRegxpReplace("foobar", "oo|ar", "") from dual;
    The result is as follows:
    +-----+
    | _c0 |
    +-----+
    | fb  |
    +-----+

UDF code example

// The package name, which can be defined as needed.
package com.aliyun.odps.examples.udf; 
import com.aliyun.odps.udf.annotation.UdfProperty;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

@UdfProperty(isDeterministic=true)
public class UDFRegxpRplace extends UDF {

    private String lastRegex = new String();
    private Pattern p = null;

    private String lastReplacement = new String();
    private String replacementString = "";

    public String evaluate(String s, String regex, String replacement) {
        if (s == null || regex == null || replacement == null) {
            return null;
        }
        // If the regular expression is changed, compile it again.
        if (! regex.equals(lastRegex) || p == null) {
            lastRegex = regex;
            p = Pattern.compile(regex.toString());
        }
        Matcher m = p.matcher(s.toString());
        // If the string used to replace the original string is changed, run toString again.
        if (! replacement.equals(lastReplacement)) {
            lastReplacement = replacement;
            replacementString = replacement.toString();
        }

        StringBuffer sb = new StringBuffer();
        while (m.find()) {
            m.appendReplacement(sb, replacementString);
        }
        m.appendTail(sb);

        return sb.toString();
    }
}