This topic describes how to use a Java UDF to replace strings by using regular expressions.
UDF used to replace strings by using regular expressions
String UDFRegxpReplace(String s, String regex, String replacement)
- Description: used to replace strings that match the regular expression. Compared with the REGEXP_REPLACE built-in function of MaxCompute, the regular expression in this UDF supports variables.
- Parameters:
- s: the string you want to replace, which is of the STRING type
- regex: the regular expression of the STRING type
- replacement: the string you use to replace the original string, which is of the STRING type
UDF example
- Function registration
After UDFRegxpReplace.java passes the test, register it as a function.Note To publish a UDF to a server for production use, the UDF needs to go through packaging, uploading, and registration. You can use the one-click publish function to complete these steps. MaxCompute Studio allows you to run the
mvn clean package
command, upload a JAR package, and register the UDF in sequence. For more information, see Package, upload, and register a Java program. - Example
After the UDF is registered, execute the following statement:
The result is as follows:select UDFRegxpReplace("foobar", "oo|ar", "") from dual;
+-----+ | _c0 | +-----+ | fb | +-----+
UDF code example
// The package name, which can be defined as needed.
package com.aliyun.odps.examples.udf;
import com.aliyun.odps.udf.annotation.UdfProperty;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
@UdfProperty(isDeterministic=true)
public class UDFRegxpRplace extends UDF {
private String lastRegex = new String();
private Pattern p = null;
private String lastReplacement = new String();
private String replacementString = "";
public String evaluate(String s, String regex, String replacement) {
if (s == null || regex == null || replacement == null) {
return null;
}
// If the regular expression is changed, compile it again.
if (! regex.equals(lastRegex) || p == null) {
lastRegex = regex;
p = Pattern.compile(regex.toString());
}
Matcher m = p.matcher(s.toString());
// If the string used to replace the original string is changed, run toString again.
if (! replacement.equals(lastReplacement)) {
lastReplacement = replacement;
replacementString = replacement.toString();
}
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, replacementString);
}
m.appendTail(sb);
return sb.toString();
}
}