All Products
Search
Document Center

Realtime Compute for Apache Flink:AI_MASK

Last Updated:Dec 10, 2025

This topic describes how to use AI_MASK to perform data masking.

Limitations

  • This feature requires Ververica Runtime (VVR) 11.4+.

  • The throughput of AI_MASK operators is subject to the rate limits of Alibaba Cloud Model Studio. When the rate limits for a model are reached, the Flink job will be backpressured with AI_MASK operators as the bottleneck. In some cases, timeout errors and job restarts may be triggered.

Syntax

AI_MASK(
  MODEL => MODEL <MODEL NAME>, 
  INPUT => <INPUT COLUMN NAME>,
  MASK_ENTITIES => <MASK ENTITIES>
)

Input parameters

Parameter

Data type

Description

MODEL <MODEL NAME>

MODEL

The name of the registered model. For more information, see Model settings to register a model service.

Note: The output type of this model must be VARIANT.

<INPUT COLUMN NAME>

STRING

The original text for the model to analyze.

<MASK ENTITIES>

ARRAY<STRING>

The entities to be masked.

Note: This input parameter must be a constant.

Return values

Parameter

Data type

Description

masked_text

STRING

The masked text.

detected_entities

ARRAY<STRING>

The detected entities.

Example

Test data

id

content

1

Timmo loves reading and always does so in his spare time.

Test statements

The sample SQL statements uses the Qwen-Plus model and AI_MASK to perform data masking.

CREATE TEMPORARY MODEL general_model
INPUT (`input` STRING)
OUTPUT (`content` VARIANT)
WITH (
    'provider' = 'openai-compat',
    'endpoint'='<YOUR ENDPOINT>',
    'apiKey' = '<YOUR KEY>',
    'model' = 'qwen-plus'
);

CREATE TEMPORARY VIEW infos(id, content)
AS VALUES (1, 'Timmo loves reading and always does so in his spare time.');

-- Use positional argument to call AI_MASK
SELECT id, masked_text, detected_entities
FROM infos,
LATERAL TABLE(
  AI_MASK(
    MODEL general_model, 
    content, 
    ARRAY['name']
    ));

-- Use named argument to call AI_MASK
SELECT id, masked_text, detected_entities
FROM infos,
LATERAL TABLE(
  AI_MASK(
    MODEL => MODEL general_model, 
    INPUT => content, 
    MASK_ENTITIES => ARRAY['name']
    ));

Result

id

masked_text

detected_entities

1

[NAME] loves reading and always does so in his spare time.

[{"entity":"Timmo","type":"name"}]