This topic provides the DDL syntax that is used to create a Faker source table, describes the limits on the use of the Faker connector and the parameters in the WITH clause, and provides sample code.
What is Faker?
Faker is a built-in connector of fully managed Flink. The connector generates test data based on Java Faker expressions that are provided for each field in a table. If you want to use test data to verify business logic during job development or testing, we recommend that you use the Faker connector.
Limits
- Only Flink that uses Ververica Runtime (VVR) 4.0.12 or later supports the Faker connector.
- The Faker connector supports the following data types: CHAR(n), VARCHAR(n), STRING, TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, DECIMAL, BOOLEAN, TIMESTAMP, ARRAY, MAP, MULTISET, and ROW.
DDL syntax
CREATE TABLE faker_source (
`name` STRING,
`age` INT
) WITH (
'connector' = 'faker',
'fields.name.expression' = '#{superhero.name}',
'fields.age.expression' = '#{number.numberBetween ''0'',''1000''}'
);
Parameters in the WITH clause
Parameter | Description | Data type | Required | Remarks |
---|---|---|---|---|
connector | The type of the source table. | STRING | Yes | Set the value to faker. |
number-of-rows | The number of rows of data that is generated. | INTEGER | No | If you configure this parameter, the source table is bounded. If you do not configure this parameter, the source table is unbounded. |
rows-per-second | The rate at which random data is generated. | INTEGER | No | Default value: 10000. Unit: records per second. |
fields.<field>.expression | The Java Faker expression that is used to generate the value of the field. | STRING | Yes | For more information, see Field expression. |
fields.<field>.null-rate | The rate at which the value of the field is null. | FLOAT | No | Default value: 0.0. |
fields.<field>.length | The length of the ARRAY, MAP, or MULTISET data type. | INTEGER | No | Default value: 1. |
Field expression
- Operation
When you use the Faker connector, you must define an expression in the WITH clause for each field in the DDL statement. The expression is in the 'fields.<field>.expression' = '#{className.methodName ''parameter'', ...}' format. The following table describes the parameters in the expression.
Parameter Description field The name of a field in the DDL statement. className The name of a Faker class. Java Faker provides about 80 Faker classes to generate field expressions. You can select the classes based on your business requirements.
Note Faker class names are not case-sensitive.methodName The name of a method. Note Method names are not case-sensitive.parameter The input parameters of a method. Note- Each input parameter of a method must be enclosed in single quotation marks (').
- Separate multiple input parameters with commas (,).
- Example
This example describes how to generate an SQL expression for a field in a DDL statement based on the Java Faker API documentation. The 'fields.age.expression' = '#{number.numberBetween ''0'',''1000''}' expression for the age field in DDL syntax is used in this example.
- Find the Number class in the Java Faker API documentation.
- Find the numberBetween method in the Number class and view the method description.
The numberBetween method specifies the value range of the return value.
- Obtain the SQL expression 'fields.age.expression' = '#{number.numberBetween ''0'',''1000''}' for the age field based on the Number class and the values 0 and 1000 that are specified
by the numberBetween method.
This expression indicates that the generated value of the age field is in the range of 0 to 1000.
- Find the Number class in the Java Faker API documentation.
Sample code
CREATE TEMPORARY TABLE heros_source (
`name` STRING,
`power` STRING,
`age` INT
) WITH (
'connector' = 'faker',
'fields.name.expression' = '#{superhero.name}',
'fields.power.expression' = '#{superhero.power}',
'fields.power.null-rate' = '0.05',
'fields.age.expression' = '#{number.numberBetween ''0'',''1000''}'
);
CREATE TEMPORARY table blackhole_sink(
`name` STRING,
`power` STRING,
`age` INT
) WITH (
'connector' = 'blackhole'
);
INSERT INTO blackhole_sink SELECT * FROM heros_source;