Dataphin supports multiple methods for developing Flink SQL tasks. These methods include native Data Definition Language (DDL) and Data Manipulation Language (DML) development, Catalog-based development, development with Dataphin meta tables, development with Dataphin data source tables, and development with physical tables from a computing source. Tables created using different methods, such as image tables, can be used together. Each method has unique procedures, applicable scenarios, advantages, and disadvantages. This topic describes these development methods to help you effectively develop Flink SQL tasks.
Dataphin computing source physical table development method
The Dataphin computing source physical table development method lets you directly access physical tables in a computing source when you develop a Flink SQL task. To do this, specify the table in the format project_name.table_name. This method also supports cross-project access to physical tables that are attached to other projects.
Currently, this method supports access to physical table data from Hologres, Hadoop, and StarRocks computing sources.
The project that contains the physical table must be attached to a supported computing source.
Example
To insert data from the test physical table in the computing source of the example project into the test_demo physical table, refer to the following sample code:
insert into test_demo select id,name from example.test;Dataphin data source table development method
The Dataphin data source table development method lets you directly access tables from a data source created in Dataphin when you develop a Flink SQL task. To use this method, you must first configure the data source encoding for the data source. For more information, see Data Source Management.
After you configure the data source encoding, you can reference tables from the data source in a Flink SQL task using the format data_source_encoding.table or data_source_encoding.schema.table. To automatically access the data source that corresponds to the current environment, use the format ${data_source_encoding}.table or ${data_source_encoding}.schema.table.
Currently, only MySQL, Hologres, MaxCompute, Hive, Oracle, StarRocks, SelectDB, and GaussDB (DWS) data sources are supported.
For data sources that do not support schemas, such as MySQL, Hive, StarRocks, and SelectDB, you can access physical tables using the format
data_source_encoding.table_name.For data sources that support schemas, such as Hologres and GaussDB (DWS), you can access physical tables using the format
data_source_encoding.schema_name.table_name.
Example
To insert data from the demo_mysql physical table of a MySQL data source with the encoding ds_demo_mysql into the test_demo physical table, refer to the following sample code.
insert into test_demo select id,name from ds_demo_mysql.demo_mysql;To insert data from the demo_hologres physical table of a Hologres data source with the encoding ds_demo_hologres and schema name hologres into the test_demo physical table, refer to the following sample code.
insert into test_demo select id,name from ds_demo_hologres.hologres.demo_hologres;Dataphin meta table development method
In Dataphin, a meta table is a logical concept that provides a higher level of abstraction than native DDL and DML development. A meta table is a cross-storage class table managed by Data Management. You can create and manage input, output, and dimension tables as meta tables during the development process. This method lets you create a table once and reference it multiple times. You do not need to write repetitive DDL statements or perform complex mappings for input, output, and dimension tables. This simplifies development and improves efficiency and the user experience. Meta tables also help prevent the exposure of sensitive information, which can occur when you write native Flink DDL statements directly.
Example
To create the demo01 and demo02 data tables and insert data from demo01 into demo02, follow these steps.
Use the Dataphin meta table feature to create the
demo01anddemo02data tables. For more information, see Create a meta table.Write an insert statement in the Flink SQL task. The following is a sample code:
INSERT into demo02 select * from demo01;Catalog-based development
Catalog-based development involves connecting to a database by creating a Catalog in a Flink SQL task and using the tables within that Catalog. This method eliminates the need to write DDL statements for tables, which simplifies Flink SQL development. For example, after you create Catalog01 and table t1 in a Flink SQL task, you can directly access table t1 by creating Catalog01 again in a new Flink SQL task.
Only the open source Flink real-time computing engine is supported.
Creating physical tables in a Catalog is not supported. Only in-memory temporary tables are supported.
The
USE CATALOG/USE DATABASEstatements are not supported.The
ALTER TABLEstatement is supported only in Flink 1.17.Accessing tables using the format
catalog.database.'schema.table'is not supported. Only the formatcatalog.database.tableis supported.Supported Catalog types include Java Database Connectivity (JDBC) (MySQL, Oracle) and Paimon.
Example
CREATE CATALOG my_catalog WITH (
'type' = 'jdbc',
'base-url' = 'jdbc:mysql://rm-uf*******7o.mysql.rds.aliyuncs.com:3306',
'default-database' = 'dataphin_01',
'username' = '*******',
'password' = '*******'
);
CREATE TEMPORARY TABLE t2 (
id bigint,
name STRING
) WITH (
'connector' = 'print'
);
-- Write streaming data to the dynamic table
INSERT INTO t2 SELECT id,name FROM my_catalog.dataphin_01.pf_id_name;Native DDL+DML development method
The native Data Definition Language (DDL) and Data Manipulation Language (DML) development method involves using Flink SQL statements directly in a Flink SQL task to create and manage data tables. For example, you can use CREATE TABLE/CREATE TEMPORARY TABLE to create a table. This method defines the table schema in code and uses SQL statements to manage the tables.
The native DDL and DML development method requires you to write plaintext usernames and passwords in the code. This practice is not secure and can lead to a data breach. Use this method with caution.
Examples
To use the native DDL and DML development method in a Flink SQL task, refer to the following sample code. The sample code demonstrates how to read analog data from table t1 and write it to table t2.
To create a table using native DDL and DML statements, you must disable the Prohibit the use of native Flink DDL statements setting in the Dataphin Coding standards. For more information, see Coding standards.
create temporary table t1 (
id bigint,
name varchar
) with (
'connector' = 'datagen',
'rows-per-second' = '1'
);
create temporary table t2 (
id bigint,
name varchar
) with (
'connector' = 'print'
);
-- begin statement set;
insert into t2 select id,replace(name, '\"', '"') as name from t1;
-- set;