This topic describes how to create an ApsaraDB RDS for MySQL dimension table in Realtime Compute for Apache Flink. This topic also describes the parameters in the WITH clause, cache parameters, and data type mapping used when you create an ApsaraDB RDS for MySQL dimension table.
ApsaraDB RDS for MySQL
ApsaraDB RDS for MySQL is developed based on a branch of the MySQL source code. Its excellent performance has been proven over years of Double 11, during which it needs to handle large volumes of concurrent traffic. ApsaraDB RDS for MySQL provides basic features, such as instance management, account management, database management, backup and restoration, control access, Transparent Data Encryption (TDE), and data migration. ApsaraDB RDS for MySQL also provides the following advanced features and functions:
- ApsaraDB for MyBase dedicated clusters: An ApsaraDB for MyBase dedicated cluster consists of multiple hosts, such as ECS instances of the ecs.i2.xlarge instance type and ECS Bare Metal instances. You can run instances on these hosts. For more information, see What is ApsaraDB for MyBase?
- Read-only RDS instances: If the primary RDS instance is overwhelmed by a large number of read requests, your workloads may be interrupted. In this case, you can create one or more read-only RDS instances to offload read requests from the primary RDS instance. For more information, see Overview of ApsaraDB RDS for MySQL read-only instances. This scales up the read capability of your database system and increases the throughput of your application.
- Read/write splitting: The read/write splitting function provides a read/write splitting endpoint. This endpoint connects to the primary RDS instance and all of the read-only RDS instances to establish an automatic read/write splitting link. For more information, see Read/write splitting. Your application can read and write data into your database system after it connects to this endpoint. ApsaraDB for RDS distributes write requests to the primary RDS instance and read requests to the read-only RDS instances based on the specified read weights. You can create more read-only RDS instances to scale up the read capability of your database system. In addition, you do not need to modify your application.
- Dedicated proxy: A dedicated proxy uses dedicated computing resources. It provides more advanced functions, such as read/write splitting, short-lived connection optimization, and transaction splitting. For more information, see What is database proxies.
- Database Autonomy Service (DAS): DAS supports intelligent diagnostics and optimization at the instance level based on various metrics. These metrics include SQL execution performance, CPU utilization, input/output operations per second (IOPS) utilization, memory usage, disk usage, number of connections, locks, and hotspot tables. For more information, see DAS overview. DAS allows you to identify existing and potential issues that may compromise the health of your database system. In addition, DAS provides details and solutions for the identified issues. This facilitates database maintenance.
Realtime Compute for Apache Flink does not allow you to use ApsaraDB RDS for MySQL V8.0 by using the storage registration method. To use ApsaraDB RDS for MySQL V8.0, we recommend that you configure a plaintext AccessKey pair. For more information about the storage registration method, see Overview.
CREATE TABLE rds_dim_table( id INT, len INT, content VARCHAR, PRIMARY KEY (id), PERIOD FOR SYSTEM_TIME --Define the change period of the dimension table. ) with ( type='rds', url='<yourDatabaseURL>', tableName='<yourDatabaseTableName>', userName='<yourDatabaseUserName>', password='<yourDatabasePassword>' );
Parameters in the WITH clause
|type||The type of the dimension table.||Yes||Set the value to
|url||The Java Database Connectivity (JDBC) URL of the database.||Yes||Set the value in the
|tableName||The name of the table.||Yes||N/A.|
|userName||The username that is used to access the ApsaraDB RDS database.||Yes||N/A.|
|password||The password that is used to access the ApsaraDB RDS database.||Yes||N/A.|
|maxRetryTimes||The maximum number of connection attempts.||No||Default value: 10.|
|cache||The policy that is used to cache data.||No||Valid values:
Note If you set the cache parameter to ALL, you must increase the memory of the node for joining tables because the system asynchronously loads data from the dimension table. The increased memory size is twice the memory size of the remote table.
|cacheSize||The maximum number of rows of data records that can be cached.||No||This parameter is available only if you set the cache parameter to
|cacheTTLMs||The cache timeout period. Unit: milliseconds.||No||If the cache parameter is set to
|cacheReloadTimeBlackList||The time periods during which the cache is not refreshed. This parameter takes effect
when the cache parameter is set to
||No||This parameter is empty by default. For example, you can set this parameter to
|maxJoinRows||The maximum number of results that are returned each time a data record in the primary table is queried and matched with data records in the dimension table.||No||Default value: 1024. If you can estimate that a data record in the primary table corresponds
to a maximum of n data records in the dimension table, you can set the maxJoinRows parameter to n to ensure efficient matching in Realtime Compute for Apache Flink.
Note When you join a dimension table with another table, this parameter specifies the maximum number of results that can be returned after a data record in the primary table is matched with data records in the dimension table.
CREATE TABLE datahub_input1 ( id BIGINT, name VARCHAR, age BIGINT ) WITH ( type='datahub', endPoint='http://dh-cn-hangzhou.aliyun-inc.com', project='<yourProjectName>', topic='<yourTopic>', accessId='<yourAccessID>', accessKey='<yourAccessSecret>', startTime='2017-07-21 00:00:00' ); create table phoneNumber( name VARCHAR, phoneNumber BIGINT, primary key(name), PERIOD FOR SYSTEM_TIME--Define the change period of the dimension table. )WITH( type='rds', url='<yourDatabaseURL>', tableName='<yourDatabaseTableName>', userName='<yourDatabaseUserName>', password='<yourDatabasePassword>' ); CREATE table result_infor( id BIGINT, phoneNumber BIGINT, name VARCHAR )WITH( type='rds', url='<yourDatabaseURL>', tableName='<yourDatabaseTableName>', userName='<yourDatabaseUserName>', password='<yourDatabasePassword>' ); INSERT INTO result_infor SELECT t.id, w.phoneNumber, t.name FROM datahub_input1 as t JOIN phoneNumber FOR SYSTEM_TIME AS OF PROCTIME() as w --You must include this clause when you perform a JOIN operation on the dimension table. ON t.name = w.name;
For more information about the detailed syntax of the JOIN statements for a dimension table, see JOIN statements for dimension tables.
Data type mapping
|Data type of ApsaraDB RDS||Data type of Realtime Compute for Apache Flink|