Use MaxCompute external tables to read from and write to ApsaraDB RDS for MySQL databases over a Virtual Private Cloud (VPC) or the public network—without moving data into MaxCompute storage.
Prerequisites
Before you begin, ensure that you have:
-
An ApsaraDB RDS for MySQL 5.x or 8.0 instance (other RDS engines are not supported)
-
Network connectivity established between MaxCompute and your RDS instance. See Access the public network
-
A network connection configured in MaxCompute for the VPC where your RDS instance resides. Find it in the MaxCompute console under Manage Configurations > Network Connection
Limitations
-
Supported regions: China (Beijing), China (Shanghai), China (Zhangjiakou), China (Ulanqab), China (Hangzhou), China (Shenzhen), China (Hong Kong), China (Shanghai) Finance Cloud (Zone F), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Germany (Frankfurt), US (Silicon Valley), and US (Virginia).
-
PrivateZone domain names are not supported.
-
External tables do not support the cluster property.
-
Table and column names are case-insensitive. Forcing case conversion is not supported.
Write behavior
When writing large amounts of data, MaxCompute uses parallel multi-process writing. This can cause a write process to rewrite data and result in data duplication. Design your RDS tables with deduplication keys or handle duplicate records downstream if you write large datasets through external tables.
DECIMAL precision
The DECIMAL type in ApsaraDB RDS external tables defaults to 18 decimal places and can only be created as DECIMAL(38,18). If you need fewer decimal places, declare the column as STRING when creating the external table, then use the CAST function to convert values at query time.
Schema mismatch behavior
If the schema of the external table does not match the ApsaraDB RDS for MySQL source table:
| Mismatch type | Behavior |
|---|---|
| External table has more columns than RDS table | Error: Unknown column 'xxx' in 'field list' |
| External table has fewer columns than RDS table | Extra RDS columns are silently discarded |
| STRING to INT mapping | Not supported |
| INT to STRING mapping | Supported but not recommended |
Syntax
Enable Hive-compatible mode and run CREATE EXTERNAL TABLE with a JDBC storage handler and connection string:
-- Enable Hive-compatible mode
SET odps.sql.hive.compatible = true;
CREATE EXTERNAL TABLE <table_name> (
<col_name1> <data_type>,
<col_name2> <data_type>,
...
)
STORED BY 'com.aliyun.odps.jdbc.JdbcStorageHandler'
LOCATION '<jdbc:mysql://<host:port>/<rds_database_name>?useSSL=false&user=<user_name>&password=<password_value>&table=<rds_table_name>>'
TBLPROPERTIES (
['odps.federation.jdbc.colmapping'='<col_name1:rdstable_colname1|select_alias1>[,<col_name2:rdstable_colname2|select_alias2>,...]',]
'mcfed.mapreduce.jdbc.input.query'='<select_sentence>',
'networklink'='<networklink_name>'
);
Parameters
Replace the placeholders in the LOCATION string and TBLPROPERTIES with your actual values:
| Placeholder | Description | Example |
|---|---|---|
<table_name> |
Name of the external table to create | mc_rds_orders |
<col_name> / <data_type> |
Column definitions matching the RDS table columns | id INT, name STRING |
<host:port> |
Internal endpoint and port of the RDS instance | rm-xxx.mysql.rds.aliyuncs.com:3306 |
<rds_database_name> |
Name of the RDS database | production_db |
<user_name> |
RDS database account | rds_user |
<password_value> |
Password for the RDS account | — |
<rds_table_name> |
Name of the source table in RDS | orders |
<networklink_name> |
Name of the network connection in MaxCompute | rds-vpc-link |
If the connection string contains special characters, URL-encode them. See URL_ENCODE.
To find the RDS internal endpoint and port:
-
Log on to the RDS console and click Instances in the left navigation pane.
-
Click the Instance ID/Name of your target instance.
-
In the left navigation pane, click RDS consoleDatabase Connection to view the Internal Endpoint and Internal Port.
To find the network connection name:
-
Log on to the MaxCompute console and select your region.
-
In the left navigation pane, choose Manage Configurations > Network Connection.
-
Copy the network connection name for the VPC of your RDS instance.
TBLPROPERTIES reference
| Property | Required | Description |
|---|---|---|
networklink |
Yes | Name of the MaxCompute network connection for the VPC where the RDS instance resides |
mcfed.mapreduce.jdbc.input.query |
No | A passthrough query sent directly to RDS for reads. Format: SELECT xxx FROM <rds_database_name>.<rds_table_name>. Column names and types must match the external table definition, including any aliases. |
odps.federation.jdbc.colmapping |
No | Column mapping between the external table and the RDS source table. Format: col_name:rdstable_colname|select_alias. The number of mapped columns must equal the external table column count. If omitted, columns are mapped by name. If partially specified, unmapped columns are matched by name—errors occur if names or types do not match. |
Create an external table from an ApsaraDB RDS data source
This section walks through an end-to-end example: creating an RDS table with sample data, then creating two MaxCompute external tables that map to it.
Step 1: Set up the RDS table
-
Log on to the RDS console and click Instances in the left navigation pane.
-
Select your instance. If you do not have one, click Create Instance and set the engine to ApsaraDB RDS for MySQL 5.x or 8.0.
-
In the left navigation pane, click Databases, then click Create Database. Configure the following parameters:
Parameter Required Description Example Database Name Yes 2–64 characters. Starts with a letter, ends with a letter or digit. Lowercase letters, digits, underscores, and hyphens are allowed. Must be unique within the instance. If the name contains -, the folder name replaces it with@002d.rds_mc_testSupported Character Set Yes Select a character set for the database. utf8Authorized Account No Account that can access this database. Only standard accounts are listed. Privileged accounts have access to all databases automatically. You can attach an account after creation. DefaultDescription No Notes for database management. Up to 256 characters. RDS external table test -
Click Log On to Database. In the left navigation pane of the DMS console, select Database Instances and double-click the database you created.
-
On the SQLConsole page, create a test table and insert sample data:
If the target database is not visible after expanding the instance in DMS, the logon account may lack access. Go to the Accounts page of the RDS instance to modify account permissions or change the logon account. Alternatively, hover over the instance name and click the refresh button to sync the database list.
CREATE TABLE `rds_mc_external` ( `id` INT(11) DEFAULT NULL, `name` VARCHAR(32) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; INSERT INTO `rds_mc_external` (`id`, `name`) VALUES (1, 'Alice'); INSERT INTO `rds_mc_external` (`id`, `name`) VALUES (1, 'Bob');
Step 2: Establish network connectivity
Establish network connectivity between MaxCompute and ApsaraDB RDS. For more information, see Access the public network.
After the network is connected, MaxCompute can connect only to the network of the specified VPC. To access other regions or other VPCs in the same region, you must establish a network connection between the VPC that is specified for the leased line connection and the other VPCs based on your cloud VPC connection solution.
Step 3: Create a MaxCompute external table
Choose the mapping method based on how your external table columns correspond to RDS columns.
Method 1: Map all columns by name
Use this method when the external table column names match the RDS column names exactly.
-
In the MaxCompute client, create the external table:
SET odps.sql.hive.compatible = true; CREATE EXTERNAL TABLE mc_vpc_rds_external ( id INT, name STRING ) STORED BY 'com.aliyun.odps.jdbc.JdbcStorageHandler' LOCATION 'jdbc:mysql://rm-2ze01y92y1tzp****.mysql.rds.aliyuncs.com:3306/rds_mc_test?useSSL=false&user=<user_name>&password=<password>&table=rds_mc_external' TBLPROPERTIES ( 'odps.federation.jdbc.colmapping' = 'key:id,value:name', 'mcfed.mapreduce.jdbc.input.query' = 'select * from rds_mc_test.rds_mc_external', 'networklink' = '<your_network_connection_name>' ); -
Insert a row through the external table:
INSERT INTO TABLE mc_vpc_rds_external VALUES (2, 'Zoey'); -
Query the result:
SELECT * FROM mc_vpc_rds_external;Expected output:
+------------+------------+ | id | name | +------------+------------+ | 1 | Alice | | 1 | Bob | | 2 | Zoey | +------------+------------+
Method 2: Map specific columns
Use this method when you want to map the external table columns to the specified column names in the ApsaraDB RDS table.
-
In the MaxCompute client, create the external table:
SET odps.sql.hive.compatible = true; CREATE EXTERNAL TABLE mc_vpc_rds_external_mapping ( id INT, name STRING ) STORED BY 'com.aliyun.odps.jdbc.JdbcStorageHandler' LOCATION 'jdbc:mysql://rm-2ze01y92y1tzp****.mysql.rds.aliyuncs.com:3306/rds_mc_test?useSSL=false&user=****&password=****&table=rds_mc_external' TBLPROPERTIES ( 'mcfed.mapreduce.jdbc.input.query' = 'select * from rds_mc_test.rds_mc_external', 'networklink' = '<your_network_connection_name>' ); -
Insert a row through the external table:
INSERT INTO TABLE mc_vpc_rds_external_mapping VALUES (4, 'Lisa'); -
Query the result:
SELECT * FROM mc_vpc_rds_external_mapping;Expected output:
+------------+------------+ | id | name | +------------+------------+ | 1 | Alice | | 1 | Bob | | 4 | Lisa | +------------+------------+
TBLPROPERTIES attributes
The TBLPROPERTIES of ApsaraDB RDS external tables also support the following attributes.
|
Attribute Name |
Function |
|
|
When data is retrieved from ApsaraDB RDS, this filter condition is added. The difference between setting
For example, assume an ApsaraDB RDS external table contains 100 rows of data. If you use |
|
|
The type of operation for writing data to MySQL. Only SimpleInsert, InsertOnDuplicateKeyUpdate, and ReplaceInto are supported. MaxCompute INSERT statements are parsed into the following three types of SQL statements to update the database:
If this attribute is not set, the default value is SimpleInsert. |
|
|
Specifies whether to automatically commit (autocommit) the instance when reading or writing data after a connection is established. A commit operation is performed at the end of each running instance. |
|
|
Specifies the batch size for each data insertion when writing data. The default value is 1000. An insert operation is performed at the end of each running instance. |
|
|
When you read data from an external table, you can use this attribute to call some built-in functions of ApsaraDB RDS. For example, the ApsaraDB RDS source table has a column named col_geometry of the GEOMETRY type. The MaxCompute external table specifies a column named odps_geo of the STRING type. If you set this attribute to odps_geo:astext(col_geometry), the GEOMETRY data from ApsaraDB RDS is converted to STRING and read into MaxCompute. |
|
|
When you write data to an external table, you can use this attribute to call some built-in functions of ApsaraDB RDS. For example, the ApsaraDB RDS source table has a column named col_geometry of the GEOMETRY type. The MaxCompute external table specifies a column named odps_geo of the STRING type. If you set this attribute to odps_geo:GEOMETRYFROMTEXT(?), the STRING data from MaxCompute is converted to GEOMETRY in ApsaraDB RDS and written to ApsaraDB RDS. |