AnalyticDB for MySQL Data Lakehouse Edition connects to ApsaraDB RDS for MySQL through Spark jobs using two methods: an elastic network interface (ENI) or an SSL connection. ENI-based access is simpler to configure. SSL-based access adds transport-layer encryption for environments with stricter data security requirements. Both methods support same-account and cross-account scenarios.
Choose a connection method
| Method | When to use | Security |
|---|---|---|
| ENI | The RDS instance and the AnalyticDB cluster are in the same VPC, or cross-account access with a RAM role is acceptable | Standard VPC-level isolation |
| SSL | Encrypted connections are required by compliance policy or the data is highly sensitive | Transport-layer encryption (SSL) on top of VPC |
Prerequisites
Before you begin, ensure that you have:
Cluster and instance setup:
An AnalyticDB for MySQL Data Lakehouse Edition cluster. See Create a cluster.Data Lakehouse Edition
An ApsaraDB RDS for MySQL instance in the same region as the cluster. See Create an ApsaraDB RDS for MySQL instance.
A job resource group for the cluster. See Create a resource group.
An Object Storage Service (OSS) bucket in the same region as the cluster. See Activate OSS and Create buckets.
Accounts and permissions:
A database account for the AnalyticDB for MySQL cluster:
Alibaba Cloud account: create a privileged account. See the "Create a privileged account" section of Create a database account.
Resource Access Management (RAM) user: create a privileged account and a standard account, then associate the standard account with the RAM user. See Create a database account and Associate or disassociate a database account with or from a RAM user.
Authorization is complete. See Perform authorization.
Network and security:
The RDS instance is added to a security group whose inbound and outbound rules allow traffic on the instance's port. See Configure a security group for an ApsaraDB RDS for MySQL instance and Add a security group rule.
Same-account access requires the AliyunADBSparkProcessingDataRole permission. Cross-account access requires additional authorization for the other Alibaba Cloud accounts.
Prepare data
Run the following statements against your ApsaraDB RDS for MySQL instance to create the sample database and table used in the examples below:
CREATE DATABASE `test`;
CREATE TABLE `test`.`persons` (
`id` int(11) DEFAULT NULL,
`first_name` varchar(32) DEFAULT NULL,
`laster_name` varchar(32) DEFAULT NULL,
`age` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO persons VALUES(1,'a','b',5);
INSERT INTO persons VALUES(2,'c','d',6);
INSERT INTO persons VALUES(3,'e','f',7);Access ApsaraDB RDS for MySQL over an ENI
All examples in this section use spark.adb.eni.enabled: true to route Spark job traffic through an elastic network interface to the RDS instance.
Step 1: Prepare the driver and JAR package
Write a Spark program that reads from the RDS table, then compile it into a JAR package. The example below produces a JAR named
rds_test.jar:package com.aliyun.spark import org.apache.spark.sql.SparkSession object SparkRDS { def main(args: Array[String]): Unit = { val sparkSession = SparkSession.builder() .appName("rds mysql test") .getOrCreate() // Internal endpoint of the ApsaraDB RDS for MySQL instance. // See "View and change the endpoints and port numbers" for details. val url = "jdbc:mysql://rm-bp11mpql1e01****.mysql.rds.aliyuncs.com" // Table name in db_name.table_name format val dbtable = "test.persons" // Database account credentials — pass via args or environment variables in production val user = "mysql_username" val password = "mysql_password" val jdbcDF = sparkSession.read .format("jdbc") .option("url", url) .option("driver", "com.mysql.jdbc.Driver") .option("dbtable", dbtable) .option("user", user) .option("password", password) .load() jdbcDF.show() } }Download the MySQL Connector/J driver that matches your RDS engine version from https://dev.mysql.com/downloads/connector/j/. This example uses
mysql-connector-java-8.0.11.jar.Upload both the JAR package (
rds_test.jar) and the driver (mysql-connector-java-8.0.11.jar) to your OSS bucket. See Upload objects.
Step 2: Submit the Spark job
Same-account access
Log on to the AnalyticDB for MySQL console. In the upper-left corner, select a region. In the left-side navigation pane, click Clusters. On the Clusters page, click an edition tab, find the cluster, and click its ID.
In the left-side navigation pane, choose Job Development > Spark JAR Development.
Select a job resource group and set the job type to Batch.
Paste the following configuration into the Spark editor, replacing the placeholder values:
Parameter Required Description nameYes Name of the Spark job jarsYes OSS path of the MySQL Connector/J driver JAR fileYes OSS path of the Spark job JAR package classNameYes Entry class of the Java or Scala application. In this example: com.aliyun.spark.SparkRDSconfYes Spark configuration parameters in key:valueformat, separated by commas. See Spark application configuration parametersspark.adb.eni.enabledYes Set to trueto enable ENI-based routingspark.adb.eni.vswitchIdYes vSwitch ID of the RDS instance. To find it, move the pointer over VPC on the Database Connection page spark.adb.eni.securityGroupIdYes ID of the security group that the RDS instance belongs to. See Configure a security group for an ApsaraDB RDS for MySQL instance spark.driver.resourceSpecYes Resource specification for the Spark driver spark.executor.instancesYes Number of Spark executor instances spark.executor.resourceSpecYes Resource specification for each Spark executor { "name": "rds-mysql-example", "jars": [ "oss://testBucketName/mysql-connector-java-8.0.11.jar" ], "file": "oss://testBucketName/rds_test.jar", "className": "com.aliyun.spark.SparkRDS", "conf": { "spark.adb.eni.enabled": "true", "spark.adb.eni.vswitchId": "vsw-bp17jqw3lrrobn6y****", "spark.adb.eni.securityGroupId": "sg-bp163uxgt4zandx****", "spark.driver.resourceSpec": "small", "spark.executor.instances": 1, "spark.executor.resourceSpec": "small" } }Click Run Now.
After the job completes, verify the result in the Spark job logs. See the View information about a Spark application section of the Spark editor topic.
Cross-account access
Follow steps 1–3 from Same-account access, then use the configuration below. It adds spark.adb.eni.roleArn to assume a RAM role in the account that owns the RDS instance.
{
"name": "rds-mysql-example",
"jars": [
"oss://testBucketName/mysql-connector-java-8.0.11.jar"
],
"file": "oss://testBucketName/rds_test.jar",
"className": "com.aliyun.spark.SparkRDS",
"conf": {
"spark.adb.eni.enabled": "true",
"spark.adb.eni.vswitchId": "vsw-bp17jqw3lrrobn6y****",
"spark.adb.eni.securityGroupId": "sg-bp163uxgt4zandx****",
"spark.driver.resourceSpec": "small",
"spark.executor.instances": 1,
"spark.executor.resourceSpec": "small",
"spark.adb.eni.roleArn": "acs:ram::testAccountID:role/testUserName"
}
}| Parameter | Required | Description |
|---|---|---|
spark.adb.eni.roleArn | Yes (cross-account only) | ARN of the RAM role used to access the RDS instance across accounts. Separate multiple ARNs with commas. Format: acs:ram::testAccountID:role/testUserName, where testAccountID is the ID of the account that owns the RDS instance and testUserName is the RAM role name created during cross-account authorization. See Perform authorization across Alibaba Cloud accounts |
All other parameters are the same as in the same-account configuration above.
Click Run Now, then verify the result in the Spark job logs.
Access ApsaraDB RDS for MySQL over an SSL connection
Before proceeding, enable SSL encryption for the ApsaraDB RDS for MySQL instance and make sure the internal endpoint connection is encrypted. See Configure the SSL encryption feature.
Step 1: Download the CA certificate and upload to OSS
Log on to the ApsaraDB RDS console. In the upper-left corner, select a region. In the left-side navigation pane, click Instances, then click the instance ID.
In the left-side navigation pane, click Data Security.
Click Download CA Certificate.
ImportantThe CA certificate is valid for one year. After it expires, generate a new one. An expired certificate cannot be used to establish an SSL connection.
Decompress the downloaded package, then upload the extracted JKS file (
ApsaraDB-CA-Chain.jks) to your OSS bucket. See Upload objects.
Step 2: Prepare the Spark JAR package
Write a Spark program that connects to RDS over SSL, then compile it into test.jar. The program reads the JKS certificate path and database credentials from command-line arguments, keeping sensitive values out of the source code:
package org.example
import org.apache.spark.sql.SparkSession
object Test {
def main(args: Array[String]): Unit = {
// OSS path of the JKS certificate file, e.g. oss://testBucketName/folder/ApsaraDB-CA-Chain.jks
val JKS_FILE_PATH = args(0)
// Database account credentials
val USERNAME = args(1)
val PASSWORD = args(2)
// Database and table names
val DATABASE_NAME = args(3)
val TABLE_NAME = args(4)
// Internal endpoint of the ApsaraDB RDS for MySQL instance.
// The JKS file is downloaded to /tmp/<OSS_PATH> at runtime by the ADB_SPARK_DOWNLOAD_FILES setting.
val mysqlUrl = "jdbc:mysql://rm-bp11mpql1e01****.mysql.rds.aliyuncs.com:3306/?" +
"useSSL=true" +
s"&trustCertificateKeyStoreUrl=file:///tmp/testBucketName/folder/ApsaraDB-CA-Chain.jks" +
"&trustCertificateKeyStorePassword=apsaradb" +
"&trustCertificateKeyStoreType=JKS"
val spark = SparkSession.builder().getOrCreate()
spark.read.format("jdbc")
.option("driver", "com.mysql.cj.jdbc.Driver")
.option("url", mysqlUrl)
.option("user", USERNAME)
.option("password", PASSWORD)
.option("dbtable", s"${DATABASE_NAME}.${TABLE_NAME}")
.load()
.show()
}
}The JDBC connection URL uses the following SSL parameters:
| Parameter | Value | Required | Description |
|---|---|---|---|
useSSL | true (default: false) | Yes | Enables SSL encryption for the JDBC connection |
trustCertificateKeyStoreUrl | file:///tmp/<JKS_FILE_PATH> | Yes | Local path of the JKS certificate file. The runtime maps the OSS path to /tmp/<OSS_PATH>. For example, oss://testBucketName/folder/ApsaraDB-CA-Chain.jks maps to file:///tmp/testBucketName/folder/ApsaraDB-CA-Chain.jks |
trustCertificateKeyStorePassword | apsaradb | Yes | Fixed password for the JKS certificate file |
trustCertificateKeyStoreType | JKS | Yes | Certificate storage format |
Upload test.jar to your OSS bucket. See Upload objects.
Step 3: Submit the Spark job
Same-account access
Log on to the AnalyticDB for MySQL console. Select a region, click Clusters, find the cluster, and click its ID.
In the left-side navigation pane, choose Job Development > Spark JAR Development.
Select a job resource group and set the job type to Batch.
Paste the following configuration into the Spark editor:
Parameter Required Description nameYes Name of the Spark job fileYes OSS path of the Spark job JAR package classNameYes Entry class of the Java or Scala application. In this example: org.example.TestconfYes Spark configuration parameters in key:valueformat, separated by commas. See Spark application configuration parametersspark.kubernetes.driverEnv.ADB_SPARK_DOWNLOAD_FILESYes OSS path of the JKS certificate file for the Spark driver. Separate multiple files with commas. Example: oss://testBucketName/a.jks,oss://testBucketName/b.jksspark.executorEnv.ADB_SPARK_DOWNLOAD_FILESYes OSS path of the JKS certificate file for the Spark executor. Separate multiple files with commas spark.adb.eni.enabledYes Set to trueto enable ENI-based routingspark.adb.eni.vswitchIdYes vSwitch ID of the RDS instance. Find it by hovering over VPC on the Database Connection page spark.adb.eni.securityGroupIdYes ID of the security group that the RDS instance belongs to. See Configure a security group for an ApsaraDB RDS for MySQL instance spark.driver.resourceSpecYes Resource specification for the Spark driver spark.executor.instancesYes Number of Spark executor instances spark.executor.resourceSpecYes Resource specification for each Spark executor { "file": "oss://testBucketName/test.jar", "className": "org.example.Test", "name": "MYSQL PEM Test", "conf": { "spark.kubernetes.driverEnv.ADB_SPARK_DOWNLOAD_FILES": "oss://testBucketName/folder/ApsaraDB-CA-Chain.jks", "spark.executorEnv.ADB_SPARK_DOWNLOAD_FILES": "oss://testBucketName/folder/ApsaraDB-CA-Chain.jks", "spark.driver.resourceSpec": "small", "spark.executor.instances": 1, "spark.executor.resourceSpec": "small", "spark.adb.eni.enabled": "true", "spark.adb.eni.vswitchId": "vsw-bp17jqw3lrrobn6y****", "spark.adb.eni.securityGroupId": "sg-bp163uxgt4zandx****" } }Click Run Now.
After the job completes, verify the result in the Spark job logs. See the View information about a Spark application section of the Spark editor topic.
Cross-account access
Follow steps 1–3 from Same-account access, then add spark.adb.eni.roleArn to the configuration:
{
"file": "oss://testBucketName/test.jar",
"className": "org.example.Test",
"name": "MYSQL PEM Test",
"conf": {
"spark.kubernetes.driverEnv.ADB_SPARK_DOWNLOAD_FILES": "oss://testBucketName/folder/ApsaraDB-CA-Chain.jks",
"spark.executorEnv.ADB_SPARK_DOWNLOAD_FILES": "oss://testBucketName/folder/ApsaraDB-CA-Chain.jks",
"spark.driver.resourceSpec": "small",
"spark.executor.instances": 1,
"spark.executor.resourceSpec": "small",
"spark.adb.eni.enabled": "true",
"spark.adb.eni.vswitchId": "vsw-bp17jqw3lrrobn6y****",
"spark.adb.eni.securityGroupId": "sg-bp163uxgt4zandx****",
"spark.adb.eni.roleArn": "acs:ram::testAccountID:role/testUserName"
}
}spark.adb.eni.roleArn is the only additional parameter. Its format and behavior are the same as described in Cross-account access for the ENI method.
Click Run Now, then verify the result in the Spark job logs.