Access ApsaraDB RDS for MySQL over an ENI or an SSL connection - AnalyticDB for MySQL

AnalyticDB for MySQL Data Lakehouse Edition (V3.0) allows you to access ApsaraDB RDS for MySQL within an Alibaba Cloud account or across Alibaba Cloud accounts. You can access ApsaraDB RDS for MySQL over an elastic network interface (ENI) or an SSL connection. Compared with the ENI-based access, the SSL-based access provides network connection encryption and ensures higher data security. This topic describes how to access ApsaraDB RDS for MySQL over an ENI and an SSL connection.

Prerequisites

An AnalyticDB for MySQL Data Lakehouse Edition (V3.0) cluster and an ApsaraDB RDS for MySQL instance are created in the same region For more information, see Create a cluster and Create an ApsaraDB RDS for MySQL instance.
A job resource group is created. For more information, see Create a resource group.
A database account is created.
- If you use an Alibaba Cloud account, you must create a privileged database account. For more information, see the "Create a privileged account" section of the Create a database account topic.
- If you use a Resource Access Management (RAM) user, you must create a privileged account and a standard account and associate the standard account with the RAM user. For more information, see Create a database account and Associate or disassociate a database account with or from a RAM user.
The ApsaraDB RDS for MySQL instance is added to a security group. The inbound and outbound rules of the security group allow requests from the port of the ApsaraDB RDS for MySQL instance. For more information, see Configure a security group for an ApsaraDB RDS for MySQL instance and Add a security group rule.
Object Storage Service (OSS) is activated and a bucket is created in the same region as the AnalyticDB for MySQL Data Lakehouse Edition (V3.0) cluster. For more information, see Activate OSS and Create buckets.
Authorization is complete. For more information, see Perform authorization.
Important
To access data within an Alibaba Cloud account, you must have the AliyunADBSparkProcessingDataRole permission. To access data across Alibaba Cloud accounts, you must perform authorization for other Alibaba Cloud accounts.

Prepare data

Create a database in the ApsaraDB RDS for MySQL instance, create a table in the database, and then write data to this table. Sample statements:

CREATE DATABASE `test`;

CREATE TABLE `test`.`persons` (
  `id` int(11) DEFAULT NULL,
  `first_name` varchar(32) DEFAULT NULL,
  `laster_name` varchar(32) DEFAULT NULL,
  `age` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
;

INSERT INTO persons VALUES(1,'a','b',5);
INSERT INTO persons VALUES(2,'c','d',6);
INSERT INTO persons VALUES(3,'e','f',7);

Access ApsaraDB RDS for MySQL over an ENI

Upload the driver and the JAR package that is required for the Spark job

Write a program that is used to access a table in the ApsaraDB RDS for MySQL instance. Then, compile the program into a JAR package that is required for the Spark job. In this example, the JAR package is named rds_test.jar. Sample code:

package com.aliyun.spark
import org.apache.spark.sql.SparkSession
object SparkRDS {
  def main(args: Array[String]): Unit = {
    val sparkSession = SparkSession.builder()
    .appName("rds mysql test")
    .getOrCreate()
    // The internal endpoint of the ApsaraDB RDS for MySQL instance. For more information, see View and change the endpoints and port numbers. 
    val url = "jdbc:mysql://rm-bp11mpql1e01****.mysql.rds.aliyuncs.com"
    // The name of the table in the ApsaraDB RDS for MySQL instance. Specify the name in the db_name.table_name format. 
    val dbtable = "test.persons"
    // The database account that is used to connect to the ApsaraDB RDS for MySQL database. 
    val user = "mysql_username"
    // The password of the database account. 
    val password = "mysql_password"
    val jdbcDF = sparkSession.read
    .format("jdbc")
    .option("url", url)
    .option("driver", "com.mysql.jdbc.Driver")
    .option("dbtable", dbtable)
    .option("user", user)
    .option("password", password)
    .load()
    jdbcDF.show()
  }
}

Download the driver that is compatible with the engine version of the ApsaraDB RDS for MySQL instance from https://dev.mysql.com/downloads/connector/j/.
In this example, mysql-connector-java-8.0.11.jar is downloaded.
Upload the JAR package and the driver to OSS. For more information, see Upload objects.

Access ApsaraDB RDS for MySQL within an Alibaba Cloud account

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Data Lakehouse Edition (V3.0) tab, find the cluster that you want to manage and click the cluster ID.
In the left-side navigation pane, choose Job Development > Spark JAR Development.
Select a job resource group and a job type for the Spark job. In this example, the Batch type is selected.

Run the following code in the Spark editor:

{
    "name": "rds-mysql-example",
    "jars": [
        "oss://testBucketName/mysql-connector-java-8.0.11.jar"
    ],
    "file": "oss://testBucketName/rds_test.jar",
    "className": "com.aliyun.spark.SparkRDS",
    "conf": {
        "spark.adb.eni.enabled": "true",
        "spark.adb.eni.vswitchId": "vsw-bp17jqw3lrrobn6y****",
        "spark.adb.eni.securityGroupId": "sg-bp163uxgt4zandx****",
        "spark.driver.resourceSpec": "small",
        "spark.executor.instances": 1,
        "spark.executor.resourceSpec": "small"
    }
}

The following table describes the parameters.

Parameter	Description
name	The name of the Spark job.
jars	The OSS path of the ApsaraDB RDS for MySQL driver. In this example, the OSS path of the mysql-connector-java-8.0.11.jar package is used.
file	The OSS path of the JAR package that is required for the Spark job.
className	The entry class of the Java or Scala application. In this example, com.aliyun.spark.SparkRDS is used.
spark.adb.eni.enabled	Specifies whether to enable ENI.
spark.adb.eni.vswitchId	The vSwitch ID of the ApsaraDB RDS for MySQL instance. To view the vSwitch ID, move the pointer over VPC on the Database Connection page.
spark.adb.eni.securityGroupId	The ID of the security group to which the ApsaraDB RDS for MySQL instance is added. For more information, see Configure a security group for an ApsaraDB RDS for MySQL instance.
conf	The configuration parameters that are required for the Spark job, which are similar to those of Apache Spark. The parameters must be in the `key:value` format. Separate multiple parameters with commas (,). For more information, see Spark application configuration parameters.

Click Run Now.
After the Spark job is run, view the data of the ApsaraDB RDS for MySQL table in the logs of the Spark job. For more information, see the "View information about a Spark application" section of the Spark editor topic.

Access ApsaraDB RDS for MySQL across Alibaba Cloud accounts

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Data Lakehouse Edition (V3.0) tab, find the cluster that you want to manage and click the cluster ID.
Important
If you want to access ApsaraDB RDS for MySQL across Alibaba Cloud accounts, you must log on to the AnalyticDB for MySQL console as the RAM user that is created for the new Alibaba Cloud account. For more information, see Perform authorization.
In the left-side navigation pane, choose Job Development > Spark JAR Development.
Select a job resource group and a job type for the Spark job. In this example, the Batch type is selected.

Run the following code in the Spark editor:

{
    "name": "rds-mysql-example",
    "jars": [
        "oss://testBucketName/mysql-connector-java-8.0.11.jar"
    ],
    "file": "oss://testBucketName/rds_test.jar",
    "className": "com.aliyun.spark.SparkRDS",
    "conf": {
        "spark.adb.eni.enabled": "true",
        "spark.adb.eni.vswitchId": "vsw-bp17jqw3lrrobn6y****",
        "spark.adb.eni.securityGroupId": "sg-bp163uxgt4zandx****",
        "spark.driver.resourceSpec": "small",
        "spark.executor.instances": 1,
        "spark.executor.resourceSpec": "small"
        "spark.adb.eni.roleArn":"acs:ram::testAccountID:role/testUserName"
    }
}

The following table describes the parameters.

Parameter

Description

spark.adb.eni.roleArn

The RAM role that is used to access ApsaraDB RDS for MySQL across Alibaba Cloud accounts. Separate multiple roles with commas (,). Specify the parameter in the acs:ram::testAccountID:role/testUserName format.

testAccountID: the ID of the Alibaba Cloud account that owns the ApsaraDB RDS for MySQL data source.
testUserName: the name of the RAM role that is created when you perform authorization across Alibaba Cloud accounts. For more information, see the "Perform authorization across Alibaba Cloud accounts" section of the Perform authorization topic.

For more information about the parameters, see the preceding parameter table.

Click Run Now.
After the Spark job is run, view the data of the ApsaraDB RDS for MySQL table in the logs of the Spark job. For more information, see the "View information about a Spark application" section of the Spark editor topic.

Access ApsaraDB RDS for MySQL over an SSL connection

Before you access ApsaraDB RDS for MySQL over an SSL connection, make sure that SSL encryption is enabled for the ApsaraDB RDS for MySQL instance and the connection to the internal endpoint is encrypted. For more information, see Configure the SSL encryption feature.

Download the CA certificate from the ApsaraDB RDS console and upload the certificate to OSS

Log on to the ApsaraDB RDS console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Instances. On the page that appears, click the ID of the instance that you want to view.
In the left-side navigation pane, click Data Security.
Click Download CA Certificate.
Important
The default validity period of a CA certificate is one year. If the CA certificate expires, you must generate another certificate. You cannot use an expired CA certificate to access ApsaraDB RDS for MySQL over an SSL connection.
Decompress the CA certificate package and upload the extracted JKS certificate file to OSS. For more information, see Upload objects.

Upload the driver and the JAR package that is required for the Spark job

package org.example
import org.apache.spark.sql.SparkSession

object Test {
  def main(args: Array[String]): Unit = {
    // The OSS path of the JKS certificate file, which is in the oss://testBucketName/folder/ApsaraDB-CA-Chain.jks format. 
    val JKS_FILE_PATH = args(0)
    
    // The database account that is used to connect to the ApsaraDB RDS for MySQL database. 
    val USERNAME = args(1)

    // The password of the database account. 
    val PASSWORD = args(2)

    The name of the ApsaraDB RDS for MySQL database. 
    val DATABASE_NAME = args(3)

    // The name of the table in the ApsaraDB RDS for MySQL instance. 
    val TABLE_NAME = args(4)
    // The internal endpoint of the ApsaraDB RDS for MySQL instance. 
    val mysqlUrl = "jdbc:mysql://rm-bp11mpql1e01****.mysql.rds.aliyuncs.com:3306/?" +
      "useSSL=true" +
      s"&trustCertificateKeyStoreUrl=file:///tmp/testBucketName/folder/ApsaraDB-CA-Chain.jks" +
      "&trustCertificateKeyStorePassword=apsaradb" +
      "&trustCertificateKeyStoreType=JKS" +
    
    val spark = SparkSession.builder().getOrCreate()

     spark.read.format("jdbc")
      .option("driver", "com.mysql.cj.jdbc.Driver")
      .option("url", mysqlUrl)
      .option("user", USERNAME)
      .option("password",  PASSWORD)
      .option("dbtable", s"${DATABASE_NAME}.${TABLE_NAME}")
      .load()
      .show()
  }
}

The following table describes the parameters.

Parameter	Description
useSSL	Specifies whether to enable SSL encryption. Valid values: true false (default) In this example, true is selected.
trustCertificateKeyStoreUrl	The on-premises path of the JKS certificate file, which is in the `file:///tmp/<JKS_FILE_PATH>` format. `JKS_FILE_PATH` specifies the OSS path of the JKS certificate file. For example, if the OSS path of the JKS certificate file is `oss://testBucketName/folder/ApsaraDB-CA-Chain.jks`, the on-premises path of the JKS certificate file is `file:///tmp/testBucketName/folder/ApsaraDB-CA-Chain.jks`.
trustCertificateKeyStorePassword	The password of the JKS certificate file. The value of this parameter is apsaradb.
trustCertificateKeyStoreType	The storage format of the certificate file. The value of this parameter is JKS.

Upload the test.jar package to OSS. For more information, see Upload objects.

Access ApsaraDB RDS for MySQL within an Alibaba Cloud account

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Data Lakehouse Edition (V3.0) tab, find the cluster that you want to manage and click the cluster ID.
In the left-side navigation pane, choose Job Development > Spark JAR Development.
Select a job resource group and a job type for the Spark job. In this example, the Batch type is selected.

Run the following code in the Spark editor:

{
  "file": "oss://testBucketName/test.jar",
	"className": "org.example.Test",
  "name": "MYSQL PEM Test",
  "conf": {
        "spark.kubernetes.driverEnv.ADB_SPARK_DOWNLOAD_FILES": "oss://testBucketName/folder/ApsaraDB-CA-Chain.jks",
	      "spark.executorEnv.ADB_SPARK_DOWNLOAD_FILES": "oss://testBucketName/folder/ApsaraDB-CA-Chain.jks",
        "spark.driver.resourceSpec": "small",
        "spark.executor.instances": 1,
        "spark.executor.resourceSpec": "small",
        "spark.adb.eni.enabled": "true",
	      "spark.adb.eni.vswitchId": "vsw-bp17jqw3lrrobn6y****",
	      "spark.adb.eni.securityGroupId": "sg-bp163uxgt4zandx****"
    }
}

The following table describes the parameters.

Parameter	Description
name	The name of the Spark job.
file	The OSS path of the JAR package that is required for the Spark job.
className	The entry class of the Java or Scala application. In this example, org.example.Test is used.
spark.kubernetes.driverEnv.ADB_SPARK_DOWNLOAD_FILES	The Spark driver parameter, which specifies the OSS path of the JKS certificate file. Separate multiple JKS certificate files with commas (,). Example: `spark.kubernetes.driverEnv.ADB_SPARK_DOWNLOAD_FILES: "oss://testBucketName/a.jks,oss://testBucketName/b.jks"`.
spark.executorEnv.ADB_SPARK_DOWNLOAD_FILES	The Spark executor parameter, which specifies the OSS path of the JKS certificate file. Separate multiple JKS certificate files with commas (,). Example: `spark.executor.ADB_SPARK_DOWNLOAD_FILES: "oss://testBucketName/a.jks,oss://testBucketName/b.jks"`.
spark.adb.eni.enabled	Specifies whether to enable ENI.
spark.adb.eni.vswitchId	The vSwitch ID of the ApsaraDB RDS for MySQL instance. To view the vSwitch ID, move the pointer over VPC on the Database Connection page.
spark.adb.eni.securityGroupId	The ID of the security group to which the ApsaraDB RDS for MySQL instance is added. For more information, see Configure a security group for an ApsaraDB RDS for MySQL instance.
conf	The configuration parameters that are required for the Spark job, which are similar to those of Apache Spark. The parameters must be in the `key:value` format. Separate multiple parameters with commas (,). For more information, see Spark application configuration parameters.

Click Run Now.
After the Spark job is run, view the data of the ApsaraDB RDS for MySQL table in the logs of the Spark job. For more information, see the "View information about a Spark application" section of the Spark editor topic.

Access ApsaraDB RDS for MySQL across Alibaba Cloud accounts

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Data Lakehouse Edition (V3.0) tab, find the cluster that you want to manage and click the cluster ID.
Important
If you want to access ApsaraDB RDS for MySQL across Alibaba Cloud accounts, you must log on to the AnalyticDB for MySQL console as the RAM user that is created for the new Alibaba Cloud account. For more information, see Perform authorization.
In the left-side navigation pane, choose Job Development > Spark JAR Development.
Select a job resource group and a job type for the Spark job. In this example, the Batch type is selected.

Run the following code in the Spark editor:

{
  "file": "oss://testBucketName/test.jar",
  "className": "org.example.Test",
  "name": "MYSQL PEM Test",
  "conf": {
        "spark.kubernetes.driverEnv.ADB_SPARK_DOWNLOAD_FILES": "oss://testBucketName/folder/ApsaraDB-CA-Chain.jks",
        "spark.executorEnv.ADB_SPARK_DOWNLOAD_FILES": "oss://testBucketName/folder/ApsaraDB-CA-Chain.jks",
        "spark.driver.resourceSpec": "small",
        "spark.executor.instances": 1,
        "spark.executor.resourceSpec": "small",
        "spark.adb.eni.enabled": "true",
        "spark.adb.eni.vswitchId": "vsw-bp17jqw3lrrobn6y****",
        "spark.adb.eni.securityGroupId": "sg-bp163uxgt4zandx****",
        "spark.adb.eni.roleArn":"acs:ram::testAccountID:role/testUserName", 
    }
}

The following table describes the parameters.

Parameter

Description

spark.adb.eni.roleArn

testAccountID: the ID of the Alibaba Cloud account that owns the ApsaraDB RDS for MySQL data source.
testUserName: the name of the RAM role that is created when you perform authorization across Alibaba Cloud accounts. For more information, see the "Perform authorization across Alibaba Cloud accounts" section of the Perform authorization topic.

For more information about the parameters, see the preceding parameter table.

Click Run Now.
After the Spark job is run, view the data of the ApsaraDB RDS for MySQL table in the logs of the Spark job. For more information, see the "View information about a Spark application" section of the Spark editor topic.