All Products
Search
Document Center

AnalyticDB:Read data from MySQL by using Spark SQL

Last Updated:Mar 30, 2026

AnalyticDB for MySQL lets you submit Spark SQL jobs to query data from self-managed MySQL databases, ApsaraDB RDS for MySQL, or PolarDB for MySQL. This topic shows how to read ApsaraDB RDS for MySQL data using two access modes: view mode and catalog mode.

Prerequisites

Before you begin, ensure that you have:

Step 1: Prepare data

Create a database and table in the ApsaraDB RDS for MySQL instance, then insert sample rows:

CREATE DATABASE `db`;

CREATE TABLE `db`.`test` (
  `id` int(11) DEFAULT NULL,
  `first_name` varchar(32) DEFAULT NULL,
  `laster_name` varchar(32) DEFAULT NULL,
  `age` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO test VALUES(1,'a','b',5);
INSERT INTO test VALUES(2,'c','d',6);
INSERT INTO test VALUES(3,'e','f',7);

Step 2: (Optional) Upload a CA certificate and the ApsaraDB RDS for MySQL driver

Skip this step if you do not need SSL encryption. Go directly to Step 3: Submit a Spark SQL job.
  1. Enable SSL encryption on the ApsaraDB RDS for MySQL instance and download the CA certificate. See Configure the SSL encryption feature.

    Important

    You can encrypt an internal or public connection. This example uses an internal connection. The CA certificate is valid for one year. Generate a new certificate before it expires — an expired certificate cannot establish an SSL connection.

  2. Download the JDBC driver that matches your ApsaraDB RDS for MySQL engine version from https://dev.mysql.com/downloads/connector/j/. This example uses mysql-connector-java-8.0.29.jar.

  3. Extract the CA certificate package. Upload the JKS file and the driver JAR to Object Storage Service (OSS). See Upload objects.

Step 3: Submit a Spark SQL job

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner, select a region. In the left-side navigation pane, click Clusters. On the Data Lakehouse Edition tab, click the cluster ID.

  2. In the left-side navigation pane, choose Job Development > SQL Development.

  3. On the SQLConsole tab, select the Spark engine and a job resource group.

  4. Enter and run the SQL for your access mode.

Important

The examples below use placeholder values for user and password. Do not hardcode credentials in production jobs. Store them securely and reference them at runtime.

View mode

In view mode, Spark SQL creates a temporary view over the MySQL table using JDBC. The view is session-scoped — recreate it each time you start a new job.

Choose one of the following connection methods.

Connect over an elastic network interface (ENI)

set spark.adb.eni.enabled=true;
set spark.adb.eni.vswitchId=<vsw-bp1sxxsodv28ey5dl****>;
set spark.adb.eni.securityGroupId=<sg-bp19mr685pmg4ihc****>;

CREATE TEMPORARY VIEW table_tmp
USING org.apache.spark.sql.jdbc
OPTIONS (
  url '<jdbc:mysql://rm-bp1k87323a7ia****.mysql.rds.aliyuncs.com:3306>',
  dbtable '<db.test>',
  user '<user>',
  password '<password>'
);

SELECT * FROM table_tmp;

Replace the placeholders with your actual values:

Parameter Description
spark.adb.eni.enabled Set to true to enable ENI routing
spark.adb.eni.vswitchId The vSwitch ID of the ApsaraDB RDS for MySQL instance. On the Database Connection page, hover over VPC to view the vSwitch ID
spark.adb.eni.securityGroupId The security group ID. See Configure a security group for an ApsaraDB RDS for MySQL instance
url The internal endpoint and port of the instance. Format: jdbc:mysql://<endpoint>:<port>
dbtable The table to read. Format: db_name.table_name. In this example, db_1.table_11 is used
user The database account name
password The database account password

Connect over SSL

add jar oss://<bucketname>/mysql-connector-java-8.0.11.jar;
set spark.app.name=SSL_RDS_SQL;
set spark.adb.eni.enabled=true;
set spark.adb.eni.vswitchId=<vsw-bp1sxxsodv28ey5dl****>;
set spark.adb.eni.securityGroupId=<sg-bp19mr685pmg4ihc****>;
set spark.kubernetes.driverEnv.ADB_SPARK_DOWNLOAD_FILES=oss://<bucketname>/ApsaraDB-CA-Chain.jks;
set spark.executorEnv.ADB_SPARK_DOWNLOAD_FILES=oss://<bucketname>/ApsaraDB-CA-Chain.jks;

CREATE TEMPORARY VIEW table_view
USING org.apache.spark.sql.jdbc
OPTIONS (
  url '<jdbc:mysql://rm-bp1k87323a7ia****.mysql.rds.aliyuncs.com:3306/?useSSL=true&clientCertificateKeyStoreUrl=file:///tmp/<bucketname>/folder/ApsaraDB-CA-Chain.jks&clientCertificateKeyStorePassword=apsaradb&trustCertificateKeyStoreUrl=file:///tmp/<bucketname>/folder/ApsaraDB-CA-Chain.jks&trustCertificateKeyStorePassword=apsaradb&trustCertificateKeyStoreType=JKS&clientCertificateKeyStoreType=JKS>',
  dbtable '<db.test>',
  user '<user>',
  password '<password>'
);

Replace the placeholders with your actual values:

Parameter Description
add jar The OSS path of the JDBC driver. This example uses mysql-connector-java-8.0.11.jar
spark.app.name The Spark SQL job name
spark.adb.eni.enabled Set to true to enable ENI routing
spark.adb.eni.vswitchId The vSwitch ID of the instance. On the Database Connection page, hover over VPC to view the ID
spark.adb.eni.securityGroupId The security group ID
spark.kubernetes.driverEnv.ADB_SPARK_DOWNLOAD_FILES The OSS path of the JKS certificate for the Spark driver. Separate multiple paths with commas. Example: oss://testBucketName/a.jks,oss://testBucketName/b.jks
spark.executorEnv.ADB_SPARK_DOWNLOAD_FILES The OSS path of the JKS certificate for the Spark executor. Separate multiple paths with commas
url The internal endpoint, port, and SSL parameters. The JKS keystore password is apsaradb and the store type is JKS. For SSL parameter details, see the Parameters section
dbtable The table to read. Format: db_name.table_name. This example uses db.test
user The database account name
password The database account password

Catalog mode

In catalog mode, you configure the MySQL data source using Spark's JDBC catalog interface. Run the following statements to register the catalog and query data.

set spark.adb.eni.enabled=true;
set spark.adb.eni.vswitchId=<vsw-bp1d14ddiw46fkgu1****>;
set spark.adb.eni.securityGroupId=<sg-bp19varsa8j0hyb****>;
set spark.sql.catalog.jdbc=org.apache.spark.sql.execution.datasources.v2.jdbc.JDBCTableCatalog;
set spark.sql.catalog.jdbc.url=<jdbc:mysql://rm-bp11mpql1e01l****.mysql.rds.aliyuncs.com:3306>;
set spark.sql.catalog.jdbc.user=<user>;
set spark.sql.catalog.jdbc.password=<password>;
USE jdbc;
SELECT * FROM db.test;

Replace the placeholders with your actual values:

Parameter Description
spark.sql.catalog.jdbc The catalog implementation class. Always set to org.apache.spark.sql.execution.datasources.v2.jdbc.JDBCTableCatalog. The suffix jdbc is the catalog name — replace it with a custom name if needed (for example, spark.sql.catalog.mycatalog)
spark.sql.catalog.jdbc.url The internal endpoint and port of the instance. Format: jdbc:mysql://<endpoint>:<port>
spark.sql.catalog.jdbc.user The database account name
spark.sql.catalog.jdbc.password The database account password
For the full list of JDBC parameters supported by Spark SQL, see JDBC to other databases.