Spark natively supports accessing MySQL through a Java Database Connectivity (JDBC) connector. Serverless Spark automatically loads the MySQL JDBC driver (version 8.0.33) at startup. You can use SQL sessions, batch processing tasks, or Notebooks to connect to MySQL to read and write data.
Prerequisites
You have created a Serverless Spark workspace. For more information, see Create a workspace.
You have created a MySQL instance.
You can use a self-managed MySQL instance or an Alibaba Cloud database service such as ApsaraDB RDS for MySQL or PolarDB for MySQL.
The example in this topic uses an ApsaraDB RDS for MySQL instance. For more information, see Create an ApsaraDB RDS for MySQL instance and configure a database.
Important
Ensure that network connectivity is established between Serverless Spark and MySQL. For configuration details, see Establish network connectivity between EMR Serverless Spark and other VPCs.
When you add security group rules, open the required ports. This example requires you to open TCP port 3306.
Procedure
Method 1: Use an SQL session
Create an SQL session. In Session Manager, you can create a session and select a pre-configured Network Connection. For more information, see Create an SQL session.
In Data Development, you can create a SparkSQL task and use the following SQL statements to test the connection.
CREATE TEMPORARY VIEW test USING org.apache.spark.sql.jdbc OPTIONS ( url 'jdbc:mysql://<jdbc_url>/', dbtable '<db>.<table>', user '<username>', password '<password>' ); SELECT * FROM test;The following table describes the parameters.
Parameter
Description
urlThe JDBC connection string. The format is
jdbc:mysql://<jdbc_url>/. Replace<jdbc_url>with the actual value.dbtableThe name of the database table to read. The format is
<db>.<table>. This example usestest_mysql_db.test.userThe username for the MySQL database.
NoteThe user must have read permissions on the target table.
passwordThe password for the MySQL database.
If the table content is returned correctly, the connection is successful.

Insert data. Use the following commands to insert data into the MySQL table.
INSERT INTO test VALUES(4, 'd'),(5, 'e'); SELECT * FROM test;If the inserted data can be queried, the write operation is successful.

Method 2: Use a Notebook session
Create a Notebook session. In Session Manager, you can create a Notebook session and select a pre-configured Network Connection. For more information, see Create a Notebook session.
In Data Development, create an task and use the following Python code to test the connection.
df = spark.read \ .format("jdbc") \ .option("url", "jdbc:mysql://<jdbc_url>") \ .option("dbtable", "<db>.<table>") \ .option("user", "<username>") \ .option("password", "<password>") \ .load() df.show()If the table content is returned correctly, the connection is successful.

Insert data. Use the following code to insert data into the MySQL table.
df = spark.createDataFrame([(6, 'f'), (7, 'g')], ["id", "name"]) df.write \ .format("jdbc") \ .mode("append") \ .option("url", "jdbc:mysql://<jdbc_url>") \ .option("dbtable", "<db>.<table>") \ .option("user", "<username>") \ .option("password", "<password>") \ .save() df.show()The
mode("append")parameter specifies that new data is appended to the target table instead of overwriting or deleting existing data.If the inserted data is retrieved successfully, the write operation is successful.

Method 3: Use a Spark batch job
Write the test code. Compile the following Scala code and package it into a JAR file.
package spark.test import org.apache.spark.sql.SparkSession object Main { def main(args: Array[String]): Unit = { val spark = SparkSession.builder() .appName("test") .getOrCreate() val newRows = spark.createDataFrame(Seq((6, "f"), (7, "g"))).toDF("id", "name") newRows.write.format("jdbc") .mode("append") .option("url", "jdbc:mysql://<jdbc_url>") .option("dbtable", "<db>.<table>") .option("user", "<username>") .option("password", "<password>") .save() spark.read.format("jdbc") .option("url", "jdbc:mysql://<jdbc_url>") .option("dbtable", "<db>.<table>") .option("user", "<username>") .option("password", "<password>") .load() .show() spark.stop() } }Create a batch job. In Data Development, create a task. Then, configure the following parameters. For configuration details, see Develop a batch job or streaming job.
Main JAR Resource: Select or enter the path to the JAR file.
The Main Class is
spark.test.Main.Network Connection: Select an existing network connection.
To view the verification results after the task is complete, click Log Investigation in the Run Record section. On the Stdout tab of the Driver Log, you can view the content of the corresponding table in the MySQL database.