All Products
Search
Document Center

AnalyticDB:Read and write MaxCompute data

Last Updated:Dec 05, 2025

This topic describes how to use Spark SQL in AnalyticDB for MySQL to read and write data in MaxCompute.

Prerequisites

Procedure

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.

  2. In the navigation pane on the left, click Job Development > SQL Development.

  3. In the SQLConsole window, select the Spark engine and a job resource group.

  4. Write a Spark SQL job.

    SET spark.adb.version=3.3;
    SET spark.adb.connectors=odps;
    SET spark.sql.catalog.odps=org.apache.spark.sql.execution.datasources.v2.odps.OdpsTableCatalog;
    SET spark.sql.extensions=org.apache.spark.sql.execution.datasources.v2.odps.extension.OdpsExtensions;
    SET spark.hadoop.odps.project.name=spark_on_maxcompute;
    SET spark.hadoop.odps.end.point=https://service.cn-hangzhou-vpc.maxcompute.aliyun-inc.com/api;
    SET spark.hadoop.odps.access.id=yourAccessKeyID;
    SET spark.hadoop.odps.access.key=yourAccessKeySecret;
    SET spark.sql.catalog.odps.tableWriteProvider=tunnel;
    SET spark.sql.catalog.odps.tableReadProvider=tunnel;
    
    -- In this example, the project name is "spark_on_maxcompute". Make sure that this name is the same as the value of the spark.hadoop.odps.project.name parameter.
    CREATE TABLE odps.spark_on_maxcompute.spark_test(k1 int, k2 string) partitioned BY (part int);
    INSERT INTO odps.spark_on_maxcompute.spark_test PARTITION(part=1) VALUES(1, 'aaa');
    SELECT * FROM odps.spark_on_maxcompute.spark_test WHERE part=1;

    Parameters:

    Parameter

    Description

    spark.adb.version

    The Spark version. Only versions 3.3 and 3.5 support access to MaxCompute data sources. You can select either version.

    spark.adb.connectors

    The name of the data source. To read and write MaxCompute data, set this parameter to odps.

    spark.sql.catalog.<catalog_name>

    The method used to configure a data source in Spark SQL. In the parameter name, catalog_name is user-defined. To read and write MaxCompute data, set the value of this parameter to org.apache.spark.sql.execution.datasources.v2.odps.OdpsTableCatalog.

    For example: spark.sql.catalog.odps=org.apache.spark.sql.execution.datasources.v2.odps.OdpsTableCatalog.

    spark.sql.extensions

    The Spark SQL extension. To read and write MaxCompute data, set this parameter to org.apache.spark.sql.execution.datasources.v2.odps.extension.OdpsExtensions.

    spark.hadoop.odps.project.name

    The name of the MaxCompute project. In this example, the value is spark_on_maxcompute.

    Note

    This is the name of the MaxCompute project, not the workspace. Log on to the MaxCompute console, switch the region in the upper-left corner, and then choose Manage Configurations > Projects in the navigation pane on the left to view the name of the MaxCompute project.

    spark.hadoop.odps.endpoint

    The VPC endpoint of the region where MaxCompute resides.

    spark.hadoop.odps.access.id

    The AccessKey information of an Alibaba Cloud account or a Resource Access Management (RAM) user that has access permissions on the MaxCompute project.

    spark.hadoop.odps.access.key

    spark.sql.catalog.odps.tableWriteProvider

    The type of interface used to write data. To write data to MaxCompute, set this parameter to tunnel. This indicates that data is written to MaxCompute using Tunnel.

    Note

    If you only need to read data from MaxCompute, you do not need to configure this parameter.

    spark.sql.catalog.odps.tableReadProvider

    The type of interface used to read data. To read data from MaxCompute, set this parameter to tunnel. This indicates that data is read from MaxCompute using Tunnel.

    Note

    If you only need to write data to MaxCompute, you do not need to configure this parameter.

  5. Click Execute.

  6. After the Spark job runs successfully, view the table data in the Logs on the Applications tab of the Spark JAR Development page. For more information, see Spark development editor.