All Products
Search
Document Center

AnalyticDB:Read data from and write data to MaxCompute

Last Updated:Oct 27, 2025

This topic describes how to use Spark SQL to read data from and write data to MaxCompute in AnalyticDB for MySQL.

Prerequisites

Procedure

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.

  2. In the navigation pane on the left, click Job Development > SQL Development.

  3. In the SQL Console window, select the Spark engine and a job resource group.

  4. Write a Spark SQL job and click Continue Execution.

    SET spark.adb.version=3.3;
    SET spark.adb.connectors=odps;
    SET spark.sql.catalog.odps=org.apache.spark.sql.execution.datasources.v2.odps.OdpsTableCatalog;
    SET spark.sql.extensions=org.apache.spark.sql.execution.datasources.v2.odps.extension.OdpsExtensions;
    SET spark.hadoop.odps.project.name=spark_on_maxcompute;
    SET spark.hadoop.odps.end.point=https://service.cn-hangzhou-vpc.maxcompute.aliyun-inc.com/api;
    SET spark.hadoop.odps.access.id=yourAccessKeyID;
    SET spark.hadoop.odps.access.key=yourAccessKeySecret;
    SET spark.sql.catalog.odps.tableWriteProvider=tunnel;
    SET spark.sql.catalog.odps.tableReadProvider=tunnel;
    
    -- In this example, the project name is "spark_on_maxcompute". Make sure that this name is the same as the value set for the spark.hadoop.odps.project.name parameter.
    CREATE TABLE odps.spark_on_maxcompute.spark_test(k1 int, k2 string) partitioned BY (part int);
    INSERT INTO odps.spark_on_maxcompute.spark_test PARTITION(part=1) VALUES(1, 'aaa');
    SELECT * FROM odps.spark_on_maxcompute.spark_test WHERE part=1;

    The following table describes the parameters.

    Parameter

    Description

    spark.adb.version

    The Spark version that is used. Only versions 3.3 and 3.5 support access to the MaxCompute data source. You can use either version.

    spark.adb.connectors

    The name of the data source. When you read data from or write data to MaxCompute, set this parameter to odps.

    spark.sql.catalog.<catalog_name>

    The method that Spark SQL uses to configure data sources. The catalog_name part of the parameter name is customizable. When you read data from or write data to MaxCompute, set this parameter to org.apache.spark.sql.execution.datasources.v2.odps.OdpsTableCatalog.

    Example: spark.sql.catalog.odps=org.apache.spark.sql.execution.datasources.v2.odps.OdpsTableCatalog.

    spark.sql.extensions

    The Spark SQL extension. When you read data from or write data to MaxCompute, set this parameter to org.apache.spark.sql.execution.datasources.v2.odps.extension.OdpsExtensions.

    spark.hadoop.odps.project.name

    The name of the MaxCompute project. In this example, the name is spark_on_maxcompute.

    Note

    This parameter specifies the name of the MaxCompute project, not the workspace. You can log on to the MaxCompute console, switch the region in the upper-left corner, and then choose Workspace > Project Management in the navigation pane on the left to view the name of the MaxCompute project.

    spark.hadoop.odps.endpoint

    The MaxCompute region's VPC network endpoint.

    spark.hadoop.odps.access.id

    The AccessKey information of an Alibaba Cloud account or a RAM user with MaxCompute project access permissions.

    spark.hadoop.odps.access.key

    spark.sql.catalog.odps.tableWriteProvider

    The type of API that is used to write data. When you write data to MaxCompute, set this parameter to tunnel. This indicates that data is written to MaxCompute using Tunnel.

    Note

    You do not need to configure this parameter if you only need to read data from MaxCompute.

    spark.sql.catalog.odps.tableReadProvider

    The type of API that is used to read data. When you read data from MaxCompute, set this parameter to tunnel. This indicates that data is read from MaxCompute using Tunnel.

    Note

    You do not need to configure this parameter if you only need to write data to MaxCompute.

  5. Click Execute SQL.

  6. After the Spark job runs successfully, you can view the table data in the Logs on the Applications tab of the Spark JAR Development page. For more information, see Spark development editor.