All Products
Search
Document Center

MaxCompute:Usage notes

Last Updated:Feb 29, 2024

This topic describes how to download the JAR package of the MaxCompute JDBC driver and use the driver to connect to MaxCompute. This topic also provides sample code.

Usage notes

  • To execute SQL statements and obtain execution results by using the MaxCompute JDBC driver, you must meet the following requirements:

    • You are a member of a project.

    • You have the CREATE INSTANCE permission on the project.

    • You have the SELECT and DOWNLOAD permissions on the table that you want to use.

      Note
      • When you use MaxCompute JDBC V1.9 or earlier, a temporary table is automatically created for each query. You can use Tunnel commands to obtain query results from the temporary table. To use these versions, you must have the CREATE TABLE permission.

      • When you use MaxCompute JDBC V2.2 or later, no temporary table is automatically created for each query. You can call the InstanceTunnel interface to obtain query results, regardless of whether you have the CREATE TABLE permission.

      For more information about MaxCompute permissions, see MaxCompute permissions.

  • MaxCompute provides the data protection feature. If the data protection feature is enabled for a project, you cannot move data out of the project. If you use MaxCompute JDBC of a version earlier than V2.4, no result sets can be obtained. If you use MaxCompute JDBC V2.4 or later, the number of result rows that you obtain cannot exceed the value of the READ_TABLE_MAX_ROW parameter. For more information about this parameter, see Project operations. For more information about the data protection feature, see Project data protection.

  • The MaxCompute V2.0 data type edition supports more data types, such as TINYINT, SMALLINT, DATETIME, TIMESTAMP, ARRAY, MAP, and STRUCT. To use these new data types, you must run the following command to enable the MaxCompute V2.0 data type edition. For more information, see Data type editions.

    set odps.sql.type.system.odps2=true

Download the JAR package of the MaxCompute JDBC driver

You can download the JAR packages of different versions of the MaxCompute JDBC driver from OSS, GitHub, or Maven. We recommend that you download the JAR package whose name includes jar-with-dependencies.

The following code describes the dependency in the Project Object Model (POM) file of the MaxCompute JDBC driver downloaded from Maven:

<dependency>
  <groupId>com.aliyun.odps</groupId>
  <artifactId>odps-jdbc</artifactId>
  <version>3.3.6</version>
  <classifier>jar-with-dependencies</classifier>
</dependency>
Note

The MaxCompute JDBC driver is an open source code project that is named aliyun-odps-jdbc.

You are welcome to participate in the development and improvement of the MaxCompute JDBC driver. You can report issues on the Issues tab and optimize source code on the Pull requests tab. Your operations on the Issues and Pull requests tabs must follow the template requirements for open source projects.

JDBC parameters

You can configure the URL and property parameters to use a JDBC driver. The property parameters have a higher priority than the URL parameters.

Note

If a URL parameter is odps_config=config_file, config_file is read as a property parameter.

  • Basic parameters

    URL key

    Property key

    Required

    Description

    project

    project_name

    Yes

    The name of the MaxCompute project.

    accessId

    access_id

    Yes

    The AccessKey ID of your Alibaba Cloud account.

    You can obtain the AccessKey ID from the AccessKey Pair page.

    accessKey

    access_key

    Yes

    The AccessKey secret of your Alibaba Cloud account.

    You can obtain the AccessKey secret on the AccessKey Pair page.

    logview

    logview_host

    No

    The URL of the MaxCompute LogView. Set the value to http://logview.odps.aliyun.com.

    tunnelEndpoint

    tunnel_endpoint

    No

    The endpoint of MaxCompute Tunnel.

    For more information about the Tunnel endpoints that correspond to different regions and network connection types, see Endpoints.

  • Log configuration parameters

    URL key

    Property key

    Required

    Description

    enableOdpsLogger

    enable_odps_logger

    No

    Specifies whether to enable the MaxCompute JDBC logger. Valid values:

    • False: The MaxCompute JDBC logger is disabled. Logs are not recorded in a file. This is the default value.

    • True: The MaxCompute JDBC logger is enabled. Logs are recorded in the file jdbc.log of the directory in which the JAR package of the MaxCompute JDBC driver is stored.

    logConfFile

    log_conf_file

    No

    The configuration file for Simple Logging Facade for Java (SLF4J). You can specify this configuration file to flexibly configure the log output. For example, you can specify the output file and configure the logLevel parameter to specify the log level. To specify this configuration file, you can add the following dependencies to the file pom.xml in the project:

    <dependency>
      <groupId>ch.qos.logback</groupId>
      <artifactId>logback-core</artifactId>
      <version>1.2.3</version>
    </dependency>
    <dependency>
      <groupId>ch.qos.logback</groupId>
      <artifactId>logback-classic</artifactId>
      <version>1.2.3</version>
    </dependency>

    For more information about the configuration, see Example.

    logLevel

    log_level

    No

    The log level. Default value: INFO.

  • Other parameters

    URL key

    Property key

    Required

    Description

    stsToken

    sts_token

    No

    The Alibaba Cloud STS token.

    charset

    charset

    No

    The character set for the input and output. Default value: UTF-8.

    useProjectTimeZone

    use_project_time_zone

    No

    Specifies whether to use the time zone that is specified by the odps.sql.timezone parameter of the MaxCompute project. Valid values:

    • False: The time zone of the MaxCompute project is not used. This is the default value.

    • True: The time zone of the MaxCompute project is used.

    Note

    You can also configure set odps.sql.timezone=xxx in a statement to use the time zone of the MaxCompute project.

    The time zone specified in a statement takes precedence over the time zone specified in a project. The time zone specified in a project takes precedence over the time zones that you do not specify.

    disableConnectinosSetting

    disable_connection_setting

    No

    Specifies whether the SQL parameters of a connection can be configured. Valid values:

    • False: The SQL parameters of a connection cannot be configured. This is the default value.

    • True: The SQL parameters of a connection can be configured.

    If you set this parameter to True, when you run the set command in a statement, the SQL parameters of the statement and the connection are configured at the same time. If you set this parameter to False, when you run the set xxx command in a statement, only the SQL parameters of the statement are configured.

    settings

    settings

    No

    The default global SQL settings. The value of this parameter is passed in the JSON format, such as {"key":"value"}.

    tableList

    table_list

    No

    The name of the table in MaxCompute. The name of the table is in the projectname.tablename,projectname1.tablename1 format.

    connectTimeout

    connect_timeout

    No

    The timeout period for establishing a connection over the underlying network. Default value: 10. Unit: seconds.

    readTimeout

    read_timeout

    No

    The timeout period for reading data over an underlying network connection. Default value: 120. Unit: seconds.

    Note
    • The timeout period of a connection in each RESTful API request is the sum of the values of connectTimeout and readTimeout. The default timeout period of each RESTful API request is 130 seconds. By default, the maximum number of retries for establishing a connection in a RESTful API request is 3.

    • If you want to adjust the timeout period of a RESTful API request, change the value of the readTimeout parameter.

    enableCommandApi

    enable_command_api

    No

    Specifies whether to use the commandAPI. Valid values:

    • False: The commandAPI is not used. This is the default value.

    • True: The commandAPI is used.

      If you set this parameter to True, you can run specific commands that can be run only on the MaxCompute client (odpscmd) when you use a JDBC driver.

    httpsCheck

    https_check

    No

    Specifies whether to perform HTTPS-based certificate verification. Valid values:

    • False: HTTPS-based certificate verification is not performed. This is the default value.

    • True: HTTPS-based certificate verification is performed.

    tunnelConnectTimeout

    tunnel_connect_timeout

    No

    The timeout period for a tunnel connection when you run Tunnel commands to download data. Default value: 180. Unit: seconds.

    tunnelReadTimeout

    tunnel_read_timeout

    No

    The timeout period for reading data when you run Tunnel commands to download data. Default value: 300. Unit: seconds.

  • Parameters that are not related to MCQA (in offline mode)

    URL key

    Property key

    Required

    Description

    autoLimitFallback

    auto_limit_fallback

    No

    Specifies whether to enable auto limit fallback. Valid values:

    • False: Auto limit fallback is disabled. This is the default value.

    • True: Auto limit fallback is enabled. In offline mode, if this parameter is set to True, the maximum number of data records that can be downloaded is 10,000 when the error message no download permission appears.

  • MCQA-related parameters (in MCQA mode)

    • Basic parameters

    • URL key

      Property key

      Required

      Description

      interactiveMode

      interactive_mode

      No

      Specifies whether to enable MaxCompute Query Acceleration (MCQA). Valid values:

      • False: MCQA is disabled. This is the default value.

      • True: MCQA is enabled.

      executeProject

      execute_project_name

      No

      The name of the MaxCompute project in which the SQL task is executed.

      tunnelRetryTime

      tunnel_retry_time

      No

      The maximum number of retries for Tunnel commands specified by SQLExecutor. Default value: 6.

      attachTimeout

      attach_timeout

      No

      The timeout period for establishing a connection when MCQA is enabled. Default value: 60. Unit: seconds.

      fallbackQuota

      fallback_quota

      No

      The name of the quota that is selected when the MCQA job is rolled back. If you do not configure this parameter, the default quota of the MaxCompute project is used.

    • Limit-related parameters

      URL key

      Property key

      Required

      Description

      instanceTunnelMaxRecord

      instance_tunnel_max_record

      No

      The maximum number of records in the result set.

      Note

      This parameter takes effect only when the enableLimit parameter is set to False.

      instanceTunnelMaxSize

      instance_tunnel_max_size

      No

      The maximum size of the result set. Unit: bytes.

      autoSelectLimit

      auto_select_limit

      No

      The maximum number of rows of data in a query.

      In the elastic computing environment of Alibaba Cloud, a maximum of 1 million rows of data can be queried by default. If you want to query more than 1 million rows of data, you can configure this parameter.

      Note
      • This parameter takes effect only when the enableLimit parameter is set to False.

      • When you use a JDBC driver of V3.2.29 or later, if you configure the autoSelectLimit parameter, the enableLimit parameter is automatically set to False.

      • You can add odps.sql.select.auto.limit to the SQL settings of MaxCompute. By default, this parameter takes effect when you run an SQL job.

      enableLimit

      enable_limit

      No

      Specifies whether to enable the limit on the maximum number of data records in a query. Valid values:

      • False: The limit on the maximum number of data records in a query is disabled.

      • True: The limit on the maximum number of data records in a query is enabled. This is the default value.

        After the limit on the maximum number of data records in a query is enabled, the download control is not checked and the maximum number of result records is 10,000.

    • Rollback-related parameters

      URL key

      Property key

      Required

      Description

      fallbackForUnknownError

      fallback_for_unknownerror

      No

      Specifies whether to roll back the MCQA job to the offline mode when an unknown error occurs. Valid values:

      • False: The MCQA job is not rolled back to the offline mode.

      • True: The MCQA job is rolled back to the offline mode. This is the default value.

      fallbackForResourceNotEnough

      fallback_for_resourcenotenough

      No

      Specifies whether to roll back the MCQA job to the offline mode when resources are insufficient. Valid values:

      • False: The MCQA job is not rolled back to the offline mode.

      • True: The MCQA job is rolled back to the offline mode. This is the default value.

      fallbackForUpgrading

      fallback_for_upgrading

      No

      Specifies whether to roll back the MCQA job to the offline mode during an upgrade. Valid values:

      • False: The MCQA job is not rolled back to the offline mode.

      • True: The MCQA job is rolled back to the offline mode. This is the default value.

      fallbackForRunningTimeout

      fallback_for_runningtimeout

      No

      Specifies whether to roll back the MCQA job to the offline mode if a connection times out when you run a command. Valid values:

      • False: The MCQA job is not rolled back to the offline mode.

      • True: The MCQA job is rolled back to the offline mode. This is the default value.

      fallbackForUnsupportedFeature

      fallbackForUnsupportedFeature

      No

      Specifies whether to roll back the MCQA job to the offline mode in scenarios in which MCQA is not supported. Valid values:

      • False: The MCQA job is not rolled back to the offline mode.

      • True: The MCQA job is rolled back to the offline mode. This is the default value.

      alwaysFallback

      always_fallback

      No

      Specifies whether to roll back the MCQA job to the offline mode in all the preceding scenarios. Valid values:

      • False: Auto limit fallback is disabled. This is the default value.

      • True: The MCQA job is rolled back to the offline mode.

      Note

      This parameter takes effect only when you use a MaxCompute JDBC driver of V3.2.3 or later.

      disableFallback

      disable_fallback

      No

      Specifies whether not to roll back the MCQA job to the offline mode in all the preceding scenarios. Valid values:

      • False: The MCQA job is rolled back to the offline mode. This is the default value.

      • True: The MCQA job is not rolled back to the offline mode.

      fallbackQuota

      fallback_quota

      No

      The name of the quota that is selected when the MCQA job is rolled back. If you do not configure this parameter, the default quota of the MaxCompute project is used.

Connect to MaxCompute

  1. Load the MaxCompute JDBC driver.

    Class.forName("com.aliyun.odps.jdbc.OdpsDriver");
  2. Use DriverManager to establish a connection to MaxCompute.

    Connection cnct = DriverManager.getConnection(url, accessId, accessKey);
    • url: A URL is in the jdbc:odps:<maxcompute_endpoint>?project=<maxcompute_project_name>[&useProjectTimeZone={true|false}] format. Parameters:

      • <maxcompute_endpoint>: the endpoint of the MaxCompute service in a specific region. For example, the public endpoint of MaxCompute in the China (Hangzhou) region is http://service.cn-hangzhou.maxcompute.aliyun.com/api. For more information about the configuration information of endpoints, see Endpoints.

      • <maxcompute_project_name>: the name of your MaxCompute project.

      • useProjectTimeZone: specifies whether to use the time zone of the MaxCompute project.

      Sample command:

      jdbc:odps:http://service.cn-hangzhou.maxcompute.aliyun.com/api?project=test_project&useProjectTimeZone=true;
    • accessId: the AccessKey ID of the account used to create the project.

    • accessKey: the AccessKey secret that corresponds to the AccessKey ID of the account used to create the project.

      Note

      For more information about how to create and view an AccessKey ID and AccessKey secret, see Create an Alibaba Cloud account.

  3. Perform a data query.

    Statement stmt = cnct.createStatement();
    ResultSet rset = stmt.executeQuery("SELECT foo FROM bar");
    
    while (rset.next()) {
        // process the results
    }
    
    rset.close();
    stmt.close();
    cnct.close();

Sample code

  • Delete a table, create a table, and obtain metadata

    Note

    If JDBC dependencies are used in the project, you do not need to specify SDK dependencies. After you configure JDBC dependencies, the related SDK is automatically used. If you specify SDK dependencies, an error may occur due to version inconsistency when you run the code.

    import java.sql.Connection;
    import java.sql.DatabaseMetaData;
    import java.sql.DriverManager;
    import java.sql.ResultSet;
    import java.sql.SQLException;
    import java.sql.Statement;
    
    public class Main {
    
        private static final String DRIVER_NAME = "com.aliyun.odps.jdbc.OdpsDriver";
     		// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
    		// In this example, the AccessKey ID and AccessKey secret are saved in the environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
    		// To prevent AccessKey pair leaks, we recommend that you do not directly specify the AccessKey ID and AccessKey secret in the code.
    		private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID");
    		private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET");
    
        public static void main(String[] args) throws SQLException {
            try {
                Class.forName(DRIVER_NAME);
            } catch (ClassNotFoundException e) {
                e.printStackTrace();
                System.exit(1);
            }
    				
            Connection conn = DriverManager.getConnection(
                "jdbc:odps:<maxcompute_endpoint>?project=<maxcompute_project>",
                Main.accessId, Main.accessKey);
    
            // create a table
            Statement stmt = conn.createStatement();
            final String tableName = "jdbc_test";
            stmt.execute("DROP TABLE IF EXISTS " + tableName);
            stmt.execute("CREATE TABLE " + tableName + " (key BIGINT, value STRING)");
    
            // get meta data
            DatabaseMetaData metaData = conn.getMetaData();
            System.out.println("product = " + metaData.getDatabaseProductName());
            System.out.println("jdbc version = "
                + metaData.getDriverMajorVersion() + ", "
                + metaData.getDriverMinorVersion());
            ResultSet tables = metaData.getTables(null, null, tableName, null);
            while (tables.next()) {
                String name = tables.getString("TABLE_NAME");
                System.out.println("inspecting table: " + name);
                ResultSet columns = metaData.getColumns(null, null, name, null);
                while (columns.next()) {
                    System.out.println(
                        columns.getString("COLUMN_NAME") + "\t" +
                            columns.getString("TYPE_NAME") + "(" +
                            columns.getInt("DATA_TYPE") + ")");
                }
                columns.close();
            }
    
            tables.close();
            stmt.close();
            conn.close();
        }
    }

    The following result is returned:

    product = MaxCompute/ODPS
    jdbc version = 3, 0
    inspecting table: jdbc_test
    key    BIGINT(-5)
    value    STRING(12)
  • Update a table

    import java.sql.Connection;
    import java.sql.DriverManager;
    import java.sql.SQLException;
    import java.sql.Statement;
    
    public class Main {
    
        private static final String DRIVER_NAME = "com.aliyun.odps.jdbc.OdpsDriver";
    		// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use these credentials to perform operations, security risks may arise. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
    		// In this example, the AccessKey ID and AccessKey secret are saved in the environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
    		// To prevent AccessKey pair leaks, we recommend that you do not directly specify the AccessKey ID and AccessKey secret in the code.
      	private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID");
      	private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET");
      
        public static void main(String[] args) throws SQLException {
            try {
                Class.forName(DRIVER_NAME);
            } catch (ClassNotFoundException e) {
                e.printStackTrace();
                System.exit(1);
            }
    				
            Connection conn = DriverManager.getConnection(
                "jdbc:odps:<maxcompute_endpoint>?project=<maxcompute_project>",
                Main.accessId, Main.accessKey);
    
            Statement stmt = conn.createStatement();
            // The following DML also works
            //String dml = "INSERT INTO jdbc_test SELECT 1, \"foo\"";
            String dml = "INSERT INTO jdbc_test VALUES(1, \"foo\")";
            int ret = stmt.executeUpdate(dml);
    
            assert ret == 1;
    
            stmt.close();
            conn.close();
        }
    }
  • Update multiple tables at the same time

    import java.sql.Connection;
    import java.sql.DriverManager;
    import java.sql.PreparedStatement;
    import java.sql.SQLException;
    
    public class Main {
    
        private static final String DRIVER_NAME = "com.aliyun.odps.jdbc.OdpsDriver";
      	// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
    		// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
    		// We recommend that you do not hard-code the AccessKey ID and AccessKey secret in your code. Otherwise, the AccessKey pair may be leaked.
    		private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID");
    		private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET");
      
        public static void main(String[] args) throws SQLException {
            try {
                Class.forName(DRIVER_NAME);
            } catch (ClassNotFoundException e) {
                e.printStackTrace();
                System.exit(1);
            }
    				
            Connection conn = DriverManager.getConnection(
                "jdbc:odps:<maxcompute endpoint>?project=<maxcompute project>",
                Main.accessId, Main.accessKey);
    
            PreparedStatement pstmt = conn.prepareStatement("INSERT INTO jdbc_test VALUES(?, ?)");
    
            pstmt.setLong(1, 1L);
            pstmt.setString(2, "foo");
            pstmt.addBatch();
    
            pstmt.setLong(1, 2L);
            pstmt.setString(2, "bar");
            pstmt.addBatch();
    
            int[] ret = pstmt.executeBatch();
    
            assert ret[0] == 1;
            assert ret[1] == 1;
    
            pstmt.close();
            conn.close();
        }
    }
    Note
    • The executeBatch method is not suitable for batch data writing to clustered tables. For example, you cannot use this method to write a batch of data to a Transaction Table 2.0 table at a time.

    • If you write a batch of data to a standard partitioned table at a time, you must specify partitions to which data is written in the INSERT INTO statement. The following code shows an example:

      -- Execute the following statement to create the partitioned table sale_detail. 
      create table if not exists sale_detail
      (
      shop_name string,
      customer_id string,
      total_price double
      )
      partitioned by (sale_date string, region string);
      
      -- If the partitioned table contains partitions sale_date='20240219' and region='hangzhou', execute the following INSERT INTO statement to write a batch of data to the partitioned table at a time: 
      INSERT INTO sale_detail PARTITION(sale_date='20240219', region='hangzhou') VALUES(?, ?, ?)
  • Query data from a table

    import java.sql.Connection;
    import java.sql.DriverManager;
    import java.sql.ResultSet;
    import java.sql.SQLException;
    import java.sql.Statement;
    
    public class Main {
    
        private static final String DRIVER_NAME = "com.aliyun.odps.jdbc.OdpsDriver";
      	// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
    		// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
    		// We recommend that you do not hard-code the AccessKey ID and AccessKey secret in your code. Otherwise, the AccessKey pair may be leaked.
    		private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID");
    		private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET");
    
        public static void main(String[] args) throws SQLException {
            try {
                Class.forName(DRIVER_NAME);
            } catch (ClassNotFoundException e) {
                e.printStackTrace();
                System.exit(1);
            }
    				
            Connection conn = DriverManager.getConnection(
                "jdbc:odps:<maxcompute endpoint>?project=<maxcompute project>",
                Main.accessId, Main.accessKey);
            ResultSet rs;
    
            Statement stmt = conn.createStatement();
            String sql = "SELECT * FROM JDBC_TEST";
            stmt.executeQuery(sql);
    
            ResultSet rset = stmt.getResultSet();
            while (rset.next()) {
                System.out.println(String.valueOf(rset.getInt(1)) + "\t" + rset.getString(2));
            }
        }
    }
    Note

    OdpsStatement supports the following methods: execute(sql), executeQuery(sql), and executeUpdate(sql). The execute(sql) and executeQuery(sql) methods support the following common commands: desc table, show tables, and show partitions.