MaxCompute Query Acceleration (MCQA) is a built-in feature of MaxCompute. MCQA uses the native MaxCompute SQL language and supports the built-in functions and permission systems of MaxCompute. This topic describes how to use the MCQA feature.

Background information

You can use one of the following methods to enable the MCQA feature:

Enable MCQA on the MaxCompute client

To enable the MCQA feature on the MaxCompute client, perform the following steps:

  1. Download the latest version of the MaxCompute client (odpscmd) on the Client page.
    Note The client version must be V0.35.1 or later.
  2. Install and configure the client. For more information, see Install and configure the MaxCompute client.
  3. Modify the odps_config.ini configuration file in the conf folder, and add the following command at the end of the configuration file:
    enable_interactive_mode=true
  4. Run the MaxCompute client in the bin folder. Run ./bin/odpscmd in Linux, and run ./bin/odpscmd.bat in Windows. If the following information appears, the MaxCompute client is running properly.
    Client runs properly
  5. Run a query job to verify that the MCQA feature is enabled.
    If the returned results contain the following information after you run the query job, the MCQA feature is enabled. Verify MCQA

Enable MCQA on the Ad-Hoc Query or Manually Triggered Workflow page

By default, the MCQA feature is enabled on the Ad-Hoc Query or Manually Triggered Workflow page of DataWorks. Manual operations are not required. If you want to disable the MCQA feature, submit a ticket.

Run a query job on the Ad-Hoc Query page. If the returned results contain the following information, the MCQA feature is enabled. For more information about ad hoc queries, see Create an ad hoc query. Ad-Hoc Query
Run a query job on the Manually Triggered Workflow page. If the returned results contain the following information, the MCQA feature is enabled. For more information about the manually triggered workflow, see Manage manually triggered workflows. Manually Triggered Workflow

JDBC

You can use the MaxCompute JDBC driver to enable the MCQA feature in the following scenarios:

Enable MCQA by using the MaxCompute JDBC driver

If you use the MaxCompute JDBC driver to connect to MaxCompute, perform the following steps to enable the MCQA feature. For more information, see Usage notes of JDBC.

  1. Download the JDBC JAR file that supports the MCQA feature or download the source code that can be compiled.
  2. Add the following dependency to the pom.xml file in the Maven repository:
    <dependency>
      <groupId>com.aliyun.odps</groupId>
      <artifactId>odps-jdbc</artifactId>
      <version>3.2.8</version>
      <classifier>jar-with-dependencies</classifier>
    </dependency>
    Note The version of the MaxCompute JDBC driver must be V3.2.0 or later.
  3. Create a Java program based on the source code and configure the required information. For more information, see ODPS JDBC.
    The source code contains the following information:
    String accessId = "your_access_id";
    String accessKey = "your_access_key";
    String conn = "jdbc:odps:http://service.<regionid>.maxcompute.aliyun.com/api?project=<your_project_name>"&accessId&accessKey&charset=UTF-8&interactiveMode=true";
    Statement stmt = conn.createStatement();
    -- Replace your_access_id with the AccessKey ID of your Alibaba Cloud account and your_access_key with the AccessKey secret of your Alibaba Cloud account. Replace your_project_name with the name of the project for which the MCQA feature is enabled. 
    Connection conn = DriverManager.getConnection(conn, accessId, accessKey);
    Statement stmt = conn.createStatement();
    String tableName = "testOdpsDriverTable";
    stmt.execute("drop table if exists " + tableName);
    stmt.execute("create table " + tableName + " (key int, value string)");
    To enable the MCQA feature by using the MaxCompute JDBC driver, you need only to modify String conn or the code.
    • String conn: Add interactiveMode=true.
    • Code: Add interactiveMode=true.
  4. Optional:Configure the following parameters in String conn to optimize the processing logic.
    • enableOdpsLogger: is used to display logs. If you do not configure Simple Logging Facade for Java (SLF4J), we recommend that you set this parameter to True.
    • fallbackForUnknownError: The default value is False. If this parameter is set to True, the system rolls back to the offline mode if an unknown error occurs.
    • fallbackForResourceNotEnough: The default value is False. If this parameter is set to True, the system rolls back to the offline mode when resources are insufficient.
    • fallbackForUpgrading: The default value is False. If this parameter is set to True, the system rolls back to the offline mode during an upgrade.
    • fallbackForRunningTimeout: The default value is False. If this parameter is set to True, the system rolls back to the offline mode when an operation times out.
    • fallbackForUnsupportedFeature: The default value is False. If this parameter is set to True, the system rolls back to the offline mode when the MCQA feature is not supported.
    • alwaysFallback: The default value is False. If this parameter is set to True, the system rolls back to the offline mode in the preceding scenarios. This parameter is supported only for MaxCompute JDBC driver V3.2.3 and later.

Enable MCQA on Tableau Server by using the MaxCompute JDBC driver

Add interactiveMode=true to the URL of Tableau Server. We recommend that you also add enableOdpsLogger=true to display logs. For more information, see Configure MaxCompute JDBC on Tableau.

Sample URL:
http://service.cn-beijing.maxcompute.aliyun.com/api?project=****_beijing&interactiveMode=true&enableOdpsLogger=true
To enable MCQA for some tables in a MaxCompute project, add the table_list=table_name1, table_name2 property to the URL of Tableau Server. Then, use this property to specify the tables for which you want to enable MCQA. Separate table names with commas (,). If you specify an excessive number of tables, access to the URL of Tableau Server becomes time-consuming. We recommend that you specify only the required tables in the URL of Tableau Server. Sample URL:
http://service.cn-beijing.maxcompute.aliyun.com/api?project=****_beijing&interactiveMode=true&enableOdpsLogger=true&table_list=orders,customers
If a table contains a large number of partitions, we recommend that you do not use data from all the partitions as the data source. You can filter the required partitions or run custom SQL queries to obtain the required data.

Enable MCQA on SQL Workbench/J by using the MaxCompute JDBC driver

After you configure the MaxCompute JDBC driver, you can use the MCQA feature on SQL Workbench/J by modifying the JDBC URL that you specified on the profile configuration page. For more information about how to configure the profile, see Configure MaxCompute JDBC on SQL Workbench/J.

Specify the JDBC URL in the following format: jdbc:odps:<MaxCompute_endpoint>?project=<MaxCompute_project_name>&accessId=<AccessKey ID>&accessKey=<AccessKey Secret>&charset=UTF-8&interactiveMode=true. Parameter description:
  • maxcompute_endpoint: the endpoint of the region in which your MaxCompute project resides. For more information, see Endpoints.
  • maxcompute_project_name: the name of your MaxCompute project.
  • AccessKey ID: the AccessKey ID that is used to access your MaxCompute project.
  • AccessKey Secret: the AccessKey secret that corresponds to the AccessKey ID.
  • charset=UTF-8: the character set encoding format.
  • interactiveMode: specifies whether to enable the MCQA feature. To enable the MCQA feature, set this parameter to true.

Enable MCQA by using MaxCompute SDK for Java

For more information about Alibaba Cloud MaxCompute SDK for Java, see SDK for Java. You must add a specified dependency to the pom.xml file in the Maven repository. Sample dependency:
<dependency>
  <groupId>com.aliyun.odps</groupId>
  <artifactId>odps-sdk-core</artifactId>
  <version>0.35.5-public</version>
</dependency>
Note The version of MaxCompute SDK for Java must be V0.35.1 or later.
The following sample code shows how to create a Java program:
import com.aliyun.odps.Odps;
import com.aliyun.odps.OdpsException;
import com.aliyun.odps.OdpsType;
import com.aliyun.odps.account.Account;
import com.aliyun.odps.account.AliyunAccount;
import com.aliyun.odps.data.Record;
import com.aliyun.odps.data.ResultSet;
import com.aliyun.odps.sqa.*;

import java.io.IOException;
import java.util.*;

public class SQLExecutorExample {

    public static void SimpleExample() {
        // Specify the Alibaba Cloud account and project information. 
        Account account = new AliyunAccount("<your_access_id>", "<your_access_key>");
        Odps odps = new Odps(account);
        odps.setDefaultProject("<your_project_name>");
        odps.setEndpoint("http://service.<regionid>.maxcompute.aliyun.com/api");

        // Prepare to build an SQLExecutor. 
        SQLExecutorBuilder builder = SQLExecutorBuilder.builder();

        SQLExecutor sqlExecutor = null;
        try {
            // run in offline mode or run in interactive mode
            if (false) {
                // Create an SQLExecutor that runs offline SQL queries by default. 
                sqlExecutor = builder.odps(odps).executeMode(ExecuteMode.OFFLINE).build();
            } else {
                // Create an SQLExecutor that runs SQL queries with MCQA enabled by default. Make sure that the system automatically rolls back to the offline query mode if an SQL query with MCQA enabled fails. 
                sqlExecutor = builder.odps(odps).executeMode(ExecuteMode.INTERACTIVE).fallbackPolicy(FallbackPolicy.alwaysFallbackPolicy()).build();
            }
            // Pass special query settings if required. 
            Map<String, String> queryHint = new HashMap<>();
            queryHint.put("odps.sql.mapper.split.size", "128");
            // Submit a query job. You can pass hints. 
            sqlExecutor.run("select count(1) from test_table;", queryHint);

            // List the System.out.println() interfaces that can be used to query common information. 
            // UUID
            System.out.println("ExecutorId:" + sqlExecutor.getId());
            // Query the Logview URL of the current query job. 
            System.out.println("Logview:" + sqlExecutor.getLogView());
            // Query the instance on which the current query job is run. In interactive mode, multiple query jobs may be run on the same instance. 
            System.out.println("InstanceId:" + sqlExecutor.getInstance().getId());
            // Query the progress of the current query job. You can check the progress bar in the console. 
            System.out.println("QueryStageProgress:" + sqlExecutor.getProgress());
            // Query the changelogs about the status of the current query job, such as rollback messages. 
            System.out.println("QueryExecutionLog:" + sqlExecutor.getExecutionLog());

            // Obtain results of query jobs by calling one of the following API operations: 
            if(false) {
                // Query the results of all query jobs. The API operation that you called is a synchronous operation and may occupy a thread until the query succeeds or fails. 
                // Write the results of all query jobs to the memory at the same time. To prevent memory issues, we recommend that you do not perform this operation if the amount of data is large. 
                List<Record> records = sqlExecutor.getResult();
                printRecords(records);
            } else {
                // Query the ResultSet iterator of the query results. The API operation that you called is a synchronous operation and may occupy a thread until the query succeeds or fails. 
                // Read the results of query jobs in several batches. We recommend that you perform this operation if the amount of data is large. 
                ResultSet resultSet = sqlExecutor.getResultSet();
                while (resultSet.hasNext()) {
                    printRecord(resultSet.next());
                }
            }

            // run another query
            sqlExecutor.run("select * from test_table;", new HashMap<>());
            if(false) {
                // Query the results of all query jobs. The API operation that you called is a synchronous operation and may occupy a thread until the query succeeds or fails. 
                // Write the results of all query jobs to the memory at the same time. To prevent memory issues, we recommend that you do not perform this operation if the amount of data is large. 
                List<Record> records = sqlExecutor.getResult();
                printRecords(records);
            } else {
                // Query the ResultSet iterator of the query results. The API operation that you called is a synchronous operation and may occupy a thread until the query succeeds or fails. 
                // Read the results of query jobs in several batches. We recommend that you perform this operation if the amount of data is large. 
                ResultSet resultSet = sqlExecutor.getResultSet();
                while (resultSet.hasNext()) {
                    printRecord(resultSet.next());
                }
            }
        } catch (OdpsException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (sqlExecutor != null) {
                // Close the SQLExecutor and release related resources. 
                sqlExecutor.close();
            }
        }
    }

    // SQLExecutor can be reused by pool mode
    public static void ExampleWithPool() {
        // Specify the Alibaba Cloud account and project information. 
        Account account = new AliyunAccount("your_access_id", "your_access_key");
        Odps odps = new Odps(account);
        odps.setDefaultProject("your_project_name");
        odps.setEndpoint("http://service.<regionid>.maxcompute.aliyun.com/api");

        // Run query jobs by using a connection pool. 
        SQLExecutorPool sqlExecutorPool = null;
        SQLExecutor sqlExecutor = null;
        try {
            // Create a connection pool. Specify the connection pool size and the default execution mode. 
            SQLExecutorPoolBuilder builder = SQLExecutorPoolBuilder.builder();
            builder.odps(odps)
                    .initPoolSize(1) // init pool executor number
                    .maxPoolSize(5)  // max executors in pool
                    .executeMode(ExecuteMode.INTERACTIVE); // run in interactive mode

            sqlExecutorPool = builder.build();
            // Obtain an SQLExecutor from the connection pool. If you cannot obtain SQLExecutor from the connection pool, you can add an SQLExecutor. Make sure that the total number of SQLExecutors does not exceed the upper limit. 
            sqlExecutor = sqlExecutorPool.getExecutor();

            // Use the SQLExecutor the same way that you use it in the preceding example. 
            sqlExecutor.run("select count(1) from test_table;", new HashMap<>());
            System.out.println("InstanceId:" + sqlExecutor.getId());
            System.out.println("Logview:" + sqlExecutor.getLogView());

            List<Record> records = sqlExecutor.getResult();
            printRecords(records);
        } catch (OdpsException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            sqlExecutor.close();
        }
        sqlExecutorPool.close();
    }

    private static void printRecord(Record record) {
        for (int k = 0; k < record.getColumnCount(); k++) {

            if (k != 0) {
                System.out.print("\t");
            }

            if (record.getColumns()[k].getType().equals(OdpsType.STRING)) {
                System.out.print(record.getString(k));
            } else if (record.getColumns()[k].getType().equals(OdpsType.BIGINT)) {
                System.out.print(record.getBigint(k));
            } else {
                System.out.print(record.get(k));
            }
        }
    }

    private static void printRecords(List<Record> records) {
        for (Record record : records) {
            printRecord(record);
            System.out.println();
        }
    }

    public static void main(String args[]) {
        SimpleExample();
        ExampleWithPool();
    }
}

Enable MCQA by using MaxCompute Studio

MCQA is supported in MaxCompute Studio V3.5.0 or later. We recommend that you install the latest version of MaxCompute Studio. For more information about how to install MaxCompute Studio, see Install MaxCompute Studio.

In the SQL editor of MaxCompute Studio, set the SQL execution mode to UseShortQueryAcceleration or FallbackWhenFailed and execute query statements to enable MCQA.

  • UseShortQueryAcceleration: Use MCQA to executes query statements.
  • FallbackWhenFailed: Use MCQA to roll back the system to the default offline mode and execute SQL query statements again if the SQL query statements fail to be executed.

Enable MCQA by using PyODPS

MCQA is supported in PyODPS V0.10.7 or later. When you execute SQL statements, you can call the run_sql_interactive() method to enable MCQA. The following code shows an example:
odps = ODPS(...)
instance = odps.run_sql_interactive('select count(*) from test_table')
# logview
print(instance.get_logview_address()) 
# Optional. If you use Tunnel to download query results, the MaxCompute client does not need to wait for the completion of the task instance. 
# Call the wait_for_success method to wait for the completion of the task instance. You can also call the wait_for_completion method to perform this operation. The difference between the two methods is that an error is returned if the task instance fails when you call the wait_for_success method. 
instance.wait_for_success() 
# Download the results.
# Display alerts returned by PyODPS.
print(instance.get_warnings()) 
# Display results. An error is returned if the task instance fails.
print(instance.get_printable_result()) 
# Download results from the frontend.
for each_record in instance.open_reader():
    pass # The data type of each record in MaxCompute.
# Use Tunnel to download the results.
for each_record in instance.open_reader(tunnel=True):
    pass
The run_sql_interactive() method can contain the following parameters:
  • service_name: optional. The name of the MCQA session.
  • service_startup_timeout: optional. The timeout period for waiting for a session attach.
  • force_reattach: optional. run_sql_interactive() automatically reuses sessions. The force_reattach parameter is used to forcibly reattach to a session.

Enable MCQA by using SQLAlchemy of PyODPS or by using a third-party tool that supports SQLAlchemy interface

PyODPS is integrated with SQLAlchemy that is used to query data in MaxCompute. To enable MCQA, you must specify the following parameters in the connection string:
  • interactive_mode: required. This parameter specifies whether to enable the MCQA feature. Set this parameter to true.
  • reuse_odps: optional. This parameter specifies whether to forcibly reuse connections. Set this parameter to true. This value indicates that connections are forcibly reused. Connection reuse improves the performance of some third-party tools, such as Apache SuperSet.
You can configure the fallback_policy=<policy1>,<policy2>,... parameter in the connection string to optimize the processing logic. The rollback policies specified by this parameter are similar to the parameters that are configured when you enable MCQA by using the MaxCompute JDBC driver. These policies are used to specify the rollback operation that is performed when query acceleration fails.
  • generic: The default value is False. If this parameter is set to True, the system rolls back to the offline mode if an unknown error occurs.
  • noresource: The default value is False. If this parameter is set to True, the system rolls back to the offline mode if resources are insufficient.
  • upgrading: The default value is False. If this parameter is set to True, the system rolls back to the offline mode during an upgrade.
  • timeout: The default value is False. If this parameter is set to True, the system rolls back to the offline mode if a connection times out.
  • unsupported: The default value is False. If this parameter is set to True, the system rolls back to the offline mode if the MCQA feature is not supported.
  • default: If you specify this parameter, the system performs a rollback based on the settings of the unsupported, upgrading, noresource, or timeout parameters. If you do not specify fallback_policy in the connection string, the default rollback policy is used.
  • all: The default value is False. If this parameter is set to True, the system rolls back to the offline mode in the preceding scenarios. This parameter is supported only for MaxCompute JDBC driver V3.2.3 and later.
You can use the following connection string if you want to enable MCQA and forcible connection reuse, and roll back the system to the offline mode in the scenario of an upgrade, MCQA not supported, or insufficient resources:
odps://<access_id>:<access_key>@<project>/?endpoint=<endpoint>&interactive_mode=true&reuse_odps=true&fallback_policy=unsupported,upgrading,noresource