You can connect Superset to a MaxCompute project and use Superset to explore and visualize
data in the MaxCompute project. This topic describes how to use PyODPS to connect
Superset to a MaxCompute project and visualize MaxCompute data on the Superset UI.
Background information
Apache Superset is a modern data exploration and visualization platform. It is a fast,
lightweight, and intuitive platform that supports various types of charts, which range
from simple line charts to highly detailed geospatial charts. For more information
about Superset, see Superset.
Prerequisites
Make sure that the following conditions are met:
- A MaxCompute project is created.
For more information about how to create a MaxCompute project, see Create a MaxCompute project.
- The AccessKey ID and AccessKey secret that are used to access the MaxCompute project
are obtained.
You can obtain the AccessKey ID and AccessKey secret on the AccessKey Pair page.
- Superset is installed.
You can install Superset by following instructions in the official Superset documentation.
For more information, see Install Superset. You can also use Docker to install Superset and perform additional steps based on the official Superset documentation.
For more information, see Adding New Database Drivers in Docker.
In this topic, Superset 1.1.0 is used.
- PyODPS 0.10.7 or later is installed.
We recommend that you use Python 3 to install PyODPS. For more information, see Install PyODPS.
In this topic, PyODPS 0.10.7 is used.
Step 1: Connect Superset to MaxCompute
- Start Superset.
- In the top navigation bar, choose . Then, click +DATABASE in the upper-right corner.
- In the Add Database dialog box, configure the parameters as described in the following
table.

Parameter |
Description |
DATABASE NAME |
The name of the data source that you want to add. |
SQLALCHEMY URI |
The SQLAlchemy connection string that is used to connect Superset to the MaxCompute project. You must configure this
parameter in the format of odps://<accesskey_id>:<accesskey_secret>@<MaxCompute_project_name>/?endpoint=<MaxCompute_endpoint> , where:
- <accesskey_id>: required. The AccessKey ID that is used to access the MaxCompute project.
You can obtain the AccessKey ID on the AccessKey Pair page.
- <accesskey_secret>: required. The AccessKey secret that corresponds to the AccessKey ID.
You can obtain the AccessKey secret on the AccessKey Pair page.
- <MaxCompute_project_name>: required. The name of the MaxCompute project to which you want to connect Tableau.
This parameter specifies the name of your MaxCompute project instead of the DataWorks
workspace to which the MaxCompute project corresponds. You can log on to the MaxCompute console, select the region where your MaxCompute project resides in the top navigation bar,
and then view the name of the MaxCompute project on the Project management tab.
- <MaxCompute_endpoint>: required. The endpoint of MaxCompute. Configure this parameter based on the region
where the MaxCompute project resides.
For more information about the endpoints of MaxCompute in different regions, see Endpoints.
|
If you want to enable the MaxCompute Query Acceleration (MCQA) feature, add the parameters
that are described in the following table to the end of the SQLAlchemy connection
string.
Parameter |
Value |
Description |
interactive_mode |
true |
Specifies whether to enable the MCQA feature. |
reuse_odps |
true |
Optional. This parameter specifies whether to forcibly reuse connections. We recommend
that you set this parameter to true. By default, Superset forcibly creates a connection
for each SQL request. The forcible reuse of connections simplifies the creation process.
|
fallback_policy |
- unsupported
- upgrading
- noresource
- timeout
- generic
- default
- all
|
Optional. This parameter specifies the fallback policy that is used when query acceleration
fails. You must configure this parameter in the <policy1>,<policy2>... format. Valid values:
- unsupported: Fall back to the offline mode if the MCQA feature is not supported.
- upgrading: Fall back to the offline mode if MaxCompute is in upgrade.
- noresource: Fall back to the offline mode if resources are insufficient.
- timeout: Fall back to the offline mode if a connection timeout error occurs.
- generic: Fall back to the offline mode if an unknown error occurs.
- default: Fall back to the offline node if the fallback condition for unsupported,
upgrading, noresource, or timeout is met. If you do not specify fallback_policy in the SQLAlchemy connection string, this value is used as the fallback policy.
- all: Fall back to the offline mode regardless of the reason why query acceleration
fails.
|
For example, if you want to enable the MCQA feature and the forcible reuse of connections
and also want to fall back to the offline mode if the fallback condition for unsupported,
upgrading, or noresource is met, configure the SQLAlchemy connection string in the
following format: odps://<accesskey_id>:<accesskey_secret>@<MaxCompute_project_name>/?endpoint=<MaxCompute_endpoint>&interactive_mode=true&reuse_odps=true&fallback_policy=unsupported,upgrading,noresource
.
- Click TEST CONNECTION. If Connection looks good! appears in the lower-right corner of the page, the connectivity test is successful.
In this case, click ADD to add the MaxCompute project to Superset.
Step 2: Use Superset to query and visualize data
After you configure the MaxCompute data source, you can add datasets to query and
visualize data in MaxCompute tables. You can perform the following operations. For
more information, see
Superset.
- View all tables in the MaxCompute project
In the top navigation bar of the Superset web UI, choose . Then, click +DATASET in the upper-right corner of the page. In the Add dataset dialog box, set DATASOURCE to the name of the data source you added in Step 1 and
SCHEMA to the name of your MaxCompute project. Then, all tables in the MaxCompute
project are displayed in the TABLE drop-down list.

- View the schema of a table
In the top navigation bar of the Superset web UI, choose . Then, click +DATASET in the upper-right corner of the page. In the Add dataset dialog box, set DATASOURCE to the name of the data source that you added in Step
1, set SCHEMA to the name of your MaxCompute project, set TABLE to the name of the
table whose schema you want to view, and then click ADD. Then, the schema of the table is displayed in the Columns section of the page.

- View data in a table
Enter and run an SQL script in the SQL editor to view data in a table.

- Visualize data
On the Datasets page, click a table, select a chart type, and then configure properties to visualize
data in the table based on your business requirements.
