All Products
Search
Document Center

E-MapReduce:Superset (Available only to existing users)

Last Updated:Mar 26, 2026

Superset is a lightweight business intelligence (BI) tool. Connect it to E-MapReduce (EMR) Druid or Hive data sources to run queries, build charts, and publish dashboards — all from a browser. An EMR V3.34.0 cluster is used in the examples.

Prerequisites

Before you begin, make sure that you have:

Limitations

  • Superset runs on the emr-header-1 node only and cannot be deployed in high-availability (HA) mode.

  • Knox cannot be used to access the Superset web UI.

Access EMR Druid from Superset

Superset is deeply integrated with EMR Druid. You can query Druid using either SQL or Druid's native query language.

Step 1: Log in to Superset

Open the Superset web UI through your SSH tunnel. The default username and password are both admin. Change the password immediately after your first login.

The web UI displays in English on first login.

Step 2: Add an EMR Druid cluster

  1. Choose Sources > Druid Clusters.

  2. Click the Add icon.

  3. In the Add Druid Cluster dialog box, configure the following parameters.

    Parameter Description
    Broker Host Enter emr-header-1. This value is fixed.
    Broker Port Enter 1 followed by the open-source Broker port number. For example, if the Broker port is 8082, enter 18082.
    Cluster Name Enter the name of the Druid cluster you created in the EMR console.

    Add Druid

  4. Click Save.

Step 3: Add a data source

  1. Choose Sources > Druid Datasources.

  2. Click the Add icon.

  3. In the Add Druid Datasource dialog box, configure the following parameters.

    Parameter Description
    Datasource Name Enter a name for the datasource.
    Cluster Select the EMR Druid cluster you added in the previous step.

    datasource

  4. Click Save. After saving, click the Edit icon to specify dimension columns and metric columns.

Step 4: Verify the connection

Click the datasource name to view the details of the EMR Druid cluster.

check-datasource

Access a Hive database from Superset

Superset uses SQLAlchemy to connect to relational databases and big data query engines, including MySQL, Oracle, PostgreSQL, Microsoft SQL Server, Hive, Presto, and Druid. Hive is installed by default on EMR Hadoop clusters.

For other supported database types, see the SQLAlchemy dialect documentation.

Step 1: Log in to Superset

Open the Superset web UI through your SSH tunnel. The default username and password are both admin.

Step 2: Add a Hive database

  1. Choose Sources > Databases.

  2. Click the Add icon.

  3. In the Add Database dialog box, configure the following parameters.

    Parameter Description
    Database Enter a name for the database connection.
    SQLAlchemy URI Enter hive://emr-header-1:10000/.

    DataBase

  4. Click Save.

Step 3: Add a table

  1. Choose Sources > Tables.

  2. Click the Add icon.

  3. In the Import a table definition dialog box, configure the following parameters.

    Parameter Description
    Database Select the database you added in the previous step.
    Table Name Enter the name of a table in that database. This example uses a table named test.

    add table

  4. Click Save.

Step 4: Query data

  1. Choose SQL Lab > SQL Editor.

  2. Select Hive JDBC Server as the database.

  3. Select default as the schema.

  4. Run your Hive query.

FAQ

The admin user sees "invalid login" on first login.

This happens on EMR clusters with a minor version earlier than V4.6 or V3.33. To fix this, run the following commands on the master node as the root user:

  1. Log in to the master node via SSH. For instructions, see Log on to a cluster.

  2. Activate the Superset environment:

    source /usr/lib/superset-current/bin/activate
  3. Create an admin account:

    superset fab create-admin

    Enter the username, first name, last name, email, password, and confirmation as prompted. The defaults are:

    Username [admin]:
    User first name [admin]:
    User last name [user]:
    Email [admin@fab.org]:
    Password:
    Repeat for confirmation:
  4. Initialize the database:

    superset db upgrade
  5. Initialize Superset:

    superset init

After these steps, create an SSH tunnel and log in with the account you just created.