All Products
Search
Document Center

E-MapReduce:Manage gateways

Last Updated:Mar 05, 2025

Livy gateways and Kyuubi gateways provide APIs for you to submit jobs to E-MapReduce (EMR) Serverless Spark.  

Background information

  • Livy is a service that allows you to call RESTful APIs to simplify the interactions between Livy and Spark. Livy allows you to use open source projects of Airflow, such as livy_operator and spark_magic, to submit jobs to EMR Serverless Spark, query the job status, and obtain computing results.

  • Kyuubi provides Java Database Connectivity (JDBC) and Open Database Connectivity (ODBC) APIs for you to connect to EMR Serverless Spark by using SQL queries or BI tools such as Tableau and Power BI. Kyuubi allows you to isolate resources in a multi-tenant environment to meet the requirements of enterprise-level applications.

Manage Livy gateways

Create a Livy gateway

  1. Go to the Gateways page.

    1. Log on to the EMR console.

    2. In the left-side navigation pane, choose EMR Serverless > Spark.

    3. On the Spark page, find the desired workspace and click the name of the workspace.

    4. In the left-side navigation pane of the EMR Serverless Spark page, choose Operation Center > Gateways.

  2. Click the Livy Gateways tab. On the tab, click Create Livy Gateway.

  3. On the Create Livy Gateway page, configure parameters and click Create. The following table describes the parameters.

    Parameter

    Description

    Name

    The name of the Livy gateway. The name can contain lowercase letters, digits, and hyphens (-). It must start and end with a letter or digit.

    Livy Gateway Resources

    The resource configurations. Default value: 1 CPU, 4 GB.

    Livy Version

    The Livy version. Default value: 0.8.0.

    Engine Version

    The version of the Spark engine that is used by the Livy gateway. For more information about engine versions, see Engine versions.

    Use Fusion Acceleration

    Specifies whether to enable Fusion acceleration. The Fusion engine helps accelerate the processing of Spark workloads and lower the overall cost of jobs. For more information about billing, see Billing. For more information about the Fusion engine, see Fusion engine.

    Associated Queue

    The queue in which the Livy gateway is deployed. When a Spark job is submitted by using a gateway, the Spark job is submitted by using the identity of the gateway creator.

    Runtime Environment

    The runtime environment. When you use a Livy gateway to submit a job, the resources used to run the job are pre-installed based on the runtime environment.

    Automatic Stop

    By default, the switch is turned off.

    After you turn on the switch for a gateway, the system automatically stops the gateway if no activity is detected in the gateway in the previous 45 minutes.

    Authentication Method

    The authentication mode. You can select only Token.

    After you create a gateway, you must generate a unique authentication token for the gateway. This way, you can use the token for identity authentication and access control when you submit requests over the gateway. For information about how to generate a token, see the Manage tokens section in this topic.

  4. On the Livy Gateways tab, find the created Livy gateway and click Start in the Actions column.

Manage tokens

Note

To use a token, add --header `x-acs-spark-livy-token: token` to the headers of the requests.

  1. On the Livy Gateways tab, find the desired Livy gateway and click Tokens in the Actions column.

  2. On the Tokens tab, click Create Token.

  3. In the Create Token dialog box, configure parameters and click OK. The following table describes the parameters.

    Parameter

    Description

    Name

    The name of the token.

    Expired At

    The validity period of the token. The validity period must be greater than or equal to 1 day. By default, this parameter is enabled and set to 365 days.

  4. Copy the token.

    Important

    After you create the token, you must immediately copy the token. You can no longer view the token after you leave the page. If the token expires or is lost, reset the token or create a new token.

View information about a Spark session

After you create a Spark session by using the Livy interface, you can view the information about the Spark session, such as the session ID and status, on the Sessions tab of a specified Livy gateway.

  1. On the Livy Gateways tab, find the desired Livy gateway and click the name of the gateway.

  2. Click the Sessions tab.

    On the Sessions tab, you can view information about the Spark session that is created by using the Livy interface.

    image

Manage Kyuubi gateways

Note

You can create only one Kyuubi gateway for each workspace.

Create a Kyuubi gateway

  1. On the Kyuubi Gateways tab, click Create Kyuubi Gateway.

  2. On the Create Kyuubi Gateway page, configure parameters and click Create. The following table describes the parameters.

    Parameter

    Description

    Name

    The name of the Kyuubi gateway. The name can contain only lowercase letters, digits, and hyphens (-). It must start and end with a letter or digit.

    Kyuubi Gateway Resources

    The resource configurations. Default value: 1 CPU, 4 GB.

    Kyuubi Version

    The Kyuubi version. Default value: 1.9.2.

    Engine Version

    The version of the Spark engine that is used by the Kyuubi gateway. For more information about engine versions, see Engine versions.

    Associated Queue

    The queue in which the Kyuubi gateway is deployed. When a Spark job is submitted by using a gateway, the Spark job is submitted by using the identity of the gateway creator.

    Kyuubi Configuration

    The Kyuubi configurations. Separate the key and value of a configuration item with spaces. Example: kyuubi.engine.pool.size 1.

    Only the following Kyuubi configuration items are supported.

    kyuubi.engine.pool.size
    kyuubi.engine.pool.size.threshold
    kyuubi.engine.share.level
    kyuubi.engine.single.spark.session
    kyuubi.session.engine.idle.timeout
    kyuubi.session.engine.initialize.timeout
    kyuubi.engine.security.token.max.lifetime
    kyuubi.session.engine.check.interval
    kyuubi.session.idle.timeout
    kyuubi.session.engine.request.timeout
    kyuubi.session.engine.login.timeout
    kyuubi.backend.engine.exec.pool.shutdown.timeout
    kyuubi.backend.server.exec.pool.shutdown.timeout
    kyuubi.backend.server.exec.pool.keepalive.time
    kyuubi.frontend.thrift.login.timeout
    kyuubi.operation.status.polling.timeout

    Spark Configuration

    The Spark configurations. Separate the key and value of a configuration item with spaces. Example: spark.sql.catalog.paimon.metastore dlf. spark.kubernetes.* configuration items are not supported.

    Authentication Type

    The authentication mode. You can select only Token.

    After you create a gateway, you must generate a unique authentication token for the gateway. This way, you can use the token for identity authentication and access control when you submit requests over the gateway.

  3. On the Kyuubi Gateways tab, find the created Kyuubi gateway and click Start in the Actions column.

Manage tokens

  1. On the Kyuubi Gateways tab, find the desired gateway and click Tokens in the Actions column.

  2. On the Tokens tab, click Create Token.

  3. In the Create Token dialog box, configure parameters and click OK. The following table describes the parameters.

    Parameter

    Description

    Name

    The name of the token.

    Expired At

    The validity period of the token. The validity period must be greater than or equal to 1 day. By default, this parameter is enabled and set to 365 days.

  4. Copy the token.

    Important

    After you create the token, you must immediately copy the token. You can no longer view the token after you leave the page. If the token expires or is lost, reset the token or create a new token.

Connect to a Kyuubi gateway

When you connect to a Kyuubi gateway, configure the following parameters in jdbc:hive2://<endpoint>:<port>/;transportMode=http;httpPath=cliservice/token/<token> based on your business requirements:

  • <endpoint>: the endpoint that you obtain on the Overview tab of the gateway.

  • <port>: the port number. If you use the public endpoint to connect to the Kyuubi gateway, the port number is 443. If you use the internal endopint to connect to the Kyuubi gateway, the port number is 80.

  • <token>: the token that you copy on the Tokens tab of the gateway.

  • <tokenname>: the name of the token. You can view the name of the token on the Tokens tab. This parameter is required when you use Python to connect to a Kyuubi Gateway.

Use Beeline to connect to a Kyuubi gateway

When you use Beeline to connect to a Kyuubi gateway, make sure that the version of Beeline is compatible with the version of the Kyuubi server. If Beeline is not installed, install Beeline by referring to Getting Started.

beeline -u "jdbc:hive2://<endpoint>:<port>/;transportMode=http;httpPath=cliservice/token/<token>"

If you use this method to connect to a Kyuubi gateway, you can add session-related parameters and modify the parameter values. Example: beeline -u "jdbc:hive2://<endpoint>:<port>/;transportMode=http;httpPath=cliservice/token/<token>;#spark.sql.shuffle.partitions=100;spark.executor.instances=2;".

Use Java to connect to a Kyuubi gateway

  • Update the pom.xml file.

    Replace the versions of the hadoop-common and hive-jdbc dependencies based on your business requirements.

    <dependencies>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-common</artifactId>
                <version>3.0.0</version>
            </dependency>
            <dependency>
                <groupId>org.apache.hive</groupId>
                <artifactId>hive-jdbc</artifactId>
                <version>2.3.9</version>
            </dependency>
        </dependencies>
  • Write Java code to connect to the desired Kyuubi gateway.

    import org.apache.hive.jdbc.HiveStatement;
    import java.sql.Connection;
    import java.sql.DriverManager;
    import java.sql.ResultSet;
    import java.sql.ResultSetMetaData;
    
    public class Main {
        public static void main(String[] args) throws Exception {
            String url = "jdbc:hive2://jdbc:hive2://<endpoint>:<port>/;transportMode=http;httpPath=cliservice/token/<token>";
            Class.forName("org.apache.hive.jdbc.HiveDriver");
            Connection conn = DriverManager.getConnection(url);
            HiveStatement stmt = (HiveStatement) conn.createStatement();
    
    
            String sql = "select * from students;";
            System.out.println("Running " + sql);
            ResultSet res = stmt.executeQuery(sql);
    
            ResultSetMetaData md = res.getMetaData();
            String[] columns = new String[md.getColumnCount()];
            for (int i = 0; i < columns.length; i++) {
                columns[i] = md.getColumnName(i + 1);
            }
            while (res.next()) {
                System.out.print("Row " + res.getRow() + "=[");
                for (int i = 0; i < columns.length; i++) {
                    if (i != 0) {
                        System.out.print(", ");
                    }
                    System.out.print(columns[i] + "='" + res.getObject(i + 1) + "'");
                }
                System.out.println(")]");
            }
    
            conn.close();
        }
    }

Use Python to connect to a Kyuubi gateway

  1. Run the following command to install PyHive and Thrift:

    pip3 install pyhive thrift
  2. Write a Python script to connect to the desired Kyuubi gateway.

    The following Python sample code provides an example on how to connect to a Kyuubi gateway and query databases.

    from pyhive import hive
    
    if __name__ == '__main__':
        cursor = hive.connect('<endpoint>',
                              port="<port>",
                              scheme='http',
                              username='<tokenname>',
                              password='<token>').cursor()
        cursor.execute('show databases')
        print(cursor.fetchall())
        cursor.close()

References

For more information about applications of Livy gateways, see the following topics: