All Products
Search
Document Center

E-MapReduce:Connect to EMR Serverless Spark by using Livy Interpreter for Apache Zeppelin

Last Updated:Apr 10, 2025

Apache Zeppelin provides an interactive development environment that enables users to write code, run queries, and perform data visualization and analysis in a web UI. This topic describes how to connect to E-MapReduce (EMR) Serverless Spark by using Livy Interpreter for Apache Zeppelin to efficiently build and optimize an interactive development environment.

Prerequisites

Procedure

Step 1: Create a gateway and a token

  1. Create and start a gateway.

    1. Go to the Gateways page.

      1. Log on to the EMR console.

      2. In the left-side navigation pane, choose EMR Serverless > Spark.

      3. On the Spark page, find the desired workspace and click the name of the workspace.

      4. In the left-side navigation pane of the EMR Serverless Spark page, choose Operation Center > Gateways.

    2. On the Gateways page, click the Livy Gateways tab.

    3. On the Livy Gateways tab, click Create Livy Gateway.

    4. On the Create Livy Gateway page, configure the Name parameter and click Create. In this example, set the Name parameter to Livy-gateway.

      You can configure other parameters based on your business requirements. For more information, see Manage gateways.

    5. On the Livy Gateways tab, find the created gateway and click Start in the Actions column.

  2. Create a token.

    1. On the Gateways page, find the gateway Livy-gateway and click Tokens in the Actions column.

    2. On the Tokens tab, click Create Token.

    3. In the Create Token dialog box, configure the Name parameter and click OK.

    4. Copy the token.

      Important

      After the token is created, you must immediately copy the token. After you leave the page, you can no longer view the token. If your token expires or is lost, reset the token or create another token.

Step 2: Configure Livy Interpreter for Apache Zeppelin

  1. Log on to Apache Zeppelin, click the username in the upper-right corner, and then select Interpreter from the drop-down list.

    image

  2. Click +Create in the upper-right corner and set the required parameters to create an interpreter.

    Parameter

    Description

    Interpreter Name

    Enter a custom name, such as mylivy.

    Interpreter Group

    Set this parameter to livy.

  3. After you set the Interpreter Group parameter to livy, configure the required parameters.

    image

    The following table describes the required parameters. You can also configure other parameters based on your business requirements. For more information, see Apache Zeppelin official documentation.

    Parameter

    Description

    zeppelin.livy.url

    The URL of the Livy gateway. Enter the URL in the http://{endpoint} format. {endpoint} indicates the internal endpoint of the Livy gateway that you created.

    image

    zeppelin.livy.session.create_timeout

    The maximum wait time for Apache Zeppelin to create a session. Unit: seconds. We recommend that you set this parameter to 600.

    zeppelin.livy.http.headers

    The custom header of the HTTP request. You need to click the image icon to add the configuration and enter x-acs-spark-livy-token:{token}. {token} is the token that you created on the Token Management tab.

  4. Click Save in the lower part of the page to save the settings.

Step 3: Create a notebook for data analytics

  1. In the top navigation bar, click Notebook. Then, select Create new note.

  2. Enter a custom note name and select mylivy from the Default Interpreter drop-down list.

    image

  3. Click Create.

  4. Enter the following code in the created notebook to start a Spark session.

    The time required for the first startup is 1 to 3 minutes. If you enter %pyspark, the Python environment is used. If you enter %spark, the Scala environment is used.

    %pyspark

    After the Spark session is started, you can view the link to the Spark UI and execute the code. You can use Python and Scala code together.

    image

  5. Enter the following code in the new notebook to query the available databases in the current Spark environment.

    %pyspark
    
    spark.sql("show databases").show()

    The following figure shows the returned information.

    image

  6. Optional. View session information.

    After you create a Spark session by using the Livy interface, you can view information about the Spark session, such as the session ID and status, on the Sessions tab of a specified Livy gateway.

    1. On the Livy Gateways tab, find the desired Livy gateway and click the name of the gateway.

    2. Click the Sessions tab.

      On the Sessions tab, you can view information about the Spark session that is created by using the Livy interface.

      image