A Livy gateway is a REST-based service that lets you submit Apache Spark jobs and query results over HTTP from multiple programming languages. With a Livy gateway, you can connect tools such as the livy_operator of Apache Airflow or the spark_magic of Jupyter Notebook directly to Serverless Spark to submit jobs and obtain status updates.
Prerequisites
Before you begin, make sure you have:
An EMR Serverless Spark workspace
A queue configured in the workspace
(Optional) A virtual private cloud (VPC) network connection, if your jobs need to access data sources or external services in a VPC. See Establish network connectivity between EMR Serverless Spark and other VPCs.
Create a Livy gateway
Log on to the EMR console.
In the left-side navigation pane, choose EMR Serverless > Spark.
On the Spark page, click the name of the target workspace.
In the left-side navigation pane of the EMR Serverless Spark page, click O&M Center > Gateway.
On the Livy Gateway page, click Create Livy Gateway.
Configure the parameters and click Create. For configuration file examples, see Livy configuration file examples.
Parameter Description Name The gateway name. Use only lowercase letters, digits, and hyphens (-). The name must start and end with a letter or digit. Livy Gateway Resource The compute resource allocated to the gateway. Default: 1 CPU, 4 GB.Livy Version The Livy version. Default: 0.8.0.Engine Version The Spark engine version used by the gateway. See Engine version introduction. Use Fusion Acceleration Enables the Fusion engine to accelerate Spark workloads and reduce job costs. For billing details, see Billing. For engine details, see Fusion engine. Associated Queue The queue where the gateway runs. Jobs submitted through the gateway use the identity of the gateway creator. Authentication Method The authentication mode. Only Token is supported. After creating the gateway, generate an authentication token to control access. See Manage tokens. Network Connection The VPC network connection for accessing data sources or external services. Environment The runtime environment. Resources are pre-installed based on the environment you select. Endpoint (Public) Disabled by default. When enabled, the gateway is accessible through a public endpoint. When disabled, the gateway uses an internal endpoint. Automatic Stop Disabled by default. When enabled, the gateway stops automatically after 45 minutes of inactivity. spark-defaults.conf The Spark default configuration file. Sets global default parameters for all Spark jobs submitted through the gateway. livy.conf The Livy server configuration file. Defines global gateway behavior including authentication (LDAP), session management, and timeout settings. Applies to all jobs submitted through the gateway. livy-client.conf The Livy HTTP client configuration file. Defines the interaction behavior between the client and the gateway. spark-blacklist.conf A security configuration file that restricts which Spark parameters users can modify when submitting jobs. Parameters listed in this file are ignored by the system and cannot be overridden. On the Livy Gateway page, click Start in the Actions column of the newly created gateway.
Manage tokens
Each gateway requires an authentication token. Include the token in the x-acs-spark-livy-token header of every request you send through the gateway.
Create a token
On the Livy Gateway page, click Tokens in the Actions column of the target gateway.
Click Create Token.
In the Create Token dialog box, configure the parameters and click OK.
Parameter Description Name The token name. Expired At The token validity period. Must be at least 1 day. Default: 365 days. Copy the token immediately after it is created.
ImportantThe token is only visible once. After you leave the page, it cannot be retrieved. If your token expires or is lost, reset it or create a new one.
View session information
After creating a Spark session through the Livy interface, view its details on the Sessions tab of the gateway.
On the Livy Gateway page, click the name of the target gateway.
Click the Sessions tab. The tab lists all Spark sessions created through the Livy interface, including each session's ID and status.
