Apache Ranger provides centralized permission management for Spark SQL, enabling fine-grained access control at the database, table, and column level. Configure the Ranger Plugin in a Livy Gateway to enforce data access policies for EMR Serverless Spark jobs submitted through that gateway.
Ranger handles authorization only — it does not verify user identity. To authenticate users, configure a separate authentication service such as LDAP. For more information, see Configure and enable LDAP authentication for Livy Gateway.
Prerequisites
Before you begin, ensure that you have:
-
A Livy Gateway. For more information, see Gateway management.
-
One of the following engine versions:
-
esr-4.x: esr-4.3.0 or later
-
esr-3.x: esr-3.3.0 or later
-
esr-2.x: esr-2.7.0 or later
-
Limitations
-
Authorization scope only: Ranger controls what actions are performed on data. It does not verify who the user is. Pair it with LDAP authentication to implement complete access security.
-
Ranger Audit disabled by default: In EMR Serverless Spark, Ranger Audit is disabled. Enable it explicitly if you need audit logs. See Step 3: (Optional) Configure Ranger Audit.
Step 1: Prepare the network
Establish network connectivity between EMR Serverless Spark and your virtual private cloud (VPC). This allows the Ranger Plugin to reach the Ranger Admin service and retrieve permission policies. For more information, see Establish network connectivity between EMR Serverless Spark and other VPCs.
Step 2: Configure the Ranger Plugin
Stop the gateway before making configuration changes. Then configure the following items.
-
From the Network Connectivity drop-down list, select the connection created in Step 1.
-
In livy.conf, add:
livy.impersonation.enabled true -
In Spark-defaults.conf, add:
Parameter Description spark.ranger.plugin.enabledSet to trueto enable Ranger authentication.spark.emr.serverless.user.defined.jarsPath to the custom JAR package. Set to /opt/ranger/ranger-spark.jarto use the Ranger Plugin built into EMR Serverless Spark.ranger.plugin.spark.policy.rest.urlAddress of the Ranger Admin service, in the format http://<ranger_admin_ip>:<ranger_admin_port>. For EMR on ECS clusters, set<ranger_admin_ip>to the internal IP address of the master node and<ranger_admin_port>to6080.spark.ranger.plugin.enabled true spark.emr.serverless.user.defined.jars /opt/ranger/ranger-spark.jar ranger.plugin.spark.policy.rest.url http://<ranger_admin_ip>:<ranger_admin_port> -
Restart the session to apply the changes.
Step 3: (Optional) Configure Ranger Audit
Ranger supports storing audit logs in Apache Solr or Hadoop Distributed File System (HDFS). Ranger Audit is disabled by default in EMR Serverless Spark. To enable it, add the following parameters in the Spark Configuration section.
The following example configures Solr as the audit storage backend:
xasecure.audit.is.enabled true
xasecure.audit.destination.solr true
xasecure.audit.destination.solr.urls http://<solr_ip>:<solr_port>/solr/ranger_audits
xasecure.audit.destination.solr.user <user>
xasecure.audit.destination.solr.password <password>
| Parameter | Description |
|---|---|
xasecure.audit.is.enabled |
Set to true to enable Ranger Audit. |
xasecure.audit.destination.solr |
Set to true to send audit logs to Solr. |
xasecure.audit.destination.solr.urls |
URL of the Solr service. Replace <solr_ip> and <solr_port> with the actual IP address and port. |
xasecure.audit.destination.solr.user and xasecure.audit.destination.solr.password |
Credentials for Solr basic authentication, if enabled. |
If the Ranger service is from an EMR on ECS cluster, find the values for xasecure.audit.destination.solr.urls, xasecure.audit.destination.solr.user, and xasecure.audit.destination.solr.password in the Ranger-spark-audit.xml configuration file of the Ranger-plugin service.
After enabling audit, submit a task on EMR Serverless Spark and then go to the Ranger UI to view access records on the Access tab of Ranger Audit. For instructions on opening the Ranger UI, see Access the web UIs of open source components in the console.
Audit records are visible in the Ranger UI only when Solr is the configured storage backend. HDFS-based audit logs are not accessible from the Ranger UI.
Step 4: Test the configuration
Use Jupyter Notebook to verify that Ranger policies are enforced. Accessing a database or table without the required permissions returns a Permission denied error.
Use separate user accounts for testing
Ranger's default policies grant all users the ability to switch between and create databases, and grant the database or table owner full permissions on their own resources. To test policies accurately:
-
Use User A to create the test databases and tables.
-
Use User B (a different account without ownership) to verify permission restrictions.
If you test with the same user who created the resources, the owner policy may silently grant access, making it appear that permission restrictions are not working.
If Ranger Admin is misconfigured, Spark SQL statements may run successfully without enforcing any permission checks — and no errors are returned. Verify that the Ranger Admin service is reachable and correctly configured before drawing conclusions from test results.
What's next
-
Configure and enable LDAP authentication for Livy Gateway — add user identity verification to complement Ranger authorization.
-
Access the web UIs of open source components in the console — open the Ranger UI to manage policies and view audit logs.