Apache Ranger provides centralized permission management for EMR Serverless Spark. With the Ranger plug-in integrated into Kyuubi Gateway, you can enforce fine-grained access control on databases, tables, and columns through Spark SQL.
Precautions
Ranger is primarily responsible for authentication services. User identity verification requires authentication services such as LDAP. See Configure and enable LDAP authentication for Kyuubi Gateway for setup instructions.
Prerequisites
Before you begin, ensure that you have:
A Kyuubi Gateway. See Manage Kyuubi Gateway to create one.
An Apache Ranger Admin server that is running and reachable from your network.
One of the following engine versions (recommended):
esr-4.x: esr-4.3.0 or later
esr-3.x: esr-3.3.0 or later
esr-2.x: esr-2.7.0 or later
Step 1: Prepare the network
Configure network connectivity between EMR Serverless Spark and your virtual private cloud (VPC) so the Ranger plug-in can reach Ranger Admin and retrieve policies. See Network connectivity between EMR Serverless Spark and other VPCs.
Step 2: Configure the Ranger plug-in
Stop Kyuubi Gateway before making changes. Then, in the Kyuubi Gateway configuration, select your connection from the Network Connection drop-down list and add the following parameters to Spark Configuration:
spark.ranger.plugin.enabled true
spark.emr.serverless.user.defined.jars /opt/ranger/ranger-spark.jar
ranger.plugin.spark.policy.rest.url http://<ranger_admin_ip>:<ranger_admin_port>| Parameter | Description |
|---|---|
spark.ranger.plugin.enabled | Set to true to enable Ranger authentication. |
spark.emr.serverless.user.defined.jars | Path to the custom JAR file. Set to /opt/ranger/ranger-spark.jar to use the built-in Ranger plug-in in Serverless Spark. |
ranger.plugin.spark.policy.rest.url | Address of the Ranger Admin service, in the format http://<ranger_admin_ip>:<ranger_admin_port>. Replace <ranger_admin_ip> with the private IP address and <ranger_admin_port> with the port of your Ranger Admin service. For an EMR on ECS cluster, use the master node's private IP address and port 6080. |
After saving, restart Kyuubi Gateway to apply the changes.
Step 3: (Optional) Configure Ranger Audit
Ranger Audit is disabled by default in Serverless Spark. To enable it, add audit parameters to Spark Configuration.
The following example configures audit storage in Apache Solr:
xasecure.audit.is.enabled true
xasecure.audit.destination.solr true
xasecure.audit.destination.solr.urls http://<solr_ip>:<solr_port>/solr/ranger_audits
xasecure.audit.destination.solr.user <user>
xasecure.audit.destination.solr.password <password>| Parameter | Description |
|---|---|
xasecure.audit.is.enabled | Set to true to enable Ranger Audit. |
xasecure.audit.destination.solr | Set to true to store audit data in Solr. |
xasecure.audit.destination.solr.urls | URL of the Solr service. Replace <solr_ip> and <solr_port> with your Solr instance details. |
xasecure.audit.destination.solr.user | Solr username. Required only if Basic authentication is enabled on the Solr service. |
xasecure.audit.destination.solr.password | Solr password. Required only if Basic authentication is enabled on the Solr service. |
If you are connecting to Ranger in an EMR on ECS cluster, find the values for xasecure.audit.destination.solr.urls, xasecure.audit.destination.solr.user, and xasecure.audit.destination.solr.password in the ranger-spark-audit.xml configuration file of the Ranger-plugin service.

After submitting a task on EMR Serverless Spark, open the Ranger UI and check the Access tab under Ranger Audit to view user access records. See Access the web UIs of open source components from the console.
Audit records are visible in the Ranger UI only when Solr is used for storage. Hadoop Distributed File System (HDFS) and other storage backends do not support browsing through the Ranger UI.

Step 4: Test the configuration
Use Spark Beeline to connect to Kyuubi Gateway and verify that Ranger policies are enforced. For connection options, see Connect to Kyuubi Gateway. Attempting to access a database, table, or column that the current user is not permitted to access returns a Permission denied error.
Ranger grants all users permission to switch and create databases by default. Database and table owners also have full permissions on resources they created. To get accurate test results, verify that User B cannot access resources created by User A, rather than testing a user's access to their own resources.
If Ranger Admin is misconfigured, SQL statements may execute without errors while policies are silently not enforced. If behavior is unexpected, confirm that Ranger Admin is reachable and that the
ranger.plugin.spark.policy.rest.urlvalue is correct.