Apache Ranger provides a centralized permission management framework. The Ranger Plugin for Spark offers fine-grained access control for Spark SQL access to databases, tables, and columns, which enhances data security. You can configure the Ranger Plugin in Livy Gateway to enable access control for your data.
Prerequisites
You have created a Livy Gateway. For more information, see Gateway management.
Use one of the following engine versions:
esr-4.x: esr-4.3.0 or later.
esr-3.x: esr-3.3.0 or later.
esr-2.x: esr-2.7.0 or later.
Notes
Ranger provides authorization services. However, user identity verification requires a separate authentication service, such as LDAP. For more information, see Configure and enable LDAP authentication for Livy Gateway.
Procedure
Step 1: Prepare the network
Before you configure Ranger, you must establish network connectivity between Serverless Spark and your VPC. This allows the Ranger Plugin to connect to your Ranger Admin service and retrieve permission information. For more information, see Establish network connectivity between EMR Serverless Spark and other VPCs.
Step 2: Configure the Ranger Plugin
To enable Ranger authentication for a Livy Gateway, first stop the gateway. Then, configure the following configuration items.
From the Network Connectivity drop-down list, select the name of the connection that you created.
In livy.conf, add the following configuration item.
livy.impersonation.enabled trueIn Spark-defaults.conf, add the following configuration items.
spark.ranger.plugin.enabled true spark.emr.serverless.user.defined.jars /opt/ranger/ranger-spark.jar ranger.plugin.spark.policy.rest.url http://<ranger_admin_ip>:<ranger_admin_port>The parameters are described as follows.
Parameter
Description
spark.ranger.plugin.enabledSet to
trueto enable Ranger authentication.spark.emr.serverless.user.defined.jarsThe path of the custom JAR package.
Set this parameter to
/opt/ranger/ranger-spark.jarto use the Ranger Plugin that is built into Serverless Spark.ranger.plugin.spark.policy.rest.urlThe address of the Ranger Admin service. The format is
http://<ranger_admin_ip>:<ranger_admin_port>.Replace
<ranger_admin_ip>and<ranger_admin_port>with the internal IP address and port of the Ranger Admin service. If you connect to the Ranger service of an Alibaba Cloud EMR on ECS cluster, set<ranger_admin_ip>to the internal IP address of the master node and<ranger_admin_port>to 6080.After you complete the configuration, restart the session to apply the changes.
Step 3: (Optional) Configure Ranger Audit
Ranger lets you configure the storage method for audits, such as Solr or HDFS. By default, the Ranger Audit feature is disabled in Serverless Spark. To use this feature, add the Ranger Audit parameters in the Spark Configuration section.
For example, to configure a connection to Solr in EMR, add the following configurations in the Spark Configuration section.
xasecure.audit.is.enabled true
xasecure.audit.destination.solr true
xasecure.audit.destination.solr.urls http://<solr_ip>:<solr_port>/solr/ranger_audits
xasecure.audit.destination.solr.user <user>
xasecure.audit.destination.solr.password <password>The parameters are described as follows:
xasecure.audit.is.enabled: Specifies whether to enable Ranger Audit.xasecure.audit.destination.solr: Specifies whether to store audits in the Solr service.xasecure.audit.destination.solr.urls: The URL of the Solr service. Replace<solr_ip>and<solr_port>with the IP address and port of the Solr service.xasecure.audit.destination.solr.userandxasecure.audit.destination.solr.password: If basic authentication is enabled for the Solr service, specify the username and password.If you connect to Ranger in EMR on ECS, you can find the values for
xasecure.audit.destination.solr.urls,xasecure.audit.destination.solr.user, andxasecure.audit.destination.solr.passwordin the Ranger-spark-audit.xml configuration file of the Ranger-plugin service.
After the configuration is complete, submit a task on EMR Serverless Spark. You can then go to the Ranger UI and view user access audit information on the Access tab of Ranger Audit. For more information about how to access the Ranger UI, see Access the web UIs of open source components in the console.
You can view audit information in the Ranger UI only if audit storage is configured to use Solr. If you use another storage method, such as HDFS, the audit information is not accessible from the Ranger UI.

Step 4: Test the connection
You can use Jupyter Notebook to test the connection. If you attempt to access a resource for which you do not have permissions, such as a database or table, a Permission denied message is returned.
When you test permissions, be aware of the default access policies that Ranger adds. For example, all users can switch between and create databases, and the owner of a database or table has all permissions on that resource. For accurate testing, use one user (User A) to create resources, such as databases and tables, and a different user (User B) to verify the permissions. If you use the same user for both actions, the owner's access policy may grant permissions that make it seem as if your security settings are not in effect.
If the Ranger Admin service is incorrectly configured, SQL statements may be executed successfully without returning permission errors, even though the permission settings are not being enforced.