Limits
The engine version of Serverless Spark must meet the following requirements:
esr-4.x: esr-4.2.0 or later
esr-3.x: esr-3.0.1 or later
esr-2.x: esr-2.4.1 or later
Precautions
Ranger is responsible for service authentication. If you want to implement identity authentication, use the OpenLDAP service. For more information, see Configure LDAP authentication for a Spark Thrift Server.
Prerequisites
A Spark Thrift Server is created. For more information, see Manage Spark Thrift Servers.
Procedure
Step 1: Configure network connectivity
You must configure network connectivity between E-MapReduce (EMR) Serverless Spark and your virtual private cloud (VPC) to allow the specified Ranger plugin to connect to Ranger Admin and be granted related permissions. For more information, see Configure network connectivity between EMR Serverless Spark and a data source across VPCs.
Step 2: Configure a Ranger plugin
Before you configure Ranger authentication for a Spark Thrift Server, you must stop the Spark Thrift Server. Then, perform the following operations to modify the Spark Thrift Server: Select a created connection from the Network Connection drop-down list and add the related configuration items to the Spark Configuration parameter based on the type of the Ranger plugin that you use. After you modify the Spark Thrift Server, you must restart it to make the modification take effect.
Method 1: Use the built-in Ranger plugin
This method is available for only Spark Thrift Servers whose version is esr-3.1.0 or later.
spark.ranger.plugin.enabled true
spark.jars /opt/ranger/ranger-spark.jar
ranger.plugin.spark.policy.rest.url http://<ranger_admin_ip>:<ranger_admin_port><ranger_admin_ip> and <ranger_admin_port> specify the internal IP address and port number of Ranger Admin. Configure the parameters based on your business requirements. If you connect to the Ranger service that is deployed in an EMR on ECS cluster, set the <ranger_admin_ip> parameter to the internal IP address of the master node and the <ranger_admin_port> parameter to 6080.
Method 2: Use a custom Ranger plugin
If you want to use a custom Ranger plugin, you can upload the custom Ranger plugin to Object Storage Service (OSS) and specify the names of the custom JAR package and class.
spark.jars oss://<bucket>/path/to/user-ranger-spark.jar
spark.ranger.plugin.class <class_name>
spark.ranger.plugin.enabled true
ranger.plugin.spark.policy.rest.url http://<ranger_admin_ip>:<ranger_admin_port>Configure the following parameters based on your business requirements:
spark.jars: the OSS path in which the custom JAR package is stored.spark.ranger.plugin.class: the name of the Spark extension class of the custom Ranger plugin.<ranger_admin_ip>and<ranger_admin_port>: the internal IP address and port number of Ranger Admin. If you connect to the Ranger service that is deployed in an EMR on ECS cluster, set the<ranger_admin_ip>parameter to the internal IP address of the master node and the<ranger_admin_port>parameter to 6080.
(Optional) Step 3: Configure the audit feature of Ranger
Ranger allows you to specify the service that is used to store audit information, such as Solr and Hadoop Distributed File System (HDFS). By default, the audit feature of Ranger is disabled for EMR Serverless Spark. If you want to enable the audit feature, you can add the related configuration items to the Spark Configuration parameter.
If you want to connect to the Solr service that is deployed in an EMR on ECS cluster, add the following configuration items to the Spark Configuration parameter:
xasecure.audit.is.enabled true
xasecure.audit.destination.solr true
xasecure.audit.destination.solr.urls http://<solr_ip>:<solr_port>/solr/ranger_audits
xasecure.audit.destination.solr.user <user>
xasecure.audit.destination.solr.password <password>Parameter description:
xasecure.audit.is.enabled: specifies whether to enable the audit feature of Ranger.xasecure.audit.destination.solr: specifies whether to store audit information in Solr.xasecure.audit.destination.solr.urls: the URL of Solr.<solr_ip>indicates the IP address of Solr, and<solr_port>indicates the port number of Solr. Configure other URL information based on your business requirements.xasecure.audit.destination.solr.userandxasecure.audit.destination.solr.password: the username and password. The parameters are required if you enable basic authentication for Solr.If you connect to the Ranger service that is deployed in an EMR on ECS cluster, you can view the values of the
xasecure.audit.destination.solr.urls,xasecure.audit.destination.solr.user, andxasecure.audit.destination.solr.passwordconfiguration items on the ranger-spark-audit.xml tab of the Ranger-plugin service page.
If you submit jobs in EMR Serverless Spark after you configure the audit feature, you can access the web UI of Ranger and view the audit information about user access on the Access tab. For information about how to access the web UI of Ranger, see Access the web UIs of open source components in the EMR console.
You can view audit information on the web UI of Ranger only if you use Solr to store audit information.

Step 4: Test the connectivity
When you use Spark Beeline to access resources on which you do not have permissions, such as databases or tables, the following error occurs:
0: jdbc:hive2://pre-emr-spark-gateway-cn-hang> create table test(id int);
Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.kyuubi.plugin.spark.authz.AccessControlException: Permission denied: user [test] does not have [create] privilege on [database=testdb/table=test]
at org.apache.spark.sql.hive.thriftserver.HiveThriftServerErrors$.runningQueryError(HiveThriftServerErrors.scala:44)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:325)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:230)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:79)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:63)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:230)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:225)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:239)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)When you verify permissions, take note of the default permissions that users have in Ranger. For example, all users have the permissions to switch and create databases, and the owners of databases and tables have full permissions on the databases and tables. We recommend that you verify the permissions of User B on the resources created by User A, such as databases and tables. If you verify the permissions of users on the resources created by themselves, you may find that specific permission settings do not take effect. This occurs because the owner of resources has full permissions on the resources.
If Ranger Admin is incorrectly configured, the SQL statements may be successfully executed, and no error is reported. However, Ranger authentication does not take effect.