You can use Kerberos with the Kyuubi Gateway in Serverless Spark for secure identity authentication and access control. After you complete the configuration, clients must use Kerberos authentication to submit tasks to the Kyuubi Gateway in the workspace. This process enhances task execution security.
Limitations
The cluster and the Serverless Spark workspace must be in the same region.
Only one Kyuubi Gateway can be created in a workspace where Kerberos is enabled.
Prerequisites
You have created an EMR on ECS cluster with Kerberos authentication enabled. For more information, see Create a cluster.
You have created a Serverless Spark workspace with Kerberos authentication enabled. For more information, see Enable Kerberos authentication.
Create network connectivity
To use Kerberos with the Kyuubi Gateway, you must configure PrivateLink to establish network connectivity between Serverless Spark and the Kerberos cluster.
Create an endpoint
An endpoint is created and maintained by the service consumer. You can associate the endpoint with an endpoint service to establish network connectivity for accessing external services through PrivateLink. For more information, see Endpoints.
Log on to the Endpoint console.
On the Create Endpoint page, configure the following parameters for the endpoint and click OK.

Configuration
Description
Region
Select the region where the endpoint resides. Make sure it is the same region as the Kerberos cluster and the Serverless Spark workspace.
Endpoint Name
Enter a custom name for the endpoint.
Endpoint Type
Select Interface Endpoint.
Endpoint Service
Click Select Service, and then select or enter the target endpoint service ID.
NoteSubmit the following information in a ticket to obtain the endpoint service ID.
Serverless Spark workspace ID, such as
w-f8cfXXXXXX.The VPC ID of the Kerberos cluster that accesses the Kyuubi Gateway. The VPC must have available vSwitches in two zones, such as
vpc-bp1tXXXXXX.The two selected zones, such as
I,J. To find out the supported zones in the region, contact customer service in the ticket.
VPC
Select the VPC of the Kerberos cluster that accesses the Kyuubi Gateway.
Security Groups
Select the security group to associate with the endpoint elastic network interface (ENI).
NoteBy default, you can add a maximum of nine security groups to an endpoint.
Zone and vSwitch
Select the zones mentioned earlier and their corresponding vSwitches.
IP Version
The supported network type.
IPv4: Supports client access using IPv4 addresses.
Dual-stack: Supports client access using both IPv4 and IPv6 addresses.
NoteThe service consumer can select this option only after the service provider has completed the dual-stack configuration.
Resource Group
Select the resource group to which the endpoint belongs.
Tag
Select or enter a Tag Key and a Tag Value.
On the Basic Information page, a Status of Active indicates that the endpoint service was created successfully. The domain name for the endpoint service is
ep-xxxxxxxxxxx.epsrv-xxxxxxxxxxx.cn-hangzhou.privatelink.aliyuncs.com.Log on to the Kerberos cluster and test the network connectivity.

Configure domain name resolution (Optional)
The default endpoint service domain name is long. For convenience, you can configure a custom internal authoritative domain name. For more information, see Internal Authoritative Domain Names.
Log on to the console. On the Authoritative Zone tab, click the User Defined Zones tab, and then click Add Zone.
Enter an Authoritative Zone, select the VPCs where the domain name will apply, and then click OK. This topic uses
kyuubi-kerberos.abcas an example.NoteIf the Domain Name Type option is available, select Internal Authoritative Acceleration Zone. If the Domain Name Type option is not available, you do not need to select a type because an Internal Authoritative Acceleration Zone domain name is created by default.
On the User Defined Zones tab, find the target domain name and click Settings in the Actions column. In the dialog box that appears, click Add Record and select Form Editor Mode.
Set Record Type to CNAME. For Hostname, enter a value, such as
test. For Record Value, enter the endpoint service domain nameep-xxxxxxxxxxx.epsrv-xxxxxxxxxxx.cn-hangzhou.privatelink.aliyuncs.com. After you click OK, the endpoint service domain name is mapped to test.kyuubi-kerberos.abc.Log on to the Kerberos cluster and test the network connectivity.
ping test.kyuubi-kerberos.abc
Create a keytab
Run the following command to access the Kerberos admin.local tool.
kadmin.localCreate a principal in the format
kyuubi/<fqdn>@<REALM>. For thefqdnpart, use the endpoint domain nameep-xxxxxxxxxxx.epsrv-xxxxxxxxxxx.cn-hangzhou.privatelink.aliyuncs.com. If you configured a custom domain name with a CNAME record, use the custom domain name instead, such astest.kyuubi-kerberos.abc.addprinc -randkey kyuubi/ep-xxxxxxxxxxx.epsrv-xxxxxxxxxxx.cn-hangzhou.privatelink.aliyuncs.com@EMR.C-DFD4*****C204.COMExport the keytab file and exit the Kerberos admin.local tool.
xst -kt /root/kyuubi.keytab kyuubi/ep-xxxxxxxxxxx.epsrv-xxxxxxxxxxx.cn-hangzhou.privatelink.aliyuncs.com@EMR.C-DFD4*****C204.COM quitUpload the generated keytab file to an OSS bucket.
hadoop fs -put /root/kyuubi.keytab oss://<YOUR_BUCKET>.<region>.oss-dls.aliyuncs.com/
Configure Kyuubi Gateway
To use Kerberos with Kyuubi Gateway, you can configure the following Kyuubi Configuration.
kyuubi.authentication KERBEROS
kyuubi.kinit.principal kyuubi/ep-xxxxxxxxxxx.epsrv-xxxxxxxxxxx.cn-hangzhou.privatelink.aliyuncs.com@EMR.C-DFD43******7C204.COM
kyuubi.kinit.keytab /opt/kyuubi/work-dir/kyuubi.keytab
kyuubi.files oss://bucket/path/to/kyuubi.keytabConfiguration Item | Description |
kyuubi.authentication | Specifies the authentication method used by Kyuubi Gateway. Set it to |
kyuubi.kinit.principal | Specifies the principal that Kyuubi Gateway uses for Kerberos authentication. The format is |
kyuubi.kinit.keytab | Specifies the keytab file used by Kyuubi Gateway. Note: The path is fixed. You only need to replace the keytab file name. |
kyuubi.files | The OSS path of the keytab file that you uploaded in the Create a keytab step. |
The following Spark Configuration are required to connect to a Kerberos-enabled Hive Metastore (HMS).
spark.hadoop.hive.metastore.uris thrift://master-1-1.c-1d36*****e840c.cn-hangzhou.emr.aliyuncs.com:9083
spark.hadoop.hive.imetastoreclient.factory.class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClientFactory
spark.hive.metastore.kerberos.principal hive/_HOST@EMR.C-DFD4*****C204.COM
spark.hive.metastore.sasl.enabled true
spark.emr.serverless.network.service.name <network_name>Configuration Item | Description |
spark.hadoop.hive.metastore.uris | The HMS address. |
spark.hadoop.hive.imetastoreclient.factory.class | Specifies the factory class for creating the HMS client. |
spark.hive.metastore.kerberos.principal | The principal for HMS in a Kerberos environment. |
spark.hive.metastore.sasl.enabled | Specifies whether to enable Kerberos authentication. |
spark.emr.serverless.network.service.name | The name of the network connection. |
In an HA cluster environment, you can configure multiple Thrift addresses for `metastore.uris`. Separate the addresses with commas. You must use hostnames, not IP addresses.
If you specify only one Thrift address for `metastore.uris`, you can use an IP address. However, `metastore.kerberos.principal` must be in the format `hive/<hostname of HMS>@<REALM>`.
You can simplify `metastore.kerberos.principal` to the `hive/_HOST@<REALM>` format only when `metastore.uris` uses a hostname.
Save the configurations and start the Kyuubi Gateway.
Submit a job
You can use a show databases job to verify that the Kerberos cluster can connect to the Kyuubi Gateway and successfully start a Spark job.
Prepare a Kerberos user with the required permissions and export its keytab file.
Run the following commands to export the keytab file.
kadmin.local addprinc -randkey hadoop xst -kt /root/hadoop.keytab hadoop quit
Use the keytab file for Kerberos authentication.
kinit -kt hadoop.keytab hadoopRun the following command to connect to the Kyuubi Gateway and start a Spark job.
/opt/apps/KYUUBI/kyuubi-1.9.2-1.0.0/bin/kyuubi-beeline -u 'jdbc:hive2://ep-xxxxxxxxxxx.epsrv-xxxxxxxxxxx.cn-hangzhou.privatelink.aliyuncs.com:10009/;principal=kyuubi/_HOST@EMR.C-DFD43*****7C204.COM'After you connect, run
show databases.
If your Spark job needs to connect to an HMS or Hadoop Distributed File System (HDFS) service that has Kerberos authentication enabled, you must modify the core-site.xml file for the HADOOP-COMMON or HDFS component on the cluster. Add the following two configurations to the file. This allows the kyuubi user to impersonate other users when accessing the HDFS or HMS service. If you do not add these configurations, the connection may fail.
hadoop.proxyuser.kyuubi.hosts = *
hadoop.proxyuser.kyuubi.groups = *Newer versions of EMR DataLake clusters include these parameters by default. After you add the parameters, you must restart the HDFS or HMS service.