By default, any client with a valid token can connect to a Spark Thrift Server and run SQL queries. Enable Lightweight Directory Access Protocol (LDAP) authentication to require a username and password, so that only users with valid LDAP credentials can connect.
Version requirements
LDAP authentication requires the following minimum engine versions for EMR Serverless Spark:
| Engine series | Minimum version |
|---|---|
| esr-4.x | esr-4.2.0 |
| esr-3.x | esr-3.0.1 |
| esr-2.x | esr-2.4.1 |
Prerequisites
Before you begin, ensure that you have:
A Spark Thrift Server session. For more information, see Manage Spark Thrift Server sessions.
(Optional) If you plan to use the OpenLDAP service from an Alibaba Cloud EMR on ECS cluster: a cluster with the OpenLDAP service enabled, and at least one user added. For more information, see Create a cluster and OpenLDAP user management.
Step 1: Prepare the network
Configure network connectivity between EMR Serverless Spark and your virtual private cloud (VPC) so that the Spark Thrift Server can reach the LDAP server. For more information, see Network connectivity between EMR Serverless Spark and other VPCs.
Step 2: Configure LDAP startup parameters
Stop the Spark Thrift Server.
From the Network Connectivity drop-down list, select the network connection you created.
Add the following parameters in Spark Configuration:
NoteFor high availability (HA) LDAP deployments, separate multiple LDAP addresses with a space:
ldap://<ldap_url_1>:<ldap_port> ldap://<ldap_url_2>:<ldap_port>.Placeholder Description Example (EMR on ECS OpenLDAP) <ldap_url>IP address or hostname of the LDAP server Internal IP address of the master node <ldap_port>Port of the LDAP server 10389<ldap_base_dn>Base DN for LDAP authentication ou=people,o=emrspark.hive.server2.authentication LDAP spark.hive.server2.authentication.ldap.url ldap://<ldap_url>:<ldap_port> spark.hive.server2.authentication.ldap.baseDN <ldap_base_dn>Replace the placeholders with values for your LDAP server:
Restart the Spark Thrift Server to apply the configuration.
Step 3: Connect to the Spark Thrift Server
After LDAP authentication is enabled, clients must supply credentials when connecting. Collect the following values before connecting:
| Placeholder | Description |
|---|---|
<endpoint> | Endpoint (Public) or Endpoint (Internal) from the Overview tab. Internal endpoints are accessible only within the same VPC. |
<port> | 443 for a public endpoint; 80 for an internal same-region endpoint |
<token> | Token from the Token Management tab |
<username> | LDAP username. For OpenLDAP on EMR on ECS, use the username from the User Management page. |
<password> | LDAP password corresponding to <username> |
Method 1: Beeline
beeline -u 'jdbc:hive2://<endpoint>:<port>/;transportMode=http;httpPath=cliservice/token/<token>' \
-n <username> \
-p <password>Method 2: JDBC URL
Build a JDBC URL for Java or other JDBC-compatible applications:
jdbc:hive2://<endpoint>:<port>/;transportMode=http;httpPath=cliservice/token/<token>;user=<username>;password=<password>Troubleshooting
Authentication fails immediately after configuration
LDAP settings take effect only after a full restart of the Spark Thrift Server. Stop the server, wait for it to stop completely, then start it again. Also confirm that spark.hive.server2.authentication is set to LDAP (case-sensitive) in Spark Configuration.
Cannot connect to the LDAP server
If the Spark Thrift Server cannot reach the LDAP server, check the network configuration from Step 1. Confirm that VPC route and security group rules allow outbound traffic from EMR Serverless Spark to <ldap_url>:<ldap_port>. For HA deployments, verify that all LDAP addresses are reachable.
Invalid credentials error
Check <ldap_base_dn>. For OpenLDAP on EMR on ECS, the correct value is ou=people,o=emr. A mismatched base DN can cause "invalid credentials" errors even when the username and password are correct.