EMR Serverless StarRocks uses Alibaba Cloud DNS PrivateZone for FE node access by default. PrivateZone does not support load balancing, so connections may concentrate on a single Frontend (FE) node while others remain idle. Under high concurrency, this creates a bottleneck.
Two complementary techniques solve this problem:
Server Load Balancer (SLB) distributes connections evenly across FE nodes at Layer 4 (TCP).
Client-side connection pools reuse connections, manage lifecycles, and prevent leaks.
Together, they deliver high-concurrency, high-availability database access.
+-------------------+
| App / BI Tool |
+--------+----------+
|
+--------v----------+
| Connection Pool |
| (HikariCP) |
+--------+----------+
|
+--------v----------+
| SLB (Layer 4) |
+--+------+------+--+
| | |
+----v+ +---v--+ +-v----+
| FE 1| | FE 2 | | FE 3 |
+-----+ +------+ +------+
| | |
+--v------v------v--+
| BE Nodes |
+--------------------+Prerequisites
Three or more FE nodes
Client IP address in the security group for internal access or the whitelist for internet access
Step 1: Enable SLB access
Enable SLB to distribute connections across all FE nodes instead of routing through a single PrivateZone endpoint.
Go to the details page of an EMR Serverless StarRocks instance.
Log on to the EMR console.
In the left-side navigation pane, choose .
In the top navigation bar, select a region based on your business requirements.
In the Instances section, find the desired StarRocks instance and click the name of the instance.
In the Gateway Information section, click Activate SLB.
Click OK.
Step 2: Configure a client-side connection pool
A connection pool sits between your application and SLB. It provides five key benefits:
| Benefit | What it does |
|---|---|
| Connection reuse | Reduces the overhead of creating and destroying TCP connections. Lowers system resource consumption. |
| Lifecycle management | Manages connection lifecycles centrally to prevent connection leaks. |
| Concurrency control | Queues excess connection requests during peak hours instead of failing immediately. |
| Validity checks | Detects and removes invalid connections automatically. Each query uses a healthy connection. |
| Monitoring and tuning | Exposes metrics such as the number of FE connections for real-time monitoring. |
SDK method (Java and HikariCP example)
Maven dependencies
Add the following dependencies to the pom.xml file of your project.
<dependency>
<groupId>com.zaxxer</groupId>
<artifactId>HikariCP</artifactId>
<version>5.0.1</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>8.0.33</version>
</dependency>Starting from version 8.0.31, Oracle renamed the MySQL JDBC driver artifact frommysql:mysql-connector-javatocom.mysql:mysql-connector-j. Version 8.0.33 is published under both artifact IDs. For new projects, use the updated coordinates:
<groupId>com.mysql</groupId>
<artifactId>mysql-connector-j</artifactId>
<version>8.0.33</version>HikariCP connection pool configuration
import com.zaxxer.hikari.HikariConfig;
import com.zaxxer.hikari.HikariDataSource;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
public class StarRocksConnectionPoolExample {
public static void main(String[] args) {
// Configure the HikariCP connection pool
HikariConfig config = new HikariConfig();
// Use the internal same-region SLB endpoint. Replace the placeholder
// with the endpoint shown in the Gateway Information section.
config.setJdbcUrl("jdbc:mysql://fe-c-xxx-internal.starrocks.aliyuncs.com:9030/<yourDatabase>");
config.setUsername("<yourUsername>");
config.setPassword("<yourPassword>");
// --- Core pool parameters ---
// Maximum connections in the pool. Start with 20 for typical OLAP workloads.
// StarRocks queries are longer-running than OLTP queries, so fewer
// connections are usually sufficient.
config.setMaximumPoolSize(20);
// Keep at least 5 idle connections warm to avoid cold-start latency
// when traffic resumes after a quiet period.
config.setMinimumIdle(5);
// Wait up to 30 seconds for a connection from the pool.
// If exceeded, an exception is thrown.
config.setConnectionTimeout(30000);
// Close idle connections after 10 minutes.
// Frees resources when traffic drops.
config.setIdleTimeout(600000);
// Force-close and recreate every connection after 30 minutes.
// This ensures connections rotate across SLB backends and
// prevents stale connections after FE node changes.
config.setMaxLifetime(1800000);
// Validate connections before use. StarRocks supports SELECT 1
// as a lightweight health check.
config.setConnectionTestQuery("SELECT 1");
// Allow up to 5 seconds for the validation query to complete.
config.setValidationTimeout(5000);
// Log a warning if a connection is checked out for more than
// 1 minute. Helps detect connection leaks in application code.
config.setLeakDetectionThreshold(60000);
// Create the data source
HikariDataSource dataSource = new HikariDataSource(config);
// Use the connection pool to execute a query
try (Connection conn = dataSource.getConnection();
PreparedStatement stmt = conn.prepareStatement("SELECT count(*) FROM your_table");
ResultSet rs = stmt.executeQuery()) {
if (rs.next()) {
System.out.println("Count: " + rs.getLong(1));
}
} catch (Exception e) {
e.printStackTrace();
}
}
}BI tool configuration
Common BI tools such as Tableau, Superset, Metabase, and Quick BI support connection pool parameters in their advanced options. Set pool parameters (maximum connections, idle timeout, validation query) in the connection or data source configuration of your chosen BI tool.
Step 3: Verify load balancing
Go to the Monitoring And Alerting page of the target StarRocks instance.
Click the Instance tab.
In the FE Net section, view the FE Connections monitoring graph.
Confirm that connections are evenly distributed among all FE nodes.
If the distribution is uneven, adjust client-side connection pool parameters or check the SLB configuration.
Step 4: Tune connection pool parameters
Adjust the following parameters to match your workload.
| Parameter | Recommended value | Guidance |
|---|---|---|
maximumPoolSize | 10 to 50 | Set the upper limit of concurrent database connections. Start with 20 for typical OLAP workloads. Too many connections waste resources and increase context-switching overhead. |
minimumIdle | 5 to 10 | Keep enough idle connections warm to avoid cold-start latency when traffic resumes after a quiet period. |
connectionTimeout | 30 s | Set the maximum wait time for a connection from the pool. The pool throws an exception when this time is exceeded. |
idleTimeout | 600 s (10 min) | Control how long an idle connection stays in the pool before it is closed. Lower this value to free resources faster when traffic drops. |
maxLifetime | 1800 s (30 min) | Force-close and recreate connections after this interval. Shorter values rotate connections across SLB backends more frequently. |
connectionTestQuery | SELECT 1 | Validate connection health before each use. StarRocks supports this statement as a lightweight health check. |
Troubleshooting
Connection pool exhaustion or timeout errors
Cause: The pool is too small, or application code does not return connections. Missing try-with-resources blocks cause connection leaks.
Solution: Increase maximumPoolSize. Wrap every getConnection() call in a try-with-resources block. Check the leakDetectionThreshold log warnings to find leaked connections.
Uneven load distribution across FE nodes
Cause: Long-lived connections stay bound to the same FE node. SLB distributes connections only at creation time, so persistent connections cause imbalance over time.
Solution: Reduce maxLifetime to rotate connections more frequently. Verify the distribution in the FE Connections monitoring graph.
"Connection refused" errors after an FE node restart
Cause: SLB health checks take time to detect that a node is back online. During this interval, stale pool connections may target the restarted node.
Solution: Set connectionTestQuery to SELECT 1. The pool validates each connection before use and discards any that fail the health check.
Slow query performance with a connection pool
Cause: An oversized pool creates excessive context switching on FE nodes. StarRocks OLAP queries are typically long-running, so fewer concurrent connections perform better.
Solution: Reduce maximumPoolSize. Monitor FE node CPU utilization and adjust until throughput stabilizes.