Optimize StarRocks access performance with SLB and client-side connection pools - E-MapReduce

EMR Serverless StarRocks uses Alibaba Cloud DNS PrivateZone for FE node access by default. PrivateZone does not support load balancing, so connections may concentrate on a single Frontend (FE) node while others remain idle. Under high concurrency, this creates a bottleneck.

Two complementary techniques solve this problem:

Server Load Balancer (SLB) distributes connections evenly across FE nodes at Layer 4 (TCP).
Client-side connection pools reuse connections, manage lifecycles, and prevent leaks.

Together, they deliver high-concurrency, high-availability database access.

                          +-------------------+
                          |   App / BI Tool   |
                          +--------+----------+
                                   |
                          +--------v----------+
                          |  Connection Pool   |
                          |    (HikariCP)      |
                          +--------+----------+
                                   |
                          +--------v----------+
                          |   SLB (Layer 4)    |
                          +--+------+------+--+
                             |      |      |
                        +----v+ +---v--+ +-v----+
                        | FE 1| | FE 2 | | FE 3 |
                        +-----+ +------+ +------+
                             |      |      |
                          +--v------v------v--+
                          |    BE Nodes        |
                          +--------------------+

Prerequisites

Three or more FE nodes
Client IP address in the security group for internal access or the whitelist for internet access

Step 1: Enable SLB access

Enable SLB to distribute connections across all FE nodes instead of routing through a single PrivateZone endpoint.

Go to the details page of an EMR Serverless StarRocks instance.
1. Log on to the EMR console.
2. In the left-side navigation pane, choose EMR Serverless > StarRocks.
3. In the top navigation bar, select a region based on your business requirements.
4. In the Instances section, find the desired StarRocks instance and click the name of the instance.
In the Gateway Information section, click Activate SLB.
Click OK.

Step 2: Configure a client-side connection pool

A connection pool sits between your application and SLB. It provides five key benefits:

Benefit	What it does
Connection reuse	Reduces the overhead of creating and destroying TCP connections. Lowers system resource consumption.
Lifecycle management	Manages connection lifecycles centrally to prevent connection leaks.
Concurrency control	Queues excess connection requests during peak hours instead of failing immediately.
Validity checks	Detects and removes invalid connections automatically. Each query uses a healthy connection.
Monitoring and tuning	Exposes metrics such as the number of FE connections for real-time monitoring.

SDK method (Java and HikariCP example)

Maven dependencies

Add the following dependencies to the pom.xml file of your project.

<dependency>
    <groupId>com.zaxxer</groupId>
    <artifactId>HikariCP</artifactId>
    <version>5.0.1</version>
</dependency>
<dependency>
    <groupId>mysql</groupId>
    <artifactId>mysql-connector-java</artifactId>
    <version>8.0.33</version>
</dependency>

Starting from version 8.0.31, Oracle renamed the MySQL JDBC driver artifact from mysql:mysql-connector-java to com.mysql:mysql-connector-j. Version 8.0.33 is published under both artifact IDs. For new projects, use the updated coordinates:

<groupId>com.mysql</groupId>
<artifactId>mysql-connector-j</artifactId>
<version>8.0.33</version>

HikariCP connection pool configuration

import com.zaxxer.hikari.HikariConfig;
import com.zaxxer.hikari.HikariDataSource;

import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;

public class StarRocksConnectionPoolExample {
    public static void main(String[] args) {
        // Configure the HikariCP connection pool
        HikariConfig config = new HikariConfig();

        // Use the internal same-region SLB endpoint. Replace the placeholder
        // with the endpoint shown in the Gateway Information section.
        config.setJdbcUrl("jdbc:mysql://fe-c-xxx-internal.starrocks.aliyuncs.com:9030/<yourDatabase>");
        config.setUsername("<yourUsername>");
        config.setPassword("<yourPassword>");

        // --- Core pool parameters ---

        // Maximum connections in the pool. Start with 20 for typical OLAP workloads.
        // StarRocks queries are longer-running than OLTP queries, so fewer
        // connections are usually sufficient.
        config.setMaximumPoolSize(20);

        // Keep at least 5 idle connections warm to avoid cold-start latency
        // when traffic resumes after a quiet period.
        config.setMinimumIdle(5);

        // Wait up to 30 seconds for a connection from the pool.
        // If exceeded, an exception is thrown.
        config.setConnectionTimeout(30000);

        // Close idle connections after 10 minutes.
        // Frees resources when traffic drops.
        config.setIdleTimeout(600000);

        // Force-close and recreate every connection after 30 minutes.
        // This ensures connections rotate across SLB backends and
        // prevents stale connections after FE node changes.
        config.setMaxLifetime(1800000);

        // Validate connections before use. StarRocks supports SELECT 1
        // as a lightweight health check.
        config.setConnectionTestQuery("SELECT 1");

        // Allow up to 5 seconds for the validation query to complete.
        config.setValidationTimeout(5000);

        // Log a warning if a connection is checked out for more than
        // 1 minute. Helps detect connection leaks in application code.
        config.setLeakDetectionThreshold(60000);

        // Create the data source
        HikariDataSource dataSource = new HikariDataSource(config);

        // Use the connection pool to execute a query
        try (Connection conn = dataSource.getConnection();
             PreparedStatement stmt = conn.prepareStatement("SELECT count(*) FROM your_table");
             ResultSet rs = stmt.executeQuery()) {

            if (rs.next()) {
                System.out.println("Count: " + rs.getLong(1));
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

BI tool configuration

Common BI tools such as Tableau, Superset, Metabase, and Quick BI support connection pool parameters in their advanced options. Set pool parameters (maximum connections, idle timeout, validation query) in the connection or data source configuration of your chosen BI tool.

Step 3: Verify load balancing

Go to the Monitoring And Alerting page of the target StarRocks instance.
Click the Instance tab.
In the FE Net section, view the FE Connections monitoring graph.
Confirm that connections are evenly distributed among all FE nodes.

If the distribution is uneven, adjust client-side connection pool parameters or check the SLB configuration.

Step 4: Tune connection pool parameters

Adjust the following parameters to match your workload.

Parameter	Recommended value	Guidance
`maximumPoolSize`	10 to 50	Set the upper limit of concurrent database connections. Start with 20 for typical OLAP workloads. Too many connections waste resources and increase context-switching overhead.
`minimumIdle`	5 to 10	Keep enough idle connections warm to avoid cold-start latency when traffic resumes after a quiet period.
`connectionTimeout`	30 s	Set the maximum wait time for a connection from the pool. The pool throws an exception when this time is exceeded.
`idleTimeout`	600 s (10 min)	Control how long an idle connection stays in the pool before it is closed. Lower this value to free resources faster when traffic drops.
`maxLifetime`	1800 s (30 min)	Force-close and recreate connections after this interval. Shorter values rotate connections across SLB backends more frequently.
`connectionTestQuery`	`SELECT 1`	Validate connection health before each use. StarRocks supports this statement as a lightweight health check.

Troubleshooting

Connection pool exhaustion or timeout errors

Cause: The pool is too small, or application code does not return connections. Missing try-with-resources blocks cause connection leaks.

Solution: Increase maximumPoolSize. Wrap every getConnection() call in a try-with-resources block. Check the leakDetectionThreshold log warnings to find leaked connections.

Uneven load distribution across FE nodes

Cause: Long-lived connections stay bound to the same FE node. SLB distributes connections only at creation time, so persistent connections cause imbalance over time.

Solution: Reduce maxLifetime to rotate connections more frequently. Verify the distribution in the FE Connections monitoring graph.

"Connection refused" errors after an FE node restart

Cause: SLB health checks take time to detect that a node is back online. During this interval, stale pool connections may target the restarted node.

Solution: Set connectionTestQuery to SELECT 1. The pool validates each connection before use and discards any that fail the health check.

Slow query performance with a connection pool

Cause: An oversized pool creates excessive context switching on FE nodes. StarRocks OLAP queries are typically long-running, so fewer concurrent connections perform better.

Solution: Reduce maximumPoolSize. Monitor FE node CPU utilization and adjust until throughput stabilizes.