All Products
Search
Document Center

Alibaba Cloud Model Studio:Configure connection pool for Dashscope Java SDK

Last Updated:Mar 26, 2025

When calling models in high-concurrency scenarios, you may encounter request timeouts or high resource consumption. You can optimize connection reuse by configuring the connection pool.

Connection pool

In high-concurrency scenarios, creating and destroying network connections frequently causes significant performance overhead. Therefore, DashScope Java SDK enables connection pooling by default to reduce resource consumption and improve request processing efficiency through connection reuse.

You can optimize connection reuse by adjusting the number of connections and timeout settings. Here are the related parameters:

Parameter

Description

Default value

Unit

Note

connectTimeout

The timeout period for establishing a connection.

120

seconds

In low-latency scenarios, you typically need to set shorter connection timeout periods to reduce waiting time and improve response speed.

readTimeout

The timeout period for reading data.

300

seconds

writeTimeout

The timeout period for writing data.

60

seconds

connectionIdleTimeout

The timeout period for idle connections in the connection pool.

300

seconds

In high-concurrency scenarios, extending the idle connection timeout period helps avoid frequent connection creation and reduce resource consumption.

connectionPoolSize

The maximum number of connections in the connection pool.

32

connections

In high-concurrency scenarios:

  • If the number of connections is too low, requests may be blocked or time out. Or, connections may be frequently created, increasing resource consumption.

  • If the number of connections is too high, the server may be overloaded.

Adjust the value based on your business requirements.

maximumAsyncRequests

The maximum number of concurrent requests. This is a global limit for concurrent requests that includes all hosts. This value must be less than or equal to the maximum number of connections. Otherwise, requests may be blocked.

32

requests

maximumAsyncRequestsPerHost

The maximum number of concurrent requests per host. This value must be less than or equal to the maximum number of concurrent requests.

32

requests

Sample requests

Install the SDK

After you install Java, you can then install DashScope Java SDK. For information about the SDK versions, see DashScope Java SDK. Run the following command to include the Java SDK dependency in your project. Replace the-latest-version with the actual latest version number.

XML

  1. Open your Maven project's pom.xml file.

  2. Add the dependency information within the <dependencies> tag.

    <dependency>
        <groupId>com.alibaba</groupId>
        <artifactId>dashscope-sdk-java</artifactId>
        <!-- Please replace 'the-latest-version' with the latest version number: https://mvnrepository.com/artifact/com.alibaba/dashscope-sdk-java -->
        <version>the-latest-version</version>
    </dependency>
  3. Save pom.xml.

  4. Use Maven commands, such as mvn compile or mvn clean install, to refresh your project's dependencies. DashScope Java SDK will be downloaded and integrated into your project automatically.

Take Windows IDEA as an example:

image

Gradle

  1. Open your Gradle project's build.gradle file.

  2. Add the dependency information within the dependencies block.

    dependencies {
        // Please replace 'the-latest-version' with the latest version number: https://mvnrepository.com/artifact/com.alibaba/dashscope-sdk-java
        implementation group: 'com.alibaba', name: 'dashscope-sdk-java', version: 'the-latest-version'
    }
  3. Save build.gradle.

  4. Navigate to your project's root directory in the command line and run the following Gradle command to refresh your project dependencies. DashScope Java SDK will be downloaded and integrated into your project automatically.

    ./gradlew build --refresh-dependencies

Take Windows IDEA as an example:

image

The following sample code shows how to configure connection pool parameters, such as timeout periods and maximum number of connections, and call the model. You can adjust these parameters to optimize concurrent performance and resource utilization based on your requirements.

// We recommend DashScope SDK version >= 2.12.0
import java.time.Duration;
import java.util.Arrays;

import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.protocol.ConnectionConfigurations;
import com.alibaba.dashscope.utils.Constants;
import com.alibaba.dashscope.utils.JsonUtils;

public class Main {
    public static GenerationResult callWithMessage() throws ApiException, NoApiKeyException, InputRequiredException {
        Generation gen = new Generation(Protocol.HTTP.getValue(), "https://dashscope-intl.aliyuncs.com/api/v1");
        Message systemMsg = Message.builder()
                .role(Role.SYSTEM.getValue())
                .content("You are a helpful assistant.")
                .build();
        Message userMsg = Message.builder()
                .role(Role.USER.getValue())
                .content("Who are you?")
                .build();
        GenerationParam param = GenerationParam.builder()
                // If environment variables are not configured, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                // This example uses qwen-plus. You can change the model name as needed. For the model list, visit: https://www.alibabacloud.com/help/en/model-studio/getting-started/models
                .model("qwen-plus")
                .messages(Arrays.asList(systemMsg, userMsg))
                .resultFormat(GenerationParam.ResultFormat.MESSAGE)
                .build();
                
        System.out.println(userMsg.getContent());
        return gen.call(param);
    }
    public static void main(String[] args) {
        // Connection pool configuration
        Constants.connectionConfigurations = ConnectionConfigurations.builder()
        .connectTimeout(Duration.ofSeconds(10))  // Connection establishment timeout, default: 120s
        .readTimeout(Duration.ofSeconds(300)) // Data read timeout, default: 300s
        .writeTimeout(Duration.ofSeconds(60)) // Data write timeout, default: 60s
        .connectionIdleTimeout(Duration.ofSeconds(300)) // Idle connection timeout in the connection pool, default: 300s
        .connectionPoolSize(256) // Maximum number of connections in the connection pool, default: 32
        .maximumAsyncRequests(256)  // Maximum number of concurrent requests, default: 32
        .maximumAsyncRequestsPerHost(256) // Maximum number of concurrent requests per host, default: 32
        .build();

        try {
            GenerationResult result = callWithMessage();
            System.out.println(result.getOutput().getChoices().get(0).getMessage().getContent());
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            // Use a logging framework to record exception information
            System.err.println("An error occurred while calling the generation service: " + e.getMessage());
        }
        System.exit(0);
    }
}

Error code

If the call failed and an error message is returned, see Error messages.