All Products
Search
Document Center

Alibaba Cloud Model Studio:High-concurrency scenarios

Last Updated:Mar 18, 2026

CosyVoice uses WebSocket for real-time audio streaming. Creating a connection for every request adds latency in high-concurrency scenarios. The DashScope SDK provides connection pools and object pools that reuse connections, eliminating setup costs and reducing first-packet latency.

Prerequisites

Python SDK: Object pool optimization

The Python SDK uses SpeechSynthesizerObjectPool to manage and reuse SpeechSynthesizer objects. The pool creates a specified number of instances with pre-established WebSocket connections. Borrowing an object returns one with an active connection for immediate use. Returning the object keeps its connection open for reuse.

Implementation steps

  1. Install dependencies.

       pip install -U dashscope
  2. Create and configure the object pool. Set pool size to 1.5x to 2x your peak concurrency (do not exceed your QPS limit). The code below creates a global singleton pool that initializes SpeechSynthesizer objects and establishes connections (this takes time).

       from dashscope.audio.tts_v2 import SpeechSynthesizerObjectPool
    
       synthesizer_object_pool = SpeechSynthesizerObjectPool(max_size=20)
  3. Borrow a SpeechSynthesizer object from the pool. If borrowing exceeds the pool's maximum size, the system creates a new SpeechSynthesizer object that does not benefit from pooling.

       speech_synthesizer = connectionPool.borrow_synthesizer(
           model='cosyvoice-v3-flash',
           voice='longanyang',
           seed=12382,
           callback=synthesizer_callback
       )
  4. Perform speech synthesis. Call the call or streaming_call method of the SpeechSynthesizer object to synthesize speech.

  5. Return the SpeechSynthesizer object. After the speech synthesis task completes, return the SpeechSynthesizer object for reuse by later tasks.

    Important

    Do not return objects from incomplete or failed tasks.

       connectionPool.return_synthesizer(speech_synthesizer)

Complete code

# !/usr/bin/env python3
# Copyright (C) Alibaba Group. All Rights Reserved.
# MIT License (https://opensource.org/licenses/MIT)

import os
import time
import threading

import dashscope
from dashscope.audio.tts_v2 import *


USE_CONNECTION_POOL = True
text_to_synthesize = [
    'First sentence: Welcome to Alibaba Cloud speech synthesis.',
    'Second sentence: Welcome to Alibaba Cloud speech synthesis.',
    'Third sentence: Welcome to Alibaba Cloud speech synthesis.',
]
connectionPool = None
if USE_CONNECTION_POOL:
    print('creating connection pool')
    start_time = time.time() * 1000
    connectionPool = SpeechSynthesizerObjectPool(max_size=3)
    end_time = time.time() * 1000
    print('connection pool created, cost: {} ms'.format(end_time - start_time))

def init_dashscope_api_key():
    '''
    Set your DashScope API-key. More information:
    https://github.com/aliyun/alibabacloud-bailian-speech-demo/blob/master/PREREQUISITES.md
    '''
    # API keys differ between the Singapore and Beijing regions. Get your API key: https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    if 'DASHSCOPE_API_KEY' in os.environ:
        dashscope.api_key = os.environ[
            'DASHSCOPE_API_KEY']  # load API-key from environment variable DASHSCOPE_API_KEY
    else:
        dashscope.api_key = '<your-dashscope-api-key>'  # set API-key manually


def synthesis_text_to_speech_and_play_by_streaming_mode(text, task_id):
    global USE_CONNECTION_POOL, connectionPool
    '''
    Synthesize speech with given text by streaming mode, async call and play the synthesized audio in real-time.
    for more information, please refer to https://www.alibabacloud.com/help/document_detail/2712523.html
    '''

    complete_event = threading.Event()

    # Define a callback to handle the result

    class Callback(ResultCallback):
        def on_open(self):
            # when using object pool, on_open will be called after task start
            self.file = open(f'result_{task_id}.mp3', 'wb')
            print(f'[task_{task_id}] start')

        def on_complete(self):
            print(f'[task_{task_id}] speech synthesis task complete successfully.')
            complete_event.set()

        def on_error(self, message: str):
            print(f'[task_{task_id}] speech synthesis task failed, {message}')

        def on_close(self):
            # when using object pool, on_open will be called after task finished
            print(f'[task_{task_id}] finished')

        def on_event(self, message):
            # print(f'recv speech synthsis message {message}')
            pass

        def on_data(self, data: bytes) -> None:
            # send to player
            # save audio to file
            self.file.write(data)

    # Call the speech synthesizer callback
    synthesizer_callback = Callback()

    # Initialize the speech synthesizer
    # you can customize the synthesis parameters, like voice, format, sample_rate or other parameters
    if USE_CONNECTION_POOL:
        speech_synthesizer = connectionPool.borrow_synthesizer(
            model='cosyvoice-v3-flash',
            voice='longanyang',
            seed=12382,
            callback=synthesizer_callback
        )
    else:
        speech_synthesizer = SpeechSynthesizer(model='cosyvoice-v3-flash',
                                               voice='longanyang',
                                               seed=12382,
                                               callback=synthesizer_callback)
    try:
        speech_synthesizer.call(text)
    except Exception as e:
        print(f'[task_{task_id}] speech synthesis task failed, {e}')
        if USE_CONNECTION_POOL:
            # close the synthesizer connection manually if task failed when using connection pool.
            speech_synthesizer.close()
        return

    print('[task_{}] Synthesized text: {}'.format(task_id, text))
    complete_event.wait()
    print('[task_{}][Metric] requestId: {}, first package delay ms: {}'.format(
        task_id,
        speech_synthesizer.get_last_request_id(),
        speech_synthesizer.get_first_package_delay()))
    if USE_CONNECTION_POOL:
        connectionPool.return_synthesizer(speech_synthesizer)


# main function
if __name__ == '__main__':
    # The following URL is for the Singapore region. If you use models in the Beijing region, replace it with: wss://dashscope.aliyuncs.com/api-ws/v1/inference
    dashscope.base_websocket_api_url='wss://dashscope-intl.aliyuncs.com/api-ws/v1/inference'
    init_dashscope_api_key()
    task_thread_list = []
    for task_id in range(3):
        thread = threading.Thread(
            target=synthesis_text_to_speech_and_play_by_streaming_mode,
            args=(text_to_synthesize[task_id], task_id))
        task_thread_list.append(thread)

    for task_thread in task_thread_list:
        task_thread.start()

    for task_thread in task_thread_list:
        task_thread.join()

    if USE_CONNECTION_POOL:
        connectionPool.shutdown()

Resource management and error handling

  • Successful task: Call connectionPool.return_synthesizer(speech_synthesizer) to return the object for reuse.

    Important

    Do not return SpeechSynthesizer objects from incomplete or failed tasks.

  • Failed task: If an SDK exception or business logic error interrupts the task, close the connection manually: speech_synthesizer.close().

  • After all speech synthesis tasks complete, shut down the object pool by calling connectionPool.shutdown().

  • No extra action is needed when the service returns a TaskFailed error.

Java SDK: Connection pool and object pool optimization

The Java SDK combines an internal connection pool with a custom object pool for optimal performance:

  • Connection pool: The SDK uses OkHttp3's connection pool to manage and reuse WebSocket connections, reducing handshake overhead (enabled by default).

  • Object pool: Built on commons-pool2, this maintains SpeechSynthesizer objects with pre-established connections. Borrowing eliminates connection setup latency and reduces first-packet latency.

Implementation steps

  1. Add dependencies. Add dashscope-sdk-java and commons-pool2 to your project's dependency configuration. Examples:

    Maven

    1. Open your Maven project’s pom.xml file.

    2. Add the following dependencies inside the <dependencies> tag.

    <dependency>
        <groupId>com.alibaba</groupId>
        <artifactId>dashscope-sdk-java</artifactId>
        <!-- Replace 'the-latest-version' with version 2.16.9 or later. Check available versions at: https://mvnrepository.com/artifact/com.alibaba/dashscope-sdk-java -->
        <version>the-latest-version</version>
    </dependency>
    
    <dependency>
        <groupId>org.apache.commons</groupId>
        <artifactId>commons-pool2</artifactId>
        <!-- Replace 'the-latest-version' with the latest version. Check available versions at: https://mvnrepository.com/artifact/org.apache.commons/commons-pool2 -->
        <version>the-latest-version</version>
    </dependency>
    1. Save the pom.xml file.

    2. Run a Maven command such as mvn clean install or mvn compile to update project dependencies.

    Gradle

    1. Open your Gradle project’s build.gradle file.

    2. Add the following dependencies inside the dependencies block.

      dependencies {
          // Replace 'the-latest-version' with version 2.16.6 or later. Check available versions at: https://mvnrepository.com/artifact/com.alibaba/dashscope-sdk-java
          implementation group: 'com.alibaba', name: 'dashscope-sdk-java', version: 'the-latest-version'
          
          // Replace 'the-latest-version' with the latest version. Check available versions at: https://mvnrepository.com/artifact/org.apache.commons/commons-pool2
          implementation group: 'org.apache.commons', name: 'commons-pool2', version: 'the-latest-version'
      }
    3. Save the build.gradle file.

    4. In your terminal, navigate to your project root directory and run the following Gradle command to update dependencies.

      ./gradlew build --refresh-dependencies

      Or, if you are using Windows, run:

      gradlew build --refresh-dependencies
  2. Configure the connection pool. Set connection pool parameters via environment variables:

    Environment variable

    Description

    Default

    DASHSCOPE_CONNECTION_POOL_SIZE

    Connection pool size. Set to 2x+ your peak concurrency.

    32

    DASHSCOPE_MAXIMUM_ASYNC_REQUESTS

    Maximum async requests. Match DASHSCOPE_CONNECTION_POOL_SIZE.

    32

    DASHSCOPE_MAXIMUM_ASYNC_REQUESTS_PER_HOST

    Maximum async requests per host. Match DASHSCOPE_CONNECTION_POOL_SIZE.

    32

  3. Configure the object pool. Set the object pool size using an environment variable. Use the following code to create the object pool:

    Important

    The object pool size (COSYVOICE_OBJECTPOOL_SIZE) must be less than or equal to the connection pool size (DASHSCOPE_CONNECTION_POOL_SIZE). If the connection pool is full when borrowing, the calling thread blocks until a connection is available. The object pool size must not exceed your QPS limit.

    Environment variable

    Description

    Default

    COSYVOICE_OBJECTPOOL_SIZE

    Object pool size. Set to 1.5x to 2x your peak concurrency.

    500

       class CosyvoiceObjectPool {
           // ... Other code omitted. See full example below.
           public static GenericObjectPool<SpeechSynthesizer> getInstance() {
               lock.lock();
               if (synthesizerPool == null) {
                   // You can set the object pool size here. Or set it via the COSYVOICE_OBJECTPOOL_SIZE environment variable.
                   // Recommended value: 1.5x to 2x your server's maximum concurrent connections.
                   int objectPoolSize = getObjectivePoolSize();
                   SpeechSynthesizerObjectFactory speechSynthesizerObjectFactory =
                           new SpeechSynthesizerObjectFactory();
                   GenericObjectPoolConfig<SpeechSynthesizer> config =
                           new GenericObjectPoolConfig<>();
                   config.setMaxTotal(objectPoolSize);
                   config.setMaxIdle(objectPoolSize);
                   config.setMinIdle(objectPoolSize);
                   synthesizerPool =
                           new GenericObjectPool<>(speechSynthesizerObjectFactory, config);
               }
               lock.unlock();
               return synthesizerPool;
           }
       }
  4. Borrow a SpeechSynthesizer object from the pool. If the number of borrowed objects exceeds the pool's maximum size, the system creates a new SpeechSynthesizer object. This newly created object must initialize and establish a WebSocket connection without reusing existing pool connections, so it does not benefit from pooling.

       synthesizer = CosyvoiceObjectPool.getInstance().borrowObject();
  5. Perform speech synthesis. Call the call or streamingCall method of the SpeechSynthesizer object to perform speech synthesis.

  6. Return the SpeechSynthesizer object. After the speech synthesis task completes, return the SpeechSynthesizer object for reuse by later tasks.

    Important

    Do not return objects from incomplete or failed tasks.

       CosyvoiceObjectPool.getInstance().returnObject(synthesizer);

Complete code

import com.alibaba.dashscope.audio.tts.SpeechSynthesisResult;
import com.alibaba.dashscope.audio.ttsv2.SpeechSynthesisAudioFormat;
import com.alibaba.dashscope.audio.ttsv2.SpeechSynthesisParam;
import com.alibaba.dashscope.audio.ttsv2.SpeechSynthesizer;
import com.alibaba.dashscope.common.ResultCallback;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.Constants;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.pool2.BasePooledObjectFactory;
import org.apache.commons.pool2.PooledObject;
import org.apache.commons.pool2.impl.DefaultPooledObject;
import org.apache.commons.pool2.impl.GenericObjectPool;
import org.apache.commons.pool2.impl.GenericObjectPoolConfig;

import java.time.LocalDateTime;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.Lock;

/**
 * Your project must include org.apache.commons.pool2 and DashScope packages.
 *
 * DashScope SDK version 2.16.6 and later are optimized for high-concurrency scenarios.
 * DashScope SDK versions earlier than 2.16.6 are not recommended for high-concurrency use.
 *
 *
 * Before making high-concurrency calls to the TTS service,
 * configure connection pool parameters using the following environment variables:
 *
 * DASHSCOPE_MAXIMUM_ASYNC_REQUESTS
 * DASHSCOPE_MAXIMUM_ASYNC_REQUESTS_PER_HOST
 * DASHSCOPE_CONNECTION_POOL_SIZE
 *
 */

class SpeechSynthesizerObjectFactory
        extends BasePooledObjectFactory<SpeechSynthesizer> {
    public SpeechSynthesizerObjectFactory() {
        super();
    }
    @Override
    public SpeechSynthesizer create() throws Exception {
        return new SpeechSynthesizer();
    }

    @Override
    public PooledObject<SpeechSynthesizer> wrap(SpeechSynthesizer obj) {
        return new DefaultPooledObject<>(obj);
    }
}

class CosyvoiceObjectPool {
    public static GenericObjectPool<SpeechSynthesizer> synthesizerPool;
    public static String COSYVOICE_OBJECTPOOL_SIZE_ENV = "COSYVOICE_OBJECTPOOL_SIZE";
    public static int DEFAULT_OBJECT_POOL_SIZE = 500;
    private static Lock lock = new java.util.concurrent.locks.ReentrantLock();
    public static int getObjectivePoolSize() {
        try {
            Integer n = Integer.parseInt(System.getenv(COSYVOICE_OBJECTPOOL_SIZE_ENV));
            System.out.println("Using Object Pool Size In Env: "+ n);
            return n;
        } catch (NumberFormatException e) {
            System.out.println("Using Default Object Pool Size: "+ DEFAULT_OBJECT_POOL_SIZE);
            return DEFAULT_OBJECT_POOL_SIZE;
        }
    }
    public static GenericObjectPool<SpeechSynthesizer> getInstance() {
        lock.lock();
        if (synthesizerPool == null) {
            // You can set the object pool size here. Or set it via the COSYVOICE_OBJECTPOOL_SIZE environment variable.
            // Recommended value: 1.5× to 2× your server's maximum concurrent connections.
            int objectPoolSize = getObjectivePoolSize();
            SpeechSynthesizerObjectFactory speechSynthesizerObjectFactory =
                    new SpeechSynthesizerObjectFactory();
            GenericObjectPoolConfig<SpeechSynthesizer> config =
                    new GenericObjectPoolConfig<>();
            config.setMaxTotal(objectPoolSize);
            config.setMaxIdle(objectPoolSize);
            config.setMinIdle(objectPoolSize);
            synthesizerPool =
                    new GenericObjectPool<>(speechSynthesizerObjectFactory, config);
        }
        lock.unlock();
        return synthesizerPool;
    }
}

class SynthesizeTaskWithCallback implements Runnable {
    String[] textArray;
    String requestId;
    long timeCost;
    public SynthesizeTaskWithCallback(String[] textArray) {
        this.textArray = textArray;
    }
    @Override
    public void run() {
        SpeechSynthesizer synthesizer = null;
        long startTime = System.currentTimeMillis();
        // if recv onError
        final boolean[] hasError = {false};
        try {
            class ReactCallback extends ResultCallback<SpeechSynthesisResult> {
                ReactCallback() {}

                @Override
                public void onEvent(SpeechSynthesisResult message) {
                    if (message.getAudioFrame() != null) {
                        try {
                            byte[] bytesArray = message.getAudioFrame().array();
                            System.out.println("Received audio. Audio stream length: " + bytesArray.length);
                        } catch (Exception e) {
                            throw new RuntimeException(e);
                        }
                    }
                }

                @Override
                public void onComplete() {}

                @Override
                public void onError(Exception e) {
                    System.out.println(e.getMessage());
                    e.printStackTrace();
                    hasError[0] = true;
                }
            }

            // Replace your-dashscope-api-key with your own API key
            String dashScopeApiKey = "your-dashscope-api-key";

            SpeechSynthesisParam param =
                    SpeechSynthesisParam.builder()
                            .model("cosyvoice-v3-flash")
                            .voice("longanyang")
                            // API keys differ between the Singapore and Beijing regions. Get your API key: https://www.alibabacloud.com/help/zh/model-studio/get-api-key
                            // If you do not set the environment variable, replace the line below with: .apiKey("sk-xxx")
                            .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                            .format(SpeechSynthesisAudioFormat
                                    .MP3_22050HZ_MONO_256KBPS) // Use PCM or MP3 for streaming synthesis
                            .apiKey(dashScopeApiKey)
                            .build();

            try {
                synthesizer = CosyvoiceObjectPool.getInstance().borrowObject();
                synthesizer.updateParamAndCallback(param, new ReactCallback());
                for (String text : textArray) {
                    synthesizer.streamingCall(text);
                }
                Thread.sleep(20);
                synthesizer.streamingComplete(60000);
                requestId = synthesizer.getLastRequestId();
            } catch (Exception e) {
                System.out.println("Exception e: " + e.toString());
                hasError[0] = true;
            }
        } catch (Exception e) {
            hasError[0] = true;
            throw new RuntimeException(e);
        }
        if (synthesizer != null) {
            try {
                if (hasError[0] == true) {
                    // If an error occurs, close the connection and invalidate the object in the pool.
                    synthesizer.getDuplexApi().close(1000, "bye");
                    CosyvoiceObjectPool.getInstance().invalidateObject(synthesizer);
                } else {
                    // If the task completes normally, return the object.
                    CosyvoiceObjectPool.getInstance().returnObject(synthesizer);
                }
            } catch (Exception e) {
                throw new RuntimeException(e);
            }
            long endTime = System.currentTimeMillis();
            timeCost = endTime - startTime;
            System.out.println("[Thread " + Thread.currentThread() + "] Speech synthesis task completed. Time taken: " + timeCost + " ms, RequestId: " + requestId);
        }
    }
}

@Slf4j
public class SynthesizeTextToSpeechWithCallbackConcurrently {
    public static void checkoutEnv(String envName, int defaultSize) {
        if (System.getenv(envName) != null) {
            System.out.println("[ENV CHECK]: " + envName + " "
                    + System.getenv(envName));
        } else {
            System.out.println("[ENV CHECK]: " + envName
                    + " Using Default which is " + defaultSize);
        }
    }

    public static void main(String[] args)
            throws InterruptedException, NoApiKeyException {
        // The following URL is for the Singapore region. If you use models in the Beijing region, replace it with: wss://dashscope.aliyuncs.com/api-ws/v1/inference
        Constants.baseWebsocketApiUrl = "wss://dashscope-intl.aliyuncs.com/api-ws/v1/inference";
        // Check for connection pool env
        checkoutEnv("DASHSCOPE_CONNECTION_POOL_SIZE", 32);
        checkoutEnv("DASHSCOPE_MAXIMUM_ASYNC_REQUESTS", 32);
        checkoutEnv("DASHSCOPE_MAXIMUM_ASYNC_REQUESTS_PER_HOST", 32);
        checkoutEnv(CosyvoiceObjectPool.COSYVOICE_OBJECTPOOL_SIZE_ENV, CosyvoiceObjectPool.DEFAULT_OBJECT_POOL_SIZE);

        int runTimes = 3;
        // Create the pool of SpeechSynthesis objects
        ExecutorService executorService = Executors.newFixedThreadPool(runTimes);

        for (int i = 0; i < runTimes; i++) {
            // Record the task submission time
            LocalDateTime submissionTime = LocalDateTime.now();
            executorService.submit(new SynthesizeTaskWithCallback(new String[] {
                    "Before my bed, moonlight gleams,", "It seems like frost upon the ground.", "I lift my gaze to watch the bright moon,", "Then bow my head, thinking of home."}));
        }

        // Shut down the ExecutorService and wait for all tasks to complete
        executorService.shutdown();
        executorService.awaitTermination(1, TimeUnit.MINUTES);
        System.exit(0);
    }
}

Recommended ECS configurations

The following configurations are based on tests where only CosyVoice runs on ECS instances of these specs. Exceeding limits may cause delays.

Single-machine concurrency refers to the number of CosyVoice tasks running simultaneously (equivalent to worker threads).

Audio format affects bandwidth. At 200 concurrent connections, PCM requires significantly more bandwidth than MP3. Use compressed formats like MP3 to reduce network overhead.

Machine spec (Alibaba Cloud ECS)

Max single-machine concurrency

Object pool size

Connection pool size

4 vCPUs, 8 GiB memory

100

500

2000

8 vCPUs, 16 GiB memory

150

500

2000

16 vCPUs, 32 GiB memory

200

500

2000

Resource management and error handling

  • Successful task: Call the returnObject method of GenericObjectPool to return the SpeechSynthesizer object to the pool for reuse. In the sample code, this corresponds to CosyvoiceObjectPool.getInstance().returnObject(synthesizer).

    Important

    Do not return SpeechSynthesizer objects from incomplete or failed tasks.

  • Failed task: If an exception from the SDK or your business logic interrupts a task, perform both of the following actions:

    1. Manually close the underlying WebSocket connection.

    2. Invalidate the object in the object pool to prevent reuse.

      // In the current code, this corresponds to:
      // Close the connection
      synthesizer.getDuplexApi().close(1000, "bye");
      // Invalidate the faulty synthesizer in the pool
      CosyvoiceObjectPool.getInstance().invalidateObject(synthesizer);
  • No extra action is needed when the service returns a TaskFailed error.

Pre-warming and timing metrics

Pre-warm the system before testing concurrent calls. Pre-warming ensures metrics reflect stable-state performance, excluding one-time connection setup costs.

Connection reuse mechanism

The DashScope Java SDK uses a global singleton connection pool to manage and reuse WebSocket connections, reducing frequent connection establishment and teardown and improving throughput in high-concurrency scenarios.

Key behaviors:

  • On-demand creation: The SDK creates connections on-demand (on first call), not at startup.

  • Time-limited reuse: After a request completes, the connection remains in the pool for up to 60 seconds for reuse.

    • Within 60 seconds, new requests reuse the existing connection, avoiding handshake overhead.

    • After 60 seconds idle, connections close automatically to free resources.

Why pre-warming matters

The connection pool contains no reusable active connections in the following cases:

  • The application just started with no calls made yet.

  • The service was idle for 60+ seconds, and all connections timed out.

In these cases, the first requests trigger full WebSocket connection setup, including TCP/TLS/WebSocket handshakes. Their end-to-end latency is much higher than later requests that reuse connections. This extra time comes from connection initialization, not service processing. Without pre-warming, test results include initial setup time and don't reflect real performance.

Best practice

To collect reliable performance data, follow these pre-warming steps before formal stress testing or latency measurement:

  1. Simulate your test's concurrency level. Send warm-up calls (1-2 min) to populate the connection pool.

  2. Confirm the pool has sufficient active connections before starting data collection.

Proper pre-warming brings the SDK connection pool to a stable reuse state, yielding latency metrics that reflect real-world production performance.

Common Java SDK errors

Exception 1: The service traffic is stable, but the number of TCP connections on the server continues to increase

Cause:

Type 1:

Each SDK object requests a connection upon creation. If you do not use an object pool, the object is destroyed after each task completes. This action leaves the connection in an unreferenced state, and it is disconnected only after the server-side connection timeout of 61 seconds. Consequently, the connection cannot be reused during this 61-second period.

In high-concurrency scenarios, a new task creates a new connection if no reusable connections are available. This leads to the following issues:

  1. The number of connections continues to increase.

  2. Server performance degrades because an excessive number of connections consumes available server resources.

  3. The connection pool becomes full, and new tasks are blocked while they wait for available connections.

Type 2:

The `MaxIdle` parameter of the object pool is set to a value that is smaller than the `MaxTotal` parameter. As a result, when the pool has idle objects, any objects that exceed the `MaxIdle` limit are destroyed. This process can cause connection leaks. These leaked connections are disconnected only after a 61-second timeout. Similar to the Type 1 cause, this leads to a continuous increase in the number of connections.

Solution:

For the Type 1 cause, use an object pool.

For the Type 2 cause, check the object pool configuration parameters. Set `MaxIdle` and `MaxTotal` to the same value, and disable the automatic object pool destruction policy.

Exception 2: The task takes 60 seconds longer than a normal call

The cause is the same as for Exception 1. The connection pool has reached its maximum number of connections. A new task must wait 61 seconds for an unreferenced connection to time out before the task can obtain a new connection.

Exception 3: Tasks are slow when the service starts and then gradually return to normal

Cause:

During high-concurrency calls, a single object reuses its WebSocket connection for multiple tasks. Therefore, a WebSocket connection is typically created only when the service starts. Note that if high-concurrency calls are initiated immediately during the task startup stage, creating too many WebSocket connections at the same time may cause blocking.

Solution:

Gradually increase the concurrency, or add prefetch tasks after the service starts.

Exception 4: The server reports the "Invalid action('run-task')! Please follow the protocol!" error

Cause:

When a client-side error occurs, the server is not notified, and the connection remains in a task-in-progress state. If this connection and its associated object are then reused for a new task, a protocol error occurs, which causes the new task to fail.

Solution:

After a client-side exception is thrown, you must explicitly close the WebSocket connection and then return the object to the object pool.

Exception 5: The service traffic is stable, but the call volume has abnormal spikes

Cause:

Creating too many WebSocket connections at the same time causes blocking. Because incoming service traffic continues, a short-term backlog of tasks is created. After the blocking is resolved, all backlogged tasks are called at once. This causes a spike in call volume that can momentarily exceed the concurrency limit for your Alibaba Cloud account, which can result in task failures, server performance degradation, and other issues.

Creating too many WebSocket connections at once typically occurs in the following scenarios:

  • During the service startup stage

  • A network exception occurs that causes many WebSocket connections to be interrupted and reconnected at the same time.

  • Many server-side errors occur at the same time, which leads to many WebSocket reconnections. A common error occurs when the concurrency exceeds the account limit ("Requests rate limit exceeded, please try again later.").

Solution:

  1. Check your network conditions.

  2. Check whether many other server-side errors occurred before the spike.

  3. Increase the concurrency limit for your Alibaba Cloud account.

  4. Reduce the sizes of the object pool and connection pool. You can also limit the maximum concurrency using the upper limit of the object pool.

  5. Upgrade your server configuration or increase the number of servers.

Exception 6: All tasks slow down as the concurrency increases

Solution:

  1. Check whether you have reached the network bandwidth limit.

  2. Check whether the actual concurrency is too high for your server's specifications.