アプリケーションがPolarDB-Xインスタンスに接続するための接続プールの設定 - PolarDB

このトピックでは、アプリケーションがPolarDB-Xインスタンスに接続するために必要な接続数を計算する方法について説明します。

概要

アプリケーションがPolarDB-Xインスタンスに接続して操作を実行すると、PolarDB-Xインスタンスへの次のタイプの接続が確立されます。

フロントエンド接続: PolarDB-Xインスタンスのコンピュートノードの論理データベースに接続するためにアプリケーションが確立する接続。
バックエンド接続: PolarDB-Xインスタンスの計算ノードが、PolarDB-Xインスタンスのバックエンドデータノードの物理データベースに接続するために確立する接続。

バックエンド接続は、計算ノードによって管理されます。システムは、TCPプロトコルの代わりに独自のプロトコルを使用してバックエンド接続を確立します。バックエンド接続のプロトコルを指定する必要はなく、バックエンド接続はアプリケーションに対して透過的です。フロントエンド接続は、アプリケーションによって確立および管理されます。このトピックでは、主にフロントエンド接続の管理方法について説明します。

説明以下のセクションでは、接続という用語は、フロントエンド接続という用語を指す。

QPSとRTに基づいて必要な接続を計算する

1秒あたりのクエリ (QPS) と応答時間 (RT) は、アプリケーションが必要とするデータベースのパフォーマンスを測定するために使用される2つのメトリックです。 QPSは、アプリケーションが1秒間に送信するクエリ要求の数を示します。 RTは、システムが単一のステートメントを処理するのに必要な期間を示します。 RTは、実行するSQL文の複雑さとスキャンするデータの量によって異なります。オンライントランザクション処理 (OLTP) システムではRTが低く、デフォルトではミリ秒単位で測定されます。

PolarDB-XはMySQLプロトコルと互換性があります。システムは、単一の接続を介して要求をシリアルに処理します。システムは、異なる接続上の要求を並列に処理することができる。次の式を使用して、1つの接続の最大QPSとアプリケーションに必要な接続数を計算できます。

単一接続の最大QPS=1000/RT
説明 QPSは、1回の接続で1秒間に送信されるクエリ要求の数を指定します。 RT値の単位はミリ秒である。 1000は、1秒に等しい1,000ミリ秒を示す。
接続数=アプリケーションが単一の計算ノードにアクセスするための最大QPS /単一接続の最大QPS

たとえば、平均RTは5ミリ秒で、1つの接続で1秒あたりに送信できるクエリ要求の最大数は200です。アプリケーションが約5,000のQPSを実行する場合、少なくとも25の接続が必要です。

接続数の制限

アプリケーションは、PolarDB-Xインスタンスのネットワークモジュールを使用してのみ、PolarDB-Xインスタンスに接続します。理論的には、接続の最大数は、PolarDB-Xインスタンスのコンピューティングノードの使用可能なメモリとネットワーク接続の数に基づいて決定されます。実際には、アプリケーションは、クエリ要求を送信するための接続を確立する。最適なパフォーマンスは、接続数がクエリの実行に割り当てられたスレッド数と一致する場合にのみ達成できます。

The preceding figure shows that after an application sends a request to establish a connection, the network module of the PolarDB-X instance verifies the identity of the application. If the verification passed, a connection is established. The method in which PolarDB-X processes a query request is similar to the method in which MariaDB processes a query request. If one of the compute nodes in the PolarDB-X instance receives a query request, the compute node attempts to allocate a thread from the thread pool to process the query request. By default, a thread pool for a single compute node contains 1,024 threads. If the number of concurrent query requests exceeds 1,024, the excessive query requests are queued in a waiting queue. You can use the following formulas to calculate the maximum QPS supported by a single compute node and the maximum QPS supported by a PolarDB-X database that is used for your application:

Maximum QPS supported by a single compute node = Maximum QPS for a single connection × MIN(Number of connections, Number of threads in the thread pool).
Maximum QPS supported by a PolarDB-X database = Maximum QPS for a single connection × MIN(Number of connections, Number of threads in the thread pool) × Number of compute nodes.

The following examples describe how to calculate the maximum values based on the formulas:

Example 1
Scenario: The average response time of a PolarDB-X instance for queries is 10 milliseconds, and the PolarDB-X instance contains two compute nodes. What is the maximum QPS that the PolarDB-X instance can support?
If the average response time for queries is 10 milliseconds, the maximum QPS for a single connection can be calculated based on the following equation: 1,000/10 = 100. If no CPU bottlenecks occur, a PolarDB-X instance that contains two compute nodes can support a maximum QPS of 204,800. The number 204,800 is calculated based on the following equation: 100 × 1,024 × 2 = 204,800. Note: The number of query requests that a compute node can process in parallel is determined based on the specification of the compute node and the complexity of the queries. In practice, the maximum QPS is less than 204,800 because each compute node cannot use all the 1,024 threads to process queries in parallel.
Example 2
Scenario: A stress test for an application is performed on a PolarDB-X instance that contains a compute node of 16 CPU cores. The result of the test shows that the average response time for queries is 5 milliseconds when the CPU utilization of the compute node is 100%. If the compute nodes of a PolarDB-X instance are required to support a maximum QPS of 400,000, how many compute nodes of 16 CPU cores are required for the PolarDB-X instance and how many connections are required for the connection pool of the application?
If the average response time of queries is 5 milliseconds, the maximum QPS for a single connection is 200 that is calculated based on the following equation: 1,000/5 = 200. You can set the number of connections in the connection pool of the application to 2,000 to help minimize excessive overheads. The value 2000 is calculated based on the following equation: 400,000/200 = 2,000. To ensure that the number of threads that run in parallel on a single compute node does not exceed 1,024, you must purchase a PolarDB-X instance that contains two 16-core compute nodes.

Druidを使用してPolarDB-Xデータベースの接続プールを構成する

接続プールを使用して、データベースの接続を一元管理できます。接続プールには、次の利点があります。

応答効率の改善: 接続の初期化が完了すると、すべてのリクエストは既存の接続を使用できます。これにより、接続初期化と接続解除のためのリソースのオーバーヘッドが削減され、システムの応答効率が向上します。
リソースの再利用: 接続を再利用できます。システムは、頻繁に接続を確立および解放する必要はない。したがって、システムの性能オーバーヘッドが低減される。システムの安定性も改善される。
接続リーク防止: 接続プールは、指定したポリシーに基づいて接続を強制的に割り当て解除します。これは接続漏れを防ぐのに役立ちます。

アプリケーションがJavaプログラミング言語を使用して開発されている場合は、Druid接続プールを使用することを推奨します。 DruidライブラリはV1.1.11以降である必要があります。詳細については、「Druid接続プール」をご参照ください。

次のサンプルコードは、Druid接続プールの標準Spring構成を示しています。

<bean id="dataSource" class="com.alibaba.druid.pool.DruidDataSource" init-method="init" destroy-method="close">
        <property name="driverClassName" value="com.mysql.jdbc.Driver" />
        <!-- URL、username、passwordの基本プロパティを指定します。 -->
        <property name="url" value="jdbc:mysql:// ip:port/db?autoReconnect=true&rewriteBatchedStatements=true&socketTimeout=30000&connectTimeout=3000" />
        <property name="username" value="root" />
        <property name="password" value="123456" />
        <!-- 接続プールの初期サイズ、最小サイズ、最大サイズを指定します。 -->
        <property name="maxActive" value="20" />
        <property name="initialSize" value="3" />
        <property name="minIdle" value="3" />
        <!-- 接続確立のタイムアウト期間を指定します。 -->
        <property name="maxWait" value="60000" />
        <!-- システムが閉じる必要のあるアイドル接続を検出する間隔を指定します。 単位：ミリ秒。 -->
        <property name="timeBetweenEvictionRunsMillis" value="60000" />
        <!-接続プールで接続をアイドル状態に保つことができる最小期間を指定します。 単位: ミリ秒。->
        <property name="minEvictableIdleTimeMillis" value="300000" />
        <!-- 接続が利用可能かどうかを確認するために使用されるSQL文を指定します。 -->
        <property name="validationQuery" value="select 'z' from dual" />
        <!-- アイドル接続を検出するかどうかを指定します。 -->
        <property name="testWhileIdle" value="true" />
        <!-- システムが接続を借用する前に接続のステータスを確認するかどうかを指定します。 -->
        <property name="testOnBorrow" value="false" />
        <!-- システムが接続を返す前に接続のステータスを確認するかどうかを指定します。 -->
        <property name="testOnReturn" value="false" />
        <!-- 各接続の有効期間を指定します。 システムは、有効期間が終了した接続を自動的に閉じます。 このパラメーターを指定すると、バックエンドノードの負荷を分散できます。 -->
        <property name="phyTimeoutMillis" value="600000" />
        <!-- 各接続を介して送信できるSQLクエリ要求の最大数を指定します。 接続を介して送信されるSQLクエリ要求の数がこのパラメーターの値に達すると、システムは接続を閉じます。 このパラメーターを指定すると、バックエンドノードの負荷を分散できます。-->
        <property name="phyMaxUseCount" value="10000" />
    </bean>

接続プールが負荷分散に及ぼす影響

When you use a connection pool of long-lived TCP connections for an application, the service efficiency of the application is improved. In specific scenarios, the connection pool may affect distributed load balancing in a negative manner and may result in unbalanced loads on compute nodes.

Unbalanced loads caused by surging connections
If an application establishes a large number of connections in a short period, the corresponding load balancer cannot update the statistics of the connections in real time. This may cause an issue that specific compute nodes manage excessive connections. At the same time, connection pooling is implemented. As a result, the overall performance of the system is reduced because loads on specific compute nodes are higher than loads on other compute nodes.
Unbalanced loads caused by liveness detection exceptions during load balancing
A load balancer uses the liveness detection feature to determine whether a compute node is normal. If the liveness detection feature becomes abnormal, the system may distribute fewer connections to specific compute nodes. At the same time, connection pooling is implemented. As a result, the overall performance of the system is reduced because loads on specific compute nodes are lower than loads on other compute nodes.

You can specify the phyTimeoutMillis parameter or phyMaxUseCount parameter for your Druid connection pool to update the connections in the Druid connection pool on a regular basis. For example, you can set the value of the phyMaxUseCount parameter to 10000 or set the value of the phyTimeoutMillis parameter to 600000. This way, you can resolve the preceding issues and maintain the system performance. We recommend that you specify the phyTimeoutMillis parameter and phyMaxUseCount parameter for your Druid connection pool.

接続プールとスレッドプールの設定方法

ほとんどの場合、アプリケーションがデータベースに対してクエリを実行すると、アプリケーションは複数のスレッドを作成します。各スレッドは、データベースへの接続を取得し、クエリを実行します。スレッドプールを使用してアプリケーションのスレッドを管理し、スレッドの作成とスレッドのリリースに必要なオーバーヘッドを減らすことができます。スレッドの最大数は、スレッドプールにとって重要な要素です。ビジネス要件に基づいてスレッドの最大数を変更できます。

理論的には、クエリのRTが同様の値である場合、「QPSとRTに基づいて必要な接続を計算する」セクションで説明されている式を使用して、接続プールサイズの妥当な値を計算できます。次のルールに基づいて、スレッドの最大数を決定することもできます。1つのデータベース接続で1つのスレッドが使用されます。実際には、ホットポット、ロック、データスキューなどの要因により、クエリのRTが増加する場合があります。特定のケースでは、接続が中断され得る。理想的なシナリオに基づいて接続プールとスレッドプールを構成すると、クエリが遅いために接続プールとスレッドプールのリソースが使い果たされる可能性があります。この場合、アプリケーションのサービスが中断され、アプリケーションに関連付けられているシステムも悪影響を受けます。この問題を防ぐために、接続数とスレッドの最大数を計算値の2倍に1.5値に設定することを推奨します。