All Products
Search
Document Center

OpenSearch:Data synchronization latency caused by multi-table joins

Last Updated:Feb 13, 2023

OpenSearch supports JOIN operations on multiple tables.

If ApsaraDB RDS for MySQL or PolarDB for MySQL data sources are configured for multiple tables and automatic incremental data synchronization is enabled by using Data Transmission Service (DTS), the data in the primary table and secondary tables can be synchronized to OpenSearch. To ensure a low latency of data synchronization, the following limits are imposed:

Note

  1. The updates on primary and secondary tables cannot exceed 1,500 transactions per second (TPS). Otherwise, the data in the primary and secondary tables may not be synchronized in real time.

  2. The updates on the primary table that are triggered by the updates on secondary tables cannot exceed 1,000 TPS. Otherwise, the data synchronization from the primary table and secondary tables may be delayed.

  3. When the ratio of primary table records to secondary table records is N:1, we recommend that the value of N be less than or equal to 10.

JOIN operations on primary and secondary tables

The following figure shows how a primary table and a secondary table are joined.

image

Update operations on secondary tables

The following figure shows a primary table and a secondary table. The ratio of the primary table records to the secondary table records is N:1.

image

As shown in the preceding figure:

  • After a JOIN operation is performed on the primary table and the secondary table, a wide table is generated. All update operations are performed on the wide table.

  • The primary key of the primary table is unique. When the data record that corresponds to a primary key ID is updated in the primary table, the corresponding data record in the wide table is also updated.

  • The ratio of the primary table records to the secondary table records is N:1. Multiple data records in the primary table may correspond to a primary key ID in the secondary table.

    When the data record that corresponds to a primary key ID is updated in the secondary table,

    multiple data records in the wide table may also be updated.

Conclusion:

  • If a large amount of data in a secondary table is updated, the updates on the primary table may be delayed. To ensure that the primary table can be updated at low latency, the updates on the primary table that are triggered by the updates on secondary tables cannot exceed 1,500 TPS.

  • If the updates on the primary table that are triggered by the updates on secondary tables exceed 1,500 TPS, the speed of the updates on secondary tables is limited. This way, the secondary tables may be updated at high latency. If a large amount of data is updated in the secondary tables, the update latency is higher.

  • When the ratio of primary table records to secondary table records is N:1, we recommend that the value of N be less than or equal to 10.

Related suggestions

  • If the latency of primary table updates can be tolerated, you can submit a ticket to increase the maximum TPS of the updates on secondary tables.

  • If the latency of secondary table updates can be tolerated and the latency of primary table updates that are triggered by secondary table updates is high, you can submit a ticket to set a limit on the maximum TPS of the updates on secondary tables.