Precautions and limits for synchronizing data from a Redis database - Data Transmission Service

This topic describes the precautions and limits that you must take note of when you synchronize data from a Redis database, such as a self-managed Redis database or an ApsaraDB for Redis database. To ensure that your data synchronization task runs as expected, read the precautions and limits before you configure the task.

Scenarios of synchronizing data from a Redis database

The following list provides the scenarios of synchronizing data from a Redis database. The precautions and limits in the scenarios may vary. You can go to the related section to view the precautions and limits in a specific scenario.

Configure two-way synchronization between Redis databases
Configure one-way synchronization between Redis databases

Configure two-way synchronization between Redis databases

Category	Description
Limits on the source database	To ensure the synchronization quality, Data Transmission Service (DTS) adds a key prefixed with DTS_REDIS_TIMESTAMP_HEARTBEAT to the source database. This key is used to record the time when data is synchronized to the destination database. If the source database is deployed in a cluster architecture, DTS adds this key to each shard. The key is filtered out during data synchronization. After the data synchronization task is complete, the key expires. If the source database is a read-only database or the source database account that is used to run the data synchronization task does not have the permissions to run the SETEX command, the reported latency may be inaccurate. To ensure the stability of data synchronization, we recommend that you increase the value of the `repl-backlog-size` parameter in the redis.conf file. We recommend that you do not run the `FLUSHDB` or `FLUSHALL` command in the source database. If you run one of the commands, data inconsistency may occur between the source and destination databases. You must enable the append-only file (AOF) logging feature for the source database. If an expiration policy is enabled for specific keys in the source database, these keys may not be deleted at the earliest opportunity after they expire. Therefore, the number of keys in the destination database may be less than that in the source database. You can run the INFO command to view the number of keys in the destination database. Limits on synchronizing data from a standalone Redis instance to a Redis cluster: Each command can be run only on a single slot in a Redis cluster. If you perform operations on multiple keys in the source database and the keys belong to different slots, the following error occurs: `CROSSSLOT Keys in request don't hash to the same slot` We recommend that you perform operations on only one key during data synchronization. This prevents the data synchronization task from being interrupted.
Other limits	During data synchronization, if the number of shards in the source Redis database is increased or decreased, or if you change the database specifications, such as scaling up the memory capacity, you must reconfigure the data synchronization task. To ensure data consistency, we recommend that you clear the data that has been synchronized to the destination Redis database before you reconfigure the data synchronization task. During data synchronization, if the endpoint of the source Redis database is changed, you must reconfigure the data synchronization task. To ensure compatibility, the version of the destination database must be the same as or later than that of the source database. If the version of the destination database is earlier than that of the source database, database compatibility issues may occur. During initial full data synchronization, DTS uses the read and write resources of the source and destination databases. This may increase the loads on the database servers. Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. If the source or destination instance resides in a region outside the Chinese mainland, two-way synchronization is supported only between instances within the same region. For example, if a Tair instance resides in the Japan (Tokyo) region, data can be synchronized only within the Japan (Tokyo) region and cannot be synchronized to or from the Germany (Frankfurt) region in two-way synchronization scenarios. During data synchronization, we recommend that you use only DTS to write data to the destination database. This prevents data inconsistency between the source and destination databases. If a table is synchronized in both the forward and reverse synchronization and both the full data and incremental data of the table are synchronized in the forward synchronization, DTS synchronizes only the incremental data of the table in the reverse synchronization. If the destination instance is deployed in a cluster architecture and the amount of memory used by a shard in the destination instance reaches the upper limit, or if the available storage space of the destination instance is insufficient, the data synchronization task fails due to out of memory (OOM). By default, the maxmemory-policy parameter that specifies how data is evicted is set to volatile-lru for ApsaraDB for Redis instances. If the destination instance has insufficient memory, data inconsistency may occur between the source and destination instances due to data eviction. In this case, the data synchronization task does not stop running. To prevent data inconsistency, we recommend that you set maxmemory-policy to noeviction for the destination instance. This way, the data synchronization task fails if the destination instance has insufficient memory, but data loss can be prevented for the destination instance. Note For more information about data eviction policies, see How does ApsaraDB for Redis evict data by default? If the transparent data encryption (TDE) feature is enabled for the source or destination instance, you cannot use DTS to synchronize data from the source database to the destination database. During data synchronization, if resumable upload fails due to transient connections that occur on the source Redis database, full data may be re-synchronized to the destination database. This may cause data inconsistency between the source and destination databases.

Configure one-way synchronization between Redis databases

Category	Description
Limits on the source database	To ensure the synchronization quality, Data Transmission Service (DTS) adds a key prefixed with DTS_REDIS_TIMESTAMP_HEARTBEAT to the source database. This key is used to record the time when data is synchronized to the destination database. If the source database is deployed in a cluster architecture, DTS adds this key to each shard. The key is filtered out during data synchronization. After the data synchronization task is complete, the key expires. If the source database is a read-only database or the source database account that is used to run the data synchronization task does not have the permissions to run the SETEX command, the reported latency may be inaccurate. To ensure the stability of data synchronization, we recommend that you increase the value of the `repl-backlog-size` parameter in the redis.conf file. You must enable the append-only file (AOF) logging feature for the source database. If an expiration policy is enabled for specific keys in the source database, these keys may not be deleted at the earliest opportunity after they expire. Therefore, the number of keys in the destination database may be less than that in the source database. You can run the INFO command to view the number of keys in the destination database. Limits on synchronizing data from a standalone Redis instance to a Redis cluster: Each command can be run only on a single slot in a Redis cluster. If you perform operations on multiple keys in the source database and the keys belong to different slots, the following error occurs: `CROSSSLOT Keys in request don't hash to the same slot` We recommend that you perform operations on only one key during data synchronization. This prevents the data synchronization task from being interrupted. If the source database is a Tair cloud-native persistent memory-optimized instance, you must set the appendonly parameter to yes for the instance.
Other limits	The timeout period for data replication between the master and replica nodes in the source Redis instance is specified by the repl-timeout parameter. If the source instance is a self-managed Redis database, we recommend that you run the `config set repl-timeout 600` command to set this timeout period to 600 seconds. If the source database stores a large amount of data, you can increase the value of the repl-timeout parameter based on your business requirements. During data synchronization, if the number of shards in the source Redis database is increased or decreased, or if you change the database specifications, such as scaling up the memory capacity, you must reconfigure the data synchronization task. To ensure data consistency, we recommend that you clear the data that has been synchronized to the destination Redis database before you reconfigure the data synchronization task. During data synchronization, if the endpoint of the source Redis database is changed, you must reconfigure the data synchronization task. To ensure compatibility, the version of the destination database must be the same as or later than that of the source database. If the version of the destination database is earlier than that of the source database, database compatibility issues may occur. During initial full data synchronization, DTS uses the read and write resources of the source and destination databases. This may increase the loads on the database servers. Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. During data synchronization, we recommend that you use only DTS to write data to the destination database. This prevents data inconsistency between the source and destination databases. If the destination instance is deployed in a cluster architecture and the amount of memory used by a shard in the destination instance reaches the upper limit, or if the available storage space of the destination instance is insufficient, the data synchronization task fails due to out of memory (OOM). By default, the maxmemory-policy parameter that specifies how data is evicted is set to volatile-lru for ApsaraDB for Redis instances. If the destination instance has insufficient memory, data inconsistency may occur between the source and destination instances due to data eviction. In this case, the data synchronization task does not stop running. To prevent data inconsistency, we recommend that you set maxmemory-policy to noeviction for the destination instance. This way, the data synchronization task fails if the destination instance has insufficient memory, but data loss can be prevented for the destination instance. Note For more information about data eviction policies, see How does ApsaraDB for Redis evict data by default? If the transparent data encryption (TDE) feature is enabled for the source or destination instance, you cannot use DTS to synchronize data from the source database to the destination database. During data synchronization, if resumable upload fails due to transient connections that occur on the source Redis database, full data may be re-synchronized to the destination database. This may cause data inconsistency between the source and destination databases.