synchronize a self-managed redis database to alibaba cloud tair - Data Transmission Service

Data Transmission Service (DTS) supports data synchronization from a self-managed Redis database on an ECS instance to a Tair (Redis-Compatible) instance.

Warning

After you configure a data synchronization task, do not change the architecture of the source or destination database. Otherwise, the data synchronization task will fail.

Prerequisites

Create a source self-managed Redis instance and a destination Tair (Redis-Compatible) instance. For more information about how to create a Tair (Redis-Compatible) instance, see Step 1: Create an instance.
Note
- DTS currently supports only Tair (Redis-Compatible) instances that use the direct connection mode. For more information about the supported versions, see Synchronization solutions.
- DTS also supports one-way synchronization between Tair (Redis-Compatible) instances. The configuration process is similar to migrating data from a self-managed Redis database to a Tair (Redis-Compatible) instance. For more information, see the instructions in this topic.
The storage space of the destination Tair (Redis-Compatible) instance must be larger than the amount of data in the source self-managed Redis database.
If the source Redis database is deployed in a cluster architecture, each cluster node must be able to execute the PSYNC command and use the same connection password.
The replication timeout parameter, repl-timeout, between the primary and replica nodes of the source Redis instance is 60 seconds by default. We recommend that you run the config set repl-timeout 600 command to increase the timeout to 600 seconds. If the source database contains a large amount of data, you can increase the value of the repl-timeout parameter as needed.
The source account must have the PSYNC and SYNC permissions.

Precautions

Category	Description
Source database limits	Do not run the `FLUSHDB` and `FLUSHALL` commands in the source database. Otherwise, data inconsistency occurs between the source and destination databases. If the `bind` parameter is configured in the redis.conf file of the source database, set this parameter to the private IP address of the ECS instance to ensure that DTS can connect to the source database. To ensure synchronization quality, DTS inserts a key with the prefix `DTS_REDIS_TIMESTAMP_HEARTBEAT` into the source database to record update timestamps. If the source database uses a cluster architecture, DTS inserts this key into each shard. This key is filtered out during synchronization and expires after the sync task ends. If the source database is a read-only instance or the DTS account does not have write (SETEX) permission, the reported latency may be inaccurate. To ensure the stability of the synchronization link, we recommend that you increase the value of the `repl-backlog-size` parameter in the redis.conf file of the source Redis database. If an expiration policy is enabled for specific keys in the source database, these keys may not be deleted at the earliest opportunity after they expire. Therefore, the number of keys in the destination database may be less than that in the source database. You can run the INFO command to view the number of keys in the destination database. Note The number of keys that do not have the expiration policy enabled or have not expired is the same between the source and destination databases. Limits on synchronizing data from a Basic Edition Redis instance to a cluster architecture Redis instance: A cluster allows a command to operate on only a single slot. If you run a command that operates on multiple keys in different slots, the following error occurs: `CROSSSLOT Keys in request don't hash to the same slot` During DTS synchronization, run only single-key commands to prevent the synchronization link from breaking.
Other limits	We recommend that the source and destination databases have the same version, or that you synchronize data from an earlier version to a later version to ensure compatibility. If you synchronize data from a later version to an earlier version, database compatibility issues may occur. If the source Redis instance is scaled in or out (shards are added or removed) or its specifications are changed (memory is scaled up) during synchronization, you must reconfigure the task. To ensure data consistency, clear the data in the destination Redis instance before you reconfigure the task. If the source or destination instance is a self-managed Redis instance (the Access Method is not Alibaba Cloud Instance) and its endpoint changes during synchronization, for example, due to instance migration or a primary/secondary failover, the sync task may retry, experience latency, fail, or even cause data inconsistency. Check the status of the sync task promptly. If the DTS task retries, experiences latency, fails, or becomes abnormal, reconfigure the sync task. If an instance migration, such as a primary/secondary failover, is triggered on the destination Redis instance, data might be written only to the memory and not to the secondary database. This can cause data loss. DTS uses the resources of the source and destination instances during initial full data synchronization. This may increase the loads on the database servers. If you synchronize a large volume of data or if the server specifications cannot meet your requirements, database services may become unavailable. Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination instances. We recommend that you synchronize data during off-peak hours. If you configure data synchronization between Redis clusters, do not run the `FLUSHDB` and `FLUSHALL` commands in the source cluster. Otherwise, data inconsistency occurs between the source and destination databases. If the destination database is out of memory and data eviction is triggered, data inconsistency may occur between the source and destination databases. This is because the default data eviction policy (maxmemory-policy) of Tair (Redis-Compatible) is volatile-lru. This does not affect the normal operation of the task. To prevent this, we recommend that you set the data eviction policy of the destination database to noeviction. If the destination database is out of memory, data writes fail and the task fails, but no data is lost in the destination database due to eviction. Note For more information about data eviction policies, see Redis data eviction policies. During DTS synchronization, do not allow data to be written to the destination database from sources other than DTS. Otherwise, data inconsistency occurs between the source and destination databases. If the destination instance is a cluster-edition instance and a shard reaches its memory limit, or if the destination instance has insufficient storage space, the DTS task fails due to an out of memory (OOM) error. If Transparent Data Encryption (TDE) is enabled for the destination instance, you cannot use DTS to synchronize data. If any of the following situations occur during data synchronization, the full data might be synchronized to the destination instance again. This can cause data inconsistency. A transient connection occurs on the source or destination Redis instance, which causes the task to fail to resume from a breakpoint. A primary/secondary failover occurs on the source or destination Redis instance. The endpoint of the source or destination Redis instance changes. The synchronization objects of the DTS instance are modified. If a Tair (Redis OSS-compatible) instance has Transport Layer Security (TLS) encryption enabled, you must connect to DTS using an SSL-encrypted connection. TLSv1.3 is not supported. You cannot connect a Tair (Redis OSS-compatible) instance that has SSL enabled to DTS as a Alibaba Cloud Instance. If a sync instance includes both full and incremental data synchronization tasks, restarting the instance may cause DTS to run the full and incremental tasks again. If an instance fails, DTS helpdesk will try to recover the instance within 8 hours. During the recovery process, operations such as restarting the instance and adjusting parameters may be performed. Note When parameters are adjusted, only the parameters of the DTS instance are modified. The parameters of the database are not modified. The parameters that may be modified include but are not limited to those described in Modify instance parameters.

Billing

Synchronization type	Task configuration fee
Full data synchronization	Free of charge.
Incremental data synchronization	Charged. For more information, see Billing overview.

Supported synchronization topologies

One-way one-to-one synchronization
One-way one-to-many synchronization
One-way cascade synchronization

For more information, see Synchronization topologies.

Operations that can be synchronized

APPEND
BITOP, BLPOP, BRPOP, and BRPOPLPUSH
DECR, DECRBY, and DEL
EVAL, EVALSHA, EXEC, EXPIRE, and EXPIREAT
GEOADD and GETSET
HDEL, HINCRBY, HINCRBYFLOAT, HMSET, HSET, and HSETNX
INCR, INCRBY, and INCRBYFLOAT
LINSERT, LPOP, LPUSH, LPUSHX, LREM, LSET, and LTRIM
MOVE, MSET, MSETNX, and MULTI
PERSIST, PEXPIRE, PEXPIREAT, PFADD, PFMERGE, and PSETEX
RENAME, RENAMENX, RESTORE, RPOP, RPOPLPUSH, RPUSH, and RPUSHX
SADD, SDIFFSTORE, SELECT, SET, SETBIT, SETEX, SETNX, SETRANGE, SINTERSTORE, SMOVE, SPOP, SREM, and SUNIONSTORE
ZADD, ZINCRBY, ZINTERSTORE, ZREM, ZREMRANGEBYLEX, ZUNIONSTORE, ZREMRANGEBYRANK, and ZREMRANGEBYSCORE
SWAPDB and UNLINK (supported only if the engine version of the source instance is 4.0)
XADD, XCLAIM, XDEL, XAUTOCLAIM, XGROUP CREATECONSUMER, and XTRIM

Note

PUBLISH operations cannot be synchronized.
If you run the EVAL or EVALSHA command to call Lua scripts, DTS cannot identify whether these Lua scripts are executed on the destination instance. This is because the destination instance does not explicitly return the execution results of Lua scripts during incremental data synchronization.
When DTS runs the SYNC or PSYNC command to transfer data of the LIST type, DTS does not clear the existing data in the destination instance. As a result, the destination instance may contain duplicate data records.

Procedure

Use one of the following methods to go to the Data Synchronization page and select the region in which the data synchronization instance resides.
DTS console
1. Log on to the DTS console.
2. In the left-side navigation pane, click Data Synchronization.
3. In the upper-left corner of the page, select the region in which the data synchronization task resides.
DMS console
Note
The actual operations may vary based on the mode and layout of the DMS console. For more information, see Simple mode and Customize the layout and style of the DMS console.
1. Log on to the DMS console.
2. In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.
3. From the drop-down list to the right of Data Synchronization Tasks, select the region in which the data synchronization instance resides.
Click Create Task to go to the task configuration page.

Configure the source and destination databases. The following table describes the parameters.

Category	Configuration	Description
None	Task Name	The name of the DTS task. DTS automatically generates a task name. We recommend that you specify a descriptive name that makes it easy to identify the task. You do not need to specify a unique task name.
Source Database	Select Existing Connection	If you use a database instance that is registered with DTS, select the instance from the drop-down list. DTS automatically populates the following database parameters for the instance. For more information, see Manage database connections. Note In the DMS console, you can select the database instance from the Select a DMS database instance drop-down list. If you fail to register the instance with DTS, or you do not need to use the instance that is registered with DTS, you must configure the following database information.
	Database Type	Select Tair/Redis.
	Access Method	Select Self-managed Database On ECS.
	Instance Region	Select the region of the ECS instance where the source Redis database resides.
	Cross-Account (Alibaba Cloud)	In this example, data is synchronized within the same Alibaba Cloud account. Select No.
	ECS Instance ID	Select the ID of the ECS instance where the source Redis database resides. Note If the source Redis database is deployed in a cluster architecture, select the ID of the ECS instance where a master node resides. You also need to manually add the IP address CIDR blocks of DTS servers in the corresponding region to the security rules of each of the other ECS instances. For more information, see Create a security group, Associate a security group with an instance (primary ENI), and Add the CIDR blocks of DTS servers to a whitelist.
	Instance Mode	Select Basic Edition or Cluster based on the architecture of the source Redis database.
	Port	Enter the service port of the source Redis database. The default value is 6379. Note If the source Redis database is deployed in a cluster architecture, enter the service port of a master node.
	Authentication Method	Select an authentication method based on your business requirements. In this example, select Password Login. Note Only Redis databases of version 6.0 or later support Account + Password Login. If you select Secret-free login, make sure that you enable the password-free access feature in the Redis database. For information about how to enable password-free access for a Tair (Redis OSS-Compatible) instance, see Enable password-free access.
	Database Password	Enter the password for connecting to the source Redis database. Note This parameter is optional. If no password is set, you can leave it empty. The database password is in the <user>:<password> format. For example, if the custom username for the Redis instance is admin and the password is Rp829dlwa, enter admin:Rp829dlwa.
	Encryption	Specifies whether to encrypt the connection to the source database. Select Non-encrypted or SSL-encrypted based on your business requirements. Note If you set Access Method is not to Alibaba Cloud Instance and select SSL-encrypted for the self-managed Redis database, you must upload a CA Certificate and enter a CA Key.
Destination Database	Select Existing Connection	If you use a database instance that is registered with DTS, select the instance from the drop-down list. DTS automatically populates the following database parameters for the instance. For more information, see Manage database connections. Note In the DMS console, you can select the database instance from the Select a DMS database instance drop-down list. If you fail to register the instance with DTS, or you do not need to use the instance that is registered with DTS, you must configure the following database information.
	Database Type	Select Tair/Redis.
	Access Method	Select Cloud Instance.
	Instance Region	Select the region where the destination Tair (Redis-Compatible) instance resides.
	Replicate Data Across Alibaba Cloud Accounts	In this example, data is synchronized within the same Alibaba Cloud account. Select No.
	Instance ID	Select the ID of the destination Tair (Redis-Compatible) instance.
	Authentication Method	Select an authentication method based on your business requirements. In this example, select Password Login. Note Only Redis databases of version 6.0 or later support Account + Password Login. If you select Secret-free login, make sure that you enable the password-free access feature in the Redis database. For information about how to enable password-free access for a Tair (Redis OSS-Compatible) instance, see Enable password-free access.
	Database Password	Enter the password for connecting to the destination Tair (Redis-Compatible) instance. Note The database password is in the <user>:<password> format. For example, if the custom username for the Tair instance is admin and the password is Rp829dlwa, enter admin:Rp829dlwa.
	Encryption	Specifies whether to encrypt the connection to the source database. Select Non-encrypted or SSL-encrypted based on your business requirements. Note If you set Access Method is not to Alibaba Cloud Instance and select SSL-encrypted for the self-managed Redis database, you must upload a CA Certificate and enter a CA Key.

In the lower part of the page, click Test Connectivity and Proceed.
Note
- Make sure that the CIDR blocks of DTS servers can be automatically or manually added to the security settings of the source and destination databases to allow access from DTS servers. For more information, see Add DTS server IP addresses to a whitelist.
- If the source or destination database is a self-managed database and its Access Method is not set to Alibaba Cloud Instance, click Test Connectivity in the CIDR Blocks of DTS Servers dialog box.

Configure the objects to be synchronized.

In the Configure Objects step, configure the objects that you want to synchronize.

Configuration	Description
Synchronization Types	Full Data Synchronization + Incremental Data Synchronization is selected by default.
Processing Mode of Conflicting Tables	Precheck and Report Errors: checks whether data exists in the destination database. If no data exists in the destination database, the precheck is passed. If data exists in the destination database, an error is returned during the precheck, and the data synchronization instance cannot be started. Ignore Errors and Proceed: skips the Check the existence of objects in the destination database. check item. Warning If you select Ignore Errors and Proceed, data loss may occur in the destination database because data records in the source database overwrite the data records that have the same keys in the destination database. Proceed with caution.
Source Objects	Select one or more objects from the Source Objects section and click the icon to add the objects to the Selected Objects section. Note You can select only databases as the objects to be synchronized. You cannot select keys as the objects to be synchronized.
Selected Objects	If you want to select a database to which data is synchronized from DB 0 to DB 255 or filter the data to be synchronized by prefix, you can use the object name mapping feature or the filtering feature. In the Selected Objects section, right-click the database that you want to synchronize. In the Edit Schema dialog box, configure the parameters. For more information, see Map object names and Specify filter conditions. Note You cannot map multiple object names at a time.

Click Next: Advanced Settings to configure advanced settings.

Configuration	Description
Dedicated Cluster for Task Scheduling	By default, DTS schedules the task to the shared cluster if you do not specify a dedicated cluster. If you want to improve the stability of data synchronization instances, purchase a dedicated cluster. For more information, see What is a DTS dedicated cluster.
Retry Time for Failed Connections	The retry time range for failed connections. If the source or destination database fails to be connected after the data synchronization task is started, DTS immediately retries a connection within the time range. Valid values: 10 to 1440. Unit: minutes. Default value: 720. We recommend that you set this parameter to a value greater than 30. If DTS reconnects to the source and destination databases within the specified time range, DTS resumes the data synchronization task. Otherwise, the data synchronization task fails. Note If you specify different retry time ranges for multiple data synchronization tasks that have the same source or destination database, the shortest retry time range takes precedence. When DTS retries a connection, you are charged for the DTS instance. We recommend that you specify the retry time range based on your business requirements. You can also release the DTS instance at your earliest opportunity after the source and destination instances are released.
Retry Time for Other Issues	The retry time range for other issues. For example, if the DDL or DML operations fail to be performed after the data synchronization task is started, DTS immediately retries the operations within the time range. Valid values: 1 to 1440. Unit: minutes. Default value: 10. We recommend that you set this parameter to a value greater than 10. If the failed operations are successfully performed within the specified time range, DTS resumes the data synchronization task. Otherwise, the data synchronization task fails. Important The value of the Retry Time for Other Issues parameter must be smaller than the value of the Retry Time for Failed Connections parameter.
Enable Throttling for Full Data Synchronization	During full data synchronization, DTS uses the read and write resources of the source and destination databases. This may increase the load on the database servers. You can configure the Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s) parameters for full data synchronization tasks to reduce the load on the destination database server. Note This configuration item is available only when Synchronization Types includes Full Data Synchronization.
Enable Throttling for Incremental Data Synchronization	Specifies whether to enable throttling for incremental data synchronization. You can enable throttling for incremental data synchronization based on your business requirements. To configure throttling, you must configure the RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s) parameters. This reduces the load on the destination database server.
Environment Tag	You can select an environment tag to identify the instance. In this example, no selection is required.
Extend Expiration Time of Destination Database Key	Set an extended expiration time for keys that are synchronized from the source database to the destination database. To ensure data consistency, we recommend that you set an extended expiration time for keys if commands that set expiration times are used. Note In scenarios involving distributed locks, this may prevent the locks from being released promptly. `EXPIRE key seconds PEXPIRE key milliseconds EXPIREAT key timestamp PEXPIREAT key timestampMs`
Use Slave Node	When the Instance Mode of the source self-managed Redis is Cluster, you can choose to read data from the primary or replica nodes. The default value is No, which means data is read from the primary node.
Configure ETL	Specifies whether to enable the extract, transform, and load (ETL) feature. For more information, see What is ETL? Valid values: Yes: configures the ETL feature. You can enter data processing statements in the code editor. For more information, see Configure ETL in a data migration or data synchronization task. No: does not configure the ETL feature.

Click Next Step: Data Verification to configure data verification.
For more information about how to use the data verification feature, see Configure a data verification task.

Save the task settings and run a precheck.
- To view the parameters to be specified when you call the relevant API operation to configure the DTS task, move the pointer over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
- If you do not need to view or have viewed the parameters, click Next: Save Task Settings and Precheck in the lower part of the page.
Note
- Before you can start the data synchronization task, DTS performs a precheck. You can start the data synchronization task only after the task passes the precheck.
- If the data synchronization task fails the precheck, click View Details next to each failed item. After you analyze the causes based on the check results, troubleshoot the issues. Then, rerun the precheck.
- If an alert is triggered for an item during the precheck:
  If an alert item cannot be ignored, click View Details next to the failed item and troubleshoot the issue. Then, run a precheck again.
  If an alert item can be ignored, click Confirm Alert Details. In the View Details dialog box, click Ignore. In the message that appears, click OK. Then, click Precheck Again to run a precheck again. If you ignore the alert item, data inconsistency may occur, and your business may be exposed to potential risks.

Purchase the instance.

Wait until the Success Rate becomes 100%. Then, click Next: Purchase Instance.

On the buy page, configure the Billing Method and Instance Class parameters for the data synchronization task. The following table describes the parameters.

Section	Parameter	Description
New Instance Class	Billing Method	Subscription: You pay for a subscription when you create a data synchronization instance. The subscription billing method is more cost-effective than the pay-as-you-go billing method for long-term use. Pay-as-you-go: A pay-as-you-go instance is billed on an hourly basis. The pay-as-you-go billing method is suitable for short-term use. If you no longer require a pay-as-you-go data synchronization instance, you can release the instance to reduce costs.
	Resource Group Settings	The resource group to which the data synchronization instance belongs. Default value: default resource group. For more information, see What is Resource Management?
	Instance Class	DTS provides instance classes that vary in synchronization speed. You can select an instance class based on your business requirements. For more information, see Instance classes of data synchronization instances.
	Subscription Duration	If you select the subscription billing method, specify the subscription duration and the number of data synchronization instances that you want to create. The subscription duration can be one to nine months, one year, two years, three years, or five years. Note This parameter is available only if you select the Subscription billing method.

Read and select Data Transmission Service (Pay-as-you-go) Service Terms.
Click Purchase And Start, and then click OK in the OK dialog box.
You can view the task progress on the data synchronization page.
Note
If the DTS instance you configure includes both full and incremental tasks (Synchronization Types includes both Full Data Synchronization and Incremental Data Synchronization), they are displayed as a single Incremental Data Synchronization task on the synchronization task list page.