DataWorks Data Integration allows you to use Redis Writer to write data to Redis. This topic describes how to perform batch writes to a Redis data source by using DataWorks.
Limitations
Data import tasks can be run on serverless resource groups (recommended) and exclusive resource groups for Data Integration.
When you use Redis Writer to write data, rerunning a synchronization task is not an idempotent operation if the value is a List. Therefore, if the value is a List, you must manually clear the corresponding data from Redis before you rerun the task.
ImportantRedis does not currently support Bloom filter configurations. To handle duplicate data, you can use a workaround: add a node, such as a Shell, Python, or PyODPS node, before or after the synchronization node in your workflow to perform deduplication.
Supported data types
Redis supports a rich set of value data types, including String, List, Set, ZSet (sorted set), and Hash. For more information about Redis, see redis.io.
Data synchronization task development
For information about the entry point for and the procedure of configuring a synchronization task, see the following configuration guides.
For the configuration procedure, see Configure a task in the codeless UI and Configure a task in the code editor.
For a complete list of parameters and code samples for the code editor, see Appendix: Code samples and parameters.
Appendix: Code samples and parameters
Configure a batch synchronization task by using the code editor
If you want to configure a batch synchronization task by using the code editor, you must configure the related parameters in the script based on the unified script format requirements. For more information, see Use the Code Editor. The following information describes the parameters that you must configure for data sources when you configure a batch synchronization task by using the code editor.
Writer code sample
The following code is a sample data synchronization task that reads data from a MySQL database and writes it to Redis. It shows the code for both the MySQL Reader and the Redis Writer.
{
"type":"job",
"version":"2.0", // The version number.
"steps":[
{ // The following is a code sample for the Reader. For more information about Reader parameters, see the documentation for the corresponding Reader plugin.
"stepType":"mysql",
"parameter": {
"envType": 0,
"datasource": "xc_mysql_demo2",
"column": [
"id",
"value",
"table"
],
"connection": [
{
"datasource": "xc_mysql_demo2",
"table": []
}
],
"where": "",
"splitPk": "",
"encoding": "UTF-8"
},,
"name":"Reader",
"category":"reader"
},
{// The following is a code sample for the Writer.
"stepType":"redis", // The plugin name for Redis Writer. Set this parameter to redis.
"parameter":{ // The following section describes the main parameters of Redis Writer.
"expireTime":{ // The cache expiration time for Redis values. You can set this parameter to the seconds type or the unixtime type."seconds":"1000"
},
"keyFieldDelimiter":"u0001", // The delimiter for Redis keys.
"dateFormat":"yyyy-MM-dd HH:mm:ss",// The date format used when data is written to Redis.
"datasource":"xc_mysql_demo2", // The data source name. This must be the same as the name of the data source you added.
"envType": 0, // The environment type. Development environment: 1. Production environment: 0.
"writeMode":{ // The write mode.
"type":"string" // The value type.
"mode":"set", // The write mode for a specific value type.
"valueFieldDelimiter":"u0001", // The delimiter between values.
},
"keyIndexes":[0,1], // Used for mapping from the source to Redis. Specifies the source columns to be used as the key (the first column starts from 0). For example, if the first and second columns of the source are combined as the Redis key, set this parameter to [0,1].
"batchSize":"1000" // The number of records in each batch.
"column": [ // For Redis type string with set operation: if this column is not configured, the value format is a delimiter-separated string (CSV format. Assuming ID=1, name="John", age=18, sex=male, the Redis value example: "18::male"). If column is configured in the following format, the Redis value will be written in JSON format. Assuming ID=1, name="John", age=18, sex=male, the Redis value example: {"id":1,"name":"John","age":18,"sex":"male"}
{
"name": "id",
"index": "0"
},
{
"name": "name",
"index": "1"
},
{
"name": "age",
"index": "2"
},
{
"name": "sex",
"index": "3"
}
]
},
"name":"Writer",
"category":"writer"
}
],
"setting":{
"errorLimit":{
"record":"0" // The error count.
},
"speed":{
"throttle":true,// When throttle is set to false, the mbps parameter does not take effect, which means throttling is disabled. When throttle is set to true, throttling is enabled.
"concurrent":1, // The concurrency of the job.
"mbps":"12"// The throttling rate. 1 mbps = 1 MB/s.
}
},
"order":{
"hops":[
{
"from":"Reader",
"to":"Writer"
}
]
}
}Writer parameters
Parameter | Description | Required | Default value |
expireTime | The cache expiration time for Redis values, in seconds. If this parameter is not specified, the default value You can configure expireTime in one of the following two ways:
| No | 0 (0 indicates that the data never expires) |
keyFieldDelimiter | The delimiter for Redis keys. For example, key=key1\u0001id. This parameter is required if multiple keys need to be concatenated. If only one key is used, you can skip this parameter. | No | \u0001 |
dateFormat | The date format used when data is written to Redis: yyyy-MM-dd HH:mm:ss. | No | N/A |
datasource | The data source name. The value must be the same as the name of the data source you added. | Yes | N/A |
selectDatabase | The Redis database to write to ( | No | Database 0 by default |
writeMode | Redis Writer supports the following five value types for writing data to Redis:
The writeMode configuration varies slightly depending on the value type. For more information, see writeMode parameter description below. Note When you configure Redis Writer, you must set writeMode to one of the five supported data types, and only one type can be specified. If you do not configure this parameter, writeMode uses the default value | No | string |
keyIndexes | Specifies the column indexes of the source columns to be used as the key. Column indexes start from 0 (the first column has an index of 0, the second column has an index of 1, and so on).
Note After you configure keyIndexes, Redis Writer uses the remaining columns as the value. If you want to synchronize only specific columns as the key and other specific columns as the value, you do not need to synchronize all columns. You can specify the column parameter in the Reader plugin to filter columns. | Yes | N/A |
batchSize | The number of records in each batch. This parameter can significantly reduce the number of network interactions between the data synchronization system and Redis, and improve overall throughput. If this value is set too large, the data synchronization process may encounter an out-of-memory (OOM) error. | No | 1,000 |
timeout | The timeout for writing data to Redis, in milliseconds. | No | 30,000 |
redisMode | The running mode of Redis. Valid values:
Note Supported on serverless resource groups (recommended) and exclusive resource groups for Data Integration. | No | N/A |
column | The column configuration for writing data to Redis. For Redis type string with the set operation:
| No | N/A |
writeMode parameter description
When you configure Redis Writer, you must set writeMode to one of the five supported data types, and only one type can be specified. If you do not configure this parameter, writeMode uses the default value string.
Value type | type parameter (required) | mode parameter (required) | valueFieldDelimiter parameter (optional) | writeMode configuration example |
String | Set type to | mode is the write mode parameter. When the value is a string:
| valueFieldDelimiter is the delimiter between values. The default value is
| |
List | Set type to | mode is the write mode parameter. When the value is a list, the following options are available:
| | |
Set | Set type to | mode is the write mode parameter. When the value is a set:
| | |
ZSet (sorted set) | Set type to | mode is the write mode parameter. When the value is a ZSet (sorted set):
| This parameter does not need to be configured. | Note When the value type is zset, each row of source data must follow the corresponding format. Each row can contain only one score-value pair in addition to the key, and the score must precede the value so that Redis Writer can correctly identify which column corresponds to the score and which to the value. |
Hash | Set type to | mode is the write mode parameter. When the value is a hash:
| This parameter does not need to be configured. | Note When the value type is hash, each row of source data must follow the corresponding format. Each row can contain only one attribute-value pair in addition to the key, and the attribute must precede the value so that Redis Writer can correctly identify which column corresponds to the attribute and which to the value. |