LogHub Writer uses the Java SDK of Log Service to push data from DataX Reader to a specific Log Service LogHub so that the data can be consumed by other applications.
Note:
Given that LogHub cannot implement idempotence, re-running the task upon failover results in data duplication.
How it works
LogHub Writer acquires the data generated by Reader by means of the Datax framework and individually converts the data types supported by Datax to the String type. When the specified batchSize is reached, LogHub Writer pushes the data to LogHub in one operation using the Java SDK of Log Service. 1,024 entries of data is pushed at a time by default. The maximum batchSize is 4,096.
The following table shows the conversion of LogHub Writer for different LogHub data types:
Internal DataX type | LogHub data types |
---|---|
Long | String |
Double | String |
String | String |
Date | String |
Boolean | String |
Bytes | String |
Parameter description
endpoint
Description: Log Service address
Required: Yes
Default value: None
accessKeyID
Description: AccessKeyID for accessing Log Service
Required: Yes
Default value: None
accessKeySecret
Description: AccessKeySecret for accessing Log Service
Required: Yes
Default value: None
project
Description: Project name of target Log Service
Required: Yes
Default value: None
logstore
Description: LogStore name of target Log Service
Required: Yes
Default value: None
topic
Description: Selected topic
Required: No
Default value: Null string
batchSize
*Description: Number of pushed data at a time
*Required: No. The default value is 1,024.
- Default value: None
column
*Description: Column names in each data entry
Required: Yes
Default value: None
Note: The data is considered as dirty data if the number of columns is inconsistent with that of the data source.
Development in script mode
The following is a script configuration sample. For more information about parameters, see the preceding Parameter description.
{
"type": "job",
"version": "1.0",
"configuration": {
"setting": {
"errorLimit": {
"record": "0"
},
"speed": {
"mbps": "1",
"concurrent": "1"
}
},
"reader": {
"plugin": "odps",
"parameter": {
"accessKey": "*****",
"accessId": "*****",
"isCompress": "false",
"odpsServer": "http://service-corp.odps.aliyun-inc.com/api",
"project": "xxxx",
"table": "ttt",
"column": [
"*"
],
"partition": "pt=20161226"
}
},
"writer": {
"plugin": "loghubwriter",
"parameter": {
"endpoint": "http://cn-hangzhou.sls.aliyuncs.com",
"accessId": "*****",
"accessKey": "*****",
"project": "ggg",
"logstore": "store",
"batchSize": 1096,
"topic": "",
"column": [
"col0",
"col1",
"col2",
"col3",
"col4",
"col5",
"col6",
"col7",
"col8",
"col9"
]
}
}
}
}