The SAP HANA data source lets you read data from and write data to SAP HANA in DataWorks.
Supported versions
-
DataBase 2.00.048.06.1623425628
-
DataBase 2.00.055.00.161
Limitations
-
Offline sync supports reading data from views.
-
This data source supports serverless resource groups (recommended) and exclusive resource groups for Data Integration.
Supported field types
The SAP HANA Reader supports the following data types.
| Type category | Data types |
|---|---|
| Integer | INT, TINYINT, SMALLINT, MEDIUMINT, and BIGINT |
| Floating-point | FLOAT, DOUBLE, and DECIMAL |
| String | VARCHAR, CHAR, TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT |
| Date and time | DATE, DATETIME, TIMESTAMP, TIME, and YEAR |
| Boolean | BIT and BOOL |
| Binary | TINYBLOB, MEDIUMBLOB, BLOB, LONGBLOB, and VARBINARY |
Data types not listed above are not supported. The SAP HANA Reader treats tinyint(1) as an integer.
Add a data source
Add SAP HANA as a data source in DataWorks before developing a synchronization task. For instructions, see Data source management.
Parameter descriptions are available directly in the DataWorks console when you add the data source.
Develop a data synchronization task
Configure an offline sync task for a single table
Use the codeless UI or the code editor to configure the task:
For all script parameters and a script demo, see Appendix: Script demo and parameter descriptions.
Appendix: Script demo and parameter descriptions
The following sections describe the Reader and Writer script parameters for batch synchronization tasks configured in the code editor. For the script format requirements, see Configure a task in the code editor.
Reader script demo
All Reader examples use "stepType": "saphana" to identify the SAP HANA plugin.
Single table in a single database
{
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "saphana",
"parameter": {
"column": [
"id"
],
"connection": [
{
"querySql": ["select a,b from join1 c join join2 d on c.id = d.id;"],
"datasource": "",
"table": [
"xxx"
]
}
],
"where": "",
"splitPk": "",
"encoding": "UTF-8"
},
"name": "Reader",
"category": "reader"
},
{
"stepType": "stream",
"parameter": {},
"name": "Writer",
"category": "writer"
}
],
"setting": {
"errorLimit": {
"record": "0"
},
"speed": {
"throttle": true,
"concurrent": 1,
"mbps": "12"
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}
Sharded tables
Sharding lets the SAP HANA Reader read data from multiple SAP HANA tables. All sharded tables must share the same schema.
{
"type": "job",
"version": "1.0",
"configuration": {
"reader": {
"plugin": "saphana",
"parameter": {
"connection": [
{
"table": [
"tbl1",
"tbl2",
"tbl3"
],
"datasource": "datasourceName1"
},
{
"table": [
"tbl4",
"tbl5",
"tbl6"
],
"datasource": "datasourceName2"
}
],
"singleOrMulti": "multi",
"splitPk": "db_id",
"column": [
"id", "name", "age"
],
"where": "1 < id and id < 100"
}
},
"writer": {}
}
}
Reader script parameters
| Parameter | Description | Required | Default value |
|---|---|---|---|
username |
The username for the SAP HANA database. | Yes | None |
password |
The password for the SAP HANA database. | Yes | None |
column |
The columns to synchronize. Use * to synchronize all columns. If a column name contains a forward slash (/), enclose it in double quotation marks and escape the quotes with backslashes. For example, for a column named /abc/efg, set the value to \\"/abc/efg\\". |
Yes | None |
table |
The name of the table to synchronize. Specify the value as an array, even for a single table. | Yes | None |
jdbcUrl |
The Java Database Connectivity (JDBC) URL for the SAP HANA database. Example: jdbc:sap://127.0.0.1:30215?currentschema=TEST. |
Yes | None |
splitPk |
A numeric integer column used as the split key for concurrent synchronization. If no such column exists, leave this blank. | No | None |
Writer script demo
{
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "stream",
"parameter": {},
"name": "Reader",
"category": "reader"
},
{
"stepType": "saphana",
"parameter": {
"postSql": [],
"datasource": "",
"column": [
"id",
"value"
],
"batchSize": 1024,
"table": "",
"preSql": [
"delete from XXX;"
]
},
"name": "Writer",
"category": "writer"
}
],
"setting": {
"errorLimit": {
"record": "0"
},
"speed": {
"throttle": true,
"concurrent": 1,
"mbps": "12"
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}
Writer script parameters
| Parameter | Description | Required | Default value |
|---|---|---|---|
datasource |
The data source name. Must match the name configured in the code editor. | Yes | None |
table |
The destination table name. | Yes | None |
column |
The destination columns to write data to, specified as an array. Example: ["id", "name", "age"]. Use ["*"] to write to all columns in order. If a column name contains a forward slash (/), enclose it in double quotation marks and escape the quotes with backslashes. For example, for a column named /abc/efg, set the value to \\"/abc/efg\\". |
Yes | None |
preSql |
An SQL statement to run before the synchronization task starts. Use this to clear old data before loading. Example: truncate table tablename. In the codeless UI, only one statement is supported. In the code editor, multiple statements are supported, but transaction atomicity is not supported. |
No | None |
postSql |
An SQL statement to run after the synchronization task completes. In the codeless UI, only one statement is supported. In the code editor, multiple statements are supported. Example: ALTER TABLE tablename ADD colname TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP. |
No | None |
batchSize |
The number of records committed in a single batch. A larger value reduces network round-trips and improves throughput, but setting it too high may cause out-of-memory (OOM) errors. | No | 1024 |