All Products
Search
Document Center

DataWorks:RestAPI (HTTP)

Last Updated:Mar 27, 2026

The REST API data source lets you sync JSON data from RESTful APIs into destinations such as MaxCompute, or receive data from other data sources into a REST API endpoint. This topic describes the supported capabilities, configuration parameters, and script reference for the REST API data source in DataWorks Data Integration.

Limitations

  • Resource groups: Only Serverless resource groups and exclusive resource groups for Data Integration are supported.

  • Request timeout: The built-in timeout is 60 seconds. You cannot configure a custom timeout value. If an API query takes longer than 60 seconds to respond, the task fails.

  • Table schema: Only a flat (single-layer) table schema is supported at the destination. Nested field structures are not supported. For example, if an API returns {data: {user: { id: 1, name:'lily'}, value: 123}}, flatten the fields to parallel columns such as user_id, user_name, and value at the destination.

  • Scheduling parameters: The REST API plugin does not support scheduling parameters.

  • Paging: Manual paging is supported — specify the page range using startIndex, endIndex, and step. Automatic paging (stopping when no more data is returned) is not supported. If the specified page count exceeds the actual number of pages, empty pages are treated as empty query results and the task continues to the next page without failing.

Supported field types

Type Data Integration column type
Integer LONG, INT
String STRING
Floating-point DOUBLE, FLOAT
Boolean BOOLEAN
Date and time DATE

Add a data source

Before you develop a synchronization task, add the REST API data source in the DataWorks console. For instructions, see Data source management.

Develop a synchronization task

To configure a single-table offline synchronization task, use either the codeless UI or the code editor:

For the full parameter reference and sample scripts, see Appendix: Script reference.

Examples

FAQ

Can I specify only the number of pages for data requests?

Yes. Use requestTimes: "multiple" with startIndex, endIndex, and step to define the page range.

Is automatic paging supported?

No. Specify the page range in advance using startIndex, endIndex, and step. The plugin cannot detect when there is no more data and stop automatically.

What happens if the specified page count is greater than the actual number of pages?

Empty pages are treated as empty query results. The task continues to the next page without failing.

Is only single-layer JSON parsing supported?

Yes. Deep (nested) parsing is not supported. Use dataPath to point to the target field, and flatten nested structures at the destination.

How do I read non-array data from a REST API?

Set dataPath to the path of the non-array field (for example, dataPath: "data.list") and set dataMode to multiData. In multiData mode, the column configuration is not applicable — specify the data path directly in dataPath.

Example:

{
  "reader": {
    "name": "restapi",
    "parameter": {
      "dataPath": "data.list",
      "dataMode": "multiData"
    }
  }
}

Appendix: Script reference

How it works

The REST API plugin sends an HTTP or HTTPS request and receives a JSON response body. Use dataPath to specify the JSONPath for extracting data from the response, and dataMode to control how the extracted data is passed to the writer.

Example 1: Array response (`multiData` mode)

The API returns a response where DATA is an array containing multiple records:

{
  "HEADER": { "BUSID": "bid1", "RECID": "uuid", "SENDER": "dc", "RECEIVER": "pre", "DTSEND": "202201250000" },
  "DATA": [
    { "SERNR": "sernr1" },
    { "SERNR": "sernr2" }
  ]
}

To extract each item in DATA as a separate synchronization record:

column:   ["SERNR"]
dataMode: "multiData"
dataPath: "DATA"

Example 2: Single-object response (`oneData` mode)

The API returns a response where content.DATA is a single object:

{
  "HEADER": { "BUSID": "bid1", "RECID": "uuid", "SENDER": "dc", "RECEIVER": "pre", "DTSEND": "202201250000" },
  "content": {
    "DATA": { "SERNR": "sernr2" }
  }
}

To extract content.DATA as a single synchronization record:

column:   ["SERNR"]
dataMode: "oneData"
dataPath: "content.DATA"

Reader script example

{
  "type": "job",
  "version": "2.0",
  "steps": [
    {
      "stepType": "restapi",
      "parameter": {
        "url": "http://127.0.0.1:5000/get_array5",
        "dataMode": "oneData",
        "responseType": "json",
        "column": [
          {
            "type": "long",
            "name": "a.b"
          },
          {
            "type": "string",
            "name": "a.c"
          }
        ],
        "dirtyData": "null",
        "method": "get",
        "socketTimeout": "60000",
        "defaultHeader": {
          "X-Custom-Header": "test header"
        },
        "customHeader": {
          "X-Custom-Header2": "test header2"
        },
        "parameters": "abc=1&def=1"
      },
      "name": "restapireader",
      "category": "reader"
    },
    {
      "stepType": "stream",
      "parameter": {},
      "name": "Writer",
      "category": "writer"
    }
  ],
  "setting": {
    "errorLimit": {
      "record": ""
    },
    "speed": {
      "throttle": true,
      "concurrent": 1,
      "mbps": "12"
    }
  },
  "order": {
    "hops": [
      {
        "from": "Reader",
        "to": "Writer"
      }
    ]
  }
}

Reader parameters

The following parameters apply when adding a data source and configuring a Data Integration node. The plugin does not support scheduling parameters.
Parameter Required Default Description
url Yes The address of the RESTful API.
method Yes The HTTP request method. Valid values: get, post.
dataMode Yes How the JSON response is processed. oneData: reads one record from the response. multiData: reads a JSON array and passes multiple records to the writer.
responseType Yes json The response format. Only json is supported.
column Yes The list of fields to read. Each field requires type (the data type) and name (the JSONPath to the value). Example: [{"type": "long", "name": "a.b"}, {"type": "string", "name": "a.c"}]
dirtyData Yes dirty How to handle records where a column's JSONPath returns no value. dirty: marks the record as dirty data. null: sets the column value to null.
requestTimes Yes single Whether to send one or multiple requests. single: sends one request. multiple: loops through page parameters defined by startIndex, endIndex, and step.
dataPath No The JSONPath to a single object or array in the response.
socketTimeout No 60000 The socket timeout for the API request, in milliseconds.
customHeader No Custom HTTP headers to include in the request.
parameters No Request parameters. For GET requests, use the key=value&key=value format. For POST requests, use JSON format.
requestParam No The loop parameter name (for example, pageNumber) when requestTimes is multiple.
startIndex No The start index for the loop request (inclusive).
endIndex No The end index for the loop request (inclusive).
step No The step size for the loop request.
authType No The authentication method. See Authentication methods.
authUsername / authPassword No The username and password for Basic authentication.
authToken No The token for token-based authentication. Example: {"Authorization": "Bearer TokenXXXXXX"}. To use a custom encryption method, provide the encrypted credentials as the AuthToken value.
accessKey / accessSecret No The access key and access secret for Alibaba Cloud API signature authentication.

Authentication methods

Check your API documentation to identify the authentication method, then configure the corresponding parameters. The following keywords in your API documentation indicate which method to use:

Method Keywords in your API docs Parameters to configure
Basic authentication "Basic Auth", "Basic HTTP", Authorization: Basic authUsername, authPassword
Token-based authentication "Bearer token", "API token", Authorization: Bearer authToken
Alibaba Cloud API signature "AK/SK", "AccessKey", Alibaba Cloud signature accessKey, accessSecret

Writer script example

{
  "type": "job",
  "version": "2.0",
  "steps": [
    {
      "stepType": "stream",
      "parameter": {},
      "name": "Reader",
      "category": "reader"
    },
    {
      "stepType": "restapi",
      "parameter": {
        "url": "http://127.0.0.1:5000/writer1",
        "dataMode": "oneData",
        "responseType": "json",
        "column": [
          {
            "type": "long",
            "name": "a.b"
          },
          {
            "type": "string",
            "name": "a.c"
          }
        ],
        "method": "post",
        "defaultHeader": {
          "X-Custom-Header": "test header"
        },
        "customHeader": {
          "X-Custom-Header2": "test header2"
        },
        "parameters": "abc=1&def=1",
        "batchSize": 256
      },
      "name": "restapiwriter",
      "category": "writer"
    }
  ],
  "setting": {
    "errorLimit": {
      "record": "0"
    },
    "speed": {
      "throttle": true,
      "concurrent": 1,
      "mbps": "12"
    }
  },
  "order": {
    "hops": [
      {
        "from": "Reader",
        "to": "Writer"
      }
    ]
  }
}

Writer parameters

Parameter Required Default Description
url Yes The address of the RESTful API.
method Yes The HTTP request method. Valid values: post, put.
dataMode Yes How records are sent. oneData: sends one record per request. multiData: sends a batch of records per request; the number of requests depends on the tasks split on the reader side.
column Yes The list of field paths for the generated JSON. Each field requires type and name (the JSONPath where the column's data is placed). Example: [{"type": "long", "name": "a.b"}, {"type": "string", "name": "a.c"}]
batchSize Yes 512 The maximum number of records per request when dataMode is multiData.
dataPath No The JSONPath of the object where the output data is placed.
customHeader No Custom HTTP headers to include in the request.
authType No The authentication method. See Authentication methods.
authUsername / authPassword No The username and password for Basic authentication.
authToken No The token for token-based authentication.
accessKey / accessSecret No The access key and access secret for Alibaba Cloud API signature authentication.