All Products
Search
Document Center

DataWorks:RestAPI data source

Last Updated:Jul 18, 2025

You can create a RestAPI data source to write JSON data from RESTful APIs to other data sources (such as MaxCompute) through synchronization tasks. RestAPI data sources can also serve as destinations to receive data from other data sources. This topic describes the capabilities of synchronizing data from or to RestAPI data sources.

Limits

  • RestAPI data sources support only exclusive resource groups for Data Integration.

  • DataWorks does not allow you to configure a timeout period when you use this type of data source. The built-in timeout period for a request in DataWorks is 60 seconds. If the time required to return the result of your API call exceeds 60 seconds, your task may fail.

Supported field types

Category

Data Integration Column Configuration Types

Integer

LONG, INT

String

STRING

Floating point

DOUBLE, FLOAT

Boolean

BOOLEAN

Date and time

DATE

Add a data source

Before you develop a synchronization task in DataWorks, you must add the required data source to DataWorks by following the instructions in Add and manage data sources. You can view the infotips of parameters in the DataWorks console to understand the meanings of the parameters when you add a data source.

Develop a data synchronization task

For information about the entry point for and the procedure of configuring a synchronization task, see the following configuration guides.

Configure a batch synchronization task to synchronize data of a single table

FAQ

  • Can I specify only the number of times of page flipping for a response?

    Yes.

  • Does it support automatic pagination that stops when a request returns no more data?

    A: No, this is not supported because it cannot be split into chunks.

  • The specified number of times of page flipping for a response is greater than the actual number of pages for the response. As a result, additional pages do not contain data. How does the system resolve this issue?

    If no result is returned for the SQL query, additional pages do not contain data. In this case, the system continues to query the next data record.

  • Can RestAPI Reader parse only one level of data in the JSON-formatted response?

    A: Yes, it does not perform deep parsing.

  • How do I configure RestAPI Reader to read data of a non-array type?

    Make sure that in the reader's parameter block, you set the dataPath parameter to the path that points to your data of a non-array type, such as dataPath:"data.list". This helps the plugin correctly locate the data fields to read. Next, set the dataMode parameter to multiData. This way, DataWorks processes the data of a non-array type as multiple separate data records.

    Note

    Note that in multiData mode, the column parameter does not apply. You must specify the data path directly in dataPath.

    The following code provides a configuration example:

    reader: {
      name: "restapi",
      parameter: {
        dataPath: "data.list",
        dataMode: "multiData",
        // Other parameters
      }
    }

Appendix: Code and parameters

Configure a batch synchronization task by using the code editor

If you want to configure a batch synchronization task by using the code editor, you must configure the related parameters in the script based on the unified script format requirements. For more information, see Configure a batch synchronization task by using the code editor. The following information describes the parameters that you must configure for data sources when you configure a batch synchronization task by using the code editor.

Reader script example

  • Sample code:

    {
        "type":"job",
        "version":"2.0",
        "steps":[
            {
                "stepType":"restapi",
                "parameter":{
                    "url":"http://127.0.0.1:5000/get_array5",
                    "dataMode":"oneData",
                    "responseType":"json",
                    "column":[
                        {
                            "type":"long",
                            "name":"a.b"  // Query data in the a.b path.
                        },
                        {
                            "type":"string",  // Query data in the a.c path.
                            "name":"a.c"
                        }
                    ],
                    "dirtyData":"null",
                    "method":"get",
                    "defaultHeader":{
                        "X-Custom-Header":"test header"
                    },
                    "customHeader":{
                        "X-Custom-Header2":"test header2"
                    },
                    "parameters":"abc=1&def=1"
                },
                "name":"restapireader",
                "category":"reader"
            },
            {
                "stepType":"stream",
                "parameter":{
    
                },
                "name":"Writer",
                "category":"writer"
            }
        ],
        "setting":{
            "errorLimit":{
                "record":""
            },
            "speed":{
                "throttle":true,  // Specifies whether to enable throttling. The value false indicates that throttling is disabled, and the value true indicates that throttling is enabled. The mbps parameter takes effect only when the throttle parameter is set to true.
                "concurrent":1,  // The maximum number of parallel threads.
                "mbps":"12"// The maximum transmission rate. Unit: MB/s.
            }
        },
        "order":{
            "hops":[
                {
                    "from":"Reader",
                    "to":"Writer"
                }
            ]
        }
    }
  • Take note of the following information when you configure RestAPI Reader using the code editor:

    After RestAPI Reader sends an HTTP or HTTPS request, a JSON-formatted response is returned. The dataPath parameter is used to specify the path of the JSON-formatted data record or JSON array that is queried. Examples:
    
    
    In the following sample response, a JSON array is returned for the DATA parameter that contains the business data.
    {
        "HEADER": {
            "BUSID": "bid1",
            "RECID": "uuid",
            "SENDER": "dc",
            "RECEIVER": "pre",
            "DTSEND": "202201250000"
        },
        "DATA": [
            {
                "SERNR": "sernr1"
            },
            {
                "SERNR": "sernr2"
            }
        ]
    }
    
    To extract multiple data records from the JSON array and transfer the data records to a writer, you must configure the column parameter in the "column": [ "SERNR" ] format, the dataMode parameter in the "dataMode": "multiData" format, and the dataPath parameter in the "dataPath": "DATA" format.
    
    
    In the following sample response, a JSON object is returned for the content.DATA parameter that contains the business data.
    {
        "HEADER": {
            "BUSID": "bid1",
            "RECID": "uuid",
            "SENDER": "dc",
            "RECEIVER": "pre",
            "DTSEND": "202201250000"
        },
        "content": {
            "DATA": {
                "SERNR": "sernr2"
            }
        }
    }
    
    To extract one data record from the JSON object and transfer the data record to a writer, you must configure the column parameter in the "column": [ "SERNR" ] format, the dataMode parameter in the "dataMode": "oneData" format, and the dataPath parameter in the "dataPath": "content.DATA" format.
                    

Parameters in code for RestAPI Reader

Note

You must configure the parameters that are described in the following table when you add a RestAPI data source and configure a data integration node.

Scheduling parameters are not supported for a data synchronization node that uses RestAPI Reader.

Parameter

Description

Required

Default value

url

The URL of the RESTful API.

Yes

No default value

dataMode

The format of the JSON data returned by a RESTful request.

  • oneData: RestAPI Reader extracts one data record.

  • multiData: RestAPI Reader extracts a JSON array and transfers multiple data records to a writer.

Yes

No default value

responseType

The format of the response returned by the RESTful API. Only the JSON format is supported.

Yes

JSON

column

The names of the fields from which you want to read data. The type parameter specifies the data type of a field. The name parameter specifies the JSON-formatted path in which the field is located. You can configure the column parameter in the following format:

"column":[{"type":"long","name":"a.b" // Query data in the a.b path.},{"type":"string","name":"a.c"// Query data in the a.c path.}]

You must configure the type and name parameters for each field.

Yes

No default value

dataPath

The path of the JSON-formatted data record or JSON array that is queried.

No

No default value

method

The request method. Valid values: get and post.

Yes

No default value

customHeader

The header information transferred to the RESTful API.

No

No default value

parameters

The parameter information transferred to the RESTful API.

  • If the method parameter is set to get, set the value to abc=1&def=1.

  • If the method parameter is set to post, configure JSON parameters.

No

No default value

dirtyData

The processing mechanism that is used when no data is found in the JSON-formatted path specified using the column parameter. Valid values:

  • dirty: If a specific data record cannot be found in the specified JSON-formatted path, this data record is considered as a dirty data record.

  • null: If a specific data record cannot be found in the specified JSON-formatted path, the column parameter is set to null.

Yes

dirty

requestTimes

The number of times data is requested from the RESTful address.

  • single: only once

  • multiple: multiple times

Yes

single

requestParam

If you set the requestTimes parameter to multiple, you must configure a parameter that you want to repeatedly pass to the RESTful API in each request. For example, if you configure the pageNumber parameter, RestAPI Reader passes the pageNumber parameter to the RESTful API based on the settings of the startIndex, endIndex, and step parameters.

No

No default value

startIndex

The start point of requests. The data at the start point is also requested.

No

No default value

endIndex

The end point of requests. The data at the end point is also requested.

No

No default value

step

The step at which requests are sent.

No

No default value

authType

The authentication method. Valid values:

  • Basic Authentication: basic authentication

    If the data source supports username- and password-based authentication, you can select Basic Authentication and configure the username and password that can be used for authentication. During data integration, the username and password are transferred to the RESTful API URL for authentication. The data source is connected only after the authentication is successful.

  • Token Authentication: token-based authentication

    If the data source supports token-based authentication, you can select Token Authentication and configure a fixed token value that can be used for authentication. During data integration, the token is contained in the request header, such as {"Authorization":"Bearer TokenXXXXXX"}, and transferred to the RESTful API URL for authentication. The data source is connected only after the authentication is successful.

    Note

    If you want to use a custom authentication method, you can select Token Authentication and configure a fixed token value in the Token field. The token value can be used for authentication after it is encrypted.

No

No default value

authUsername/authPassword

The username and password used for basic authentication.

No

No default value

authToken

The token used for token-based authentication.

No

No default value

accessKey/accessSecret

The AccessKey pair used for authentication based on Alibaba Cloud API signature.

No

No default value

Writer script demo

{
    "type":"job",
    "version":"2.0",
    "steps":[
        {
            "stepType":"stream",
            "parameter":{

            },
            "name":"Reader",
            "category":"reader"
        },
        {
            "stepType":"restapi",
            "parameter":{
                "url":"http://127.0.0.1:5000/writer1",
                "dataMode":"oneData",
                "responseType":"json",
                "column":[
                    {
                        "type":"long", // Store data in the a.b path.
                        "name":"a.b"
                    },
                    {
                        "type":"string", // Store data in the a.c path.
                        "name":"a.c"
                    }
                ],
                "method":"post",
                "defaultHeader":{
                    "X-Custom-Header":"test header"
                },
                "customHeader":{
                    "X-Custom-Header2":"test header2"
                },
                "parameters":"abc=1&def=1",
                "batchSize":256
            },
            "name":"restapiwriter",
            "category":"writer"
        }
    ],
    "setting":{
        "errorLimit":{
            "record":"0" // The maximum number of dirty data records allowed.
        },
        "speed":{
            "throttle":true,// Specifies whether to enable throttling. The value false indicates that throttling is disabled, and the value true indicates that throttling is enabled. The mbps parameter takes effect only when the throttle parameter is set to true.
            "concurrent":1, // The maximum number of parallel threads.
            "mbps":"12"// The maximum transmission rate. Unit: MB/s.
        }
    },
    "order":{
        "hops":[
            {
                "from":"Reader",
                "to":"Writer"
            }
        ]
    }
}

Parameters in code for RestAPI Writer

Parameter

Description

Required

Default value

url

The URL of the RESTful API.

Yes

No default value

dataMode

The format in which RestAPI Writer transfers JSON-formatted data.

  • oneData: RestAPI Writer transfers one data record in each request.

  • multiData: RestAPI Writer transfers multiple data records in each request. The number of requests is determined by the number of tasks generated by the reader.

Yes

No default value

column

The columns to which you want to write the generated JSON-formatted data. The type field specifies the data type of a column. The name field specifies the JSON-formatted path where the column is stored. You can configure the column parameter in the following format:

"column":[{"type":"long","name":"a.b" // Store data in the a.b path.},{"type":"string","name":"a.c"// Store data in the a.c path.}]

Note

You must configure the type and name parameters for each field.

Yes

No default value

dataPath

The path that is used to store the JSON-formatted data.

No

No default value

method

The request method. Valid values: post and put.

Yes

No default value

customHeader

The header information transferred to the RESTful API.

No

No default value

authType

The authentication method. Valid Values:

  • Basic Authentication: basic authentication

    If the data source supports username- and password-based authentication, you can select Basic Authentication and configure the username and password that can be used for authentication. During data integration, the username and password are transferred to the RESTful API URL for authentication. The data source is connected only after the authentication is successful.

  • Token Authentication: token-based authentication

    If the data source supports token-based authentication, you can select Token Authentication and configure a fixed token value that can be used for authentication. During data integration, the token is contained in the request header, such as {"Authorization":"Bearer TokenXXXXXX"}, and transferred to the RESTful API URL for authentication. The data source is connected only after the authentication is successful.

    Note

    If you want to use a custom authentication method, you can select Token Authentication and configure a fixed token value in the Token field. The token value can be used for authentication after it is encrypted.

No

No default value

authUsername/authPassword

The username and password used for basic authentication.

No

No default value

authToken

The token used for token-based authentication.

No

No default value

accessKey/accessSecret

The AccessKey pair used for authentication based on Alibaba Cloud API signature.

No

No default value

batchSize

The maximum number of data records that can be transferred in each request when the dataMode parameter is set to multiData.

Yes

512