All Products
Search
Document Center

DataWorks:UpdateFile

Last Updated:Mar 25, 2024

Updates a file.

When you debug or call this operation, you must specify new values for the parameters of a node. For example, if the original value of a parameter is A, you must change the value of this parameter to B before you commit the node. If you set the parameter to A, an exception that indicates invalid data occurs.

Debugging

OpenAPI Explorer automatically calculates the signature value. For your convenience, we recommend that you call this operation in OpenAPI Explorer. OpenAPI Explorer dynamically generates the sample code of the operation for different SDKs.

Request parameters

Parameter

Type

Required

Example

Description

Action

String

Yes

UpdateFile

The operation that you want to perform.

FileFolderPath

String

No

Workflow/1/Data Integration/Folder 1/Folder 2

The path of the file.

ProjectId

Long

No

10000

The DataWorks workspace ID. You can log on to the DataWorks console and go to the Workspace page to obtain the workspace ID.

FileName

String

No

ods_user_info_d

The name of the file. You can set the FileName parameter to another value to change the file name.

You can call the ListFiles operation to query the ID of the file whose name you want to change. Then, you can call the UpdateFile operation to change the file name by setting the FileId parameter to the ID you queried and setting the FileName parameter to another value.

FileDescription

String

No

File description

The description of the file.

Content

String

No

SELECT "1";

The code for the file. The code format varies based on the file type. To view the code format for a specific file type, go to Operation Center, open the DAG of a node of the file type, right-click the node, and then select View Code.

AutoRerunTimes

Integer

Yes

3

The number of automatic reruns that are allowed after an error occurs.

AutoRerunIntervalMillis

Integer

No

120000

The interval between automatic reruns after an error occurs. Unit: millisecond. Maximum value: 1800000 (30 minutes).

This parameter corresponds to the Rerun interval parameter that is displayed after the Auto Rerun upon Error check box is selected in the Schedule section of the Properties tab in the DataWorks console.

The interval that you specify in the DataWorks console is measured in minutes. Pay attention to the conversion between the units of time when you call the operation.

RerunMode

String

No

ALL_ALLOWED

Specifies whether the node that corresponds to the file can be rerun. Valid values:

  • ALL_ALLOWED: The node can be rerun regardless of whether it is successfully run or fails to run.

  • FAILURE_ALLOWED: The node can be rerun only after it fails to run.

  • ALL_DENIED: The node cannot be rerun regardless of whether it is successfully run or fails to run.

This parameter corresponds to the Rerun parameter in the Schedule section of the Properties tab in the DataWorks console.

Stop

Boolean

No

false

Specifies whether to suspend the scheduling of the node. Valid values:

  • true

  • false

This parameter corresponds to the Recurrence parameter in the Schedule section of the Properties tab in the DataWorks console.

ParaValue

String

No

x=a y=b z=c

The scheduling parameters of the node.

This parameter corresponds to the Scheduling Parameter section of the Properties tab in the DataWorks console. For more information about the configurations of scheduling parameters, see Configure scheduling parameters.

StartEffectDate

Long

No

936923400000

The start time of automatic scheduling. Set the value to a UNIX timestamp representing the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC.

This parameter corresponds to the start time specified for the Validity Period parameter in the Schedule section of the Properties tab in the DataWorks console.

EndEffectDate

Long

No

4155787800000

The end time of automatic scheduling. Set the value to a UNIX timestamp representing the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC.

This parameter corresponds to the end time specified for the Validity Period parameter in the Schedule section of the Properties tab in the DataWorks console.

CronExpress

String

No

00 00-59/5 1-23 * * ?

The CRON expression that represents the periodic scheduling policy of the node. This parameter corresponds to the Cron Expression parameter in the Schedule section of the Properties tab in the DataWorks console. After you configure the Scheduling Cycle and Scheduled time parameters in the DataWorks console, DataWorks generates the value of the Cron Expression parameter.

Examples:

  • CRON expression for a node that is scheduled to run at 05:30 every day: 00 30 05 * * ?

  • CRON expression for a node that is scheduled to run at the fifteenth minute of each hour: 00 15 * * * ?

  • CRON expression for a node that is scheduled to run every 10 minutes: 00 00/10 * * * ?

  • CRON expression for a node that is scheduled to run every 10 minutes from 08:00 to 17:00 every day: 00 00-59/10 8-23 * * * ?

  • CRON expression for a node that is scheduled to run at 00:20 on the first day of each month: 00 20 00 1 * ?

  • CRON expression for a node that is scheduled to run every three months from 00:10 on January 1: 00 10 00 1 1-12/3 ?

  • CRON expression for a node that is scheduled to run at 00:05 every Tuesday and Friday: 00 05 00 * * 2,5

The scheduling system of DataWorks imposes the following limits on CRON expressions:

  • The minimum interval specified in a CRON expression to schedule a node is 5 minutes.

  • The earliest time a node can be scheduled to run every day is 00:05.

CycleType

String

No

NOT_DAY

The type of the scheduling cycle of the node that corresponds to the file. Valid values: NOT_DAY and DAY. The value NOT_DAY indicates that the node is scheduled to run by minute or hour. The value DAY indicates that the node is scheduled to run by day, week, or month.

This parameter corresponds to the Scheduling Cycle parameter in the Schedule section of the Properties tab in the DataWorks console.

DependentType

String

No

USER_DEFINE

The type of the cross-cycle scheduling dependency of the node. Valid values:

  • SELF: The instance generated for the node in the current cycle depends on the instance generated for the node in the previous cycle.

  • CHILD: The instance generated for the node in the current cycle depends on the instances generated for the descendant nodes at the nearest level of the node in the previous cycle.

  • USER_DEFINE: The instance generated for the node in the current cycle depends on the instances generated for one or more specified nodes in the previous cycle.

  • NONE: No cross-cycle scheduling dependency type is selected for the node.

DependentNodeIdList

String

No

5,10,15,20

The ID of the node on which the node that corresponds to the file depends when the DependentType parameter is set to USER_DEFINE. Separate multiple IDs with commas (,).

The value of this parameter corresponds to the ID of the node that you specified after you select Previous Cycle and set Depend On to Other Nodes in the Dependencies section of the Properties tab in the DataWorks console.

InputList

String

No

project_root,project.file1,project.001_out

The output name of the parent file on which the current file depends. If you specify multiple output names, separate them with commas (,).

This parameter corresponds to the Parent Nodes parameter that is displayed after you select Same Cycle in the Dependencies section of the Properties tab in the DataWorks console.

ProjectIdentifier

String

No

dw_project

The name of the DataWorks workspace. You can log on to the DataWorks console and go to the Workspace page to obtain the workspace name.

You must configure either this parameter or the ProjectId parameter to determine the DataWorks workspace to which the operation is applied.

FileId

Long

Yes

100000001

The file ID. You can call the ListFiles operation to obtain the ID.

OutputList

String

No

dw_project.ods_user_info_d

The output name of the current file.

This parameter corresponds to the Output parameter in the Dependencies section of the Properties tab in the DataWorks console.

ResourceGroupIdentifier

String

No

default_group

The identifier of the resource group that is used to run the node that corresponds to the file. You can call the ListResourceGroups operation to query the available resource groups in the workspace.

ConnectionName

String

No

odps_source

The name of the data source that is used to run the node. You can call the ListDataSources operation to query the available data sources of the workspace.

Owner

String

No

18023848927592

The file owner ID.

AutoParsing

Boolean

No

true

Specifies whether to enable the automatic parsing feature for the file. Valid values:

  • true

  • false

This parameter corresponds to the Automatic Parsing From Code Before Node Committing parameter that is displayed after you select Same Cycle in the Dependencies section of the Properties tab in the DataWorks console.

SchedulerType

String

No

NORMAL

The scheduling type of the node. Valid values:

  • NORMAL: The node is an auto triggered node.

  • MANUAL: The node is a manually triggered node. Manually triggered nodes cannot be automatically triggered. They correspond to the nodes in the Manually Triggered Workflows pane.

  • PAUSE: The node is a paused node.

  • SKIP: The node is a dry-run node. Dry-run nodes are started as scheduled but the scheduling system sets the status of the nodes to successful when the scheduling system starts to run the nodes.

AdvancedSettings

String

No

{"queue":"default","SPARK_CONF":"--conf spark.driver.memory=2g"}

The advanced configurations of the node.

This parameter is valid only for an EMR Spark Streaming node or an EMR Streaming SQL node. This parameter corresponds to the Advanced Settings tab of the node in the DataWorks console.

The value of this parameter must be in the JSON format.

StartImmediately

Boolean

No

true

Specifies whether to immediately run a node after the node is deployed to the production environment. Valid values:

  • true

  • false

This parameter is valid only for an EMR Spark Streaming node or an EMR Streaming SQL node. This parameter corresponds to the Start Method parameter in the Schedule section of the Configure tab in the DataWorks console.

InputParameters

String

No

[{"ValueSource": "project_001.first_node:bizdate_param","ParameterName": "bizdate_input"}]

The input parameters of the node. The value of this parameter must be in the JSON format. For more information about the input parameters, see the InputContextParameterList parameter in the Response parameters section of the GetFile operation.

This parameter corresponds to the Input Parameters table in the Input and Output Parameters section of the Properties tab in the DataWorks console.

OutputParameters

String

No

[{"Type": 1,"Value": "${bizdate}","ParameterName": "bizdate_param"}]

The output parameters of the node. The value of this parameter must be in the JSON format. For more information about the output parameters, see the OutputContextParameterList parameter in the Response parameters section of the GetFile operation.

This parameter corresponds to the Output Parameters table in the Input and Output Parameters section of the Properties tab in the DataWorks console.

IgnoreParentSkipRunningProperty

Boolean

No

true

Specifies whether to skip the dry-run property of the ancestor nodes of the node. This parameter corresponds to the Skip the dry-run property of the ancestor node parameter that is displayed after you configure the Depend On parameter in the Dependencies section of the Properties tab in the DataWorks console.

Response parameters

Parameter

Type

Example

Description

HttpStatusCode

Integer

200

The HTTP status code.

ErrorMessage

String

The connection does not exist.

The error message.

RequestId

String

0000-ABCD-EFGH-IJKLMNOPQ

The request ID.

ErrorCode

String

Invalid.Tenant.ConnectionNotExists

The error code.

Success

Boolean

true

Indicates whether the request was successful. Valid values:

  • true

  • false

Examples

Sample requests

http(s)://[Endpoint]/?Action=UpdateFile
&FileFolderPath=Workflow/1/Data Integration/Folder 1/Folder 2
&ProjectId=10000
&FileName=ods_user_info_d
&FileDescription=File description
&Content=SELECT "1";
&AutoRerunTimes=3
&AutoRerunIntervalMillis=120000
&RerunMode=ALL_ALLOWED
&Stop=false
&ParaValue=x=a y=b z=c
&StartEffectDate=936923400000
&EndEffectDate=4155787800000
&CronExpress=00 00-59/5 1-23 * * ?
&CycleType=NOT_DAY
&DependentType=USER_DEFINE
&DependentNodeIdList=5,10,15,20
&InputList=project_root,project.file1,project.001_out
&ProjectIdentifier=dw_project
&FileId=100000001
&OutputList=dw_project.ods_user_info_d
&ResourceGroupIdentifier=default_group
&ConnectionName=odps_source
&Owner=18023848927592
&AutoParsing=true
&SchedulerType=NORMAL
&AdvancedSettings={"queue":"default","SPARK_CONF":"--conf spark.driver.memory=2g"}
&StartImmediately=true
&InputParameters=[{"ValueSource": "project_001.first_node:bizdate_param","ParameterName": "bizdate_input"}]
&OutputParameters=[{"Type": 1,"Value": "${bizdate}","ParameterName": "bizdate_param"}]
&IgnoreParentSkipRunningProperty=true
&<Common request parameters>

Sample success responses

XML format

HTTP/1.1 200 OK
Content-Type:application/xml

<UpdateFileResponse>
    <HttpStatusCode>200</HttpStatusCode>
    <ErrorMessage>The connection does not exist.</ErrorMessage>
    <RequestId>0000-ABCD-EFGH-IJKLMNOPQ</RequestId>
    <ErrorCode>Invalid.Tenant.ConnectionNotExists</ErrorCode>
    <Success>true</Success>
</UpdateFileResponse>

JSON format

HTTP/1.1 200 OK
Content-Type:application/json

{
  "HttpStatusCode" : 200,
  "ErrorMessage" : "The connection does not exist.",
  "RequestId" : "0000-ABCD-EFGH-IJKLMNOPQ",
  "ErrorCode" : "Invalid.Tenant.ConnectionNotExists",
  "Success" : true
}

Error codes

HTTP status code

Error code

Error message

Description

429

Throttling.Api

The request for this resource has exceeded your available limit.

The number of requests for the resource has exceeded the upper limit.

429

Throttling.System

The DataWorks system is busy. Try again later.

The DataWorks system is busy. Try again later.

429

Throttling.User

Your request is too frequent. Try again later.

Excessive requests have been submitted within a short period of time. Try again later.

500

InternalError.System

An internal system error occurred. Try again later.

An internal error occurred. Try again later.

500

InternalError.UserId.Missing

An internal system error occurred. Try again later.

An internal error occurred. Try again later.

For a list of error codes, see Service error codes.