CreateFile - DataWorks - Alibaba Cloud Documentation Center

Creates a file in DataStudio. You cannot call this operation to create Data Integration nodes.

Debugging

You can run this interface directly in OpenAPI Explorer, saving you the trouble of calculating signatures. After running successfully, OpenAPI Explorer can automatically generate SDK code samples.

Debug

Authorization information

There is currently no authorization information disclosed in the API.

Request parameters

Parameter	Type	Required	Description	Example
FileFolderPath	string	Yes	The file path.	Business_process/First_Business_Process/MaxCompute/Folder_1/Folder_2
ProjectId	long	Yes	The DataWorks workspace ID. To obtain the workspace ID, log on to the DataWorks console and navigate to the workspace configuration page. You must configure either this parameter or the ProjectIdentifier parameter to determine the DataWorks workspace to which the operation is applied.	10000
FileName	string	Yes	The file name.	File name
FileDescription	string	No	The description of the file.	test
FileType	integer	Yes	The code type of the file. Different file types have different code. For more information, see DataWorks node types. You can call the ListFileType operation to query the code types of files.	10
Owner	string	No	The Alibaba Cloud account ID of the file owner. If this parameter is not specified, the Alibaba Cloud account ID of the caller is used by default.	1000000000001
Content	string	No	The file code content. Different code types (fileType) have different code formats. In Operation Center, you can find a task of the corresponding type, right-click it, and select View Code to view the specific code format.	SHOW TABLES;
AutoRerunTimes	integer	No	The number of automatic reruns after an error occurs. Maximum value: 10.	3
AutoRerunIntervalMillis	integer	No	The interval at which the node is automatically rerun after a failure. Unit: milliseconds. Maximum value: 1800000 milliseconds (30 minutes). This parameter corresponds to the Rerun interval parameter in Properties > Schedule > Auto Rerun upon Failure for data development nodes in the DataWorks console. In the console, the unit of the rerun interval is minutes. Convert the time unit when you call this operation.	120000
RerunMode	string	No	The rerun policy. Valid values: ALL_ALLOWED: Reruns are allowed regardless of whether the task succeeds or fails. FAILURE_ALLOWED: Reruns are allowed only when the task fails. ALL_DENIED: Reruns are not allowed regardless of whether the task succeeds or fails. This parameter corresponds to the Support for Rerun setting in Scheduling > Scheduling Policies for Data Studio tasks in the DataWorks console.	ALL_ALLOWED
Stop	boolean	No	Specifies whether to skip execution. Valid values: true false This parameter corresponds to the Skip Execution option in Properties > Schedule > Recurrence for data development nodes in the DataWorks console.	false
ParaValue	string	No	The scheduling parameters of the node. Separate multiple parameters with spaces. This parameter corresponds to the Scheduling Parameter setting in Properties for data development nodes in the DataWorks console. For more information, see Scheduling parameters.	a=x b=y
StartEffectDate	long	No	The timestamp (in milliseconds) when automatic scheduling starts. This parameter corresponds to the start time of Effective Period in Scheduling > Scheduling Time for Data Studio tasks in the DataWorks console.	1671608450000
EndEffectDate	long	No	The timestamp (in milliseconds) when automatic scheduling stops. This parameter corresponds to the end time of Effective Period in Scheduling > Scheduling Time for Data Studio tasks in the DataWorks console.	1671694850000
CronExpress	string	No	The cron expression for scheduled execution. This parameter corresponds to the Cron Expression setting in Scheduling > Scheduling Time for Data Studio tasks in the DataWorks console. After you configure Scheduling Cycle and Scheduled Time, DataWorks automatically generates a cron expression. Examples: Scheduled at 05:30 every day: `00 30 05 * * ?` Scheduled at the 15th minute of every hour: `00 15 00-23/1 * * ?` Scheduled every 10 minutes: `00 00/10 * * * ?` Scheduled every 10 minutes between 08:00 and 17:00 every day: `00 00-59/10 8-17 * * * ?` Scheduled at 00:20 on the 1st day of every month: `00 20 00 1 * ?` Scheduled every 3 months starting from 00:10 on January 1: `00 10 00 1 1-12/3 ?` Scheduled at 00:05 on every Tuesday and Friday: `00 05 00 * * 2,5` Due to the rules of the DataWorks scheduling system, cron expressions have the following restrictions: The minimum scheduling interval is 5 minutes. The earliest scheduling time each day is 00:05.	00 05 00 * * ?
CycleType	string	No	The type of scheduling cycle. Valid values: NOT_DAY (minute, hour) and DAY (day, week, month). This parameter corresponds to the Scheduling Cycle setting in Scheduling > Scheduling Time for Data Studio tasks in the DataWorks console.	DAY
DependentType	string	No	The dependency mode on the previous cycle. Valid values: SELF: Depends on the current node. CHILD: Depends on the child nodes. USER_DEFINE: Depends on other nodes. NONE: No dependencies. Does not depend on the previous cycle. USER_DEFINE_AND_SELF: Depends on both the current node and other nodes in the previous cycle. CHILD_AND_SELF: Depends on both the current node and its child nodes in the previous cycle.	NONE
DependentNodeIdList	string	No	The IDs of the nodes on which the current node depends. This parameter takes effect only when the DependentType parameter is set to USER_DEFINE. Separate multiple node IDs with commas (,). This parameter corresponds to the Other Nodes option in Properties > Dependencies > Cross-cycle Dependency (Original Previous-cycle Dependency) for data development nodes in the DataWorks console.	abc
InputList	string	Yes	The output names of the ancestor nodes on which the current node depends. Separate multiple output names with commas (,). This parameter corresponds to the Output Name of Ancestor Node setting in Properties > Dependencies for data development nodes in the DataWorks console.	project_root,project.file1,project.001_out
ProjectIdentifier	string	No	The DataWorks workspace name. To obtain the workspace name, log on to the DataWorks console and navigate to the workspace configuration page. You must specify either this parameter or ProjectId to identify the target DataWorks workspace for this API call.	dw_project
ResourceGroupIdentifier	string	No	The resource group for the task published from the file. To obtain the ID, log on to the DataWorks console, navigate to the workspace configuration page, and click Resource Groups in the left-side navigation pane to view the IDs of resource groups bound to the current workspace.	S_res_group_559_1613715566828
ResourceGroupId	long	No	This parameter is deprecated.	375827434852437
ConnectionName	string	No	The data source used when the task published from the file is run. You can call the UpdateDataSource operation to query the available data sources in the workspace.	odps_source
AutoParsing	boolean	No	Specifies whether to enable automatic parsing for the file. Valid values: true false This parameter corresponds to the Analyze Code setting in Properties > Dependencies for data development nodes in the DataWorks console.	true
SchedulerType	string	No	The scheduling type. Valid values: NORMAL: Normal scheduled task. MANUAL: Manually triggered node. Not scheduled for daily execution. Corresponds to nodes in manually triggered workflows. PAUSE: Paused task. SKIP: Dry-run task. Scheduled for daily execution but is directly marked as successful when scheduling starts.	NORMAL
AdvancedSettings	string	No	The advanced settings of the node. This parameter corresponds to the Advanced Settings section in the right-side navigation pane on the configuration tab of EMR Spark Streaming and EMR Streaming SQL nodes in the DataWorks console. Only EMR Spark Streaming and EMR Streaming SQL nodes support this parameter. The value must be in the JSON format.	{"queue":"default","SPARK_CONF":"--conf spark.driver.memory=2g"}
StartImmediately	boolean	No	Specifies whether to immediately run the node after the node is deployed. This parameter corresponds to the Start Method setting in Settings > Schedule in the right-side navigation pane on the configuration tab of EMR Spark Streaming and EMR Streaming SQL nodes in the DataWorks console.	true
InputParameters	string	No	The input context parameters of the node. The value must be in the JSON format. For more information about the parameter structure, see the InputContextParameterList parameter in the response parameters of the GetFile operation. This parameter corresponds to the Input Parameters setting in Properties > Input and Output Parameters for data development nodes in the DataWorks console.	[{"ValueSource": "project_001.first_node:bizdate_param","ParameterName": "bizdate_input"}]
OutputParameters	string	No	The output context parameters of the node. The value must be in the JSON format. For more information about the parameter structure, see the OutputContextParameterList parameter in the response parameters of the GetFile operation. This parameter corresponds to the Output Parameters setting in Properties > Input and Output Parameters for data development nodes in the DataWorks console.	[{"Type": 1,"Value": "${bizdate}","ParameterName": "bizdate_param"}]
IgnoreParentSkipRunningProperty	boolean	No	Specifies whether to inherit the dry-run status from the previous cycle. Valid values: true: Inherit the dry-run status from the previous cycle. false: Do not inherit the dry-run status from the previous cycle.	false
CreateFolderIfNotExists	boolean	No	Specifies whether to automatically create the directory specified by FileFolderPath if the directory does not exist. Valid values: true: If the directory does not exist, automatically create it. false: If the directory does not exist, the call fails.	false
ApplyScheduleImmediately	boolean	No	Specifies whether to apply the scheduling configuration immediately after the file is published.	true
Timeout	integer	No	The timeout settings for scheduling configuration.	1
ImageId	string	No	The custom image ID.	m-bp1h4b5a8ogkbll2f3tr

Response parameters

Parameter	Type	Description	Example
	object	The response.
HttpStatusCode	integer	The HTTP status code.	200
Data	long	The file ID.	1000001
RequestId	string	The request ID. Use this ID to troubleshoot issues.	0000-ABCD-EFG
ErrorMessage	string	The error message.	The connection does not exist.
Success	boolean	Indicates whether the call succeeded. Valid values: true: The call succeeded. false: The call failed.	true
ErrorCode	string	The error code.	Invalid.Tenant.ConnectionNotExists

Examples

Sample success responses

JSONformat

{
  "HttpStatusCode": 200,
  "Data": 1000001,
  "RequestId": "0000-ABCD-EFG",
  "ErrorMessage": "The connection does not exist.",
  "Success": true,
  "ErrorCode": "Invalid.Tenant.ConnectionNotExists"
}

Error codes

HTTP status code	Error code	Error message	Description
403	Forbidden.Access	Access is forbidden. Please first activate DataWorks Enterprise Edition or Flagship Edition.	No permission, please authorize
429	Throttling.Api	The request for this resource has exceeded your available limit.	-
429	Throttling.System	The DataWorks system is busy. Try again later.	-
429	Throttling.User	Your request is too frequent. Try again later.	-
500	InternalError.System	An internal system error occurred. Try again later.	-
500	InternalError.UserId.Missing	An internal system error occurred. Try again later.	-

For a list of error codes, visit the Service error codes.