Creates a file in Data Analytics.

Debugging

OpenAPI Explorer automatically calculates the signature value. For your convenience, we recommend that you call this operation in OpenAPI Explorer. OpenAPI Explorer dynamically generates the sample code of the operation for different SDKs.

Request parameters

Parameter Type Required Example Description
Action String Yes CreateFile

The operation that you want to perform.

AutoRerunTimes Integer Yes 3

The number of automatic rerunning errors. You can run this task up to 10 times.

FileDescription String Yes Here is the file description

Description of the file.

FileFolderPath String Yes Workflow /first workflow /data integration /Folder 1 /Folder 2

The path of the file.

FileName String Yes File

The name of the file.

FileType Integer Yes 10

The code type of the file. Common code types include 6(Shell), 10(ODPS SQL), 11(ODPS MR), 23 (data integration), 24(ODPS Script), 99 (virtual node), 221(PyODPS 2), 225(ODPS Spark), 227(EMR Hive), 228(EMR Spark), 229(EMR Spark SQL), 230(EMR MR), 239(OSS object inspection), 257(EMR Shell), 258(EMR Spark Shell), 259(EMR Presto), 260(EMR Impala), 900 (real-time synchronization), cross-tenant collaboration node ID: 1089, 1091, 1093,, or 1221, which is PyODPS 3.

InputList String Yes project_root,project.file1,project.001_out

Output of the file according to the dependencies. The output values are listed in scheduling configuration> dependencies> parent node output name. Separate the values with commas (,).

ProjectId Long Yes 10000

Dataworks workspace ID and click the workspace manage icon in the upper-right corner to view the workspace information.

RegionId String Yes cn-zhangjiakou

The list of regions where the services are located.

Owner String No 1000000000001

The Alibaba Cloud User ID of the owner of the file. If this parameter is not specified, the user ID of the caller is used.

Content String No SHOW TABLES;

The code content of the file. The code formats vary with the file type. You can find the task type in O&M, right-click the task and select View code to view the code format.

AutoRerunIntervalMillis Integer No 120000

The automatic rerunning interval. Unit: milliseconds. Maximum value: 1800000. Unit: milliseconds. On the Select Properties page, choose schedule configuration> time properties. Then, click the error Details tab and select the automatically rerun interval. The time on the page is measured in minutes. Pay attention to the time conversion.

RerunMode String No ALL_ALLOWED

The rerun attributes, including ALL_ALLOWED (which can be rerun after a successful or failed run), FAILURE_ALLOWED (which cannot be rerun after a successful run, and can be rerun after a failed run), and ALL_DENIED (which cannot be rerun after a successful or failed run). On the Select Properties page, choose Properties> dashboard.

Stop Boolean No false

Specifies whether to disable scheduling. For example, choose scheduling configuration> scheduling> disable scheduling in the scheduling section.

ParaValue String No a=x b=y

Scheduling parameter, according to scheduling configuration> basic configuration> parameters on the page.

StartEffectDate Long No 936923400000

The millisecond timestamp when the automatic scheduling starts to take effect, corresponding to the millisecond timestamp of the schedule configuration> time attribute> Effective Time> start time on the page.

EndEffectDate Long No 4155787800000

The millisecond timestamp that indicates when auto scheduling is stopped. The timestamp corresponds to the timestamp when scheduling configuration> time attribute> Effective Time> end time on the properties page.

CronExpress String No 00 05 00 * * ?

The cron expression used for automatic scheduling. You can select schedule configuration> Properties> cron expression.

  • Scheduled at 05:30 every day: 00 30 05 * * ?
  • Scheduled by minute 15 of each hour: 00 15 * * * ?
  • The node is scheduled every 10 minutes. 00 00/10 * * * ?
  • From 8: 00 to 17: 00 every day, it is scheduled every 10 minutes: 00 00-59/10 8-23 * * * ?
  • Automatic scheduling at 00:20 on the 1th of each month: 00 20 00 1 * ?
  • From 1: 00:10 on January 1, it will be scheduled every three months: 00 10 00 1 1-12/3 ?
  • Automatic scheduling at 00:05 every Tuesday and Friday: 00 05 00 * * 2,5

Dataworks scheduling system rules, cron expressions have the following restrictions:

  • The shortest scheduling gap is 5 minutes.
  • The earliest scheduling time is 00:05 every day.
CycleType String No DAY

The scheduling cycle type, including NOT_DAY (minute, hour) and DAY (DAY, week, month), according to the scheduling configuration> time Property> scheduling cycle on the page.

DependentType String No NONE

Select a cross-cycle dependency, including NONE (deselect cross-cycle dependency), SELF, CHILD, and USER_DEFINE (custom).

DependentNodeIdList String No abc

It depends on the last-cycle node list.

ProjectIdentifier String No dw_project

Dataworks unique identifier of the workspace. It is the identifier of the workspace at the top of the workspace.

You must specify either this parameter or the projectId parameter to determine the Dataworks of the project to be called.

ResourceGroupIdentifier String No group_375827434852437

After a file is published as a task, the resource group corresponding to the task is executed. On the scheduling configuration page, set the parameters as needed, as shown in the following figure.

You can ListResourceGroups To obtain the list of available resource groups of a workspace. ResourceGroupType 1. Extract the Identifier from the returned result.

ResourceGroupId Long No 375827434852437

After a file is published as a task, the resource group corresponding to the task is executed. On the scheduling configuration page, set the parameters as needed, as shown in the following figure. You can set either of the two ResourceGroupIdentifier.

You can ListResourceGroups To obtain the list of available resource groups of a workspace. ResourceGroupType 1, and extract the Id field after obtaining the result.

ConnectionName String No odps_first

After a file is published as a task, the data source to be connected during task execution is displayed, which corresponds to the select data source drop-down list at the top of the page.

You can use the ListConnections API to obtain the list of available data sources for the project.

You can ListConnections To get the list of available data sources of the workspace.

Response parameters

Parameter Type Example Description
Data Long 1000001

The ID of the file after it is created.

ErrorCode String Invalid.Tenant.ConnectionNotExists

The error code.

ErrorMessage String The connection does not exist.

The error message.

HttpStatusCode Integer 200

The HTTP status code.

RequestId String 0000-ABCD-EFG

The unique ID of the call. You can use the error ID to troubleshoot the error.

Success Boolean true

Indicates whether the request was successful.

Examples

Sample requests


     http(s)://[Endpoint]/? Action=CreateFile &AutoRerunTimes=3 &FileDescription=here is file description&FileFolderPath=workflow /first workflow /data integration /Folder 1 /Folder 2 &FileName=file name&FileType=10 &InputList=project_root,project.file1,project.001_out &ProjectId=10000 &RegionId=cn-zhangjiakou &<common request parameters> 
   

Sample success responses

XML format


     <RequestId>0000-ABCD-EFG</RequestId> <HttpStatusCode>200</HttpStatusCode> <Data>1000001</Data> <ErrorCode>Invalid.Tenant.ConnectionNotExists</ErrorCode> <ErrorMessage>The connection does not exist. </ErrorMessage> <Success>true</Success> 
   

JSON Format


     { "RequestId": "0000-ABCD-EFG", "HttpStatusCode": 200, "Data": 1000001, "ErrorCode": "Invalid.Tenant.ConnectionNotExists", "ErrorMessage": "The connection does not exist.", "Success": true } 
   

Error codes

HttpCode Error codes Error message Description
403 Forbidden.Access Access is forbidden. Activate DataWorks first. You have restricted access. Please activate DataWorks Enterprise Edition or above.
500 InternalError.System An internal system error occurred. Try again later. An internal system error occurred. Please try again later.
500 InternalError.UserId.Missing An internal system error occurred. Try again later. An internal system error occurred. Please try again later.
403 ResourceNotAuthorized.Api You are not authorized to access the resources. You cannot access resources without authorization.
429 Throttling.Api The request for this resource has exceeded your available limit. The request for the resource exceeds your available upper limit.
429 Throttling.System The DataWorks system is busy. Try again later. DataWorks the system is busy, please try again later.
429 Throttling.User Your request is too frequent. Try again later. Your request is too frequent, please try to slow down the request speed.

Go to the Error CenterFor more information, see error codes.