Queries the details of a file.

Debugging

OpenAPI Explorer automatically calculates the signature value. For your convenience, we recommend that you call this operation in OpenAPI Explorer. OpenAPI Explorer dynamically generates the sample code of the operation for different SDKs.

Request parameters

Parameter Type Required Example Description
Action String Yes GetFile

The operation that you want to perform.

ProjectId Long No 10000

The ID of the DataWorks workspace. You can log on to the DataWorks console and go to the Workspace Management page to view the workspace ID.

You must set either this parameter or the ProjectIdentifier parameter to determine the DataWorks workspace to which the operation is applied.

RegionId String Yes cn-zhangjiakou

The region ID. For example, the ID of the China (Shanghai) region is cn-shanghai, and that of the China (Zhangjiakou) region is cn-zhangjiakou. The system automatically determines the value of this parameter based on the endpoint that is used to call the operation.

ProjectIdentifier String No dw_project

The name of the DataWorks workspace. You can log on to the DataWorks console and go to the Workspace Management page to view the workspace name.

You must set either this parameter or the ProjectId parameter to determine the DataWorks workspace to which the operation is applied.

FileId Long No 100000001

The ID of the file. You can call the ListFiles operation to query the ID of the file.

NodeId Long No 200000001

The ID of the node that is scheduled.

Response parameters

Parameter Type Example Description
HttpStatusCode Integer 200

The HTTP status code returned.

ErrorMessage String The connection does not exist.

The error message.

RequestId String 0000-ABCD-EFG****

The ID of the request. You can troubleshoot errors based on the ID.

ErrorCode String Invalid.Tenant.ConnectionNotExists

The error code.

Success Boolean true

Indicates whether the request is successful. Valid values:

  • true: The request is successful.
  • false: The request fails.
Data Object

The details of the file.

File Object

The basic information about the file.

CommitStatus Integer 0

Indicates whether the latest code in the file is committed. Valid values: 0 and 1. A value of 0 indicates that the latest code is not committed. A value of 1 indicates that the latest code is committed.

AutoParsing Boolean true

Indicates whether the automatic parsing feature is enabled for the file. Valid values:

  • true: The automatic parsing feature is enabled for the file.
  • false: The automatic parsing feature is not enabled for the file.

This parameter is equivalent to the Auto Parse parameter in the Dependencies section of the Properties panel in the DataWorks console.

Owner String 7775674356****

The ID of the Alibaba Cloud account used by the file owner.

CreateTime Long 1593879116000

The time when the file was created. This value is a UNIX timestamp representing the number of milliseconds that have elapsed since the epoch time January 1, 1970, 00:00:00 UTC.

FileType Integer 10

The type of the code in the file. Examples: 6 (Shell), 10 (ODPS SQL), 11 (ODPS MR), 23 (Data Integration), 24 (ODPS Script), 99 (Zero load), 221 (PyODPS 2), 225 (ODPS Spark), 227 (EMR Hive), 228 (EMR Spark), 229 (EMR Spark SQL), 230 (EMR MR), 239 (OSS object inspection), 257 (EMR Shell), 258 (EMR Spark Shell), 259 (EMR Presto), 260 (EMR Impala), 900 (Real-time synchronization), 1089 (Cross-tenant collaboration), 1091 (Hologres development), 1093 (Hologres SQL), 1100 (Assignment), and 1221 (PyODPS 3).

CurrentVersion Integer 3

The latest version number of the file.

BizId Long 1000001

The ID of the workflow to which the file belongs. This parameter is discontinued and replaced by the BusinessId parameter.

LastEditUser String 62465892****

The ID of the Alibaba Cloud account within which the file was last modified.

FileName String ods_user_info_d

The name of the file.

ConnectionName String odps_first

The ID of the compute engine instance that is used to run the node that corresponds to the file.

UseType String NORMAL

The module to which the file belongs. Valid values:

  • NORMAL: The file is used for DataStudio.
  • MANUAL: The file is used for a manually triggered node.
  • MANUAL_BIZ: The file is used for a manually triggered workflow.
  • SKIP: The file is used for a dry-run DataStudio node.
  • ADHOCQUERY: The file is used for an ad hoc query.
  • COMPONENT: The file is used for a snippet.
FileFolderId String 2735c2****

The ID of the folder to which the file belongs.

ParentId Long -1

The ID of the node group file to which the current file belongs. This parameter is returned only if the current file is an inner file of the node group file.

CreateUser String 424732****

The ID of the Alibaba Cloud account within which the file is created.

IsMaxCompute Boolean true

Indicates whether the file needs to be uploaded to MaxCompute.

This parameter is returned only if the file is a MaxCompute resource file.

BusinessId Long 1000001

The ID of the workflow to which the file belongs.

FileDescription String My first DataWorks file

The description of the file.

DeletedStatus String RECYCLE

The status of the file. Valid values:

  • NORMAL: The file is not deleted.
  • RECYCLE_BIN: The file is moved to the recycle bin.
  • DELETED: The file is deleted.
LastEditTime Long 1593879116000

The time when the file was last modified. This value is a UNIX timestamp representing the number of milliseconds that have elapsed since the epoch time January 1, 1970, 00:00:00 UTC.

Content String SHOW TABLES;

The code in the file.

NodeId Long 300001

The ID of the auto triggered node that is generated in the scheduling system after the file is committed.

AdvancedSettings String null

The advanced configurations of an EMR node. This parameter is valid only for an EMR Spark Streaming node or an EMR Streaming SQL node. This parameter is equivalent to the Advanced Settings panel in the DataWorks console.

The value is in the JSON format.

NodeConfiguration Object

The scheduling configurations of the file.

RerunMode String ALL_ALLOWED

Indicates whether the node can be rerun. Valid values:

  • ALL_ALLOWED: The node can be rerun regardless of whether it is run as expected or fails to run.
  • FAILURE_ALLOWED: The node can be rerun only after it fails to run.
  • ALL_DENIED: The node cannot be rerun regardless of whether it is run as expected or fails to run.

This parameter is equivalent to the Rerun parameter in the Schedule section of the Properties panel in the DataWorks console.

SchedulerType String NORMAL

The scheduling type of the node. Valid values:

  • NORMAL: The node is an auto triggered node.
  • MANUAL: The node is a manually triggered node. Manually triggered nodes cannot be automatically triggered. They correspond to the nodes in the Manually Triggered Workflows pane.
  • PAUSE: The node is a paused node.
  • SKIP: The node is a dry-run node. Dry-run nodes are started as scheduled but the system sets the status of the nodes to successful when it starts to run them.
Stop Boolean false

Indicates whether the scheduling for the node is suspended.

  • true: The scheduling for the node is suspended.
  • false: The scheduling for the node is not suspended.

This parameter is equivalent to the Skip Execution option in the Schedule section of the Properties panel in the DataWorks console.

ParaValue String a=x b=y

The scheduling parameters of the node.

This parameter is equivalent to the configuration of the scheduling parameters in the Parameters section of the Properties panel in the DataWorks console. For more information, see Configure scheduling parameters.

StartEffectDate Long 936923400000

The start time of automatic scheduling. This value is a UNIX timestamp representing the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC.

This parameter is equivalent to the start time specified for the Validity Period parameter in the Schedule section of the Properties panel in the DataWorks console.

EndEffectDate Long 4155787800000

The end time of automatic scheduling. This value is a UNIX timestamp representing the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC.

This parameter is equivalent to the end time specified for the Validity Period parameter in the Schedule section of the Properties panel in the DataWorks console.

CycleType String DAY

The type of the scheduling cycle of the node that corresponds to the file. Valid values: NOT_DAY and DAY. A value of NOT_DAY indicates that the node is scheduled to run by minute or hour. A value of DAY indicates that the node is scheduled to run by day, week, or month.

This parameter is equivalent to the Scheduling Cycle parameter in the Schedule section of the Properties panel in the DataWorks console.

DependentNodeIdList String 5,10,15,20

The IDs of the nodes on which the node corresponding to the file depends when the DependentType parameter is set to USER_DEFINE. Multiple IDs are separated by commas (,).

This parameter is equivalent to the field that appears after Previous Cycle is selected and the Depend On parameter is set to Other Nodes in the Dependencies section of the Properties panel in the DataWorks console.

ResourceGroupId Long 375827434852437

The ID of the resource group that is used to run the node. You can call the ListResourceGroups operation to query the available resource groups in the workspace.

DependentType String USER_DEFINE

The type of the cross-cycle scheduling dependency of the node that corresponds to the file. Valid values:

  • SELF: The instance generated for the node in the current cycle depends on the instance generated for the node in the previous cycle.
  • CHILD: The instance generated for the node in the current cycle depends on the instances generated for the descendant nodes at the nearest level of the node in the previous cycle.
  • USER_DEFINE: The instance generated for the node in the current cycle depends on the instances generated for one or more specified nodes in the previous cycle.
  • NONE: No cross-cycle scheduling dependency type is selected for the node.
AutoRerunTimes Integer 3

The maximum number of automatic reruns that are allowed after an error occurs.

AutoRerunIntervalMillis Integer 120000

The interval between two consecutive automatic reruns after an error occurs. Unit: milliseconds.

This parameter is equivalent to the Rerun Interval parameter in the Schedule section of the Properties panel in the DataWorks console.

The interval that you specify in the DataWorks console is measured in minutes. Pay attention to the conversion between the units of time when you call the operation.

CronExpress String 00 05 00 * * ?

The CRON expression that represents the periodic scheduling policy of the node.

InputList Array of NodeInputOutput

The output names of the parent files on which the current file depends.

Input String project.001_out

The output name of the parent file on which the current file depends.

This parameter is equivalent to the Output Name parameter under Parent Nodes in the Dependencies section of the Properties panel in the DataWorks console.

ParseType String MANUAL

The mode of the configuration file dependency. Valid values:

  • MANUAL: The scheduling dependencies are manually configured.
  • AUTO: The scheduling dependencies are automatically parsed.
OutputList Array of NodeInputOutput

The output names of the current file.

This parameter is equivalent to the Output Name parameter under Output in the Dependencies section of the Properties panel in the DataWorks console.

RefTableName String ods_user_info_d

The output table name of the current file.

This parameter is equivalent to the Output Table Name parameter under Output in the Dependencies section of the Properties panel in the DataWorks console.

Output String dw_project.002_out

The output name of the current file.

This parameter is equivalent to the Output Name parameter under Output in the Dependencies section of the Properties panel in the DataWorks console.

StartImmediately Boolean true

Indicates whether the node is immediately run after the node is deployed to the production environment. This parameter is equivalent to the Instance Generation Mode parameter in the Schedule section of the Properties panel in the DataWorks console.

Examples

Sample requests

http(s)://[Endpoint]/?Action=GetFile
&ProjectId=10000
&ProjectIdentifier=dw_project
&FileId=100000001
&NodeId=200000001
&<Common request parameters>

Sample success responses

XML format

HTTP/1.1 200 OK
Content-Type:application/xml

<GetFileResponse>
    <HttpStatusCode>200</HttpStatusCode>
    <ErrorMessage>The connection does not exist.</ErrorMessage>
    <RequestId>0000-ABCD-EFG****</RequestId>
    <ErrorCode>Invalid.Tenant.ConnectionNotExists</ErrorCode>
    <Success>true</Success>
    <Data>
        <File>
            <CommitStatus>0</CommitStatus>
            <AutoParsing>true</AutoParsing>
            <Owner>7775674356****</Owner>
            <CreateTime>1593879116000</CreateTime>
            <FileType>10</FileType>
            <CurrentVersion>3</CurrentVersion>
            <BizId>1000001</BizId>
            <LastEditUser>62465892****</LastEditUser>
            <FileName>ods_user_info_d</FileName>
            <ConnectionName>odps_first</ConnectionName>
            <UseType>NORMAL</UseType>
            <FileFolderId>2735c2****</FileFolderId>
            <ParentId>-1</ParentId>
            <CreateUser>424732****</CreateUser>
            <IsMaxCompute>true</IsMaxCompute>
            <BusinessId>1000001</BusinessId>
            <FileDescription>My first DataWorks file</FileDescription>
            <DeletedStatus>RECYCLE</DeletedStatus>
            <LastEditTime>1593879116000</LastEditTime>
            <Content>SHOW TABLES;</Content>
            <NodeId>300001</NodeId>
            <AdvancedSettings>{"queue":"default","SPARK_CONF":"--conf spark.driver.memory=2g"}</AdvancedSettings>
        </File>
        <NodeConfiguration>
            <RerunMode>ALL_ALLOWED</RerunMode>
            <SchedulerType>NORMAL</SchedulerType>
            <Stop>false</Stop>
            <ParaValue>a=x b=y</ParaValue>
            <StartEffectDate>936923400000</StartEffectDate>
            <EndEffectDate>4155787800000</EndEffectDate>
            <CycleType>DAY</CycleType>
            <DependentNodeIdList>5,10,15,20</DependentNodeIdList>
            <ResourceGroupId>375827434852437</ResourceGroupId>
            <DependentType>USER_DEFINE</DependentType>
            <AutoRerunTimes>3</AutoRerunTimes>
            <AutoRerunIntervalMillis>120000</AutoRerunIntervalMillis>
            <CronExpress>00 05 00 * * ?</CronExpress>
            <InputList>
                <Input>project.001_out</Input>
                <ParseType>MANUAL</ParseType>
            </InputList>
            <OutputList>
                <RefTableName>ods_user_info_d</RefTableName>
                <Output>dw_project.002_out</Output>
            </OutputList>
            <StartImmediately>true</StartImmediately>
        </NodeConfiguration>
    </Data>
</GetFileResponse>

JSON format

HTTP/1.1 200 OK
Content-Type:application/json

{
  "HttpStatusCode" : 200,
  "ErrorMessage" : "The connection does not exist.",
  "RequestId" : "0000-ABCD-EFG****",
  "ErrorCode" : "Invalid.Tenant.ConnectionNotExists",
  "Success" : true,
  "Data" : {
    "File" : {
      "CommitStatus" : 0,
      "AutoParsing" : true,
      "Owner" : "7775674356****",
      "CreateTime" : 1593879116000,
      "FileType" : 10,
      "CurrentVersion" : 3,
      "BizId" : 1000001,
      "LastEditUser" : "62465892****",
      "FileName" : "ods_user_info_d",
      "ConnectionName" : "odps_first",
      "UseType" : "NORMAL",
      "FileFolderId" : "2735c2****",
      "ParentId" : -1,
      "CreateUser" : "424732****",
      "IsMaxCompute" : true,
      "BusinessId" : 1000001,
      "FileDescription" : "My first DataWorks file",
      "DeletedStatus" : "RECYCLE",
      "LastEditTime" : 1593879116000,
      "Content" : "SHOW TABLES;",
      "NodeId" : 300001,
      "AdvancedSettings" : "{\"queue\":\"default\",\"SPARK_CONF\":\"--conf spark.driver.memory=2g\"}"
    },
    "NodeConfiguration" : {
      "RerunMode" : "ALL_ALLOWED",
      "SchedulerType" : "NORMAL",
      "Stop" : false,
      "ParaValue" : "a=x b=y",
      "StartEffectDate" : 936923400000,
      "EndEffectDate" : 4155787800000,
      "CycleType" : "DAY",
      "DependentNodeIdList" : "5,10,15,20",
      "ResourceGroupId" : 375827434852437,
      "DependentType" : "USER_DEFINE",
      "AutoRerunTimes" : 3,
      "AutoRerunIntervalMillis" : 120000,
      "CronExpress" : "00 05 00 * * ?",
      "InputList" : {
        "Input" : "project.001_out",
        "ParseType" : "MANUAL"
      },
      "OutputList" : {
        "RefTableName" : "ods_user_info_d",
        "Output" : "dw_project.002_out"
      },
      "StartImmediately" : true
    }
  }
}

Error codes

HTTP status code Error code Error message Description
403 Forbidden.Access Access is forbidden. Please first activate DataWorks Enterprise Edition or Flagship Edition. The error message returned because you are not allowed to perform this operation. Activate DataWorks Enterprise Edition or DataWorks Ultimate Edition.
429 Throttling.Api The request for this resource has exceeded your available limit. The error message returned because the number of requests for the resource has exceeded the upper limit.
429 Throttling.System The DataWorks system is busy. Try again later. The error message returned because the DataWorks system is busy. Try again later.
429 Throttling.User Your request is too frequent. Try again later. The error message returned because excessive requests have been submitted within a short period of time. Try again later.
500 InternalError.System An internal system error occurred. Try again later. The error message returned because an internal error has occurred. Try again later.
500 InternalError.UserId.Missing An internal system error occurred. Try again later. The error message returned because an internal error has occurred. Try again later.

For a list of error codes, visit the API Error Center.