All Products
Search
Document Center

DataWorks:GetFile

Last Updated:Mar 30, 2026

Retrieves the details of a file.

Try it now

Try this API in OpenAPI Explorer, no manual signing needed. Successful calls auto-generate SDK code matching your parameters. Download it with built-in credential security for local usage.

Test

RAM authorization

No authorization for this operation. If you encounter issues with this operation, contact technical support.

Request parameters

Parameter

Type

Required

Description

Example

ProjectId

integer

No

The ID of the DataWorks workspace. You can log on to the DataWorks console, and go to the workspace configuration page to obtain the workspace ID.

You must specify either this parameter or the ProjectIdentifier parameter to identify the DataWorks workspace for this API call.

10000

ProjectIdentifier

string

No

The name of the DataWorks workspace. You can log on to the DataWorks console, and go to the workspace configuration page to obtain the workspace name.

You must specify either this parameter or the ProjectId parameter to identify the DataWorks workspace for this API call.

dw_project

FileId

integer

No

The ID of the file. You can invoke the ListFiles API to query the ID of the corresponding file.

100000001

NodeId

integer

No

The ID of the scheduling node. You can invoke the ListFiles API to obtain the node ID.

200000001

Response elements

Element

Type

Description

Example

object

HttpStatusCode

integer

HTTP status code.

200

ErrorMessage

string

Error message.

The connection does not exist.

RequestId

string

Request ID. Used for troubleshooting when a fault occurs.

0000-ABCD-EFG****

ErrorCode

string

Error code.

Invalid.Tenant.ConnectionNotExists

Success

boolean

Indicates whether the invocation succeeded. Valid values:

  • true: The invocation succeeded.

  • false: Failed to invoke.

true

Data

object

Details of the file.

File

object

Basic information about the file.

CommitStatus

integer

The current commit status of the file. Valid values:

  • 0: The latest code has not been submitted.

  • 1: The latest code has been submitted.

0

AutoParsing

boolean

Indicates whether automatic parsing is enabled for the file. Valid values:

  • true: The code in the file is automatically parsed.

  • false: The code in the file is not automatically parsed.

This parameter corresponds to the "Code Parsing" option in the DataWorks console (https://workbench.data.aliyun.com/console) when you select "Same Cycle" under Schedule Configuration > Schedule Dependency for a Data Development job.

true

Owner

string

Alibaba Cloud User ID of the file owner.

7775674356****

CreateTime

integer

UNIX timestamp when the file was created, in milliseconds.

1593879116000

FileType

integer

The code type of the file. Different file types use different code. For more information, see DataWorks Edge Zone Collection.

10

CurrentVersion

integer

Version number of the latest submitted version of the file.

3

BizId

integer

The ID of the Business Process to which the file belongs. This field is deprecated. Use the BusinessId field instead.

1000001

LastEditUser

string

The Alibaba Cloud User ID of the user who last edited the file.

424732****

FileName

string

Name of the file.

ods_user_info_d

ConnectionName

string

The name of the data source used when executing the job corresponding to the file.

odps_source

UseType

string

The function module to which the file belongs. Valid values:

  • NORMAL: Data Development.

  • MANUAL: One-time task.

  • MANUAL_BIZ: Manually triggered workflow.

  • SKIP: Dry-run scheduling in Data Development.

  • ADHOCQUERY: Ad-hoc query.

  • COMPONENT: Widget Management.

NORMAL

FileFolderId

string

The ID of the folder to which the file belongs.

2735c2****

ParentId

integer

If the current file is an internal file of a composite edge zone file, this field identifies the ID of the corresponding composite edge zone file.

-1

CreateUser

string

The Alibaba Cloud User ID of the file creator.

424732****

IsMaxCompute

boolean

Indicates whether the resource file needs to be uploaded to MaxCompute. Configure this parameter only when the file is a MaxCompute resource file.

true

BusinessId

integer

The Business Process ID of the file.

1000001

FileDescription

string

The description of the file.

My first DataWorks file

DeletedStatus

string

The deletion status of the file. Valid values:

  • NORMAL: Not deleted.

  • RECYCLE_BIN: In the recycle bin.

  • DELETED: Deleted.

RECYCLE

LastEditTime

integer

The UNIX timestamp of the most recent edit to the file, in milliseconds.

1593879116000

Content

string

The code of the file.

SHOW TABLES;

NodeId

integer

The ID of the scheduling task generated in the CDN mapping system after the file is submitted.

300001

AdvancedSettings

string

Advanced configuration of the job.

This parameter corresponds to "Advanced Settings" in the right-side navigation bar on the editing page of an EMR Data Development job in the DataWorks console.

Note

Currently, EMR Shell jobs do not support advanced parameters.

For details about advanced parameters for different EMR job types, see EMR Job Development.

{\"priority\":\"1\",\"ENABLE_SPARKSQL_JDBC\":false,\"FLOW_SKIP_SQL_ANALYZE\":false,\"queue\":\"default\"}

FileId

integer

The ID of the file.

100000001

NodeConfiguration

object

The schedule configuration of the file.

RerunMode

string

Rerun property. Valid values:

  • ALL_ALLOWED: The job can be rerun regardless of whether it previously Succeeded or failed.

  • FAILURE_ALLOWED: The job cannot be rerun if it previously Succeeded, but can be rerun if it previously failed.

  • ALL_DENIED: The job cannot be rerun regardless of whether it previously Succeeded or failed.

This parameter corresponds to the "Scan Configuration > Time Properties > Rerun Property" setting for a Data Development job in the DataWorks console.

ALL_ALLOWED

SchedulerType

string

The schedule type. Valid values:

  • NORMAL: Normal scheduling task.

  • MANUAL: One-time task, which is not included in regular scheduling and corresponds to a node in a manually triggered workflow.

  • PAUSE: Paused task.

  • SKIP: Dry-run task, which is included in regular scheduling but is immediately marked as Succeeded when scheduled.

NORMAL

Stop

boolean

Indicates whether to skip execution. Valid values:

  • true: Skip execution.

  • false: Do not skip execution.

This parameter corresponds to the setting "Schedule Type" under "Schedule Configuration > Time Properties" for a Data Development job in the DataWorks console, when it is set to "skip execution".

false

ParaValue

string

Schedule parameter.

This parameter corresponds to the "Scan Configuration > Parameters" setting for a Data Development job in the DataWorks console. You can refer to the Schedule Parameters documentation for configuration details.

a=x b=y

StartEffectDate

integer

The UNIX timestamp (in milliseconds) indicating when automatic scheduling starts.

This parameter corresponds to the start time (as a UNIX timestamp in milliseconds) configured under "Schedule Configuration > Time Properties > Effective Date" for a Data Development job in the DataWorks console.

936923400000

EndEffectDate

integer

The UNIX timestamp, in milliseconds, when automatic scheduling stops.

This parameter corresponds to the millisecond UNIX timestamp of the end time configured in the "Scan Configuration > Time Properties > Effective Date" setting for a Data Development job in the DataWorks console.

4155787800000

CycleType

string

The type of recurrence, including NOT_DAY (minute, hour) and DAY (day, week, month).

This parameter corresponds to "Schedule Configuration > Time Properties > Recurrence" for a Data Development job in the DataWorks console.

DAY

DependentNodeIdList

string

When the DependentType parameter is set to USER_DEFINE, this parameter specifies the IDs of the nodes on which the current file depends. Separate multiple node IDs with commas (,).

This parameter corresponds to the configuration when, in the DataWorks console, the "Schedule Configuration > Schedule Dependency" of a Data Development job is set to "Previous Cycle" and the dependency option is set to "Other Nodes".

5,10,15,20

ResourceGroupId

integer

The resource group used when the file is published as a Job and executed. You can call ListResourceGroups to obtain the list of available resource groups in the workspace.

375827434852437

DependentType

string

The method of depending on the previous cycle. Valid values:

  • SELF: The dependency is the current node itself.

  • CHILD: The dependency is direct child nodes.

  • USER_DEFINE: The dependency is other specified nodes.

  • NONE: No dependency is selected, meaning the node does not depend on the previous cycle.

USER_DEFINE

AutoRerunTimes

integer

The number of automatic reruns after an error.

3

AutoRerunIntervalMillis

integer

The time interval between automatic reruns after an error, in milliseconds.

This parameter corresponds to the "Rerun Interval" setting under "Schedule Configuration > Time Properties > Auto Rerun on Error" for a Data Development job in the DataWorks console.
Note that the time unit for "Rerun Interval" in the console is minutes; convert the time accordingly when invoking the API.

120000

CronExpress

string

The Cron Expression for timed scheduling of the file.

00 05 00 * * ?

InputList

array<object>

Information about outputs from upstream files on which this file depends.

object

Information about outputs from upstream files on which this file depends.

Input

string

The output name of the upstream file on which this file depends.

This parameter corresponds to "Parent Node Output Name" when "Same Cycle" is selected under "Schedule Configuration > Schedule Dependency" for a Data Development job in the DataWorks console.

project.001_out

ParseType

string

The method for configuring file dependencies. Valid values:

  • MANUAL: Manually configured.

  • AUTO: Automatically parsed.

MANUAL

OutputList

array<object>

Output information of the file.

object

Output information of the file.

RefTableName

string

Output value of the file.

This parameter corresponds to the value in the "Output Table" column when "Same Cycle" is selected under "Scan Configuration > Schedule Dependency" for a Data Development job in the DataWorks console.

ods_user_info_d

Output

string

Output name of the file.

This parameter corresponds to the value in the "Output Name" column when "Same Cycle" is selected under "Scan Configuration > Schedule Dependency" for a Data Development job in the DataWorks console.

dw_project.002_out

StartImmediately

boolean

Indicates whether to start immediately after publishing.

This parameter corresponds to the "Start Method" setting under "Configuration > Time Properties" in the right-side navigation bar on the editing page for EMR Spark Streaming and EMR Streaming SQL Data Development jobs in the DataWorks console.

true

InputParameters

array<object>

Return Result.

object

Return Result.

ParameterName

string

The parameter name of the input parameter in the node context. You can reference this parameter in code by using the ${...} syntax.

This parameter corresponds to the "Parameter Name" field under "Schedule Configuration > Node Context > Input Parameters of This Node" in the DataWorks console.

input

ValueSource

string

The value source of the input parameter in the node context.

This parameter corresponds to the "Value Source" field under "Schedule Configuration > Node Context > Input Parameters of This Node" in the DataWorks console.

project_001.parent_node:outputs

OutputParameters

array<object>

Return Result.

object

Return Result.

ParameterName

string

The parameter name of the output parameter in the node context.

This parameter corresponds to the "Parameter Name" field under "Schedule Configuration > Node Context > Output Parameters of This Node" for a Data Development job in the DataWorks console.

output

Value

string

The expression of the output parameter in the edge zone context.

This parameter corresponds to the "Value" field in the "Scan Configuration > Edge Zone Context > Output Parameters of This Node" section for a Data Development job in the DataWorks console.

${bizdate}

Type

string

The type of the expression for the edge zone context output parameter. Valid values are as follows:

  • 1: constant

  • 2: variable

  • 3: pass-through variable from a parameter node

This parameter corresponds to the "Type" field in the "Scan Configuration > Edge Zone Context > Output Parameters of This Node" section for a Data Development job in the DataWorks console.

1

Description

string

The description of the output parameter in the edge zone context.

It's a context output parameter.

ApplyScheduleImmediately

string

Whether to apply the schedule configuration immediately after publishing.

true

IgnoreParentSkipRunningProperty

string

Schedule Configuration > Previous Cycle > Whether to ignore the upstream dry-run property.

true

Timeout

integer

Timeout definition for scheduling configuration.

1

ImageId

string

Custom image ID

m-bp1h4b5a8ogkbll2f3tr

Examples

Success response

JSON format

{
  "HttpStatusCode": 200,
  "ErrorMessage": "The connection does not exist.",
  "RequestId": "0000-ABCD-EFG****",
  "ErrorCode": "Invalid.Tenant.ConnectionNotExists",
  "Success": true,
  "Data": {
    "File": {
      "CommitStatus": 0,
      "AutoParsing": true,
      "Owner": "7775674356****",
      "CreateTime": 1593879116000,
      "FileType": 10,
      "CurrentVersion": 3,
      "BizId": 1000001,
      "LastEditUser": "424732****\n",
      "FileName": "ods_user_info_d",
      "ConnectionName": "odps_source",
      "UseType": "NORMAL",
      "FileFolderId": "2735c2****",
      "ParentId": -1,
      "CreateUser": "424732****\n",
      "IsMaxCompute": true,
      "BusinessId": 1000001,
      "FileDescription": "My first DataWorks file",
      "DeletedStatus": "RECYCLE",
      "LastEditTime": 1593879116000,
      "Content": "SHOW TABLES;",
      "NodeId": 300001,
      "AdvancedSettings": "{\\\"priority\\\":\\\"1\\\",\\\"ENABLE_SPARKSQL_JDBC\\\":false,\\\"FLOW_SKIP_SQL_ANALYZE\\\":false,\\\"queue\\\":\\\"default\\\"}",
      "FileId": 100000001
    },
    "NodeConfiguration": {
      "RerunMode": "ALL_ALLOWED",
      "SchedulerType": "NORMAL",
      "Stop": false,
      "ParaValue": "a=x b=y",
      "StartEffectDate": 936923400000,
      "EndEffectDate": 4155787800000,
      "CycleType": "DAY",
      "DependentNodeIdList": "5,10,15,20",
      "ResourceGroupId": 375827434852437,
      "DependentType": "USER_DEFINE",
      "AutoRerunTimes": 3,
      "AutoRerunIntervalMillis": 120000,
      "CronExpress": "00 05 00 * * ?",
      "InputList": [
        {
          "Input": "project.001_out",
          "ParseType": "MANUAL"
        }
      ],
      "OutputList": [
        {
          "RefTableName": "ods_user_info_d",
          "Output": "dw_project.002_out"
        }
      ],
      "StartImmediately": true,
      "InputParameters": [
        {
          "ParameterName": "input",
          "ValueSource": "project_001.parent_node:outputs"
        }
      ],
      "OutputParameters": [
        {
          "ParameterName": "output",
          "Value": "${bizdate}",
          "Type": "1",
          "Description": "It's a context output parameter."
        }
      ],
      "ApplyScheduleImmediately": "true",
      "IgnoreParentSkipRunningProperty": "true",
      "Timeout": 1,
      "ImageId": "m-bp1h4b5a8ogkbll2f3tr"
    },
    "ResourceDownloadLink": {
      "downloadLink": "http://xx"
    }
  }
}

Error codes

HTTP status code

Error code

Error message

Description

500 InternalError.System An internal system error occurred. Try again later.
500 InternalError.UserId.Missing An internal system error occurred. Try again later.
403 Forbidden.Access Access is forbidden. Please first activate DataWorks Enterprise Edition or Flagship Edition. No permission, please authorize
429 Throttling.Api The request for this resource has exceeded your available limit.
429 Throttling.System The DataWorks system is busy. Try again later.
429 Throttling.User Your request is too frequent. Try again later.

See Error Codes for a complete list.

Release notes

See Release Notes for a complete list.