Document Center
DataWorks
Update Records
Release notes of key features
Version history
DataWorks V2.0
DataWorks V3.0
Change history
Product Introduction
What is DataWorks?
Basic concepts
Scenarios
Data development process
Basic mode and standard mode
Pricing
Purchase guide
Billing
View spending details
Pay-as-you-go
Subscription (pre-payment)
DataWorks advanced editions
DataWorks resource plans
DataWorks exclusive resources
Billing standards of exclusive resource groups for scheduling
Performance metrics and billing standards of exclusive resource groups for Data Integration
Appendix
Usage notes of DataWorks resource groups
Feature comparison among DataWorks editions
Feature downgrades of DataWorks advanced editions upon expiration
DataWorks (pay-as-you-go) billing details
Preparation
Administrator Operations
Prepare an Alibaba Cloud account
Prepare a RAM user
Create a workspace
Add workspace members
Work as a workspace administrator
Use a RAM user
Quick Start
Overview
Create tables and import data
Create a workflow
Create a sync node
Configure recurrence and dependencies for a node
Run a node and troubleshoot errors
Optional: Use an ad hoc query node to run SQL statements
Tutorials
DataWorks for MaxCompute Workshop
Workshop introduction
Prepare the environment
Collect data
Process data
Configure rules to monitor data quality
Visualize data on a dashboard
Use Function Studio to develop a UDF
DataWorks for EMR Workshop
Prepare the environment
Collect data
Process data
Collect and view metadata
Configure rules to monitor data quality
Build an online operation analysis platform
Business scenarios and development process
Prepare the environment
Prepare data
Build data models and analyze data
Create tables
Design workflows
Configure nodes
Commit and test nodes
Display data on dashboards
Automatically identify users who steal electricity
Overview
Prepare the environment
Prepare data
Process data
Build a data model
Workbench
Overview of the DataWorks console
Workspace list
Resource group list
Create and use an exclusive resource group for scheduling
View a resource group list
View the resource usage of an exclusive resource group
O&M Assistant
Add a route
Create custom resource groups for scheduling
Exclusive resource group mode
Compute engine list
MaxCompute Management
Workspace management
Configure a workspace
Manage workspace members
Permission list
Configure MaxCompute
Manage connections
Upgrade the workspace mode
Data Integration
Overview
Homepage
Supported data stores
Common configurations
Test data store connectivity
Configure whitelists
Configure a security group
Batch data synchronization
Connections
Supported data stores, readers, and writers
Isolate connections between the development and production environments
Manage connection permissions
Node configuration
Create a sync node by using the codeless UI
Create a sync node by using the code editor
Synchronize incremental data
Synchronize data from tables in multiple databases to the same destination table
Migrate databases to MaxCompute
Rules and restrictions
Migrate all data in a MySQL database
Migrate all data in an Oracle database
Migrate multiple tables to the cloud at a time
Migrate multiple tables to the cloud at a time
Add multiple connections at a time
Add a prefix to the names of the destination tables
Resources for batch sync nodes
Real-time synchronization
Overview
Create, configure, commit, and manage real-time sync nodes
Supported data stores for real-time synchronization
Fields used for real-time synchronization
Reader
Configure MySQL Binlog Reader
Configure SQLServer CDC Reader
Configure DataHub Reader
Configure LogHub Reader
Configure Kafka Reader
Configure PolarDB Reader
Writer
Configure MaxCompute Writer
Configure Hologres Writer
Configure DataHub Writer
Configure Kafka Writer
Transformation
Configure Data Filtering
Configure String Replace
Sync solutions
Go to the Sync Solutions page
Synchronize data to MaxCompute
Configure a source Oracle data source
Configure a source MySQL data source
Add a data source
Node optimization
Optimize performance
Optimize a sync node
Resource groups
Overview
Use the default resource group
Exclusive resource group for Data Integration
Create a custom resource group for Data Integration
RAM authorization mode
Service linked role of DataWorks Data Integration
Use the RAM authorization mode to configure connections to data stores
Appendixes
Connection configuration
Reader configuration
Writer configuration
Data Development
Create a solution
SQL coding guidelines and specifications
GUI elements
GUI elements
Use workflow parameters
Lineage
Versions
Code Structure
Workflow
Manage workflows
Node group
Node types
Data Integration
Create a batch sync node
MaxCompute
Create an ODPS SQL node
Create an SQL component node
Create an ODPS Spark node
Create a PyODPS 2 node
Create a PyODPS 3 node
Create an ODPS Script node
Create an ODPS MR node
Create a MaxCompute table
Create MaxCompute resources
Create a MaxCompute function
AnalyticDB for PostgreSQL
Create an ADB for PostgreSQL node
Create an AnalyticDB for PostgreSQL table
E-MapReduce
E-MapReduce access modes
EMR Presto node
Create an EMR Hive node
Create an EMR MR node
Create an EMR Spark SQL node
Create an EMR Spark node
Create an EMR JAR resource
Create an E-MapReduce table
Create an E-MapReduce function
Hologres
Create a Hologres SQL node
Algorithm
Create a Machine Learning (PAI) node
General
OSS Object Inspection node
Create and configure a for-each node
Configure a do-while node
Configure a merge node
Configure a branch node
Configure an assignment node
Create a Shell node
Create a zero-load node
Create an HTTP trigger node
Cross-tenant collaboration node
Custom nodes
Create a Data Lake Analytics node
Create a Hologres development node
Create an AnalyticDB for MySQL node
MaxCompute
MaxCompute modules
MaxCompute functions
MaxCompute resources
Schedule
Basic properties
Scheduling parameters
Configure time properties
Dependencies
Configure the resource group
Immediate instance generation
Cross-cycle dependencies
Context-based parameters
Logic of scheduling dependencies
Script template management
Create a script template
Use a script template
Node configuration
Overview
Create a wrapper
Create a custom node type
Create a data quality wrapper
Create an ad hoc query
View tenant tables
Runtime logs
Manage manually triggered workflows
Manage tables
External table
Functions
Recycle bin
Setup
Setup
Configuration center
Workspace settings
Template management
Folder management
Level management
Workspace backup and restoration
Code review
Editor shortcuts
DataService Studio
Overview
Features
Create an API
Manage API tags
Use service orchestration
Use filters
Use prefilters
Use post filters
View API versions
Use business processes
Manage business processes
Manage APIs
Manage functions
Manage workflows
Generate API
Configure a connection
Create an API in the codeless UI
Create an API in the code editor
Register an API
View API statistics
Test an API
Publish an API
Authorize access to an API
Call an API
Appendix: DataService Studio error codes
DataAnalysis
Overview
Workbook
Create and manage a workbook
Import data to a workbook
Analyze data
Export, share, and download a workbook
Manage a template
MaxCompute table
Create and manage a MaxCompute table
Edit a MaxCompute table
Import data to a MaxCompute table
Share a MaxCompute table
Report
Create and manage a report
Edit a report
Save a report as a template
Share a report
Charts
Area charts
Pie charts
Horizontal bar charts
Scatter charts
Column charts
Stock charts
Line charts
Analyze data
Task Operation
Operation Center
Overview
View the dashboard
Auto triggered node O&M
View auto triggered nodes
Manage retroactive instances
View auto triggered node instances
Instance diagnosis
Test instances
Manually triggered node O&M
Manually triggered nodes
Manually triggered node instances
Monitor
Overview
Features
Instructions
Publish nodes
Deploy nodes
Delete a node
Cross-workspace cloning
Overview
Clone nodes across workspaces
Data governance
Security Center
Overview
Quick start
My Permissions
Authorizations
Approval Center
Data Quality
Overview
Go to the Overview page
View my subscriptions
Configure monitoring rules
View monitoring results
Manage templates
Create and manage report templates
Create, manage, and use rule templates
Instructions
Configure monitoring rules for DataHub
Configure monitoring rules for MaxCompute
Built-in rule templates for offline data
Data security guard
Overview
Discover data
View data activities
Data risk identification
Audit data
Set data identification rules
Customize de-identification rules
Manage data security levels
Manually correct data
Manage risk identification rules
Data Map
Overview
View overall data
View and manage data and data permissions
Manage categories and permissions on MaxCompute tables
Table details
View the details of a table
Apply for table permissions
Add a table to favorites
Go to DataService Studio to create API operations
Data discovery
Collect metadata from an E-MapReduce data store
Collect Tablestore metadata
Collect metadata from a MySQL data store
Collect metadata from an SQL Server data store
Collect metadata from a PostgreSQL data store
Collect metadata from an Oracle data store
Collect metadata from an AnalyticDB for PostgreSQL data store
Collect metadata from an AnalyticDB for MySQL 2.0 data store
Collect metadata from an AnalyticDB for MySQL 3.0 data store
Collect metadata from an OSS data store
Requirements Management
Overview
Create a requirement
Search for a requirement
Manage a requirement
Best practice of Requirements Management
Migration Assistant
Overview
Cloud tasks
Export tasks from open source engines
Import tasks of open source engines
Migration of data objects in DataWorks
Create and view export tasks
Create and view import tasks
Best Practices
Data migration
Migrate data across DataWorks workspaces
Migrate data from Hadoop to MaxCompute
Best practices for migrating data from Kafka to MaxCompute
Migrate JSON data from OSS to MaxCompute
Migrate JSON-formatted data from MongoDB to MaxCompute
Migrate data from Elasticsearch to MaxCompute
Connect an exclusive resource group for Data Integration to a data store that is deployed in a VPC
Add a relational database driver for RDBMS
Use Data Integration to import data to Elasticsearch
Configure an OTSStream sync node
Migrate data from a user-created MySQL database on an ECS instance to MaxCompute
Use Data Integration to import data to DataHub
Use Data Integration to ship data collected by LogHub to destinations
Best practice to migrate data from Oracle to MaxCompute
Automatically synchronize IoT data to the cloud
Data development
Best practice to configure scheduling dependencies
Use MaxCompute to query geolocations of IP addresses
Reference a third-party package in a PyODPS node
Run nodes at a specific time by using branch nodes
Connect DataV to DataWorks DataService Studio
Use a PyODPS node to segment Chinese text based on Jieba
Build a data warehouse for an enterprise based on AnalyticDB for MySQL
Use a PyODPS node to send emails
Data security
Grant access to a specific UDF to a specified user
Allow a RAM user to log on to DataWorks only from a specific IP address
Data analysis
Use MaxCompute to analyze data and Quick BI to present the analysis results
Analyze offline data in the Internet and e-commerce industries
Intelligently recommend items on e-commerce websites
Videos
DataWorks for MaxCompute Workshop
FAQ
Data Integration
Batch synchronization
Management of permissions on data sources
Custom resource groups
Exclusive resource groups
Scheduling parameters
Scheduling properties
Dependencies
Data Analytics
Monitor
Operation Center
DataService Studio
Security Center
App Studio
Stream Studio
Troubleshooting for connections
API
Overview
Tenant
ListCalcEngines
CreateConnection
DeleteConnection
ListConnections
UpdateConnection
ListResourceGroups
GetProjectDetail
AddProjectMemberToRole
CreateProjectMember
DeleteProjectMember
ListProjectRoles
RemoveProjectMemberFromRole
ListProjectMembers
ListProjects
Metadata
GetMetaDBInfo
GetMetaDBTableList
CheckMetaTable
CheckMetaPartition
SearchMetaTables
GetMetaTableBasicInfo
GetMetaTableColumn
GetMetaTableFullInfo
GetMetaTablePartition
GetMetaTableOutput
GetMetaTableChangeLog
GetMetaTableIntroWiki
GetMetaTableLineage
GetMetaColumnLineage
CreateMetaCategory
DeleteMetaCategory
UpdateMetaCategory
GetMetaCategory
GetMetaTableListByCategory
AddToMetaCategory
DeleteFromMetaCategory
CreateTable
UpdateTable
DeleteTable
GetDDLJobStatus
UpdateMetaTable
UpdateMetaTableIntroWiki
ListMetaDB
UpdateTableModelInfo
CreateTableTheme
DeleteTableTheme
UpdateTableTheme
ListTableTheme
CreateTableLevel
DeleteTableLevel
UpdateTableLevel
ListTableLevel
Data Development
CreateFile
CreateUdfFile
DeleteBusiness
CreateFolder
UpdateBusiness
ListFiles
EstablishRelationTableToBusiness
SubmitFile
ListBusiness
GetFileVersion
DeleteFolder
DeleteFile
GetDeployment
UpdateFile
GetFolder
ListFileVersions
GetFile
ListFolders
DeployFile
UpdateUdfFile
GetBusiness
UpdateFolder
Operation Center
ListAlertMessages
GetBaselineConfig
GetNodeOnBaseline
ListBaselineConfigs
GetBaselineKeyPath
GetBaselineStatus
ListBaselineStatuses
DeleteRemind
UpdateRemind
CreateRemind
GetRemind
ListReminds
ListTopics
GetTopic
GetTopicInfluence
GetNode
ListNodeIO
ListNodes
GetNodeCode
GetInstance
ListInstances
GetInstanceLog
StopInstance
RestartInstance
SetSuccessInstance
SuspendInstance
ResumeInstance
CreateDagComplement
CreateDagTest
GetSuccessInstanceTrend
GetInstanceCountTrend
GetInstanceConsumeTimeRank
GetInstanceErrorRank
ListProgramTypeCount
GetInstanceStatusCount
GetNodeTypeListInfo
CreateManualDag
GetManualDagInstances
GetDag
SearchNodesByOutput
RunTriggerNode
Data Quality
UpdateQualityFollower
UpdateQualityRule
GetQualityRule
ListQualityRules
CreateQualityRule
DeleteQualityFollower
DeleteQualityRule
GetQualityFollower
CreateQualityFollower
DeleteQualityEntity
GetQualityEntity
CreateQualityEntity
CreateQualityRelativeNode
DeleteQualityRelativeNode
ListQualityResultsByEntity
ListQualityResultsByRule
DataService Studio
CreateDataServiceApi
PublishDataServiceApi
DeleteDataServiceApi
ListDataServiceApis
GetDataServiceApi
UpdateDataServiceApi
GetDataServicePublishedApi
ListDataServicePublishedApis
ListDataServiceApiAuthorities
ListDataServiceAuthorizedApis
AbolishDataServiceApi
GetDataServiceApplication
ListDataServiceApplications
CreateDataServiceFolder
GetDataServiceFolder
ListDataServiceFolders
CreateDataServiceGroup
GetDataServiceGroup
ListDataServiceGroups
CreateDataServiceApiAuthority
DeleteDataServiceApiAuthority
SDK
Install the Alibaba Cloud SDK for Java
All Products
Search
Document Center
DataWorks
Data Integration
Appendixes
Writer configuration
This Product
This Product
All Products
DataWorks
DataHub Writer
Db2 Writer
DRDS Writer
FTP Writer
HBase Writer
HBase11xsql Writer
HDFS Writer
Memcache Writer
MongoDB Writer
MySQL Writer
Configure Oracle Writer
OSS Writer
PostgreSQL Writer
Redis Writer
SQL Server Writer
Elasticsearch Writer
LogHub Writer
Open Search Writer
Tablestore Writer
RDBMS Writer
Stream Writer
HybridDB for MySQL Writer
AnalyticDB for PostgreSQL Writer
PolarDB Writer
TSDB Writer
AnalyticDB for MySQL 3.0 Writer
GDB Writer
ClickHouse Writer
ApsaraDB for OceanBase Writer
Hologres Writer
RestAPI Writer
Maxgraph Writer
Vertica Writer
MaxCompute Writer
Hive Writer
Gbase 8a Writer
Kafka Writer
Free Trial
Free Trial