Deploy pre-trained LLMs on serverless infrastructure. No cluster management required.
Billing
Model deployments incur charges from DataWorks and related Alibaba Cloud services:
Service | Charges |
DataWorks fees | Billed per CU-hour when models are running. For more information, see Billing of serverless resource groups. |
Non-DataWorks fees |
|
How it works
DataWorks runs your model as a fully managed service inside a serverless resource group. For secure internal access, the system automatically configures PrivateLink and Private Hosted Zone during deployment:
PrivateLink: Creates an encrypted tunnel between your VPC and the DataWorks resource group, enabling private cross-VPC communication.
Private Hosted Zone: Provides DNS resolution so you can reach your model using a custom domain name from within your VPC.
API calls using the model's domain name travel through the PrivateLink tunnel from your VPC to the model instance running in the DataWorks resource group. You can monitor these resources in the PrivateLink console and the Alibaba Cloud DNS console.
Usage notes
Enable PrivateLink for secure cross-VPC access.
Enable Private Hosted Zone for DNS resolution.
Create a DataWorks workspace with a serverless resource group.
All services must be in the same region as your workspace.
Limitations
Regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Ulanqab), China (Shenzhen), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Germany (Frankfurt), US (Silicon Valley), and US (Virginia).
Deployment: Models only (training not supported).
Quotas: 50 services per Alibaba Cloud account per region, 5 services per resource group.
Networking: 3 VPCs maximum per service.
Deploy a model
Access the deployment page
Log on to the DataWorks console and switch to your target region.
Click Model Service in the left navigation pane.
Click Deployment Model, select a model, then click Deployment.
Configure the deployment
Model: The model to deploy. For more information, see Manage model services.
Service Name: A unique identifier for this deployment.
Resource settings
Setting | Description |
Resource Group | Your serverless resource group. View the usage in the console under . |
Switch | A vSwitch attached to your resource group. |
Deployment Specifications | Select a resource specification. |
Number of instances | Multiple instances improve availability. |
Total occupancy | Calculated automatically: Deployment Specifications × Number of instances. Ensure your resource group has enough available CUs for the model service. For more information, see Allocate CU quotas.
|
Click Deploy.
DataWorks automatically creates PrivateLink, Private Hosted Zone, and security groups. Don't delete or modify these. They're removed automatically when you delete the service.
After deployment completes:
A PrivateLink endpoint appears in your default VPC, connected to the DataWorks resource group.
A DNS record is added to Private Hosted Zone, mapping the service domain to the PrivateLink endpoint in your VPC.
Manage models
After a model service is created, you can manage the model status, view service information, manage network and API keys, and adjust resources as needed from the Model Service list.
Manage the model status
A model service starts by default after creation. You can manage its status in the Actions column.
Operation | Model service status change | Resource consumption |
Start | In operation | Consumes serverless resource group resources. |
Stop | Stopped | Does not consume serverless resource group resources. |
Delete | - | Completely releases serverless resource group resources. |
View the model service
The Overview tab displays the configuration information for the current model service.
In the Model Service list, find the target model service and click its name to go to the Overview tab.
You can manage the model service's Basic Information, Resource Allocation, and Invocation Information.
Basic Information: Such as the model service name, service ID, and type.
Resource Allocation: Includes details such as Deployment Specifications and Number of instances.
Invocation Information: To invoke the service and use the model in a node task, click the
icon next to VPC Address Invocation Domain Name to copy and obtain the domain name parameter.
Modify model resources
For a created model service, you can modify the service name, adjust the deployment specifications, and change the number of instances for deploying the model service.
In the Model Service list, click the target model service name to go to the Overview tab.
Click Modification next to Resource Allocation and configure the settings.
ImportantChanging resources causes the service to restart, which affects the operation of the model service.
Manage the model network
The Network Configuration tab displays the VPCs that can currently access the model service through the internal network. On the tab, you can add or manage VPCs that can be used to access the model service.
In the Model Service list, click the target model service name to go to the Overview tab.
Switch to the Network Configuration tab to view the VPCs.
To expand the access range, you can click Add Network to allow more VPCs to access the model service deployed on DataWorks.
When adding a network, you must specify a VPC and a vSwitch. You can access the model service through the VPC after its status changes to Available.
NoteBilling: After adding a VPC for the model service, the system creates a PrivateLink endpoint in the VPC you selected to access the model service, establishing network connectivity with the DataWorks resource group. A DNS record is also added to Private Hosted Zone. This process incurs instance fees, traffic processing fees, and domain name resolution fees. For more information, see Billing of PrivateLink and Billing of Private Hosted Zone.
Limit: You can add a maximum of three VPCs for each model service.
If you no longer want to allow a certain VPC to access the model service, you can click Delete for the target VPC in the model service.
When you delete the VPC from the model service, the PrivateLink endpoint created in that VPC is also removed.
Manage API keys
An API key is an authentication credential provided by the model service to authenticate the caller's identity and permissions. You can manage all API keys for invoking the current model service on the API Key tab.
In the Model Service list, click the target model service name to go to the Overview tab.
Switch to the API Key tab to create, manage, and use API keys.
Add API Keys: After the model service is deployed successfully, DataWorks auto-generates an internal key for platform integrations. If you need to call the service model through the model service
Endpointin other environments, click Add New API Key to create a new API key.It is recommended that you create separate API keys for different use cases.
View API Keys: Click View in the Actions column of the target API Key, then click Copy to obtain the API Key.
Delete API keys: DataWorks API Keys provide Disable and Delete functions.
ImportantIf you need to disable or delete an enabled API key, evaluate the impact in advance. Once an API key is disabled or deleted, all tasks that use the API key to invoke models will fail.
Disable or Delete operation takes approximately
5 minto take effect.
More operations
After deploying the model, you can use the model to develop related tasks.
How invocation works
The following descriptions explain how model service invocation works:
When you deploy a model service in a DataWorks resource group or configure a VPC for it, the system automatically performs the following operations:
Establish a cross-VPC connection. In your VPC (the VPC in your account that can communicate with the DataWorks resource group), the system automatically creates a PrivateLink endpoint and establishes an encrypted communication channel with the PrivateLink endpoint service in the DataWorks resource group VPC.
This operation automatically creates a PrivateLink endpoint in your account. Your account must have the PrivateLink service enabled.
Configure domain name resolution service. The system automatically configures domain name resolution rules in the VPC, so that domain name request traffic within the VPC is automatically forwarded to the DataWorks model service.
This operation automatically deploys the Private Hosted Zone service in your account. Your account must have this service enabled.