Deploy LLMs on Serverless with No Cluster Management - DataWorks

How it works

DataWorks manages the model service in its own resource group VPC. To route traffic from your VPC to the model, the platform automatically sets up two services:

PrivateLink creates a PrivateLink endpoint in your VPC and establishes an encrypted channel to the DataWorks resource group VPC, enabling cross-VPC access without exposing traffic to the public internet. Your account must have the PrivateLink service enabled.
Private Hosted Zone adds domain name resolution rules to your VPCs so that requests to the model's domain name are automatically forwarded to the DataWorks model service. Your account must have the Private Hosted Zone service enabled.

Traffic flow: your VPC → PrivateLink connection → DataWorks resource group VPC → model instance.

To inspect the resources the platform creates, go to the PrivateLink console or the Alibaba Cloud DNS console.

Important

The platform automatically creates and manages the required PrivateLink, Private Hosted Zone, and Security Group resources. Do not manually delete or edit them. The platform removes these resources when the model service is deleted.

Limitations

Supported regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Ulanqab), China (Shenzhen), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Germany (Frankfurt), US (Silicon Valley), and US (Virginia).
Deployment only: Model training is not supported.
Quota per region: Up to 50 model services per Alibaba Cloud account per region.
Resource group: Only Serverless resource groups are supported. Each resource group supports up to 5 model services.
VPC limit: Each model service can be associated with up to 3 VPCs.

Billing

Running a model service incurs costs from DataWorks and from two dependent services. Each additional VPC you add later also incurs fees.

Fee type	Description
DataWorks fees	A running model service consumes Compute Units (CUs) from your Serverless resource group. Pay-as-you-go resource groups are charged per CU-hour. The total CU consumption equals Deployment Specification × Number of Instances. See Billing of Serverless resource groups.
PrivateLink fees	Instance fees and data transfer fees apply for each VPC connected to the model service. See Billing of PrivateLink.
Private Hosted Zone fees	Domain name resolution charges apply. See Billing of Private Hosted Zone.

Deploy a model service

Prerequisites

Before you begin, ensure that you have:

PrivateLink enabled in your account
Private Hosted Zone enabled in your account
A DataWorks workspace created and associated with a Serverless resource group

Important

Enable PrivateLink and Private Hosted Zone in the same region as your DataWorks workspace. A region mismatch prevents the model service from functioning.

Steps

Log on to the DataWorks console. In the top navigation bar, switch to the target region.
In the left-side navigation pane, click Model Service.
On the Model Service page, click Deploy Model to open the Model List page.
Select the model you want to deploy, then click Deploy to open the Model Deployment configuration page.
Under Basic information, configure the following:
- Model: Confirm the model type. See Supported models.
- Service Name: Enter a name to identify this model service in DataWorks.

Under Resource information, configure the deployment environment: CU limits vary by resource group type:

Subscription resource groups: Scale up the resource group if the current allocation is insufficient.
Pay-as-you-go resource groups: The default limit is 500 CUs. After the first model deployment, the platform automatically raises this to 2,000 CUs.

Parameter	Description
Resource Group	The Serverless resource group where the model service will run. After deployment, navigate to Resource Group > target group > Details to monitor Serverless resource group usage.
vSwitch	The vSwitch for deployment. Select one associated with the Serverless resource group in the appropriate availability zone.
Deployment Specification	Resource specification per service instance.
Number of Instances	Number of instances to deploy. Multiple instances improve availability.
Total Occupancy	Total CUs consumed, calculated as Deployment Specification × Number of Instances. Make sure the resource group has enough CUs. See Allocate CU quotas to tasks to adjust limits.

Click Deploy.

After deployment, the platform creates a PrivateLink endpoint in the default VPC associated with the DataWorks resource group and adds a resolution record to Private Hosted Zone, connecting the model's private domain name to the DataWorks resource group VPC.

Manage services

Manage service state

A model service starts automatically after creation. Use the Actions column in the Model Service list to change its state.

Action	Resulting state	Resource consumption
Start	Running	Consumes CUs from the Serverless resource group
Stop	Stopped	Does not consume resources
Delete	—	Permanently releases all associated resources

View service details

In the Model Service list, click the Service Name of the target service.
On the Overview tab, review the following sections:
- Basic Information: Service Name, Service ID, and Model type.
- Resource Allocation: Deployment Specification and Number of Instances.
- Invocation Information: Click the icon next to VPC Address Invocation Domain Name to copy the domain name for use in node tasks.
These three identifiers serve different purposes. Use the Service Name to identify the service in DataWorks. Use the Service ID for system-level references. Use the VPC Address Invocation Domain Name as the endpoint when invoking the model from a node task.

Modify resources

To rename a model service, change the deployment specification, or adjust the number of instances:

In the Model Service list, click the Service Name of the target service.
In the Resource Allocation section, click Modification.

Important
Modifying resources restarts the service, which temporarily interrupts availability. Plan modifications during low-traffic periods.
On the Modify Resources page, update the settings and confirm.

Network settings

The Network Configuration tab shows all VPCs that can access the model service over a private network.

In the Model Service list, click the Service Name of the target service.
Switch to the Network Configuration tab to view the current VPCs.
To add VPC access, click Add Network and specify a VPC and vSwitch. The VPC becomes accessible once its status changes to Available.

Adding a VPC creates a PrivateLink endpoint and a Private Hosted Zone resolution record in the selected VPC. This incurs instance fees, data transfer fees, and domain name resolution fees. See Billing of PrivateLink and Billing of Private Hosted Zone for details.

Each model service supports a maximum of three VPCs.
To remove VPC access, click Delete for the target VPC. Removing a VPC also deletes the PrivateLink endpoint created in that VPC.

API keys

API keys authenticate callers to the model service. Manage all API keys on the API Key tab.

After deployment, the platform provides a built-in API key for calls from other DataWorks modules. To call the model service from external environments using the service endpoint, create additional API keys.

In the Model Service list, click the Service Name of the target service.
Switch to the API Key tab.

Create an API key

Click Add New API Key.

Tip: Create a separate API key for each application. This limits the impact if a key is compromised and makes it easier to rotate keys independently.

View an API key

In the Actions column for the target API key, click View, then click Copy.

Disable or delete an API key

Click Disable or Delete for the target API key.

Important

Once disabled or deleted, all tasks using that key lose access to the model service. The change takes approximately 5 minutes to propagate. Assess the impact before proceeding.

What's next

After the model service is running, use the model service in your tasks.