Create a PAI EAS service using public resources - Platform For AI

Public resources are ideal for testing scenarios or services with fluctuating traffic, allowing you to use elastic resource pools to minimize costs. However, the availability of public resources is not guaranteed. When deploying services with public resources, you can use spot instances to further reduce costs and configure multiple instance types to mitigate deployment risks caused by insufficient inventory of a single instance type. This topic describes how to deploy a model service by using public resources.

Billing

Public resources are billed on a pay-as-you-go basis. For more information, see Billing of Elastic Algorithm Service (EAS).

When billing starts

When you deploy a model service on public resources by using specific instance types, billing starts after the service is deployed and enters the Running state.
Platform for AI (PAI) provides a free 30 GiB system disk for each instance node in public resources. You can expand the system disk on a pay-as-you-go basis. Billing for the system disk starts immediately upon creation.

When billing stops

To stop a service and its billing, go to the Service List tab on the PAI EAS Model Online Service page. In the Actions column for the target service, click Stop.

Important

Stop idle model services promptly to avoid unnecessary charges.
Ensure that a stopped service is no longer needed to prevent business disruptions.
When using public resources, if instance creation fails due to insufficient resources, the system automatically retries once resources become available. Stop or delete such model services if they are no longer needed.
To determine if a failure is caused by insufficient resources, click the service name to go to the service details page and check the instance status.

Spot instances

A Spot Instance is an instance type that allows you to deploy services in a preemptible mode by setting a price ceiling, offering more cost-effective compute resources.

Benefits
- Cost savings: The main advantage of spot Instances is their low price. The price fluctuates in real-time based on supply and demand, typically offering significant discounts compared to standard pay-as-you-go instances on public resources.
- Price tiers: Spot Instances are available with or without a protection period. The price tiers, from lowest to highest, are: no protection period < one-hour protection period < standard instance.
Acquisition conditions
- A spot Instance is acquired when there is sufficient inventory and your bid price is not lower than the current market price.
Release conditions: The release of an instance is determined by the spot instance retention period setting.
- One-hour protection period: This option guarantees one hour of uninterrupted usage. The instance will not be released during this period but may be automatically released after the protection period ends.
- No protection period: Continuous use is not guaranteed. The instance may be automatically released at any time due to changes in inventory or market price.
Billing model
- Spot Instances use a pay-as-you-go model, with charges calculated based on the real-time market price.

Multiple instance types

If you specify only a single instance type when deploying a service, the service deployment can fail or be delayed due to insufficient inventory of that type. To address this, EAS supports selecting multiple instance types during deployment. The system iterates through a list of specified instance types in the configuration to launch resources, which significantly mitigates the risk of deployment failure caused by a single type being out of stock.

Instance usage order
When you create or update a service, you can specify multiple instance types, such as Spot Instances and standard instances. During deployment, the system attempts to use these instances in the order you configured them. If a Spot Instance bid fails or an instance type is out of stock, the system automatically proceeds to the next instance type in the configured list.
Resource release and reallocation
If a configured spot Instance is released due to changes in inventory or market price, EAS automatically reallocates the highest-priority available resource based on your configuration to ensure service continuity.

Expand system disk storage

PAI provides a free 30 GiB system disk for each instance node on public resources. If you need more capacity, additional capacity is billed based on usage. For more information about billing, see Billing of Elastic Algorithm Service (EAS).

Important

The maximum system disk size is 2000 GiB. Exceeding this limit will cause the model service deployment to fail.

Procedure

Configure in the console

The following steps use custom deployment as an example.

Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).
- To create a new service: On the Inference Service tab, click Deploy Service. Then, select Custom Model Deployment > Custom Deployment.
- To update an existing service: On the Inference Service tab, find the service that you want to update and click Update in the Actions column.
In the Resource Information section, set Resource Type to Public Resources. Click the resource specification field and select a desired specification from the list.
(Optional) Enable spot bidding. Turn on the Bidding switch, set a bid price, and choose a spot instance retention period.
Note
- The Bidding switch is only available for resource specifications that support spot Instances.
- When using Spot Instances, we recommend also configuring standard instance types to prevent deployment failures if your bid is unsuccessful.
(Optional) Configure multiple instance types. Click Add to configure multiple instances.
Configure a system disk.

Configure with the EASCMD client

To deploy a model service by using the EASCMD client, see Deploy services using EASCMD.

If you are deploying with the EASCMD client for the first time, first configure the parameters in the console to generate the complete JSON configuration. You can then find it in the Service Configuration section.

The following is an example of the JSON parameters for resource deployment:

{
    "metadata": {
        "name": "test",
        "instance": 1,
        "workspace_id": "your-workspace-id",
        "disk": "40Gi"
    },
    "cloud": {
        "computing": {
            "instances": [
                {
                    "type": "ecs.c8i.2xlarge",
                    "spot_price_limit": 1
                },
                {
                    "type": "ecs.c8i.xlarge"
                }
            ],
            "disable_spot_protection_period": true
        }
    },
    "containers": [
        {
            "image": "eas-registry-vpc.cn-hangzhou.cr.aliyuncs.com/pai-eas/python-inference:py39-ubuntu2004",
            "script": "python app.py",
            "port": 8000
        }
    ]
}

Parameter			Description
metadata	instance		The number of instances to start for the service. In the example JSON file, this is set to 1.
	disk		The size of the system disk. Public resource groups provide a free 30 GiB. If you need more capacity, you are charged based on actual usage. The maximum value is 2000 GiB.
cloud	computing	instances	Specifies a prioritized list of instance types for deployment. Multiple types can be configured. If a bid for an instance type fails or the inventory is insufficient, the system sequentially tries the next instance type in the configuration. type: The instance type. spot_price_limit (Optional): When this parameter is configured, the corresponding instance type is a spot Instance, and the value is its maximum price limit in USD. This supports pay-as-you-go. When this parameter is not configured, the corresponding instance type is a standard pay-as-you-go instance.
		disable_spot_protection_period	The following values are supported: false (Default): Provides a one-hour protection period after the spot Instance is successfully created. The instance will not be released during this period, even if the market price exceeds your bid. true: Disables the protection period. A spot Instance without a protection period is priced lower than one with a protection period.

FAQ

What can I do if public resources are out of stock?

When you deploy popular models with a large number of parameters, you may encounter insufficient inventory of public resources. Consider the following solutions:

Switch to another region. Resource availability varies by region. You can switch to a different region to find available public resources.
Important
Consider switching to the Ulanqab region to use Lingjun Spot Resources (no whitelist approval is required). These resources can be preempted, so be mindful of your bid price.
Use a dedicated resource group. Some instance types are not available through public resources. You can purchase dedicated resources for EAS by visiting EAS Dedicated Machine Subscription.
Important
Billing for pay-as-you-go dedicated resources starts immediately after a successful purchase, regardless of whether they are used to deploy a service. Delete unused pay-as-you-go instances promptly to avoid unnecessary charges.

References

Public resources do not guarantee resource availability. We recommend using dedicated resources to deploy your services. For more information, see Use EAS resource groups.
If you need to connect directly to your service through a VPC for high-speed access and low latency, or if your EAS service needs to access other cloud products in the same VPC, see Access the Internet or internal networks from EAS.
You can configure a log service for public resources. Logs generated by EAS services deployed on public resources are stored in the log service, allowing you to monitor your EAS services in real time. For more information, see Configure Simple Log Service for a resource group.