Create a PAI EAS service using public resources - Platform For AI

Public resources are ideal for testing environments or for services with fluctuating traffic, allowing you to minimize costs. However, the availability of public resources is not guaranteed. To further reduce costs, you can use spot instances. To avoid deployment delays caused by a shortage of a specific instance type, you can configure multiple instance types. This topic describes how to use public resources to deploy model services.

Billing

You are billed for your actual usage of public resources. For more information, see Billing of Elastic Algorithm Service (EAS).

Start of billing

When you use public resources to deploy model services, billing starts when resources are provisioned for a service instance.
Platform for AI (PAI) provides a complimentary 30 GiB system disk for each machine node that uses public resources. You can expand the system disk on a pay-as-you-go basis. Billing for the expanded capacity starts after the disk is successfully created.

End of billing

On the Inference Service tab of the Elastic Algorithm Service (EAS) page, find the target service and click Stop in the Actions column to stop the model service and its billing.

Important

Stop idle model services promptly to avoid unnecessary charges.
Before stopping a service, ensure it is no longer needed to prevent business disruptions.
When you use public resources, if an instance fails to create due to insufficient resources, the system automatically retries to create the instance when resources become available. Make sure to stop or delete such services if they are no longer needed.
To determine whether the failure is caused by insufficient resources, click the service name to go to the service details page and check the status of the service instance.

Core concepts

Spot instances

A spot instance is a preemptible instance that you can purchase by setting a maximum price, offering a cost-effective way to obtain compute resources.

Benefits
- Cost-effectiveness: The main benefit of spot instances is their low price. The price fluctuates in real time based on supply and demand, and is typically much lower than the price of standard instances.
- Price tiers: Spot instances are available with or without a protection period. The prices from lowest to highest are as follows: price for a spot instance without a protection period < price for a spot instance with a protection period < price for a standard instance.
Acquisition conditions
- A spot instance can be acquired only when the inventory is sufficient and your bid price is not lower than the current market price.
Release conditions depend on the protection period settings of the spot instance.
- With a one-hour protection period: The instance runs for at least one hour. During this protection period, the instance is not released. After the protection period ends, the instance may be automatically released.
- Without a protection period: Continuous use of the instance is not guaranteed. The instance may be automatically released at any time due to changes in inventory or market prices.
Billing method
- Spot instances are billed on a pay-as-you-go basis, with charges calculated based on the real-time market price.

Multiple instance types

If you specify only a single instance type when you deploy a service, the service deployment may be significantly delayed due to insufficient inventory of that instance type. To address this issue, EAS allows you to select multiple instance types during deployment. This significantly reduces deployment delays caused by a shortage of a single instance type.

Instance usage order
When you create or update a service, you can specify multiple instance types, such as spot instances and standard instances. During deployment, the system attempts to use these instances in the order you specify. If the bid for a spot instance fails or an instance type is out of stock, the system automatically tries the next available type in the list.
Resource release and reallocation
If a configured spot instance is reclaimed due to changes in inventory or market price, EAS automatically provisions a new instance by using the highest-priority available resource from your configuration to ensure service continuity.

System disk

PAI provides a complimentary 30 GiB system disk for each machine node that uses public resources. If you need more capacity, you are billed for additional capacity on a pay-as-you-go basis. For more information about billing, see Billing of Elastic Algorithm Service (EAS).

Important

The maximum size of a system disk is 2,000 GiB. If the specified size exceeds this limit, the model service fails to deploy.

Procedure

Console configuration

This section uses custom deployment as an example.

Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).
- Create a service: On the Inference Service tab, click Deploy Service and select Custom Model Deployment > Custom Deployment.
- Update a service: On the Inference Service tab, find the target service in the service list and click Update in the Actions column.
In the Resource Information section, select Public Resources for Resource Type. Then, click Resource Specification and select a desired instance type from the list.
(Optional) Enable the spot instance feature. Turn on the Bidding switch, set a bid price, and select a protection period for the spot instance.
Note
- The Bidding switch is available only for instance types that support spot instances.
- When using spot instances, also configure standard instances to prevent service deployment failures in case your bid for a spot instance is unsuccessful.
(Optional) Configure multiple instance types. Click Add to configure additional instance types.
Configure the system disk size.

JSON configuration

After you configure the parameters in the console, you can view the generated JSON configuration in the Service Configuration section. You can also directly edit the parameters in the JSON file.

The following code block shows an example of the JSON parameters for resource deployment:

{
    "metadata": {
        "name": "test",
        "instance": 1,
        "workspace_id": "your-workspace-id",
        "disk": "40Gi"
    },
    "cloud": {
        "computing": {
            "instances": [
                {
                    "type": "ecs.c8i.2xlarge",
                    "spot_price_limit": 1
                },
                {
                    "type": "ecs.c8i.xlarge"
                }
            ],
            "disable_spot_protection_period": true
        }
    },
    "containers": [
        {
            "image": "eas-registry-vpc.cn-hangzhou.cr.aliyuncs.com/pai-eas/python-inference:py39-ubuntu2004",
            "script": "python app.py",
            "port": 8000
        }
    ]
}

Parameter			Description
metadata	instance		The number of service instances to create. In this example, the value 1 creates a single instance. Note EAS supports single-node and multi-node distributed inference. Single-node inference: One instance is deployed on a single machine instance. Multi-node distributed inference: One instance is deployed on multiple machine instances.
	disk		The size of the system disk. A complimentary 30 GiB system disk is provided for each machine node that uses public resources. If you require a larger capacity, you are billed for the additional storage on a pay-as-you-go basis. The maximum size is 2,000 GiB.
cloud	computing	instances	A prioritized list of allowed instance types. You can configure multiple instance types. If a bid for an instance type fails or inventory is insufficient, the system sequentially tries the next instance type in the configuration to create the service. type: The instance type. spot_price_limit (Optional): When this parameter is configured, the instance is a spot instance. The parameter value is the maximum pay-as-you-go price in USD. When this parameter is not configured, the instance is a standard pay-as-you-go instance.
		disable_spot_protection_period	Valid values: false (Default): A one-hour protection period is provided after the spot instance is created. During the protection period, the instance is not released even if the market price exceeds your bid. true: The protection period is disabled. A spot instance without a protection period is typically priced about 10% lower than an instance with a protection period.

FAQ

Insufficient public resources

When you deploy large, popular models, the public resources may be insufficient. In this case, you can use the following solutions:

Switch regions. Resource availability varies by region, so switching to a different region may help you find available public resources.
Important
You can switch to the Ulanqab region to use Lingjun Spot Resources. No whitelist approval is required. Note that these are spot resources and can be preempted. Set your bid price carefully.
Use an EAS resource group. Some instance types are not available as public resources. You can go to EAS Subscription Dedicated Resources to purchase dedicated EAS resources.
Important
Billing for pay-as-you-go dedicated resources starts immediately after the purchase, regardless of whether they are used to deploy a service. Promptly delete unused dedicated resources to avoid unnecessary charges.

Platform For AI:Use public resources