All Products
Search
Document Center

Platform For AI:Canary Release

Last Updated:Aug 28, 2025

Service groups are primarily used to manage multiple sub-services that handle business traffic, providing a unified traffic entry point. The system distributes incoming traffic to individual services based on traffic allocation policies, making them suitable for various business scenarios such as canary releases, elastic scaling, and heterogeneous resource scheduling. This document describes how to create service groups, view traffic entry points, and modify traffic allocation policies.

Scenarios

  • Canary release

    In a canary release, both production and canary services are added to the same group, with the canary service receiving a smaller portion of traffic. When a new version is released, it is first deployed to the canary service for observation. In case of issues, roll back the canary service or stop the canary service and redirect traffic back to the production service. If everything runs smoothly, update the production service entirely, and then scale down the canary service to zero or retain a small amount of traffic.

  • Auto scaling of pay-as-you-go and subscription resource groups

    In the same group, subscription services are deployed in dedicated resource groups with a fixed number of instances to support basic needs. Pay-as-you-go services are deployed in public resource groups, offering on-demand scaling to reduce costs.

  • Use of heterogeneous hardware resources

    In GPU acceleration scenarios, after service deployment, certain GPU types may experience downtime or insufficient inventory in some regions, preventing normal scaling of services. You can dynamically create services with different GPU types within the same service group, adapting to various CUDA environments. This allows multiple services to use heterogeneous resources to support the same business scenario. Since the traffic entry point for the service group remains unchanged, the frontend is unaware of these changes.

Create a service group

When you create a service, you can specify the service group to which the service belongs.

Note

If the specified service group does not exist, the system automatically creates the service group. If the specified service group exists, the system adds the new service to the service group. After all services in a service group are deleted, the service group is automatically deleted.

The following example shows how to create a service group named pmml and add the pmml_prod and pmml_grey services to the service group.

PAI Console

  1. Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).

  2. On the Elastic Algorithm Service (EAS) page, click the Canary Release tab. On the tab that appears, click Create Group and Service.

  3. On the Custom Deployment page, configure the parameters and click Deploy.

    Parameters:

    • Service Name: Follow the on-screen instructions to specify a valid service name. Example: pmml_prod.

    • Group: the service group to which the service belongs. In this example, New Group is used, and the new group name is set to pmml.

    For information about other parameters, see Deploy a model service in the PAI console.

Repeat Steps 2 and 3 to create a service named pmml_grey that belongs to the pmml service group.

After you create the services, click pmml on the Canary Release tab to go to the group details page and view the services that belong to the group.image

Important

Newly added services do not receive traffic by default. See the traffic distribution policy to adjust.

EASCMD Client

  1. Prepare a service configuration file named service.json.

    • Click to view the configuration file for the pmml_prod service

      {
        "name":"pmml_prod",
        "model_path":"http://examplebucket.oss-cn-shanghai.aliyuncs.com/models/lr_xxxx.pmml",
        "processor":"pmml",
        "metadata":{
          "cpu":1,
          "instance":4,
          "group":"pmml"
        }
      }

      Click to view the configuration file for the pmml_grey service

      {
        "name":"pmml_grey",
        "model_path":"http://examplebucket.oss-cn-shanghai.aliyuncs.com/models/lr_xxxx.pmml",
        "processor":"pmml",
        "metadata":{
          "cpu":1,
          "instance":1,
          "group":"pmml"
        }
      }

    The group parameter specifies which service group the service belongs to, i.e., the name of the created service group. For other parameters, see the Details of other parameters.

  2. Create two services and a service group.

    Log on to the EASCMD client and run the create command to create two services and a service group, see Download the EASCMD client and complete identity authentication. Sample code:

    $ eascmd create service.json
  3. View information about the services and service group.

    Run the following ls command to view information about the services and service group:

    $ eascmd ls

    The following information is returned:

    [RequestId]: 716BEBFC-E8A4-51FD-A3F7-56376B167923
    +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+
    |        SERVICENAME        | INSTANCE | CPU | MEMORY |      CREATETIME      |      UPDATETIME      | STATUS  | WEIGHT | TRAFFICSTATE |       SERVICEGROUP        |
    +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+
    | pmml_prod                 |        4 |   1 | 1000M  | 2022-06-05T14:30:49Z | 2022-06-05T14:30:49Z | Running |     80 | grouping     | pmml                      |
    | pmml_grey                 |        1 |   1 | 1000M  | 2022-06-05T14:31:38Z | 2022-06-05T14:31:38Z | Running |     20 | grouping     | pmml                      |
    +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+

    Parameters:

    • pmml is displayed in the SERVICEGROUP column. This indicates that the two services belong to the pmml service group.

    • grouping is displayed in the TRAFFICSTATE column. This indicates that both services receive traffic. The traffic distribution between the services is 80% and 20%, which is calculated based on the number of service instances.

View data ingresses

A service group has a centralized data ingress. Each service in the service group has a separate data ingress. The data ingresses are in the following formats:

Data ingress of a service group

<endpoint>/api/predict/<group_name>

Example:

http://182848887922****.vpc.cn-shanghai.pai-eas.aliyuncs.com/api/predict/pmml

On the Canary Release tab, view the service group traffic ingress. Traffic to this ingress will be allocated to different services based on the policy. Services within the service group can be created or deleted, but the entry address remains unchanged, allowing for online debugging.image

Data ingress of a service

<endpoint>/api/predict/<group_name>.<service_name>

Example:

http://182848887922****.vpc.cn-shanghai.pai-eas.aliyuncs.com/api/predict/pmml.pmml_prod

On the Inference Service tab, view the traffic ingress for a single service. This ingress is associated with the specific lifecycle of the service, ensuring that traffic consistently flows into the designated service. Once a service is deleted, the ingress will be destroyed. After completing traffic redirection within the group, you will still need to use this ingress address to access the service and conduct online debugging.

image

Modify traffic distribution policies

Elastic Algorithm Service (EAS) currently supports two traffic allocation methods:

  • Instance-based allocation:Traffic is dynamically distributed based on the number of inference instances for each service. Example: If Service A has 1 instance and Service B has 3 instances, then Service A receives 25% of the traffic and Service B receives 75%.

  • Custom weight-based allocation:Traffic is allocated according to the weights assigned to each service. Example: If Service A has a weight of 100 and Service B has a weight of 400, then Service A receives 20% of the traffic and Service B receives 80%.

Important

When a service disables the traffic allocation feature, it no longer participates in group-based traffic distribution but can still be accessed and invoked individually. This applies to both allocation methods.

The specific modification methods are as follows:

Note

You can adjust service traffic weights and traffic status through the API. For details, please refer to ReleaseService - ReleaseService - Adjust Service Traffic Weights and Traffic Status

Instance-based allocation

Using the Console

Turn on the traffic allocation switch in the corresponding column to enable the service to handle traffic. Turn it off to disable traffic for the service.

image

Using EASCMD

Method 2: Using EASCMD

Run the following release command to modify traffic distribution policies: For information about how to log on to the EASCMD client, see Download the EASCMD client and complete identity authentication.

$ eascmd release <service_name> -s grouping|standalone

Parameters:

  • <service_name>: the name of the service. Change the value to the name of the service for which you want to modify the traffic distribution policy.

  • grouping|standalone: the status after modification. Valid values: grouping(receives traffic) and standalone (does not receive traffic).

Examples:

  • Run the following command to change the status of the pmml_grey service to standalone. This way, the pmml_grey service does not receive traffic.

    $ eascmd release pmml_grey -s standalone

    The following output is returned:

    Confirmed to release service [pmml_grey] to group traffic [Y/n]yes
    [RequestId]: 40C787DF-8900-5F7A-8A01-30F7D5A8BF3B
    [OK] Service [pmml_grey] has entered the traffic state: standalone

    Run the eascmd ls command to view the status of the service. The following output is returned:

    [RequestId]: 83BE3FBB-8CE2-5008-B435-1938A20B13AA
    +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+
    |        SERVICENAME        | INSTANCE | CPU | MEMORY |      CREATETIME      |      UPDATETIME      | STATUS  | WEIGHT | TRAFFICSTATE |       SERVICEGROUP        |
    +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+
    | pmml_prod                 |        4 |   1 | 1000M  | 2022-06-05T14:30:49Z | 2022-06-05T14:30:49Z | Running |    100 | grouping     | pmml                      |
    | pmml_grey                 |        1 |   1 | 1000M  | 2022-06-05T14:42:41Z | 2022-06-05T14:42:41Z | Running |     0  | standalone   | pmml                      |
    +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+

    The TRAFFICSTATE of the pmml_grey service changes to standalone. The value of the WEIGHT parameter is 0, which indicates that all traffic is received by the pmml_prod service.

  • Run the following command to change the status of the pmml_grey service to grouping. This allows the pmml_grey service to receive traffic.

    $ eascmd release pmml_grey -s grouping

    The following output is returned:

    Confirmed to release service [pmml_grey] to group traffic [Y/n]yes
    [RequestId]: 40C787DF-8900-5F7A-8A01-30F7D5A8BF3B
    [OK] Service [pmml_grey] has entered the traffic state: grouping

    Run the eascmd ls command to view the status of the service. The following output is returned:

    [RequestId]: 83BE3FBB-8CE2-5008-B435-1938A20B13AA
    +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+
    |        SERVICENAME        | INSTANCE | CPU | MEMORY |      CREATETIME      |      UPDATETIME      | STATUS  | WEIGHT | TRAFFICSTATE |       SERVICEGROUP        |
    +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+
    | pmml_prod                 |        4 |   1 | 1000M  | 2022-06-05T14:30:49Z | 2022-06-05T14:30:49Z | Running |     80 | grouping     | pmml                      |
    | pmml_grey                 |        1 |   1 | 1000M  | 2022-06-05T14:42:41Z | 2022-06-05T14:42:41Z | Running |     20 | grouping     | pmml                      |
    +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+

    The TRAFFICSTATE of the pmml_grey service changes to grouping. The percentage of traffic that is received by the service is 20%.

Custom weight-based allocation

Edit directly in the Traffic Weight column.

image