Service groups are primarily used to manage multiple sub-services that handle business traffic, providing a unified traffic entry point. The system distributes incoming traffic to individual services based on traffic allocation policies, making them suitable for various business scenarios such as canary releases, elastic scaling, and heterogeneous resource scheduling. This document describes how to create service groups, view traffic entry points, and modify traffic allocation policies.
Scenarios
Canary release
In a canary release, both production and canary services are added to the same group, with the canary service receiving a smaller portion of traffic. When a new version is released, it is first deployed to the canary service for observation. In case of issues, roll back the canary service or stop the canary service and redirect traffic back to the production service. If everything runs smoothly, update the production service entirely, and then scale down the canary service to zero or retain a small amount of traffic.
Auto scaling of pay-as-you-go and subscription resource groups
In the same group, subscription services are deployed in dedicated resource groups with a fixed number of instances to support basic needs. Pay-as-you-go services are deployed in public resource groups, offering on-demand scaling to reduce costs.
Use of heterogeneous hardware resources
In GPU acceleration scenarios, after service deployment, certain GPU types may experience downtime or insufficient inventory in some regions, preventing normal scaling of services. You can dynamically create services with different GPU types within the same service group, adapting to various CUDA environments. This allows multiple services to use heterogeneous resources to support the same business scenario. Since the traffic entry point for the service group remains unchanged, the frontend is unaware of these changes.
Create a service group
When you create a service, you can specify the service group to which the service belongs.
If the specified service group does not exist, the system automatically creates the service group. If the specified service group exists, the system adds the new service to the service group. After all services in a service group are deleted, the service group is automatically deleted.
The following example shows how to create a service group named pmml and add the pmml_prod and pmml_grey services to the service group.
PAI Console
Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).
On the Elastic Algorithm Service (EAS) page, click the Canary Release tab. On the tab that appears, click Create Group and Service.
On the Custom Deployment page, configure the parameters and click Deploy.
Parameters:
Service Name: Follow the on-screen instructions to specify a valid service name. Example: pmml_prod.
Group: the service group to which the service belongs. In this example, New Group is used, and the new group name is set to pmml.
For information about other parameters, see Deploy a model service in the PAI console.
Repeat Steps 2 and 3 to create a service named pmml_grey that belongs to the pmml service group.
After you create the services, click pmml on the Canary Release tab to go to the group details page and view the services that belong to the group.
Newly added services do not receive traffic by default. See the traffic distribution policy to adjust.
EASCMD Client
Prepare a service configuration file named service.json.
The
groupparameter specifies which service group the service belongs to, i.e., the name of the created service group. For other parameters, see the Details of other parameters.Create two services and a service group.
Log on to the EASCMD client and run the
createcommand to create two services and a service group, see Download the EASCMD client and complete identity authentication. Sample code:$ eascmd create service.jsonView information about the services and service group.
Run the following
lscommand to view information about the services and service group:$ eascmd lsThe following information is returned:
[RequestId]: 716BEBFC-E8A4-51FD-A3F7-56376B167923 +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+ | SERVICENAME | INSTANCE | CPU | MEMORY | CREATETIME | UPDATETIME | STATUS | WEIGHT | TRAFFICSTATE | SERVICEGROUP | +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+ | pmml_prod | 4 | 1 | 1000M | 2022-06-05T14:30:49Z | 2022-06-05T14:30:49Z | Running | 80 | grouping | pmml | | pmml_grey | 1 | 1 | 1000M | 2022-06-05T14:31:38Z | 2022-06-05T14:31:38Z | Running | 20 | grouping | pmml | +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+Parameters:
pmml is displayed in the SERVICEGROUP column. This indicates that the two services belong to the pmml service group.
grouping is displayed in the TRAFFICSTATE column. This indicates that both services receive traffic. The traffic distribution between the services is 80% and 20%, which is calculated based on the number of service instances.
View data ingresses
A service group has a centralized data ingress. Each service in the service group has a separate data ingress. The data ingresses are in the following formats:
Data ingress of a service group
<endpoint>/api/predict/<group_name>Example:
http://182848887922****.vpc.cn-shanghai.pai-eas.aliyuncs.com/api/predict/pmmlOn the Canary Release tab, view the service group traffic ingress. Traffic to this ingress will be allocated to different services based on the policy. Services within the service group can be created or deleted, but the entry address remains unchanged, allowing for online debugging.
Data ingress of a service
<endpoint>/api/predict/<group_name>.<service_name>Example:
http://182848887922****.vpc.cn-shanghai.pai-eas.aliyuncs.com/api/predict/pmml.pmml_prodOn the Inference Service tab, view the traffic ingress for a single service. This ingress is associated with the specific lifecycle of the service, ensuring that traffic consistently flows into the designated service. Once a service is deleted, the ingress will be destroyed. After completing traffic redirection within the group, you will still need to use this ingress address to access the service and conduct online debugging.

Modify traffic distribution policies
Elastic Algorithm Service (EAS) currently supports two traffic allocation methods:
Instance-based allocation:Traffic is dynamically distributed based on the number of inference instances for each service. Example: If Service A has 1 instance and Service B has 3 instances, then Service A receives 25% of the traffic and Service B receives 75%.
Custom weight-based allocation:Traffic is allocated according to the weights assigned to each service. Example: If Service A has a weight of 100 and Service B has a weight of 400, then Service A receives 20% of the traffic and Service B receives 80%.
When a service disables the traffic allocation feature, it no longer participates in group-based traffic distribution but can still be accessed and invoked individually. This applies to both allocation methods.
The specific modification methods are as follows:
You can adjust service traffic weights and traffic status through the API. For details, please refer to ReleaseService - ReleaseService - Adjust Service Traffic Weights and Traffic Status
Instance-based allocation
Using the Console
Turn on the traffic allocation switch in the corresponding column to enable the service to handle traffic. Turn it off to disable traffic for the service.

Using EASCMD
Method 2: Using EASCMD
Run the following release command to modify traffic distribution policies: For information about how to log on to the EASCMD client, see Download the EASCMD client and complete identity authentication.
$ eascmd release <service_name> -s grouping|standaloneParameters:
<service_name>: the name of the service. Change the value to the name of the service for which you want to modify the traffic distribution policy.
grouping|standalone: the status after modification. Valid values: grouping(receives traffic) and standalone (does not receive traffic).
Examples:
Run the following command to change the status of the pmml_grey service to standalone. This way, the pmml_grey service does not receive traffic.
$ eascmd release pmml_grey -s standaloneThe following output is returned:
Confirmed to release service [pmml_grey] to group traffic [Y/n]yes [RequestId]: 40C787DF-8900-5F7A-8A01-30F7D5A8BF3B [OK] Service [pmml_grey] has entered the traffic state: standaloneRun the
eascmd lscommand to view the status of the service. The following output is returned:[RequestId]: 83BE3FBB-8CE2-5008-B435-1938A20B13AA +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+ | SERVICENAME | INSTANCE | CPU | MEMORY | CREATETIME | UPDATETIME | STATUS | WEIGHT | TRAFFICSTATE | SERVICEGROUP | +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+ | pmml_prod | 4 | 1 | 1000M | 2022-06-05T14:30:49Z | 2022-06-05T14:30:49Z | Running | 100 | grouping | pmml | | pmml_grey | 1 | 1 | 1000M | 2022-06-05T14:42:41Z | 2022-06-05T14:42:41Z | Running | 0 | standalone | pmml | +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+The TRAFFICSTATE of the pmml_grey service changes to standalone. The value of the WEIGHT parameter is 0, which indicates that all traffic is received by the pmml_prod service.
Run the following command to change the status of the pmml_grey service to grouping. This allows the pmml_grey service to receive traffic.
$ eascmd release pmml_grey -s groupingThe following output is returned:
Confirmed to release service [pmml_grey] to group traffic [Y/n]yes [RequestId]: 40C787DF-8900-5F7A-8A01-30F7D5A8BF3B [OK] Service [pmml_grey] has entered the traffic state: groupingRun the
eascmd lscommand to view the status of the service. The following output is returned:[RequestId]: 83BE3FBB-8CE2-5008-B435-1938A20B13AA +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+ | SERVICENAME | INSTANCE | CPU | MEMORY | CREATETIME | UPDATETIME | STATUS | WEIGHT | TRAFFICSTATE | SERVICEGROUP | +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+ | pmml_prod | 4 | 1 | 1000M | 2022-06-05T14:30:49Z | 2022-06-05T14:30:49Z | Running | 80 | grouping | pmml | | pmml_grey | 1 | 1 | 1000M | 2022-06-05T14:42:41Z | 2022-06-05T14:42:41Z | Running | 20 | grouping | pmml | +---------------------------+----------+-----+--------+----------------------+----------------------+---------+--------+--------------+---------------------------+The TRAFFICSTATE of the pmml_grey service changes to grouping. The percentage of traffic that is received by the service is 20%.
Custom weight-based allocation
Edit directly in the Traffic Weight column.
