Services or scripts may stop running due to program exceptions, instance restarts, or power outage. If the services or scripts fail to resume at the earliest opportunity, online business may suffer losses. This topic describes how to use the ecs-tool-servicekeepalive
plugin of Cloud Assistant to keep services alive.
Implementation
Based on systemd in Linux, the ecs-tool-servicekeepalive plugin uses a periodic monitoring mechanism to ensure that interrupted services or scripts can quickly resume, thereby ensuring service reliability and continuity. To use the plugin to keep a service alive, enter a command that can start the service or its program. Then, the plugin automatically generates the systemd service configuration based on the startup command that you enter, without the need for manual configuration. After the configuration is generated, systemd starts the service and configures the service to start on system startup.
systemd is a Linux component that can be used to manage services. For example, systemd can start a service on instance startup or restart a service after an unexpected stop. For more information, see systemd documentation.
Procedure
Enable keepalive for a service
After you deploy services on an Elastic Compute Service (ECS) instance, start the ecs-tool-servicekeepalive
plugin of Cloud Assistant as the root user.
Start a service as the root user
sudo acs-plugin-manager --exec --plugin ecs-tool-servicekeepalive --params "start,'<cmd>'"
Replace <cmd> with the actual command that can start a service. Examples:
Shell program:
/bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh
Python program:
python /home/root/main.py
Start a service as a specific user
sudo acs-plugin-manager --exec --plugin ecs-tool-servicekeepalive --params "start,execstart='<cmd>',user=<user_name>,group=<group_name>"
Replace <cmd> with the actual command that can start a service. Examples:
Shell program:
/bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh
Python program:
python /home/root/main.py
Replace <user_name> with the actual name of the user that starts the service. Run the
cut -d: -f1 /etc/passwd
command to view existing users.Replace <group_name> with the group name of the user that starts the service. Run the
cut -d: -f1 /etc/group
command to view existing user groups.
You must specify the absolute path of the script or program file.
If you cannot enable keepalive for a service, we recommend that you perform the following operations to prevent business exceptions due to multiple running processes for the service: Query the keepalive status of the service, disable keepalive for the service, resolve the issue, restart the service, and then re-enable keepalive for the service.
Query the keepalive status of a service
Run the following command to check whether keepalive is enabled for a service:
sudo acs-plugin-manager --exec --plugin ecs-tool-servicekeepalive --params "status"
The following command output indicates that keepalive is enabled for the service:
service_name execstart user group status
ecs_keepalive_1744262359.service /bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh /home/ecs-user/keepalive-simple/test-keepalive.log active (running) since Thu 2025-04-10 13
Disable keepalive for a service
Run the following command to disable keepalive for a service:
sudo acs-plugin-manager --exec --local --plugin ecs-tool-servicekeepalive --params "stop <service_name>"
Replace <service_name> with the actual service name. You can obtain the service name in the service_name column of the command output in Query the keepalive status of a service.
When you disable keepalive for a service, the service process is terminated, the service no longer automatically starts on system startup, and the service configuration generated by the ecs-tool-servicekeepalive plugin is also deleted.
Example
Prepare a script for a test service.
In the script, the user working directory is
/home/ecs-user
. Replace the working directory with the actual one.# In the /home/ecs-user directory, create the keepalive-simple folder and create the test-keepalive.sh script in the folder. sudo mkdir -p /home/ecs-user/keepalive-simple && \ sudo tee /home/ecs-user/keepalive-simple/test-keepalive.sh > /dev/null << 'EOF' #!/bin/bash # Generate a log message every second in the specified log file. while true do sudo echo "$(date '+%Y-%m-%d %H:%M:%S') progress is alive" >> $1 sleep 1 done EOF # Grant executable permissions on the script. sudo chmod +x /home/ecs-user/keepalive-simple/test-keepalive.sh
(Optional) Run the following command to query the status of the service:
ps aux | grep test-keepalive.sh
The following command output indicates that the service is not started:
ecs-user 2207 0.0 0.0 221528 916 pts/0 S+ 11:34 0:00 grep --color=auto test-keepalive.sh
Enable keepalive for the service.
sudo acs-plugin-manager --exec --plugin ecs-tool-servicekeepalive --params "start,'/bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh /home/ecs-user/keepalive-simple/test-keepalive.log'"
The following command output indicates that keepalive is enabled for the service:
Created symlink /etc/systemd/system/multi-user.target.wants/ecs_keepalive_1744256544.service → /etc/systemd/system/ecs_keepalive_1744256544.service. Start systemd service for "/bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh /home/ecs-user/keepalive-simple/test-keepalive.log" success
Query the keepalive status and the status of the service.
Run the following command to query the keepalive status of the service:
sudo acs-plugin-manager --exec --plugin ecs-tool-servicekeepalive --params "status"
The following command output indicates that keepalive is enabled for the service and the service is running:
service_name execstart user group status ecs_keepalive_1744256544.service /bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh /home/ecs-user/keepalive-simple/test-keepalive.log active (running) since Thu 2025-04-10 11
(Optional) Run the following command to query the status of the service:
ps aux | grep test-keepalive.sh
The following command output indicates that the service is running:
root 3144 0.0 0.0 222200 3420 ? Ss 11:42 0:00 /bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh /home/ecs-user/keepalive-simple/test-keepalive.log ecs-user 6841 0.0 0.0 221660 968 pts/0 S+ 11:49 0:00 grep --color=auto test-keepalive.sh
(Optional) Verify the keepalive effect.
Manually trigger the service process to stop to verify that Cloud Assistant can restart the process.
Method 1: Restart the ECS instance
Restart the instance in the ECS console to simulate an unexpected restart.
Method 2: Terminate the service process
Run the following command to terminate the
test-keepalive.sh
process. Replace <PID> with the process ID (PID) that you obtain by running the ps command.sudo date && kill -9 <PID>
(Optional) Run the following command to query the status of the service:
ps aux | grep test-keepalive.sh
The following command output indicates that the service is running:
root 33061 0.0 0.0 222200 3504 ? Ss 13:19 0:00 /bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh /home/ecs-user/keepalive-simple/test-keepalive.log ecs-user 34558 0.0 0.0 221660 2556 pts/0 S+ 13:23 0:00 grep --color=auto test-keepalive.sh
Run the following command to disable keepalive for the service:
sudo acs-plugin-manager --exec --local --plugin ecs-tool-servicekeepalive --params "stop ecs_keepalive_1744256544.service"
The following command output indicates that keepalive is disabled for the service:
service check ok, file:ecs_keepalive_1744256544.service is valid Removed /etc/systemd/system/multi-user.target.wants/ecs_keepalive_1744256544.service. stop service ok, service:ecs_keepalive_1744256544.service is stopped and removed
References
For more information about Cloud Assistant, see Overview.