All Products
Search
Document Center

Elastic Compute Service:Use a Cloud Assistant plugin to keep services alive

Last Updated:Apr 16, 2025

Services or scripts may stop running due to program exceptions, instance restarts, or power outage. If the services or scripts fail to resume at the earliest opportunity, online business may suffer losses. This topic describes how to use the ecs-tool-servicekeepalive plugin of Cloud Assistant to keep services alive.

Implementation

Based on systemd in Linux, the ecs-tool-servicekeepalive plugin uses a periodic monitoring mechanism to ensure that interrupted services or scripts can quickly resume, thereby ensuring service reliability and continuity. To use the plugin to keep a service alive, enter a command that can start the service or its program. Then, the plugin automatically generates the systemd service configuration based on the startup command that you enter, without the need for manual configuration. After the configuration is generated, systemd starts the service and configures the service to start on system startup.

Note

systemd is a Linux component that can be used to manage services. For example, systemd can start a service on instance startup or restart a service after an unexpected stop. For more information, see systemd documentation.

Procedure

Enable keepalive for a service

After you deploy services on an Elastic Compute Service (ECS) instance, start the ecs-tool-servicekeepalive plugin of Cloud Assistant as the root user.

Start a service as the root user

sudo acs-plugin-manager --exec --plugin ecs-tool-servicekeepalive --params "start,'<cmd>'"

Replace <cmd> with the actual command that can start a service. Examples:

  • Shell program: /bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh

  • Python program: python /home/root/main.py

Start a service as a specific user

sudo acs-plugin-manager --exec --plugin ecs-tool-servicekeepalive --params "start,execstart='<cmd>',user=<user_name>,group=<group_name>"
  • Replace <cmd> with the actual command that can start a service. Examples:

    • Shell program: /bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh

    • Python program: python /home/root/main.py

  • Replace <user_name> with the actual name of the user that starts the service. Run the cut -d: -f1 /etc/passwd command to view existing users.

  • Replace <group_name> with the group name of the user that starts the service. Run the cut -d: -f1 /etc/group command to view existing user groups.

Warning
  • You must specify the absolute path of the script or program file.

  • If you cannot enable keepalive for a service, we recommend that you perform the following operations to prevent business exceptions due to multiple running processes for the service: Query the keepalive status of the service, disable keepalive for the service, resolve the issue, restart the service, and then re-enable keepalive for the service.

Query the keepalive status of a service

Run the following command to check whether keepalive is enabled for a service:

sudo acs-plugin-manager --exec --plugin ecs-tool-servicekeepalive --params "status"

The following command output indicates that keepalive is enabled for the service:

service_name                   execstart            user  group status              
ecs_keepalive_1744262359.service /bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh /home/ecs-user/keepalive-simple/test-keepalive.log              active (running) since Thu 2025-04-10 13

Disable keepalive for a service

Run the following command to disable keepalive for a service:

sudo acs-plugin-manager --exec --local --plugin ecs-tool-servicekeepalive --params "stop <service_name>"

Replace <service_name> with the actual service name. You can obtain the service name in the service_name column of the command output in Query the keepalive status of a service.

Note

When you disable keepalive for a service, the service process is terminated, the service no longer automatically starts on system startup, and the service configuration generated by the ecs-tool-servicekeepalive plugin is also deleted.

Example

  1. Prepare a script for a test service.

    In the script, the user working directory is /home/ecs-user. Replace the working directory with the actual one.

    # In the /home/ecs-user directory, create the keepalive-simple folder and create the test-keepalive.sh script in the folder.
    sudo mkdir -p /home/ecs-user/keepalive-simple && \
    sudo tee /home/ecs-user/keepalive-simple/test-keepalive.sh > /dev/null << 'EOF'
    #!/bin/bash
    # Generate a log message every second in the specified log file.
    while true
    do
       sudo echo "$(date '+%Y-%m-%d %H:%M:%S') progress is alive" >> $1
        sleep 1
    done
    EOF
    # Grant executable permissions on the script.
    sudo chmod +x /home/ecs-user/keepalive-simple/test-keepalive.sh
  2. (Optional) Run the following command to query the status of the service:

    ps aux | grep test-keepalive.sh

    The following command output indicates that the service is not started:

    ecs-user    2207  0.0  0.0 221528   916 pts/0    S+   11:34   0:00 grep --color=auto test-keepalive.sh
  3. Enable keepalive for the service.

    sudo acs-plugin-manager --exec --plugin ecs-tool-servicekeepalive --params "start,'/bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh /home/ecs-user/keepalive-simple/test-keepalive.log'"

    The following command output indicates that keepalive is enabled for the service:

    Created symlink /etc/systemd/system/multi-user.target.wants/ecs_keepalive_1744256544.service → /etc/systemd/system/ecs_keepalive_1744256544.service.
    Start systemd service for "/bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh /home/ecs-user/keepalive-simple/test-keepalive.log" success
  4. Query the keepalive status and the status of the service.

    1. Run the following command to query the keepalive status of the service:

      sudo acs-plugin-manager --exec --plugin ecs-tool-servicekeepalive --params "status"

      The following command output indicates that keepalive is enabled for the service and the service is running:

      service_name                   execstart            user  group status              
      ecs_keepalive_1744256544.service /bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh /home/ecs-user/keepalive-simple/test-keepalive.log              active (running) since Thu 2025-04-10 11
    2. (Optional) Run the following command to query the status of the service:

      ps aux | grep test-keepalive.sh

      The following command output indicates that the service is running:

      root        3144  0.0  0.0 222200  3420 ?        Ss   11:42   0:00 /bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh /home/ecs-user/keepalive-simple/test-keepalive.log
      ecs-user    6841  0.0  0.0 221660   968 pts/0    S+   11:49   0:00 grep --color=auto test-keepalive.sh
  5. (Optional) Verify the keepalive effect.

    1. Manually trigger the service process to stop to verify that Cloud Assistant can restart the process.

      Method 1: Restart the ECS instance

      Restart the instance in the ECS console to simulate an unexpected restart.

      Method 2: Terminate the service process

      Run the following command to terminate the test-keepalive.sh process. Replace <PID> with the process ID (PID) that you obtain by running the ps command.

      sudo date && kill -9 <PID>
    2. (Optional) Run the following command to query the status of the service:

      ps aux | grep test-keepalive.sh

      The following command output indicates that the service is running:

      root       33061  0.0  0.0 222200  3504 ?        Ss   13:19   0:00 /bin/bash /home/ecs-user/keepalive-simple/test-keepalive.sh /home/ecs-user/keepalive-simple/test-keepalive.log
      ecs-user   34558  0.0  0.0 221660  2556 pts/0    S+   13:23   0:00 grep --color=auto test-keepalive.sh
  6. Run the following command to disable keepalive for the service:

    sudo acs-plugin-manager --exec --local --plugin ecs-tool-servicekeepalive --params "stop ecs_keepalive_1744256544.service"

    The following command output indicates that keepalive is disabled for the service:

    service check ok, file:ecs_keepalive_1744256544.service is valid
    Removed /etc/systemd/system/multi-user.target.wants/ecs_keepalive_1744256544.service.
    stop service ok, service:ecs_keepalive_1744256544.service is stopped and removed

References

For more information about Cloud Assistant, see Overview.