This topic describes two methods of installing the xdragon_hardware_detect_plugin monitoring plug-in. These methods are applicable to ECS Bare Metal Instances that are equipped with local disks.

Prerequisites

You can install the monitoring plug-in only on an ECS Bare Metal Instance that meets the following requirements:

  • The instance is located in the China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen) or China (Zhangjiakou-Beijing Winter Olympics) region.
  • The instance runs on a Linux operating system.
  • If you need to batch install the monitoring plug-in on multiple instances by using Operation Orchestration Service (OOS) and use a tag to select target ECS Bare Metal Instances, the tag must have been bound to the instances. For more information, see Add a tag to resources.
  • If you need to manually install the monitoring plug-in, ensure that the Cloud Assistant client is installed. For more information, see Configure the cloud assistant client.
    Note ECS instances created from public images after December 1, 2017 are pre-installed with the Cloud Assistant client.

Background information

If you are using an ECS Bare Metal Instance equipped with a local disk, you need to monitor and diagnose the health status of the local disk by using the xdragon_hardware_detect_plugin plug-in.

The xdragon_hardware_detect_plugin plug-in can check the health status of local disks on ECS Bare Metal Instances on a regular basis. If an exception occurs to a local disk, the plug-in will automatically report the exception as a system event of the local disk. The corresponding system event code is SystemMaintenance.ReInitErrorDisk. For more information, see Overview of system events on ECS instances equipped with local disks.

Method 1: Batch install the monitoring plug-in on multiple instances by using OOS

The OOS can automatically install the xdragon_hardware_detect_plugin monitoring plug-in by using a public template.

  1. Log on to the OOS console.
  2. In the top navigation bar, select a region.
  3. In the Public Template section, enter ACS-InstallXDragonAndCloudMonitor to search for the template that is used to install the monitoring plug-in. Click Create Execution.
    Batch install the xdragon_hardware_detect_plugin monitoring plug-in on multiple instances - Create Execution
  4. In the Create Execution pane, complete the following settings:
    1. On the Basic Information tab, set Execution Mode to Automatic, and retain the default values for other parameters. Click Next: Parameter Settings.
    2. On the Parameter Settings tab, set the following parameters as needed and retain the default values for other parameters. After completing the settings, click Next: Preview.
      • targets: You can Specify Instance Tags or Select Instances Manually to select one or more ECS Bare Metal Instances that need to be polled and installed with the monitoring plug-in.
      • action: Select install to install the plug-in.
        Note This operations template can be used to install, update, or uninstall the monitoring plug-in. Set the action parameter based on your needs.
      • rateControl: Select Concurrency-based Control, select % for Concurrency and set it to 100%.
      Batch install the xdragon_hardware_detect_plugin monitoring plug-in on multiple instances - Create Execution
    3. After confirming the preceding settings, click Create Execution.
    After the execution is created, you can go to the Executions page to view the execution results.
    • If Success is displayed in the Execution Status column corresponding to an O&M task, it indicates that the O&M task is successful.
    • If Failed is displayed in the Execution Status column corresponding to an O&M task, you can click Details in the Actions column and then click Execution Logs. Then you can analyze and adjust the execution content based on the log information.

Method 2: Install the monitoring plug-in manually

You can perform the following steps to download and install the xdragon_hardware_detect_plugin monitoring plug-in by using the aliyun_installer tool provided by the Cloud Assistant client:

  1. Connect to an ECS Bare Metal Instance as a root user.
  2. Optional: Run the aliyun_installer -h command to view the help information of the tool.
  3. Run the aliyun_installer command to install the xdragon_hardware_detect_plugin monitoring plug-in.
    [root@EcsHost ~]# aliyun_installer -i xdragon_hardware_detect_plugin -e 1.0.0
  4. Save and run the shell script that is used to install the special version of the CloudMonitor agent. For more information about the shell script, see the Sample script section in this topic.
    [root@EcsHost ~]# bash <nameOfTheScript>.sh
    Note You must install the xdragon_hardware_detect_plugin monitoring plug-in before you install the CloudMonitor agent. If you have installed the CloudMonitor agent first, run the /usr/local/cloudmonitor/CmsGoAgent.linux-amd64 restart command to restart the CloudMonitor agent.
  5. Run the smartctl -V command to check whether the smartctl monitoring and analysis tool for local disks has been installed on the target instance.
    If the version number of the smartctl tool is shown in the command output, the tool has been installed on the instance.
  6. Optional: If the version number of the smartctl tool is not shown in the command output, install the tool by using the following methods:
    • For CentOS systems:
      root@EcsHost ~]# yum install smartmontools
    • For Ubuntu systems:
      root@EcsHost ~]# apt update &&  apt install smartmontools
    • For more information about the installation methods for other Linux distributions, see smartmontools document.
Sample script of shell for installing the special version of the CloudMonitor agent manually:
#! /bin/bash
echo "installing cms agent"

if [ -z "${CMS_HOME}" ]; then
  CMS_HOME="/usr/local/cloudmonitor"
  [[ ! -z "`egrep -i coreos /etc/os-release`" ]] && CMS_HOME="/opt/cloudmonitor"
fi

if [ `uname -m` = "x86_64" ]; then
    ARCH="amd64"
else
    ARCH="386"
fi

VERSION="2.1.57"
ELF_NAME=CmsGoAgent.linux-${ARCH}
DOWNLOAD_PATH="cms-go-agent/${VERSION}/${ELF_NAME}"
DEST_UPDATE_FILE="$CMS_HOME/${ELF_NAME}"

current_cms_version="0"
# xdragon always x86 arch, judge version
if [ -f /usr/local/cloudmonitor/CmsGoAgent.linux-amd64 ]; then
    current_cms_version="$($DEST_UPDATE_FILE version)"
fi

if [ $current_cms_version = "2.1.57" ]; then
    echo "CmsGoAgent already installed"
    echo "Installation success."
    exit 0
fi

if [ -z "${REGION_ID}" ]; then
  REGION_ID="$(wget -q --timeout=1 -t 1 -O - 'http://100.100.100.200/latest/meta-data/region-id')"
fi


if [ -d $CMS_HOME ] ; then
  if [ -f $CMS_HOME/wrapper/bin/cloudmonitor.sh ] ; then
    $CMS_HOME/wrapper/bin/cloudmonitor.sh remove;
    rm -rf $CMS_HOME;
  elif [ -f $DEST_UPDATE_FILE ]; then
    $DEST_UPDATE_FILE stop
    #$DEST_UPDATE_FILE uninstall
    ps aux | grep -v grep | grep $ELF_NAME
  fi
fi

download()
{
  if [ -z "${REGION_ID}" ]; then
    echo "networkType is classic"
    OSS_URL="http://cms-agent-cn-hangzhou.oss-cn-hangzhou-internal.aliyuncs.com/$DOWNLOAD_PATH"
  else
    echo "networkType is vpc, REGION_ID: $REGION_ID"
    if [[ "$REGION_ID" = "cn-shenzhen-finance-1" ]]; then
      OSS_URL="http://cms-download.aliyun.com/$DOWNLOAD_PATH"
      CMS_PROXY="szcmsproxy.aliyun.com:3128"
    elif [[ "$REGION_ID" = "cn-shanghai-finance-1" ]]; then
      OSS_URL="http://cms-agent-$REGION_ID.oss-$REGION_ID-pub-internal.aliyuncs.com/$DOWNLOAD_PATH"
    elif [[ "$REGION_ID" = "ap-south-1" ]]; then
      OSS_URL="http://cms-download.aliyun.com/$DOWNLOAD_PATH"
      CMS_PROXY="cmsproxy-ap-south-1.aliyuncs.com:8080"
    elif [ "$REGION_ID" = "ap-southeast-3" -o "$REGION_ID" = "me-east-1" -o "$REGION_ID" = "cn-chengdu" ]; then
      OSS_URL="http://cms-download.aliyun.com/$DOWNLOAD_PATH"
    else
      OSS_URL="http://cms-agent-$REGION_ID.oss-$REGION_ID-internal.aliyuncs.com/$DOWNLOAD_PATH"
    fi
  fi
  echo download from "$OSS_URL"
  wget -q -e "http_proxy=$CMS_PROXY" "$OSS_URL" -O "$DEST_UPDATE_FILE" -t 3 --connect-timeout=2
  if [ $? ! = 0 ]; then
    echo "download fail, retry..."
    CMS_PROXY="vpc-opencmsproxy.aliyun.com:8080";
    OSS_URL="http://cms-download.aliyun.com/$DOWNLOAD_PATH"
    wget -q -e "http_proxy=$CMS_PROXY" "$OSS_URL" -O "$DEST_UPDATE_FILE" -t 3 --connect-timeout=2
  fi
  if [ $? ! = 0 ]; then
    echo "download fail, retry..."
    CMS_PROXY="opencmsproxy.aliyun.com:8080";
    OSS_URL="http://cms-download.aliyun.com/$DOWNLOAD_PATH"
    wget -q -e "http_proxy=$CMS_PROXY" "$OSS_URL" -O "$DEST_UPDATE_FILE" -t 3 --connect-timeout=2
  fi
}

mkdir -p $CMS_HOME && \
chown -R root:root $CMS_HOME && \
download && \
chmod a+x $DEST_UPDATE_FILE
$DEST_UPDATE_FILE check
RC=$?
if [ ${RC} -ne 0 ]; then
    echo CmsGoAgent install failed, your platform is not supported
    exit ${RC}
fi

$DEST_UPDATE_FILE install >/dev/null 2>&1 || true
$DEST_UPDATE_FILE start
ps aux | grep -v grep | grep $ELF_NAME

ACT_VERSION=`$DEST_UPDATE_FILE version`
if [ -n "$ACT_VERSION" ]; then
    echo CmsGoAgent v$ACT_VERSION installed
else
    echo CmsGoAgent install failed
    exit 1
fi

Result

After the xdragon_hardware_detect_plugin monitoring plug-in is installed, you can view it in the /usr/local/xdragon_hwqc directory. You can run the aliyun_installer -d xdragon_hardware_detect_plugin command to update the plug-in, or run the aliyun_installer -u xdragon_hardware_detect_plugin command to uninstall it.

What to do next

You can call an ECS API operation to isolate damaged local disks. When damaged local disks are isolated, the corresponding ECS Bare Metal Instances are not migrated to different physical machines. For more information, see Overview of system events on ECS instances equipped with local disks.