This topic describes two methods to install the xdragon_hardware_detect_plugin monitoring plug-in. These methods are applicable to ECS bare metal instances that are equipped with local disks.
Prerequisites
You can install the monitoring plug-in on an ECS bare metal instance only when the instance meets the following requirements:
- The instance is located in the China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen), or China (Zhangjiakou-Beijing Winter Olympics) region.
- The instance runs a Linux operating system.
- To install the monitoring plug-in on multiple instances at a time by using Operation Orchestration Service (OOS) and use a tag to select ECS bare metal instances, the tag must be bound to the instances. For more information, see Create or bind a tag.
- To manually install the monitoring plug-in, you must make sure that the Cloud Assistant
client is installed. For more information, see Install the Cloud Assistant client.
Note ECS instances that are created from public images after December 1, 2017 are pre-installed with the Cloud Assistant client.
Background information
If you are using an ECS bare metal instance equipped with a local disk, you must monitor and diagnose the health status of the local disk by using the xdragon_hardware_detect_plugin plug-in.
The xdragon_hardware_detect_plugin plug-in can check the health status of local disks
on ECS bare metal instances on a regular basis. If an exception occurs to a local
disk, the plug-in automatically reports the exception as a system event of the local
disk. The corresponding system event code is SystemMaintenance.ReInitErrorDisk
. For more information, see O&M scenarios and system events for instances equipped with local disks.
Install the monitoring plug-in on multiple instances at a time by using OOS
OOS can automatically install the xdragon_hardware_detect_plugin monitoring plug-in by using a public template.
Manually install the monitoring plug-in
You can perform the following operations to download and install the xdragon_hardware_detect_plugin monitoring plug-in by using the aliyun_installer tool provided by the Cloud Assistant client:
#! /bin/bash
echo "installing cms agent"
if [ -z "${CMS_HOME}" ]; then
CMS_HOME="/usr/local/cloudmonitor"
[[ ! -z "`egrep -i coreos /etc/os-release`" ]] && CMS_HOME="/opt/cloudmonitor"
fi
if [ `uname -m` = "x86_64" ]; then
ARCH="amd64"
else
ARCH="386"
fi
VERSION="2.1.57"
ELF_NAME=CmsGoAgent.linux-${ARCH}
DOWNLOAD_PATH="cms-go-agent/${VERSION}/${ELF_NAME}"
DEST_UPDATE_FILE="$CMS_HOME/${ELF_NAME}"
current_cms_version="0"
# SHENLONG always x86 arch, judge version
if [ -f /usr/local/cloudmonitor/CmsGoAgent.linux-amd64 ]; then
current_cms_version="$($DEST_UPDATE_FILE version)"
fi
if [ $current_cms_version = "2.1.57" ]; then
echo "CmsGoAgent already installed"
echo "Installation success."
exit 0
fi
if [ -z "${REGION_ID}" ]; then
REGION_ID="$(wget -q --timeout=1 -t 1 -O - 'http://100.100.100.200/latest/meta-data/region-id')"
fi
if [ -d $CMS_HOME ] ; then
if [ -f $CMS_HOME/wrapper/bin/cloudmonitor.sh ] ; then
$CMS_HOME/wrapper/bin/cloudmonitor.sh remove;
rm -rf $CMS_HOME;
elif [ -f $DEST_UPDATE_FILE ]; then
$DEST_UPDATE_FILE stop
#$DEST_UPDATE_FILE uninstall
ps aux | grep -v grep | grep $ELF_NAME
fi
fi
download()
{
if [ -z "${REGION_ID}" ]; then
echo "networkType is classic"
OSS_URL="http://cms-agent-cn-hangzhou.oss-cn-hangzhou-internal.aliyuncs.com/$DOWNLOAD_PATH"
else
echo "networkType is vpc, REGION_ID: $REGION_ID"
if [[ "$REGION_ID" = "cn-shenzhen-finance-1" ]]; then
OSS_URL="http://cms-download.aliyun.com/$DOWNLOAD_PATH"
CMS_PROXY="szcmsproxy.aliyun.com:3128"
elif [[ "$REGION_ID" = "cn-shanghai-finance-1" ]]; then
OSS_URL="http://cms-agent-$REGION_ID.oss-$REGION_ID-pub-internal.aliyuncs.com/$DOWNLOAD_PATH"
elif [[ "$REGION_ID" = "ap-south-1" ]]; then
OSS_URL="http://cms-download.aliyun.com/$DOWNLOAD_PATH"
CMS_PROXY="cmsproxy-ap-south-1.aliyuncs.com:8080"
elif [ "$REGION_ID" = "ap-southeast-3" -o "$REGION_ID" = "me-east-1" -o "$REGION_ID" = "cn-chengdu" ]; then
OSS_URL="http://cms-download.aliyun.com/$DOWNLOAD_PATH"
else
OSS_URL="http://cms-agent-$REGION_ID.oss-$REGION_ID-internal.aliyuncs.com/$DOWNLOAD_PATH"
fi
fi
echo download from "$OSS_URL"
wget -q -e "http_proxy=$CMS_PROXY" "$OSS_URL" -O "$DEST_UPDATE_FILE" -t 3 --connect-timeout=2
if [ $? != 0 ]; then
echo "download fail, retry..."
CMS_PROXY="vpc-opencmsproxy.aliyun.com:8080";
OSS_URL="http://cms-download.aliyun.com/$DOWNLOAD_PATH"
wget -q -e "http_proxy=$CMS_PROXY" "$OSS_URL" -O "$DEST_UPDATE_FILE" -t 3 --connect-timeout=2
fi
if [ $? != 0 ]; then
echo "download fail, retry..."
CMS_PROXY="opencmsproxy.aliyun.com:8080";
OSS_URL="http://cms-download.aliyun.com/$DOWNLOAD_PATH"
wget -q -e "http_proxy=$CMS_PROXY" "$OSS_URL" -O "$DEST_UPDATE_FILE" -t 3 --connect-timeout=2
fi
}
mkdir -p $CMS_HOME && \
chown -R root:root $CMS_HOME && \
download && \
chmod a+x $DEST_UPDATE_FILE
$DEST_UPDATE_FILE check
RC=$?
if [ ${RC} -ne 0 ]; then
echo CmsGoAgent install failed, your platform is not supported
exit ${RC}
fi
$DEST_UPDATE_FILE install >/dev/null 2>&1 || true
$DEST_UPDATE_FILE start
ps aux | grep -v grep | grep $ELF_NAME
ACT_VERSION=`$DEST_UPDATE_FILE version`
if [ -n "$ACT_VERSION" ]; then
echo CmsGoAgent v$ACT_VERSION installed
else
echo CmsGoAgent install failed
exit 1
fi
Result
aliyun_installer -d xdragon_hardware_detect_plugin
command to update the plug-in, or run the aliyun_installer -u xdragon_hardware_detect_plugin
command to uninstall it.
What to do next
You can call an ECS API operation to isolate damaged local disks. When damaged local disks are isolated, the corresponding ECS bare metal instances are not migrated to a different physical machine. For more information, see Overview of system events on ECS instances equipped with local disks.