This topic describes how to use Alibaba Cloud Managed Service for Prometheus to monitor a Simple Network Management Protocol (SNMP) system.

Prerequisites

A Prometheus instance for Container Service is created. For more information, see Create a Prometheus instance to monitor an ACK cluster.

Limit

You can install the component only for Prometheus for Container Service instances.

What is SNMP?

Components of an SNMP system

An SNMP system consists of the following components: network management system (NMS), agent, managed object, and management information base (MIB). The components constitute SNMP management models and are crucial to the SNMP architecture. The following concepts are involved in these components:
  • Network management system (NMS): NMS is a network manager that can query or modify various information from SNMP agents, or receive information actively pushed from the agents. Managed Service for Prometheus provides the SNMP exporter to allow you to query information from SNMP agents.
  • SNMP agent: An SNMP agent runs on a managed device. The agent collects and reports device information to the NMS.
  • Management information base (MIB): An MIB is a database that lists various managed objects that a managed device can provide. Each managed object has a unique object identifier (OID).
  • Device: Devices can be switches, routers (including soft routers), firewalls, uninterruptible power supplies (UPSs), and access points (APs) that support SNMP.
  • Managed object: A device contains at least one managed object. A managed object may be the device itself, a hardware (such as a network port), or a set of parameters.
  • Object identifier (OID): An OID is used to identify a specific managed object. An OID is a string of numbers. For example, 1.3.6.1.2.1.1 represents the OID of the system. An OID can be depicted as a tree. The first part of the OID is a private enterprise number (PEN) created by the Internet Assigned Numbers Authority (IANA). The rest are private branches defined by vendors for their own products.
  • Module: To monitor a variety of devices and vendors, the SNMP exporter consists of a dozen modules, such as if_mib for network devices, ddwrt for soft routers, and paloalto_fw for firewalls. if_mib is the most commonly used.

SNMP exporter

Similar to Managed Service for Prometheus metrics, SNMP OIDs can distinguish different status data. The SNMP exporter queries the specified OID data from SNMP agents, maps the data to readable Managed Service for Prometheus metrics, and then converts SNMP data to the metrics. In addition, the SNMP exporter provides comprehensive conversion configurations by default. In most cases, you can convert OIDs into readable metric data without additional configurations.

SNMP metric monitoring models

SNMP metric collection

SNMP can help O&M engineers manage the network in a simple and efficient manner. SNMP helps the engineers collect information about the bandwidth usage of different devices on the network, troubleshoot problems, and quickly identify network performance trends. SNMP collects different data from devices of different vendors. The default configurations of the SNMP exporter are compatible with the OID mappings of mainstream vendors and their network devices and can meet the requirements of most scenarios. For more information, see Prometheus open source documentation. Alibaba Cloud Managed Service for Prometheus supports the metric data collection of the if_mib module.

The Cisco 16-port switch is used as an example to list SNMP metrics.

MetricDescriptionOID
ifAdminStatusThe status of the interface.1.3.6.1.2.1.2.2.1.7
ifHCOutOctetsThe total number of bytes sent by the interface.1.3.6.1.2.1.31.1.1.1.10
ifInBroadcastPktsThe number of broadcast packets received by the interface.1.3.6.1.2.1.31.1.1.1.3
ifInErrorsThe number of errors that occur in inbound traffic.1.3.6.1.2.1.2.2.1.14
ifSpeedThe transmission rate of the interface. Unit: bits per second.1.3.6.1.2.1.2.2.1.5
ifMtuThe maximum transmission unit (MTU).1.3.6.1.2.1.2.2.1.4
ifOutDiscardsThe total number of dropped packets in the outbound traffic of the interface.1.3.6.1.2.1.2.2.1.19
ifHCInOctetsThe total number of bytes received by the interface.1.3.6.1.2.1.31.1.1.1.6
ifHighSpeedThe backplane bandwidth of the interface. Unit: Mbits per second.1.3.6.1.2.1.31.1.1.1.15
ifInDiscardsThe total number of dropped packets in the inbound traffic of the interface.1.3.6.1.2.1.2.2.1.13
ifInMulticastPktsThe number of multicast packets received by the interface.1.3.6.1.2.1.31.1.1.1.2
ifInUnknownProtosThe total number of unknown protocol packets received by the interface.1.3.6.1.2.1.2.2.1.15
ifOutMulticastPktsThe number of multicast packets sent by the interface.1.3.6.1.2.1.31.1.1.1.4
sysUpTimeThe time when the system was last re-initialized.N/A

SNMP monitoring dashboards

By default, Managed Service for Prometheus provides the SNMP Status and SNMP Interface Detail dashboards to visualize monitoring data such as network traffic data in if_mib scenarios.

SNMP Status

The dashboard displays the overall status of the device. You can view monitoring data, such as the uptime, current inbound or outbound traffic, total inbound and outbound traffic, real-time traffic of each port, and traffic change trends.

SNMP Interface Detail

The dashboard displays the working details of each port. You can view monitoring data, such as the port status, whether the port is connected, port rate, MTU configurations, and the rate or packet changes related to unicast, multicast, and broadcast traffic.

Important Before you use the SNMP Interface Detail dashboard, you must add the required data source in the Variable section.

SNMP alert rules

You can configure the following alert items for SNMP based on the metrics:

  • The interface throughput reaches the 80% of the speed.
  • The number of dropped packets or errors in outbound traffic exceeds the threshold. The number of dropped packets or errors in inbound traffic exceeds the threshold.
  • The outbound queue length exceeds the threshold.
  • The number of interfaces is changed.

Use Managed Service for Prometheus to monitor an SNMP system

Procedure

Procedure 1: Integration center of a Prometheus instance

  1. Log on to the ARMS console.
  2. In the left-side navigation pane, choose Prometheus Service > Prometheus Instances.
  3. Click the name of the Prometheus instance instance that you want to manage to go to the Integration Center page.

Procedure 2: Integration Center in the left-side navigation pane of the ARMS console

  1. Log on to the ARMS console.
  2. In the left-side navigation pane, click Integration Center. In the Components section, find SNMP and click Add. In the panel that appears, integrate SNMP as prompted.

Step 1: Integrate an SNMP exporter

This section describes how to integrate an SNMP exporter in the integration center of a Prometheus instance.

  1. Install or add the SNMP component.
    • If this is the first time that you install the SNMP component, perform the following operation.
      In the Not Installed section of the Integration Center page, find SNMP and click Install.
      Note You can click the card to view the common SNMP metrics and dashboard thumbnails in the panel that appears. Due to the complexity of OID and MIB, the metrics listed are for reference only. After you install the SNMP component, you can view the actual metrics.
    • If you have installed the SNMP component, you need to add the component again.

      In the Installed section of the Integration Center page, find SNMP and click Add.

  2. On the Configuration tab in the STEP2 section, configure the parameters and click OK. The following table describes the parameters.
    ParameterDescription
    Instance nameThe name of the SNMP exporter. The name must meet the following requirements:
    • The name can contain only lowercase letters, digits, and hyphens (-), and cannot start or end with a hyphen (-).
    • The name must be unique.
    Note If you do not specify this parameter, the system uses the default name, which consists of the exporter type and a numeric suffix.
    SNMP device IP ADDRThe IP address of the SNMP device that you want to monitor.
    Metrics pathThe HTTP path of the SNMP metric from which monitoring data is collected. Default value: /snmp.
    Metrics scrape interval (seconds)The interval at which Managed Service for Prometheus collects SNMP monitoring data. Default value: 30.
    Note You can view the monitoring metrics on the Metrics tab in the STEP2 section.
    After you click OK, the system adds a deployment named snmp-exporter-snmp-test-1 to the arms-prom namespace of your ACK cluster and automatically configures data collection jobs. You can view the data collection jobs on the Targets tab of the Service Discovery page. To view the data collection jobs, perform the following steps:
    1. Log on to the ARMS console.
    2. In the left-side navigation pane, choose Prometheus Service > Prometheus Instances.
    3. Click the name of the Prometheus instance. In the left-side navigation pane, click Service Discovery. On the Targets tab, view the data collection jobs.

    You can also click the SNMP component in the Installed section of the Integration Center page. In the panel that appears, you can view information such as targets, metrics, dashboards, service discovery configurations, and exporters. For more information, see Integration center.

Step 2: View SNMP dashboard data

By default, Managed Service for Prometheus integrates the SNMP dashboards with Grafana. You do not need to install Grafana.

On the Integration Center page, click the SNMP component in the Installed section. In the panel that appears, click the Dashboards tab to view the thumbnails and hyperlinks of SNMP dashboards. Click a hyperlink to go to the Grafana page and view the dashboard.

Step 3: Configure SNMP alerting

On the Integration Center page, click the SNMP component in the Installed section. In the panel that appears, click the Alerts tab to view all SNMP alert rules configured in Managed Service for Prometheus.

When you install the SNMP component in the integration center, an alert group named snmp_exporter is created and has alert rules added by default. You only need to manually modify the threshold values and enable the alert rules. You can also create alert rules based on your business requirements. For more information, see Create an alert rule for a Prometheus instance.

FAQ: What do I do if the SNMP exporter fails to collect metric data?

Network errors may occur during metric mapping of the SNMP exporter. In this case, metric data may fail to be collected. If metrics fail to be collected, you can perform the following operations to troubleshoot the failure.

  1. Check the status of Prometheus targets.
    1. Log on to the ARMS console.
    2. In the left-side navigation pane, choose Prometheus Service > Prometheus Instances.
    3. Click the Prometheus instance. In the left-side navigation pane, click Service Discovery. On the Service Discovery page, click the Targets tab and click a target to view the status in the State column.
    • If the target is displayed on the Unhealthy tab, check the status of the snmp-exporter pod.
    • If the target is running as expected, go to the next step.
  2. View the logs of the snmp-exporter pod and check whether errors are reported in the logs.
    • You can troubleshoot network errors in the logs as prompted.
    • If only one SNMP metric cannot be collected and no error is reported in the logs, the metric may not exist in the device. You can use SNMPwalk for troubleshooting.
      1. You can use SNMPwalk to obtain data collected by the SNMP exporter. Some Linux distributions do not provide SNMPwalk by default. You must install the net-snmp-utils package.
      2. On a machine that can connect to the SNMP device, use SNMPwalk to obtain the raw data.
        snmpwalk -v 2c -c public snmp_dev_ip  OID
        # Replace the snmp_dev_ip and OID fields with the IP address of the device and the OID of the missing metric.
        # -v indicates the SNMP version.
        # -c indicates community. Default value: public.
        # You can use snmp_dev_ip:port to specify a non-default port. Default port: 116.

        If SNMPwalk fails to obtain the metric data, you must contact technical support from the vendor and check whether the metric is provided.