All Products
Search
Document Center

Perform risk control on abnormal behaviors of a system

Last Updated: May 14, 2020

Background

A user system may encounter abnormal metrics when the CPU utilization of the O&M system increases suddenly, the user system is flooded with illegal information, or some users frequently make bargain speculation. The user system may be far less exposed to risks if we can take preventive measures and implement real-time warning for abnormal metrics through Machine Learning Platform for AI (PAI).

Business pain points

Real-time and effective measures are unavailable to monitor the metrics of user systems and improve the intelligent defense capability of user systems.

Solution

PAI provides a set of classification algorithms based on metric monitoring. These algorithms are used to abstract abnormal metric monitoring into a binary classification scenario and deploy the monitoring model to an online system for real-time calling. This helps implement near-line risk control.

  1. Required knowledge: knowledge of the classic algorithms in machine learning, especially feature engineering and binary classification algorithms.

  2. Development cycle: one to two days.

  3. Required data: one thousand labeled data items, including abnormal data and normal data.

Data

The following experiment uses a system-level monitoring log with 22,544 data items, of which 9,711 are abnormal data items.

Data:

Parameter Description
protocol_type The protocol used for network connection. Valid values: TCP, ICMP, and UDP.
service The service protocol. Valid values: HTTP, Finger, POP, Private, and SMTP.
flage Valid values: SF, RSTO, and REJ.
a2-a38 Different system metrics.
class The label field. “normal” indicates a normal sample, and “anomaly” indicates an abnormal sample.

Procedure

Log on to PAI Studioat https://pai.data.aliyun.com/console

The solution data and experiment environment are built in the corresponding template on the homepage.

Open the experiment:

1. Data source

The data source is the data described in the “Data” section.

2. Feature engineering

The One-Hot Encoding component converts character-type features to the numeric type. This is the most common mode of data encoding in machine learning.

The Normalization component limits all data within the range of 0 to 1, without the impact of dimensions. The following figure shows the normalized data.

Use the SQL Script component to mark metric labeled “anomaly” as 1 and those labeled “normal” metrics as 0 in the target column.

  1. select (case class when 'anomaly' then 1 else 0 end) as class from ${t1};

3. Model training

The binary logistic regression algorithm of logistic regression in machine learning is effective in training a monitoring model based on normal and abnormal samples.

4. Model evaluation

PAI provides the Binary Classification Evaluation component to evaluate the model effect based on metrics such as AUC, KS, and F1Score. The model used by this experiment reaches a prediction accuracy of more than 90%.

Summary

PAI provides comprehensive functions such as feature encoding, model training, and model evaluation, allowing you to create a metric anomaly monitoring model by extracting and labeling the features of abnormal behaviors of the target system.