You can modify and enable the YML configuration of a shipper to complete a data collection task. This topic describes how to prepare a YML configuration file for a shipper and describes the parameters in the YML configuration file.

Prerequisites

An Alibaba Cloud Elasticsearch cluster is created, and the Auto Indexing feature is enabled for the cluster. For more information about how to create an Elasticsearch cluster, see Create an Alibaba Cloud Elasticsearch cluster.

For security purposes, Alibaba Cloud Elasticsearch disables the Auto Indexing feature by default. However, Beats depends on this feature. If you select Elasticsearch for Output when you install a shipper, you must enable the Auto Indexing feature. For more information, see Configure the YML file.

Note Open source Beats provides multiple modules, but Alibaba Cloud Beats does not allow you to perform separate configuration for these modules. If you want to use the modules, you must configure them in the configuration files of different shippers. For example, if you want to enable the system module for a Metricbeat shipper, add the following script to metricbeat.yml:
metricbeat.modules:
- module: system
metricsets: ["diskio","network"]
diskio.include_devices: []
period: 1s

Filebeat configuration

You can specify filebeat.inputs in filebeat.yml to determine how to search for or process input data sources. The following figure shows an example of a simple input configuration. Filebeat configuration
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /opt/test/logs/t1.log
    - /opt/test/logs/t2/*
  fields:
    alilogtype: usercenter_serverlog
Notice
  • If you specify Output when you install a shipper, you do not need to configure the output part in Shipper YML Configuration. Otherwise, the system prompts a shipper installation error.
  • Each input data source starts with a hyphen (-). You can use multiple hyphens to specify multiple input data sources.
Parameter Description
type The input type. Examples of valid values: stdin, redis, tcp, and syslog. Default value: log.
paths The paths of the logs you want to monitor. You can specify a file or a directory, such as /log/nginx.log or log/*. The specified file or directory is mapped to Docker.
Notice If you specify a directory, you must make sure that the specified directory contains an asterisk (*) for fuzzy match, such as /*. You must also make sure that the type of the files in the specified directory is the same as the type of the files from which you want to collect data.
enabled Specifies whether the configuration takes effect. Valid values:
  • true: The configuration takes effect.
  • false: The configuration does not take effect.
fields Optional. Below this parameter, you can indent with two spaces to add fields. For example, enter alilogtype: usercenter_serverlog to add this field to each output log to identify the type of the log source. If logs are shipped to Logstash, they can be classified and processed based on this field.

For more information, see Log input in the open source Filebeat documentation.

Metricbeat configuration

Metricbeat delivers system and service statistics in a lightweight manner. You can specify metricbeat.modules in metricbeat.yml to configure a module. Metricbeat configuration
metricbeat.modules:
- module: system
  metricsets: ["diskio","network"]
  enabled: true
  hosts: ["http://XX.XX.XX.XX/"]
  period: 10s
  fields:
    dc: west
  tags: ["tag"]
Notice If you specify Output when you install a shipper, you do not need to configure the output part in Shipper YML Configuration. Otherwise, the system prompts a shipper installation error.
Parameter Description
module The name of the module you want to run. For more information about supported modules, see Modules.
metricsets The metricsets you want to execute. For more information about metricsets, see Modules.
enabled Specifies whether the configuration takes effect. The value true indicates that the configuration takes effect. The value false indicates that the configuration does not take effect.
period Specifies the frequency at which the metricsets are executed. If the system is inaccessible, Metricbeat returns an error for each period.
hosts The hosts from which you want to obtain information. This parameter is optional.
fields The fields that are sent with the metricset event. This parameter is optional.
tags The tags that are sent with the metricset event. This parameter is optional.

For more information, see open source Metricbeat documentation.

Heartbeat configuration

Heartbeat can be installed on a remote server in a lightweight manner. You can use Heartbeat to periodically check the status of your services and determine whether the services are available. Heartbeat is different from Metricbeat. Heartbeat checks whether your services are available but Metricbeat checks whether your services are running.

You can specify heartbeat.monitors in heartbeat.yml to specify the services you want to monitor.
Note You can configure only the services that you want to monitor for Heartbeat. To ensure the availability of Heartbeat, we recommend that you deploy at least two Elastic Compute Service (ECS) instances.
Heartbeat configuration
heartbeat.monitors:
- type: http
  name: ecs_monitor
  enabled: true
  urls: ["http://localhost:9200"]
  schedule: '@every 5s'
  fields:
    dc: west
Notice If you specify Output when you install a shipper, you do not need to configure the output part in Shipper YML Configuration. Otherwise, the system prompts a shipper installation error.
Parameter Description
type The monitor type. Valid values: icmp, tcp, and http.
name The monitor name. This value appears in Exported fields of the monitor field and is used as the job name. The type field is used as the job type.
enabled Specifies whether the configuration takes effect. The value true indicates that the configuration takes effect. The value false indicates that the configuration does not take effect.
urls The servers to which you want to connect. This parameter is optional.
schedule The task schedule. If you set this parameter to @every 5s, the system runs the task every 5 seconds from the time Heartbeat is started. If you set this parameter to */5 * * * * * *, the system runs the task every 5 seconds.
fields Optional. You can add the fields to the output configuration part as additional information.

For more information, see open source Heartbeat documentation.

Auditbeat configuration

Auditbeat is a lightweight service that collects audit logs from the Linux audit framework and monitors file integrity. Auditbeat combines relevant messages into an event to generate structured data for analytics and can be seamlessly integrated with Logstash, Elasticsearch, and Kibana.
Notice Auditbeat is based on the Linux audit framework and requires an OS kernel version of 3.14 or later. The Auditd service must be in the stop state. You can run the service auditd status command to query the status of the service.

When you configure an Auditbeat shipper, you can specify auditbeat.modules in auditbeat.yml. auditbeat.yml consists of two parts: module and output. If you want to enable a module, you must add specific parameters to auditbeat.yml. In the following configuration example, the auditd and file_integrity modules are used.

auditbeat.modules:
- module: auditd
  audit_rules: |
    -w /etc/passwd -p wa -k identity
    -a always,exit -F arch=b32 -S open,create,truncate,ftruncate,openat,open_by_handle_at -F exit=-EPERM -k access
- module: file_integrity
  paths:
  - /bin
  - /usr/bin
  - /sbin
  - /usr/sbin
  - /etc
Notice If you specify Output when you install a shipper, you do not need to configure the output part in Shipper YML Configuration. Otherwise, the system prompts a shipper installation error.

For more information about auditbeat.yml configuration, see open source Auditbeat documentation. For more information about module configuration, see Modules.