Use Filebeat to collect Apache log data - Elasticsearch - Alibaba Cloud Documentation Center

If you want to view and analyze Apache log data, you can use Filebeat to collect the data. Then, use Alibaba Cloud Logstash to filter the data and transfer the processed data to an Alibaba Cloud Elasticsearch cluster for analytics. This topic describes how to use Filebeat to collect Apache log data.

Procedure

Step 1: Make preparations
Step 2: Configure and install a Filebeat shipper
Step 3: Configure a Logstash pipeline to filter and synchronize data
Step 4: View the collected data

Step 1: Make preparations

Create an Elasticsearch cluster and a Logstash cluster that are of the same version and are deployed in the same virtual private cloud (VPC).
For more information, see Create an Alibaba Cloud Elasticsearch cluster and Create an Alibaba Cloud Logstash cluster.
Enable the Auto Indexing feature for the Elasticsearch cluster.
For security purposes, Alibaba Cloud Elasticsearch disables the Auto Indexing feature by default. However, Beats depends on this feature. If you select Elasticsearch for Output when you create a shipper, you must enable the Auto Indexing feature. For more information, see Configure the YML file.
Create an Alibaba Cloud Elastic Compute Service (ECS) instance in the same VPC as the Elasticsearch cluster and Logstash cluster.
For more information, see Create an instance by using the wizard.
Important
- Beats supports only Alibaba Cloud Linux, Red Hat Enterprise Linux (RHEL), and CentOS.
- Alibaba Cloud Filebeat can be used to collect logs only from an ECS instance that resides in the same region and is deployed in the same VPC as an Alibaba Cloud Elasticsearch cluster and an Alibaba Cloud Logstash cluster. Alibaba Cloud Filebeat cannot be used to collect logs from a source that is deployed on the Internet.

Install HTTP Daemon (HTTPd) on the ECS instance.

To facilitate the analytics and display of Apache log data by using a visualization tool, we recommend that you define JSON as the format of the log data in the httpd.conf file. For more information, see Step 1: Install and configure Apache HTTP Server. In this example, the following configurations are used:

LogFormat "{\"@timestamp\":\"%{%Y-%m-%dT%H:%M:%S%z}t\",\"client_ip\":\"%{X-Forwa rded-For}i\",\"direct_ip\": \"%a\",\"request_time\":%T,\"status\":%>s,\"url\":\"%U%q\",\"method\":\"%m\",\"http_host\":\"%{Host}i\",\"server_ip\":\"%A\",\"http_referer\":\"%{Referer}i\",\"http_user_agent\":\"%{User-agent}i\",\"body_bytes_sent\":\"%B\",\"total_bytes_sent\":\"%O\"}"  access_log_json
# Change the original CustomLog configuration to CustomLog "logs/access_log" access_log_json.

Install Cloud Assistant and Docker on the ECS instance.
For more information, see Install the Cloud Assistant client and Deploy and use Docker on Alibaba Cloud Linux 2 instances.

Step 2: Configure and install a Filebeat shipper

Log on to the Elasticsearch console.
Navigate to the Beats Data Shippers page.
1. In the top navigation bar, select a region.
2. In the left-side navigation pane, click Beats Data Shippers.
3. Optional:If this is the first time you go to the Beats Data Shippers page, view the information displayed in the message that appears and click OK to authorize the system to create a service-linked role for your account.
  
  Important When Beats collects data from various data sources, Beats depends on the service-linked role and the rules specified for the role. Do not delete the service-linked role. Otherwise, the use of Beats is affected. For more information, see Overview of the Elasticsearch service-linked role.
In the Create Shipper section, move the pointer over Filebeat and click ECS Logs.
Configure and install a shipper.
For more information, see Collect the logs of an ECS instance and Prepare a YML configuration file for a shipper. The following figure shows the configurations that are used in this example.
Note
- You must select Logstash for Output and select the ID of the Logstash cluster. Therefore, you do not need to specify an output in Shipper YML Configuration.
- You must set Filebeat Log File Path to the path that stores the data source. In addition, you must enable log collection and configure the path that is used to store log data in Shipper YML Configuration.
Click Next.
In the Install Shipper step, select the ECS instance on which you want to install the shipper.

Note The selected ECS instance must meet the preceding prerequisites.
Enable the shipper and check whether the shipper is installed.
1. Click Enable.
  Then, the Enable Shipper message appears.
2. Click Back to Beats Shippers. In the Manage Shippers section of the Beats Data Shippers page, view the installed shipper.
3. After the state of the shipper changes to Enabled 1/1, click View Instances in the Actions column.
4. In the View Instances panel, check whether the shipper is installed on the ECS instance. If the value of Installed Shippers is Heartbeat Normal, the shipper is installed.

Step 3: Configure a Logstash pipeline to filter and synchronize data

In the left-side navigation pane of the Alibaba Cloud Elasticsearch console, click Logstash Clusters.
On the page that appears, find your Logstash cluster and click Manage Pipelines in the Actions column.
On the Pipelines page, click Create Pipeline.

Configure a pipeline.

For more information, see Use configuration files to manage pipelines. The following configurations are used in this example:

input {
  beats {
      port => 8000
    }
}
filter {
  json {
        source => "message"
        remove_field => "@version"
        remove_field => "prospector"
        remove_field => "beat"
        remove_field => "source"
        remove_field => "input"
        remove_field => "offset"
        remove_field => "fields"
        remove_field => "host"
        remove_field => "message"
      }

}
output {
  elasticsearch {
    hosts => ["http://es-cn-mp91cbxsm00******.elasticsearch.aliyuncs.com:9200"]
    user => "elastic"
    password => "<your_password>"
    index => "<your_index>"
  }
}


Parameter	Description
input	Receives data collected by the shipper.
filter	Filters collected data. The json plug-in is used to decode message data. The remove_field parameter specifies the field that will be deleted. Note The configurations in the filter part apply to only the current testing scenario. You can configure the filter part based on your business requirements. For information about supported filter plug-ins, see Filter plugins.
output	Transfers data to your Elasticsearch cluster. The following parameters are involved: hosts: Set this parameter to the endpoint of your Elasticsearch cluster. You can obtain the endpoint on the Basic Information page of the cluster. For more information, see View the basic information of a cluster. <your_password>: Replace <your_password> with the password that is used to access your Elasticsearch cluster. <your_index>: Replace <your_index> with the name of the index to which the data is transferred.

Step 4: View the collected data

Log on to the Kibana console of your Elasticsearch cluster and go to the homepage of the Kibana console as prompted.
For more information about how to log on to the Kibana console, see Log on to the Kibana console.

Note In this example, an Elasticsearch V6.7.0 cluster is used. Operations on clusters of other versions may differ. The actual operations in the console prevail.
In the left-side navigation pane of the page that appears, click Dev Tools.
On the Console tab of the page that appears, run the following command to view the collected data:
```
GET <your_index>/_search
```
Note Replace <your_index> with the index name that you configured in the output part of the Logstash pipeline.
In the left-side navigation pane, click Discover. On the page that appears, specify a period in the upper-right corner. Then, view the details of the collected data within the specified period.

Note Before you view the collected data, make sure that an index pattern is created for the index specified by <your_index>. To create an index pattern in the Kibana console, click Management in the left-side navigation pane. On the page that appears, click Index Patterns in the Kibana section and then click Create index pattern. Follow the instructions to create the index pattern.