Logstash is an open source data collection engine with real-time pipelining capabilities. It is first used to write log data into Elastic Stack. As the open source community develops, Logstash can dynamically unify data from disparate sources and normalize the data into destinations of your choice.

For example, Java Database Connectivity (JDBC) is allowed to access AnalyticDB for MySQL. You can use the logstash-output-jdbc plug-in of Logstash to import log data to AnalyticDB for MySQL for subsequent analysis. However, JDBC writes single records. If you use JDBC to write large amounts of log data to AnalyticDB for MySQL Data Warehouse Edition (V3.0), the system does not provide high write performance but consumes large amounts of CPU resources. In this context, AnalyticDB for MySQL provides the optimized logstash-ouput-analyticdb plug-in that is dedicated to writing log data to AnalyticDB for MySQL in aggregate.

The logstash-output-analyticdb plug-in provides five times the write performance of the logstash-output-jdbc plug-in at a lower CPU utilization.

Install Logstash

For information about how to install Logstash, see Installing Logstash.

Configure Logstash

Create a configuration file named logstash-analyticdb.conf in the config directory. You can also specify a custom name for the file. logstash-analyticdb.conf contains the following content:

input
{
    stdin { }
}
output {
    analyticdb {
        driver_class => "com.mysql.jdbc.Driver"
        connection_string => "jdbc:mysql://HOSTNAME:PORT/DATABASE?user=USER&password=PASSWORD"
        statement => [ "INSERT INTO log (host, timestamp, message) VALUES(?, ?, ?)", "host", "@timestamp", "message" ]
        commit_size => 4194304
    }
}           
  • connection_string: the URL that is used to connect to AnalyticDB for MySQL.
  • statement: the declared arrays in the INSERT statement.

Other parameters:

  • max_flush_exceptions: the maximum number of retries that are allowed if an exception occurs during data write. Default value: 100.
  • skip_exception: specifies whether to skip exceptions. The default value is FALSE. If the data import task still fails after the maximum number of retries specified by max_flush_exceptions are attempted, an exception is thrown to terminate the data import. If you set this parameter to TRUE and all the retries fail, the exception is skipped and written to a log.
  • flush_size: the maximum number of data records that can buffer simultaneously. This parameter is used in combination with the commit_size parameter.
  • commit_size: the maximum amount of data that can buffer simultaneously. This parameter is used in combination with the flush_size parameter. Data write tasks are submitted when the upper limits are reached.

The configuration example is provided only for reference. You must configure the logstash-analyticdb.conf file based on your business needs. For more information about the configurations related to AnalyticDB for MySQL, visit GitHub. For more information about configurations and rules of Logstash, see the Logstash documentation.

After you configure the preceding parameters, the configuration is complete.

Start a task

Run the following command in the installation directory of Logstash to start a task: bin/logstash -f config/logstash-analyticdb.conf.

Precautions

Before you write data to AnalyticDB for MySQL, we recommend that you run the following command to upgrade Logstash to the latest version:

bin/logstash-plugin update logstash-output-analyticdb