The Secret of Performance Improvement of Logtail's Multi-line Log Collection by 7 Times

By Yitao

1. Background

In the field of log analysis, Logtail, a widely used log collection tool, can significantly enhance overall efficiency with any performance improvements. Recently, during performance testing of Logtail, the author noticed that when the regular expression at the beginning of the line is enabled to process multi-line logs, the collection performance deteriorates. What exactly causes this phenomenon? Next, let's explore the secrets behind the performance improvement of Logtail's multi-line log collection.

2. Analysis

To understand this phenomenon, you must first understand how Logtail works when processing multi-line logs. Logtail's multi-line log merging feature aggregates scattered multi-line data into complete events based on a specific log format. The following process shows how this feature works:

The user specifies the regular expression at the beginning of the line.
Logtail uses the regular expression to match the beginning of each log line.
If a line does not match, Logtail continues to wait until it finds the regular expression at the beginning of a matching line.

For example, assume that we have the following log format. Generally, we specify cnt.* as the regular expression at the beginning of the line. Logtail uses this regular expression to match each line and merges these single-line logs into a complete multi-line log.

cnt:13472391,thread:2,log:Exception in thread "main" java.lang.NullPointerExceptionat  com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitle
    at com.example.myproject.Book.getTitle
    at com.example.myproject.Book.getTitle
    at com.example.myproject.Book.getTitle

Logtail uses the boost::regex_match function to perform the regular expression match. This regular function performs a full match on the input log buffer based on the input regular expression reg. For example, in the preceding log, cnt.* matches all 1072 characters in the first line.

bool BoostRegexMatch(const char* buffer, size_t size, const boost::regex& reg, string& exception) {
    // ...
    if (boost::regex_match(buffer, buffer + size, reg))
        return true;
    // ...
}

The author wrote the following test code and found that as the length of the log that is not related to the regular expression at the beginning of the line increases (the part matched by .*), the length of boost::regex_match also grows linearly.

static void BM_Regex_Match(int batchSize) {
    std::string buffer = "cnt:";
    std::string regStr = "cnt.*";
    boost::regex reg(regStr);
    std::ofstream outFile("BM_Regex_Match.txt", std::ios::trunc);
    outFile.close();

    for (int i = 0; i < 1000; i++) {
        std::ofstream outFile("BM_Regex_Match.txt", std::ios::app);
        buffer += "a";

        int count = 0;
        uint64_t durationTime = 0;
        for (int i = 0; i < batchSize; i++) {
            count++;
            uint64_t startTime = GetCurrentTimeInMicroSeconds();
            if (!boost::regex_match(buffer, reg)) {
                std::cout << "error" << std::endl;
            }
            durationTime += GetCurrentTimeInMicroSeconds() - startTime;
        }
        outFile << i << '\t' << "durationTime: " << durationTime << std::endl;
        outFile << i << '\t' << "process: " << formatSize(buffer.size() * (uint64_t)count * 1000000 / durationTime)
                << std::endl;
        outFile.close();
    }
}

int main(int argc, char** argv) {
    logtail::Logger::Instance().InitGlobalLoggers();
    std::cout << "BM_Regex_Match" << std::endl;
    BM_Regex_Match(10000);
    return 0;
}

At this time, we need to note that when we use the regular expression at the beginning of the line, we usually only need to match the beginning part of a single-line log. For example, in the cnt log, we don't need the entire .* part, as matching this part consumes unnecessary performance. This effect is particularly noticeable when the log is very long.

In fact, the Boost library provides the boost::regex_search function. Setting the appropriate flags (such as boost::match_continuous) can achieve the requirement of matching only prefixes. This is exactly what is needed for matching the regular expression at the beginning of the line. Let's take a look at how to use boost::regex_search:

bool BoostRegexSearch(const char* buffer, size_t size, const boost::regex& reg, string& exception) {
    // ...
    if (boost::regex_search(buffer, buffer + size, what, reg, boost::match_continuous)) {
        return true;
    }
    // ...
}

In Logtail, due to the existing implementations of the regular expression at the beginning of the line, the user's regular expressions have .* suffixes. We can automatically remove the .* and add ^ at the beginning of the regular expression to improve matching efficiency.

Like boost::regex_match, the author also tested boost::regex_search based on the length of the log. It can be found that the time consumed by boost::regex_search remains relatively stable and does not increase with the length of the log.

static void BM_Regex_Search(int batchSize) {
    std::string buffer = "cnt:";
    std::string regStr = "^cnt";
    boost::regex reg(regStr);
    std::ofstream outFile("BM_Regex_Search.txt", std::ios::trunc);
    outFile.close();

    for (int i = 0; i < 1000; i++) {
        std::ofstream outFile("BM_Regex_Search.txt", std::ios::app);
        buffer += "a";

        int count = 0;
        uint64_t durationTime = 0;
        for (int i = 0; i < batchSize; i++) {
           count++;
           uint64_t startTime = GetCurrentTimeInMicroSeconds();
           if (!boost::regex_search(buffer, reg)) {
               std::cout << "error" << std::endl;
            }
            durationTime += GetCurrentTimeInMicroSeconds() - startTime;
        }

        outFile << i << '\t' << "durationTime: " << durationTime << std::endl;
        outFile << i << '\t' << "process: " << formatSize(buffer.size() * (uint64_t)count * 1000000 / durationTime)
                << std::endl;
        outFile.close();
    }
}

int main(int argc, char** argv) {
    std::cout << "BM_Regex_Search" << std::endl;
    BM_Regex_Search(10000);
    return 0;
}

3. Performance Testing

After making these adjustments, the author conducted a performance comparison test between the original and improved versions of Logtail. The test results showed a significant performance improvement. The test environment is as follows:

• The same ACK cluster with identical machine specifications (ecs.c7.4xlarge, 16 vCPU, 32 GiB, Compute-optimized type c7).

• PL3 ESSD of 2048GB.

• Logtail startup parameters are kept at default.

• The program used for printing logs is consistent.

• The same collection configuration, with only the regular expression cnt.* configured at the beginning of the line.

• Generate the same log in real time. The log feature is that the length of the first line is relatively long.

cnt:13472391,thread:2,log:Exception in thread "main" java.lang.NullPointerExceptionat  com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitle
    at com.example.myproject.Book.getTitle
    at com.example.myproject.Book.getTitle
at com.example.myproject.Book.getTitle

• The size of the first collection is 10485760, and the startup parameter is modified to "max_fix_pos_bytes" : 10485760 to prevent the log from being written a lot during the first read.

• Run eight pods at the same time to output logs to a local file.

The test found that the log collection speed of the optimized Logtail 2.1.1 version reaches 633 MB/s. However, the Logtail 1.8.7 version, which is before the optimization of the regular expression at the beginning of the line, is less than 90 MB/s, which is increased by 7 times.

Metric	Original	After Multi-line Optimization
iLogtail Collection Rate (MB/s)	90MB/s	633MB/s

4. Summary

We have observed that when the log content is very long, the collection performance of Logtail can be significantly improved several times by a simple one-line code modification. Theoretically, the longer the parts of the log that are irrelevant to the regular expression at the beginning of the line are, the more significant the performance improvement brought by this optimization will be.

Community

The Secret of Performance Improvement of Logtail's Multi-line Log Collection by 7 Times

1. Background

2. Analysis

3. Performance Testing

4. Summary

Read previous post:

Read next post:

Alibaba Cloud Native Community

You may also like

Comments

Alibaba Cloud Native Community

Related Products

Cloud Parallel File Storage

Managed Service for Prometheus

Storage Capacity Unit

Hybrid Cloud Storage