Merge pull request #18 from Dutzu/master
Added log aggregation blog post
This commit is contained in:
commit
f111974d8a
195
_posts/2016-01-11-Log-aggregation
Normal file
195
_posts/2016-01-11-Log-aggregation
Normal file
|
@ -0,0 +1,195 @@
|
||||||
|
---
|
||||||
|
title: Log aggregation
|
||||||
|
subtite: A short introduction
|
||||||
|
description: Introduction to log aggregation using Fluentd, Elasticsearch and Kibana
|
||||||
|
category: howto
|
||||||
|
tags: [fluentd, log, ELK, EFK]
|
||||||
|
author: Doru Mihai
|
||||||
|
author_email: doru.mihai@haufe-lexware.com
|
||||||
|
---
|
||||||
|
|
||||||
|
With an increasing number of systems decoupled and scattered throughout the landscape it becomes increasingly difficult to track and trace events across all systems.
|
||||||
|
Log aggregation solutions provides a series of benefits to distributed systems.
|
||||||
|
|
||||||
|
The problems it tackles are:
|
||||||
|
- Centralized, aggregated view over all log events
|
||||||
|
- Normalization of log schema
|
||||||
|
- Automated processing of log messages
|
||||||
|
- Support for a great number of event sources and outputs
|
||||||
|
|
||||||
|
One of the most prolific open source solutions on the market is the [ELK stack](https://www.elastic.co/videos/introduction-to-the-elk-stack) created by Elastic.
|
||||||
|
<center>
|
||||||
|
<img src="/images/logaggregation-elk.png" width=50% align="middle"/>
|
||||||
|
</center>
|
||||||
|
ELK stands for Elasticsearch – Logstash – Kibana and they are respectively their Search engine, Log Shipper and Visualization frontend solutions.
|
||||||
|
Elasticsearch becomes the nexus for gathering and storing the log data and it is not exclusive to Logstash.
|
||||||
|
|
||||||
|
Another very good data collection solution on the market is Fluentd, and it also supports Elasticsearch (amongst others) as the destination for it’s gathered data. So using the same data repository and frontend solutions, this becomes the EFK stack and if you do a bit of searching you will discover many people have chosen to substitute Elastic's logstash with FluentD and we will talk about why that is in a minute.
|
||||||
|
<center>
|
||||||
|
<img src="/images/logaggregation-efk.png" width=20% align="middle"/>
|
||||||
|
</center>
|
||||||
|
|
||||||
|
|
||||||
|
# Logstash vs FluentD
|
||||||
|
Both of them are very capable, have [hundreds](https://www.elastic.co/guide/en/logstash/current/input-plugins.html) and [hundreds](http://www.fluentd.org/plugins) of plugins available and are being maintained actively by corporation backed support.
|
||||||
|
### Technology - Fluentd wins
|
||||||
|
The big elephant in the room is that Logstash is written in JRuby and FluentD is [written in Ruby with performance sensitive parts in C](http://www.fluentd.org/faqs). As a result the overhead of running a JVM for the log shipper translates in large memory consumption, especially when you compare it to the footprint of Fluentd. The only advantage that Logstash can still invoke is the good parallelism support that the JVM brings and very good [Grok](https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html) support.
|
||||||
|
|
||||||
|
The only downside for Fluentd was the lack of support for Windows, but even that has been [solved](https://github.com/fluent/fluentd/pull/674) and [grok](https://github.com/kiyoto/fluent-plugin-grok-parser) support is also available for Fluentd and you can even re-use the grok libraries you had used/built, even [Logstash grok patterns](https://github.com/elastic/logstash/tree/v1.4.2/patterns).
|
||||||
|
### Shippers - Fluentd wins
|
||||||
|
They both however offer the option of deploying lightweight components that will only read and send the log messages to a fully fledged instance that will do the necessary processing. These are called log forwarders and both have lightweight forwarders written in Go. As of this writing Elastic has released a replacement for it's [logstash-forwarder](https://github.com/elastic/logstash-forwarder)(formerly called Lumberjack) and it is built on top of it's new Data shipper platform [Beats](https://www.elastic.co/products/beats), and it is called [Filebeat](https://github.com/elastic/beats/tree/master/filebeat).
|
||||||
|
|
||||||
|
This new Logstash forwarder allows for a TLS secured communication with the log shipper, something that the old one was not capable of but it is still lacking a very valuable feature that fluentd offers, and that is buffering.
|
||||||
|
### Resiliency - Fluentd wins
|
||||||
|
As mentioned previously fluentd offers buffering, something that you get "for free" and coupled with the active client-side load balancing you get a very competent solution without a large footprint.
|
||||||
|
|
||||||
|
On the other side, logstash doesn't have buffering and only has an in-memory queue of messages that is fixed in length (20 messages) so in case messages can't get through, they are lost. To alleviate this weakness the common practice is to set up an external queue (like [Redis](http://www.logstash.net/docs/1.3.2//tutorials/getting-started-centralized)) for persistence of the messages in case something goes wrong at either end. They are [working on it](https://github.com/elastic/logstash/issues/2605) though, so in the future we might see an improvement in this area.
|
||||||
|
|
||||||
|
Fluentd offers in-memory or file based buffering coupled with [active-active and active-standby load balancing and even weighted load balancing](http://docs.fluentd.org/articles/high-availability) and last but not least it also offers [at-most-once and at-least-once](http://docs.fluentd.org/articles/out_forward#requireackresponse) semantics.
|
||||||
|
|
||||||
|
# Additional considerations
|
||||||
|
|
||||||
|
Logstash benefits from a more chiselled, mature implementation due to the fact that the core and a lot of the essential plugins are maintained by Elastic, and some may argue that it's easier to deploy a JRE and the logstash jar and be done with it while others would consider it overkill to have a JVM running for such a small task. Plus the need to deploy and maintain a separate queueing
|
||||||
|
|
||||||
|
Fluentd provides just the core and a couple of input/output plugins and filters and the rest of the large number of plugins available are community driven and so you are exposed to the risk of potential version incompatibilities and lack of documentation and support.
|
||||||
|
|
||||||
|
I have personally seen that there is a bit of chaos since each plugin creator will define his own set of configuration input variables and there isn't a sense of consistency when you look at different plugins. You will encounter variables that are optional and have different default values, variables that are not properly documented but you can deduct their usage from the examples that the author offers and virtually all known naming conventions will appear in your config file.
|
||||||
|
|
||||||
|
# What next?
|
||||||
|
|
||||||
|
Well, as you can probably already tell, I have chosen to go with fluentd, and as such it became quickly apparent that I need to integrate it with Elasticsearch and Kibana to have a complete solution, and that wasn't a smooth ride due to 2 issues:
|
||||||
|
- Timestamps were sent to Elasticsearch without milliseconds
|
||||||
|
- All field values were by default analyzed fields
|
||||||
|
|
||||||
|
For communicating with Elasticsearch I used the plugin [fluent-plugin-elasticsearch](https://github.com/uken/fluent-plugin-elasticsearch) as presented in one of their very helpful [use case tutorials](http://docs.fluentd.org/articles/free-alternative-to-splunk-by-fluentd).
|
||||||
|
|
||||||
|
This plugin allows fluentd to impersonate logstash by just enabling the setting `logstash-format` in the configuration file. I snooped arround a bit and found that basically the only difference is that the plugin will make sure that the message sent has a timestamp field named `@timestamp`.
|
||||||
|
|
||||||
|
And here we arrive at our first problem....
|
||||||
|
|
||||||
|
### Timestamp fix
|
||||||
|
This is a pain because if you want to properly visualize a set of log messages gathered from multiple systems, in sequence, to be able to see exactly what step followed the other.....well, you see the problem.
|
||||||
|
|
||||||
|
Let's take a look at what fluentd sends to Elasticsearch. Here is a sample log file with 2 log messages:
|
||||||
|
|
||||||
|
~~~java
|
||||||
|
2015-11-12 06:34:01,471 [ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO ==== Request ===
|
||||||
|
2015-11-12 06:34:01,473 [ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO GET /monitor/broker/ HTTP/1.1
|
||||||
|
~~~
|
||||||
|
|
||||||
|
A message sent to Elasticsearch from fluentd would contain these values:
|
||||||
|
|
||||||
|
*-this isn't the exact message, this is the result of the stdout output plugin-*
|
||||||
|
~~~ruby
|
||||||
|
2015-11-12 06:34:01 -0800 tag.common: {"message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO ==== Request ===","time_as_string":"2015-11-12 06:34:01 -0800"}
|
||||||
|
|
||||||
|
2015-11-12 06:34:01 -0800 tag.common: {"message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO GET /monitor/broker/ HTTP/1.1\n","time_as_string":"2015-11-12 06:34:01 -0800"}
|
||||||
|
~~~
|
||||||
|
|
||||||
|
I added the `time_as_string` field in there just so you can see the literal string that is sent as the time value.
|
||||||
|
|
||||||
|
This is a known issue and initially it was the fault of fluentd for not supporting that level of granularity, but is had been [fixed](https://github.com/fluent/fluentd/issues/461). Sadly, the fix has not made it's way to the Elasticsearch plugin and so, [alternatives have appeared](https://github.com/shivaken/fluent-plugin-better-timestamp).
|
||||||
|
|
||||||
|
The fix basically involves manually formatting the `@timestamp` field to have the format `YYYY-MM-ddThh:mm:ss.SSSZ`. So you can either bring on the previously mentioned `fluent-plugin-better-timestamp` into your log processing pipeline to act as a filter that fixes your timestamps OR you can build it yourself.
|
||||||
|
|
||||||
|
In order to build it yourself you only need the `record_transformer` filter that is part of the core of plugins that fluentd comes with and that I anyway would recommend you use for enriching your messages with things like the source hostname for example.
|
||||||
|
|
||||||
|
Next you need to parse the timestamp of your logs into separate date, time and millisecond components (which is basically what the better-timestamp plugin asks you to do, to some extent), and then to create a filter that would match all the messages you will send to Elasticsearch and to create the `@timestamp` value by appending the 3 components. This makes use of the fact that fluentd also allows you to run ruby code within your record_transformer filters to accommodate for more special log manipulation tasks.
|
||||||
|
|
||||||
|
~~~xml
|
||||||
|
<filter tag.**>
|
||||||
|
type record_transformer
|
||||||
|
enable_ruby true
|
||||||
|
<record>
|
||||||
|
@timestamp ${date_string + "T" + time_string + "." + msec + "Z"}
|
||||||
|
</record>
|
||||||
|
</filter>
|
||||||
|
~~~
|
||||||
|
|
||||||
|
The result is that the above sample will come out like this:
|
||||||
|
~~~ruby
|
||||||
|
2015-12-12 05:26:15 -0800 akai.common: {"date_string":"2015-11-12","time_string":"06:34:01","msec":"471","message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO ==== Request ===","@timestamp":"2015-11-12T06:34:01.471Z"}
|
||||||
|
2015-12-12 05:26:15 -0800 akai.common: {"date_string":"2015-11-12","time_string":"06:34:01","msec":"473","message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO GET /monitor/broker/ HTTP/1.1\n","@timestamp":"2015-11-12T06:34:01.473Z"}
|
||||||
|
~~~
|
||||||
|
|
||||||
|
*__Note__: you can use the same record_transformer filter to remove the 3 separate time components after creating the `@timestamp` field via the `remove_keys` option.*
|
||||||
|
|
||||||
|
### Do not analyse
|
||||||
|
|
||||||
|
There are 2 reasons why you shouldn't want your fields to be analyzed in this scenario:
|
||||||
|
- It will potentially increase the storage requirements
|
||||||
|
- It will make it impossible to do proper analysis and visualization on your data if you have field values that contain hyphens, dots or others.
|
||||||
|
|
||||||
|
Ok, so first, why does it increase the storage requirements?
|
||||||
|
|
||||||
|
Well, while researching what would be the proper hardware sizing requirements for setting up our production EFK installation I stumbled upon [this](http://peter.mistermoo.com/2015/01/05/hardware-sizing-or-how-many-servers-do-i-really-need/) post that goes in detail about what and why and how big can the problem become.
|
||||||
|
|
||||||
|
Worst case scenario, you could be using up to **40% more** disk space than you really need. Pretty bad huh?
|
||||||
|
|
||||||
|
And the second issue which would become apparent much quicker than the first is that when you will try to make use of Kibana to visualize your data you will encounter the issue that fields that contain hyphens for example will appear split and duplicate when used in visualizations.
|
||||||
|
|
||||||
|
For instance, by using the record_transformer I would send the hostname and also a statically specified field called `sourceProject`, to be able to group together messages that came from different identical instances of a project application.
|
||||||
|
|
||||||
|
Using this example configuration I tried to create a pie chart showing the number of messages per project for a dashboard. Here is what I got.
|
||||||
|
|
||||||
|
~~~ruby
|
||||||
|
<filter tag.**>
|
||||||
|
type record_transformer
|
||||||
|
enable_ruby true
|
||||||
|
<record>
|
||||||
|
@timestamp ${date_string + "T" + time_string + "." + msec + "Z"}
|
||||||
|
sourceProject Test-Analyzed-Field
|
||||||
|
</record>
|
||||||
|
</filter>
|
||||||
|
~~~
|
||||||
|
|
||||||
|
Sample output from stdout:
|
||||||
|
~~~ruby
|
||||||
|
2015-12-12 06:01:35 -0800 clear: {"date_string":"2015-10-15","time_string":"06:37:32","msec":"415","message":"[amelJettyClient(0xdc64419)-706] jetty:test/test INFO totallyAnonymousContent: http://whyAreYouReadingThis?:)/history/3374425?limit=1","@timestamp":"2015-10-15T06:37:32.415Z","sourceProject":"Test-Analyzed-Field"}
|
||||||
|
~~~
|
||||||
|
|
||||||
|
And here is the result of trying to use it in a visualization:
|
||||||
|
<center>
|
||||||
|
<img src="/images/logaggregation-analyzed-field.png" width=35% align="middle"/>
|
||||||
|
</center>
|
||||||
|
|
||||||
|
I should mention, what you are seeing is the result of 6 messages that all have the field sourceProject set to the value "Test-Analyzed-Field".
|
||||||
|
|
||||||
|
Sadly, once you put some data into Elasticsearch, indices are automatically created (by the fluent-plugin-elasticsearch) and mappings along with them and once a field is mapped as being analyzed [it cannot be changed](https://www.elastic.co/blog/changing-mapping-with-zero-downtime).
|
||||||
|
|
||||||
|
Curiously this did not happen when using Logstash, which made me look into how they are handling this problem. Then I discovered the issue was discussed also in the context of the fluent-plugin-elasticsearch and [the solution was posted there](https://github.com/uken/fluent-plugin-elasticsearch/issues/33) along with the request to include it in future versions of the plugin.
|
||||||
|
|
||||||
|
And the solution is: When Elasticsearch creates a new index, it will rely on the existence of a template to create that index. Logstash comes with a template of its own that it uses to tell Elasticsearch to create not analyzed copies of the fields it sends to it so that users can benefit from the analyzed fields for searching and the not analyzed fields when doing visualizations. And that template can be found [here](https://github.com/logstash-plugins/logstash-output-elasticsearch/blob/master/lib/logstash/outputs/elasticsearch/elasticsearch-template.json).
|
||||||
|
|
||||||
|
And what you basically need to do is to do a curl put with that json content to ES and then all the indices created that are prefixed with `logstash-*` will use that template. Be aware that with the fluent-plugin-elasticsearch you can specify your own index prefix so make sure to adjust the template to match your prefix:
|
||||||
|
|
||||||
|
~~~bash
|
||||||
|
curl -XPUT localhost:9200/_template/template_doru -d '{
|
||||||
|
"template" : "logstash-*",
|
||||||
|
"settings" : {....
|
||||||
|
}'
|
||||||
|
~~~
|
||||||
|
|
||||||
|
The main thing to note in the whole template is this section:
|
||||||
|
~~~json
|
||||||
|
"string_fields" : {
|
||||||
|
"match" : "*",
|
||||||
|
"match_mapping_type" : "string",
|
||||||
|
"mapping" : {
|
||||||
|
"type" : "string", "index" : "analyzed", "omit_norms" : true,
|
||||||
|
"fielddata" : { "format" : "disabled" },
|
||||||
|
"fields" : {
|
||||||
|
"raw" : {"type": "string", "index" : "not_analyzed", "doc_values" : true, "ignore_above" : 256}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
~~~
|
||||||
|
|
||||||
|
This tells Elasticsearch that for any field of type string that it receives it should create a mapping of type string that is analyzed + another field that adds a `.raw` suffix that will not be analyzed.
|
||||||
|
|
||||||
|
The `not_analyzed` suffixed field is the one you can safely use in visualizations, but do keep in mind that this creates the scenario mentioned before where you can have up to 40% inflation in storage requirements because you will have both analyzed and not_analyzed fields in store.
|
||||||
|
|
||||||
|
# Have fun
|
||||||
|
So, now you know what we went through here at Haufe and what problems we faced and how we can overcome them.
|
||||||
|
|
||||||
|
If you want to give it a try you can take a look at our docker templates on github, there you will find a template for an EFK setup + a shipper that can transfer messages securely to the EFK solution and you can have it up and running in a matter of minutes: https://github.com/Haufe-Lexware/docker-templates/tree/master/logaggregation
|
220
_posts/2016-01-18-Fluentd-log-parsing
Normal file
220
_posts/2016-01-18-Fluentd-log-parsing
Normal file
|
@ -0,0 +1,220 @@
|
||||||
|
---
|
||||||
|
title: Fluentd log parsing
|
||||||
|
subtite: Approaches to log parsing
|
||||||
|
description: Description of a couple of approaches to designing your fluentd configuration.
|
||||||
|
category: howto
|
||||||
|
tags: [fluentd, logs]
|
||||||
|
author: Doru Mihai
|
||||||
|
author_email: doru.mihai@haufe-lexware.com
|
||||||
|
---
|
||||||
|
|
||||||
|
# Approaches to log parsing
|
||||||
|
|
||||||
|
When you will start to deploy your log shippers to more and more systems you will encounter the issue of adapting your solution to be able to parse whatever log format and source each system is using. Luckily, fluentd has a lot of plugins and you can approach a problem of parsing a log file in different ways.
|
||||||
|
|
||||||
|
|
||||||
|
The main reason you may want to parse a log file and not just pass along the contents is that when you have multi-line log messages that you would want to transfer as a single element rather than split up in an incoherent sequence.
|
||||||
|
|
||||||
|
|
||||||
|
Another reason would be log files that contain multiple log formats that you would want to parse into a common data structure for easy processing.
|
||||||
|
Below I will enumerate a couple of strategies that can be applied for parsing logs.
|
||||||
|
|
||||||
|
And last but not least, there is the case that you have multiple log sources (perhaps each using a different technology) and you want to parse them and aggregate all information to a common data structure for coherent analysis and visualization of the data.
|
||||||
|
|
||||||
|
## One Regex to rule them all
|
||||||
|
The simplest approach is to just parse all messages using the common denominator. This will lead to a very black-box type approach to your messages deferring any parsing efforts to a later time or to another component further downstream.
|
||||||
|
|
||||||
|
|
||||||
|
In the case of a typical log file a configuration can be something like this (but not necessarily):
|
||||||
|
|
||||||
|
~~~xml
|
||||||
|
<source>
|
||||||
|
type tail
|
||||||
|
path /var/log/test.log
|
||||||
|
read_from_head true
|
||||||
|
tag test.unprocessed
|
||||||
|
format multiline
|
||||||
|
format_firstline /\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2},\d{3}/
|
||||||
|
#we go with the most generic pattern where we know a message will have
|
||||||
|
#a timestamp in front of it, the rest is just stored in the field 'message'
|
||||||
|
format1 /(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2},\d{3}) (?<message>(.|\s)*)/
|
||||||
|
</source>
|
||||||
|
~~~
|
||||||
|
|
||||||
|
You will notice we still do a bit of parsing, the minimal level would be to just have a multiline format to split the log contents into separate messages and then to push the contents on.
|
||||||
|
|
||||||
|
The reason we do not just put everything into a single field with a greedy regex pattern is to have the correct timestamp pushed showing the time of the log and not the time when the log message was read by the log shipper, along with the rest of the message.
|
||||||
|
If more pieces are common to all messages, it can be included in the regex for separate extraction, if it is of interest of course.
|
||||||
|
|
||||||
|
## Divide & Conquer
|
||||||
|
As the name would suggest, this approach suggests that you should try to create an internal routing that would allow you to precisely target log messages based on their content later on downstream.
|
||||||
|
An example of this is shown in the configuration below:
|
||||||
|
|
||||||
|
~~~ruby
|
||||||
|
#Sample input:
|
||||||
|
#2015-10-15 08:19:05,190 [testThread] INFO testClass - Queue: update.testEntity; method: updateTestEntity; Object: testEntity; Key: 154696614; MessageID: ID:test1-37782-1444827636952-1:1:2:25:1; CorrelationID: f583ed1c-5352-4916-8252-47298732516e; started processing
|
||||||
|
#2015-10-15 06:44:01,727 [ ajp-apr-127.0.0.1-8009-exec-2] LogInterceptor INFO user-agent: check_http/v2.1.1 (monitoring-plugins 2.1.1)
|
||||||
|
#connection: close
|
||||||
|
#host: test.testing.com
|
||||||
|
#content-length: 0
|
||||||
|
#X-Forwarded-For: 8.8.8.8
|
||||||
|
#2015-10-15 08:21:04,716 [ ttt-grp-127.0.0.1-8119-test-11] LogInterceptor INFO HTTP/1.1 200 OK
|
||||||
|
|
||||||
|
<source>
|
||||||
|
type tail
|
||||||
|
path /test/test.log
|
||||||
|
tag log.unprocessed
|
||||||
|
read_from_head true
|
||||||
|
|
||||||
|
format multiline
|
||||||
|
format_firstline /\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2},\d{3}/
|
||||||
|
format1 /(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2},\d{3}) (?<message>(.|\s)*)/
|
||||||
|
</source>
|
||||||
|
|
||||||
|
<match log.unprocessed.**>
|
||||||
|
type rewrite_tag_filter
|
||||||
|
rewriterule1 message \bCorrelationID\b correlation
|
||||||
|
rewriterule2 message .* clear
|
||||||
|
</match>
|
||||||
|
<match clear>
|
||||||
|
type null
|
||||||
|
</match>
|
||||||
|
|
||||||
|
<filter correlation>
|
||||||
|
type parser
|
||||||
|
key_name message
|
||||||
|
format / * (.*method:) (?<method>[^;]*) *(.*Object:) (?<object>[^;]*) *(.*Key:) (?<objectkey>[^;]*) *(.*MessageID:) (?<messageID>[^;]*) *(.*CorrelationID:) (?<correlationID>[^;]*).*/
|
||||||
|
reserve_data yes
|
||||||
|
</filter>
|
||||||
|
|
||||||
|
<match correlation>
|
||||||
|
type stdout
|
||||||
|
</match>
|
||||||
|
~~~
|
||||||
|
|
||||||
|
This approach is useful when we have multiline log messages within our logfile and the messages themselves have different formats for the content. Still, the important thing to note is that all log messages are prefixed by a standard timestamp, this is key to succesfully splitting messages correctly.
|
||||||
|
|
||||||
|
|
||||||
|
The break-down of the approach with the configuration shown is that all entries in the log are first parsed into individual events to be processed. The key separator here is the timestamp and it is marked by the *format_firstline* key/value pair as a regex pattern.
|
||||||
|
Fluentd will continue to read logfile lines and keep them in a buffer until a line is reached that starts with text that matches the regex pattern specified in the *format_firstline* field. After detecting a new log message, the one already in the buffer is packaged and sent to the parser defined by the regex pattern stored in the format<n> fields.
|
||||||
|
|
||||||
|
|
||||||
|
Looking at the example, all our log messages (single or multiline) will take the form:
|
||||||
|
~~~json
|
||||||
|
{ "time":"2015-10-15 08:21:04,716", "message":"[ ttt-grp-127.0.0.1-8119-test-11] LogInterceptor INFO HTTP/1.1 200 OK" }
|
||||||
|
~~~
|
||||||
|
|
||||||
|
Being tagged with log.unprocessed all the messages will be caught by the *rewrite_tag_filter* match tag and it is at this point that we can pinpoint what type of contents each message has and we can re-tag them for individual processing.
|
||||||
|
|
||||||
|
This module is key to the whole mechanism as the *rewrite_tag_filter* takes the role of a router. You can use this module to redirect messages to different processing modules or even outputs depending on the rules you define in it.
|
||||||
|
|
||||||
|
## Shooting fish in a barrel
|
||||||
|
You can use *fluent-plugin-multi-format-parser* to try to match each line read from the log file with a specific regex pattern (format).
|
||||||
|
This approach probably comes with performance drawbacks because fluentd will try to match using each regex pattern sequentially until one matches.
|
||||||
|
An example of this approach can be seen below:
|
||||||
|
|
||||||
|
~~~ruby
|
||||||
|
<source>
|
||||||
|
type tail
|
||||||
|
path /var/log/aka/test.log
|
||||||
|
read_from_head true
|
||||||
|
keep_time_key true
|
||||||
|
tag akai.unprocessed
|
||||||
|
format multi_format
|
||||||
|
# 2015-10-15 08:19:05,190 [externalPerson]] INFO externalPersonToSapSystem - Queue: aka.update.externalPerson; method: ; Object: externalPerson; Key: ; MessageID: ID:test1-37782-1444827636952-1:1:2:25:1; CorrelationID: f583ed1c-5352-4916-8252-47698732506e; received
|
||||||
|
<pattern>
|
||||||
|
format /(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2},\d{3}) \[(?<thread>.*)\] (?<loglevel>[A-Z]*) * (.*method:) (?<method>[^;]*) *(.*Object:) (?<object>[^;]*) *(.*Key:) (?<objectkey>[^;]*) *(.*MessageID:) (?<messageID>[^;]*) *(.*CorrelationID:) (?<correlationID>[^;]*); (?<status>.*)/
|
||||||
|
</pattern>
|
||||||
|
# 2015-10-13 12:30:18,475 [ajp-apr-127.0.0.1-8009-exec-14] LogInterceptor INFO Content-Type: text/xml; charset=UTF-8
|
||||||
|
# Authorization: Basic UFJPRE9NT1xyZXN0VGVzdFVzZXI6e3tjc2Vydi5wYXNzd29yZH19
|
||||||
|
# breadcrumbId: ID-haufe-prodomo-stage-51837-1444690926044-1-1731
|
||||||
|
# checkoutId: 0
|
||||||
|
# Content-Encoding: gzip
|
||||||
|
# CS-Cache-Minutes: 0
|
||||||
|
# CS-Cache-Time: 2015-10-13 12:30:13
|
||||||
|
# CS-Client-IP: 172.16.2.51
|
||||||
|
# CS-Inner-Duration: 207.6 ms
|
||||||
|
# CS-Outer-Duration: 413.1 ms
|
||||||
|
# CS-Project: PRODOMO
|
||||||
|
# CS-UserID: 190844
|
||||||
|
# CS-UserName: restTestUser
|
||||||
|
# Expires: Thu, 19 Nov 1981 08:52:00 GMT
|
||||||
|
# Server: Apache
|
||||||
|
# SSL_CLIENT_S_DN_OU: Aka-Testuser
|
||||||
|
# User-Agent: check_http/v2.1.1 (monitoring-plugins 2.1.1)
|
||||||
|
# Vary: Accept-Encoding
|
||||||
|
# workplace: 0
|
||||||
|
# X-Forwarded-For: 213.164.69.219
|
||||||
|
# X-Powered-By: PHP/5.3.21 ZendServer/5.0
|
||||||
|
# Content-Length: 2883
|
||||||
|
# Connection: close
|
||||||
|
<pattern>
|
||||||
|
format multiline
|
||||||
|
format_firstline /\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2},\d{3}/
|
||||||
|
format1 /(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2},\d{3}) \[(?<thread>.*)\] (?<class>\w*) * (?<loglevel>[A-Z]*) (?<message>.*)/
|
||||||
|
</pattern>
|
||||||
|
|
||||||
|
#Greedy default format
|
||||||
|
<pattern>
|
||||||
|
format /(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2},\d{3}) (?<message>(.|\s)*)/
|
||||||
|
</pattern>
|
||||||
|
</source>
|
||||||
|
~~~
|
||||||
|
|
||||||
|
When choosing this path there are multiple issues you need to be aware of:
|
||||||
|
* The pattern matching is done sequentially and the first pattern that matches the message is used to parse it and the message is passed along
|
||||||
|
* You need to make sure the most specific patterns are higher in the list and the more generic ones lower
|
||||||
|
* Make sure to create a very generic pattern to use as a default at the end of the list of patterns.
|
||||||
|
* Performance will probably decrease due to the trial&error approach to finding the matching regex
|
||||||
|
|
||||||
|
|
||||||
|
The biggest issue with this approach is that it is very very hard to handle multi-line log messages if there are significantly different log syntaxes in the log.
|
||||||
|
|
||||||
|
|
||||||
|
__Warning:__ Be aware that the multiline parser continues to store log messages in a buffer until it matches another firstline token and when it does it then it packages and emits the multiline log it collected.
|
||||||
|
This approach is useful when you have good control and know-how about the format of your log source.
|
||||||
|
|
||||||
|
## Order & Chaos
|
||||||
|
Introducing Grok!
|
||||||
|
|
||||||
|
|
||||||
|
Slowly but surely getting all your different syntaxes, for which you will have to define different regular expressions, will make your config file look very messy, filled with regex-es that are longer and longer, and just relying on the multiple format lines to split it up doesn't bring that much readability nor does it help with maintainability. Reusability is something that we cannot even discuss in the case of pure regex formatters.
|
||||||
|
|
||||||
|
|
||||||
|
Grok allows you to define a library of regexes that can be reused and referenced via identifiers. It is structured as a list of key-value pairs and can also contain named capture groups.
|
||||||
|
An example of such a library can be seen below. (Note this is just a snippet and does not contain all the minor expressions that are referenced from within the ones enumerated below)
|
||||||
|
|
||||||
|
~~~ruby
|
||||||
|
###
|
||||||
|
# AKA-I
|
||||||
|
###
|
||||||
|
# Queue: aka.update.externalPerson; method: updateExternalPerson; Object: externalPerson; Key: ; MessageID: ID:test1-37782-1444827636952-1:1:2:25:1; CorrelationID: f583ed1c-5352-4916-8252-47698732506e; received
|
||||||
|
AKA_AKAI_CORRELATIONLOG %{AKA_AKAI_QUEUELABEL} %{AKA_AKAI_QUEUE:queue}; %{AKA_AKAI_METHODLABEL} %{AKA_AKAI_METHOD:method}; %{AKA_AKAI_OBJECTLABEL} %{AKA_AKAI_OBJECT:object}; %{AKA_AKAI_KEYLABEL} %{AKA_AKAI_KEY:key}; %{AKA_AKAI_MESSAGEIDLABEL} %{AKA_AKAI_MESSAGEID:messageId}; %{AKA_AKAI_CORRELATIONLABEL} %{AKA_AKAI_CORRELATIONID:correlationId}; %{WORD:message}
|
||||||
|
#CorrelationId log message from AKAI
|
||||||
|
#2015-10-15 08:19:05,190 [externalPerson]] INFO externalPersonToSapSystem - Queue: aka.update.externalPerson; method: updateExternalPerson; Object: externalPerson; Key: ; MessageID: ID:test1-37782-1444827636952-1:1:2:25:1; CorrelationID: f583ed1c-5352-4916-8252-47698732506e; received
|
||||||
|
AKA_AKAI_CORRELATIONID %{AKAIDATESTAMP:time} %{AKA_THREAD:threadName} %{LOGLEVEL:logLevel} * %{JAVACLASS:className} %{AKA_AKAI_CORRELATIONLOG}
|
||||||
|
#Multiline generic log pattern
|
||||||
|
# For detecting that a new log message has been read we will use AKAIDATESTAMP as the pattern and then match with a greedy pattern
|
||||||
|
AKA_AKAI_GREEDY %{AKAIDATESTAMP:time} %{AKA_THREAD:threadName} %{JAVACLASS:class} * %{LOGLEVEL:loglevel} %{AKA_GREEDYMULTILINE:message}
|
||||||
|
# Variation since some log messages have loglevel before classname or vice-versa
|
||||||
|
AKA_AKAI_GREEDY2 %{AKAIDATESTAMP:time} %{AKA_THREAD:threadName} * %{LOGLEVEL:loglevel} %{JAVACLASS:class} %{AKA_GREEDYMULTILINE:message}
|
||||||
|
###
|
||||||
|
# ARGO Specific
|
||||||
|
###
|
||||||
|
AKA_ARGO_DATESTAMP %{MONTHDAY}-%{MONTH}-%{YEAR} %{TIME}
|
||||||
|
#17-Nov-2015 07:53:38.786 INFO [www.lexware-akademie.de-startStop-1] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory /var/www/html/www.lexware-akademie.de/ROOT has finished in 44,796 ms
|
||||||
|
AKA_ARGO_LOG %{AKA_ARGO_DATESTAMP:time} %{LOGLEVEL:logLevel} %{AKA_THREAD:threadName} %{JAVACLASS:className} %{AKA_GREEDYMULTILINE:message}
|
||||||
|
#2015-11-17 07:53:51.606 RVW INFO AbstractApplicationContext:515 - Refreshing Root WebApplicationContext: startup date [Tue Nov 17 07:53:51 CET 2015]; root of context hierarchy
|
||||||
|
AKA_ARGO_LOG2 %{AKAIDATESTAMP2:time} %{WORD:argoComponent} *%{LOGLEVEL:logLevel} * %{AKA_CLASSLINENUMBER:className} %{AKA_GREEDYMULTILINE:message}
|
||||||
|
#[GC (Allocation Failure) [ParNew: 39296K->4351K(39296K), 0.0070888 secs] 147064K->114083K(172160K), 0.0071458 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
|
||||||
|
#[CMS-concurrent-abortable-preclean: 0.088/0.233 secs] [Times: user=0.37 sys=0.02, real=0.23 secs]
|
||||||
|
AKA_ARGO_SOURCE (GC|CMS)
|
||||||
|
AKA_ARGO_GC \[%{AKA_ARGO_SOURCE:source} %{AKA_GREEDYMULTILINE:message}
|
||||||
|
~~~
|
||||||
|
|
||||||
|
|
||||||
|
To use Grok you will need to install the *fluent-plugin-grok-parser* and then you can use grok patterns with any of the other techniques previously described with regex: Multiline, Multi-format.
|
||||||
|
|
||||||
|
# Go Get'em!
|
||||||
|
|
||||||
|
Now you should have a pretty good idea of how you can approach different log formats and how you can structure your config file using a couple of plugins from the hundreds of plugins available.
|
BIN
images/logaggregation-analyzed-field.png
Normal file
BIN
images/logaggregation-analyzed-field.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 24 KiB |
BIN
images/logaggregation-efk.png
Normal file
BIN
images/logaggregation-efk.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 84 KiB |
BIN
images/logaggregation-elk.png
Normal file
BIN
images/logaggregation-elk.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 126 KiB |
Loading…
Reference in a new issue