--- layout: post title: Better Log Parsing with Fluentd subtitle: Description of a couple of approaches to designing your fluentd configuration. category: howto tags: [devops, logging] author: doru_mihai author_email: doru.mihai@haufe-lexware.com header-img: "images/bg-post.jpg" --- When you will start to deploy your log shippers to more and more systems you will encounter the issue of adapting your solution to be able to parse whatever log format and source each system is using. Luckily, fluentd has a lot of plugins and you can approach a problem of parsing a log file in different ways. The main reason you may want to parse a log file and not just pass along the contents is that when you have multi-line log messages that you would want to transfer as a single element rather than split up in an incoherent sequence. Another reason would be log files that contain multiple log formats that you would want to parse into a common data structure for easy processing. Below I will enumerate a couple of strategies that can be applied for parsing logs. And last but not least, there is the case that you have multiple log sources (perhaps each using a different technology) and you want to parse them and aggregate all information to a common data structure for coherent analysis and visualization of the data. ## One Regex to rule them all The simplest approach is to just parse all messages using the common denominator. This will lead to a very black-box type approach to your messages deferring any parsing efforts to a later time or to another component further downstream. In the case of a typical log file a configuration can be something like this (but not necessarily): ~~~ type tail path /var/log/test.log read_from_head true tag test.unprocessed format multiline format_firstline /\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2},\d{3}/ #we go with the most generic pattern where we know a message will have #a timestamp in front of it, the rest is just stored in the field 'message' format1 /(?