diff --git a/_data/authors.yml b/_data/authors.yml index ec3605c..a27ef38 100644 --- a/_data/authors.yml +++ b/_data/authors.yml @@ -1,89 +1,7 @@ # Author details. -holger_reinhardt: - name: Holger Reinhardt - email: holger.reinhardt@haufe-lexware.com - twitter: hlgr360 - github: hlgr360 - linkedin: hrreinhardt -martin_danielsson: - name: Martin Danielsson - email: martin.danielsson@haufe-lexware.com - twitter: donmartin76 - github: donmartin76 - linkedin: martindanielsson -marco_seifried: - name: Marco Seifried - email: marco.seifried@haufe-lexware.com - twitter: marcoseifried - github: marc00s - linkedin: marcoseifried -thomas_schuering: - name: Thomas Schüring - email: thomas.schuering@haufe-lexware.com - github: thomsch98 - linkedin: thomas-schuering-205a8780 -rainer_zehnle: - name: Rainer Zehnle - email: rainer.zehnle@haufe-lexware.com - github: Kodrafo - linkedin: rainer-zehnle-09a537107 - twitter: RainerZehnle -doru_mihai: - name: Doru Mihai - email: doru.mihai@haufe-lexware.com - github: Dutzu - linkedin: doru-mihai-32090112 - twitter: dcmihai -eike_hirsch: - name: Eike Hirsch - email: eike.hirsch@haufe-lexware.com - twitter: stagzta -axel_schulz: - name: Axel Schulz - email: axel.schulz@semigator.de - github: axelschulz - linkedin: luckyguy -carol_biro: - name: Carol Biro - email: carol.biro@haufe-lexware.com - github: birocarol - linkedin : carol-biro-5b0a5342 -frederik_michel: - name: Frederik Michel - email: frederik.michel@haufe-lexware.com - github: FrederikMichel - twitter: frederik_michel -tora_onaca: - name: Teodora Onaca - email: teodora.onaca@haufe-lexware.com - github: toraonaca - twitter: toraonaca -eric_schmieder: - name: Eric Schmieder - email: eric.schmieder@haufe-lexware.com - github: EricAtHaufe - twitter: EricAtHaufe -scott_speights: - name: Scott Speights - email: scott.speights@haufe-lexware.com - github: SSpeights - twitter: ScottSpeights -esmaeil_sarabadani: - name: Esmaeil Sarabadani - email: esmaeil.sarabadani@haufe-lexware.com - twitter: esmaeils -daniel_wehrle: - name: Daniel Wehrle - email: daniel.wehrle@haufe-lexware.com - github: DanielHWe -anja_kienzler: - name: Anja Kienzler - email: anja.kienzler@haufe-lexware.com -filip_fiat: - name: Filip Fiat - email: filip.fiat@haufe-lexware.com -daniel_bryant: - name: Daniel Bryant - email: daniel.bryant@opencredo.com - github: danielbryantuk - twitter: danielbryantuk +jeremy: + name: Jeremy Kidwell + email: j.kidwell@bham.ac.uk + twitter: kidwellj + github: kidwellj + linkedin: kidwellj \ No newline at end of file diff --git a/_posts/2015-11-11-Hello-World.md b/_posts/2015-11-11-Hello-World.md deleted file mode 100644 index 38b14cd..0000000 --- a/_posts/2015-11-11-Hello-World.md +++ /dev/null @@ -1,23 +0,0 @@ ---- -layout: post -title: We are live or How to start a developer blog -subtitle: The 'Hello World' Post -category: general -tags: [cto, culture] -author: holger_reinhardt -author_email: holger.reinhardt@haufe-lexware.com -header-img: "images/bg-post.jpg" ---- - -So how does one start a developer blog? It is pretty intimidating to look at blank editor (BTW, I use [Mou](http://25.io/mou/) to write this post ;) and think about some witty content, some heading which would rope you in and make you want to read what we have to say. But why should you? And who are we anyhow? - -So lets start with first things first. **Welcome to our Haufe Developer Blog**. -We - that is our development and design community at Haufe. And we are having a problem. No, it is not that we are sitting in Freiburg on the edge of the beautiful Black Forest in Germany. It is neither that we are a software company with close to 300 million EUR in yearly revenue you probably never heard of (you might have heard of Lexware though). - -No, our problem is that we are actually doing quite some cool stuff (and planning even more in the future) and no one in the developer community knows about it. When I joined Haufe-Lexware as CTO back in March of this year the first thing I did was that I searched for Haufe on my usual (developer) channels: Github (nope), Twitter (nope), SlideShare (nope). Well, you see - I think that is a problem. If a tree falls in the forest but no one sees it - did the tree fall in the forest? How are you ever going to find out about us and get so excited that you want to join us - right now! (And yes - we do have plenty of dev openings if you are interested). - -During the 'Meet the new guy' meeting with my team I drew a triangle with the three areas I would like to focus on first: Architecture, Technology and Developer Culture. I figured developing Developer Culture was the easiest - and boy was I naive (and wrong). Fast forward 6 month and I think that developer culture is the number one factor which will determine if we as a team and as company will succeed or fail. No matter what technology or architecture we use, it is the culture which permeates every decision we make at work day in and day out. It is culture which incluences if we go left or if we go right, if we speak up or if we remain silent. Without the right kind of culture, everything else is just band-aid. - -You see, I am a pretty opiniated guy. I can probably talk for hours about Microservices, API's, Docker and so on. But if you ask me today what I think my biggest lever to affect lasting change will be, then shaping and influencing the direction of our dev culture will be my answer. Technoloy and architecure are just manifestations of that culture. Sure they need to be aligned, but **culture eats strategy for breakfast**. And how we share our stories, how we talk about what we learned and what worked and what not, are important first steps of this cultural journey. - -I would like to invite you to listen in to our stories and hopefully find a few nuggets which are helpful for your own journey. Hopefully we can return a bit of knowledge to the community in as much as we are learning from the stories of so many other great teams out there who share their struggles, triumphs and learnings. So here is the beginning of our story .. diff --git a/_posts/2015-11-17-oscon.md b/_posts/2015-11-17-oscon.md deleted file mode 100644 index d5f30c7..0000000 --- a/_posts/2015-11-17-oscon.md +++ /dev/null @@ -1,28 +0,0 @@ ---- -layout: post -title: OSCON Europe 2015 -subtitle: Notes from OSCON Europe 2015 -category: conference -tags: [open-source] -author: marco_seifried -author_email: marco.seifried@haufe-lexware.com -header-img: "images/bg-post.jpg" ---- - -This is a personal and opinionated summary of my attendance of the [OSCON](http://conferences.oreilly.com/oscon/open-source-eu-2015) conference this year. - -I was looking forward to this conference, was hoping to learn what is hot and trendy in the open source world right now. To some extent I got that. I was very impressed by the talks of Sam Aaron - live coding with [Sonic Pi](http://sonic-pi.net/). Very interesting approach, live coding to make music with Raspberry Pi. Plus, Sam is a very enthusiastic character who makes talking about technical stuff a fun thing. Do you know why he did that? When he was asked about his work in a pub over a beer and said he's a developer, he got reactions he wasn't happy with - now he can say he is a DJ, which is way cooler ;-) -(That's only part of the story. He also does that to teach kids about coding as well as music, he works together with schools and institutions in England.) - -Another inspiring session was the [Inner Source](http://www.infoq.com/news/2015/10/innersource-at-paypal) one by Paypal: Let's apply open source practices to your own organization internally. Have others (outside your project, product core team etc.) participate - while having trusted committers (recommendation is 10% of your engineers) to keep direction and control. This might be an approach for us internally, to share code and knowledge. Also, to avoid finger pointing: We all can participate and can identify ourselves with code. - -Also on my top list of sessions: [Growth Hacking](http://conferences.oreilly.com/oscon/open-source-eu-2015/public/schedule/detail/46945) by David Arnoux. Again, partly because David is someone who can talk and present, is passionate about what he does (and that's something I missed in other talks). Growth hacking is a modern approach to marketing, focused on growth, everything else is second. It uses unconventional approaches to achieve that. An example is Airbnb, which used to piggyback Craigslist (without them knowing) which was way more popular at the beginning. - -[Writing code that lasts](http://de.slideshare.net/rdohms/writing-code-that-lasts-or-writing-code-you-wont-hate-tomorrow-54396256) as a session topic is not something that attracts me. It's another session about how to write better code, some low level coding guidelines we all agree on and way to often ignore. But out of better alternatives on the conference schedule, I went - and was surprised. Again, Rafael is a guy who knows how to engage with people, that helped a lot ;-) -One of his rules: Do not use *else*. Let your code handle one thing and focus on that. Also, focus on the major use case first and don't try to anticipate every little possibility up front. A bit like the microservice approach (do one thing and one thing well), but on a smaller scale. -All in all a worthwhile session. - -Apart from that I was excited about day 3, the tutorial day. I booked myself into the GO workshop in the morning and Kubernetes in the afternoon. -Well, GO was ok, but very junior level and basically the same you can find as tutorials on the web already. Kubernetes might have been interesting, but it was assumed you have a Google Cloud account with credit card attached - which I didn't have and didn't want just for the sake of a tutorial. Therefore he lost me after 10 mins and I was behind from the start... - -Overall I enjoyed my time at OSCON. It's always good to meet up with others, get inspired. But in total the quality of the sessions differed a lot and the tutorials, as stated, were disappointing. diff --git a/_posts/2015-11-19-api-journey.md b/_posts/2015-11-19-api-journey.md deleted file mode 100644 index 2c28473..0000000 --- a/_posts/2015-11-19-api-journey.md +++ /dev/null @@ -1,25 +0,0 @@ ---- -layout: post -title: The beginnings of our API Journey -subtitle: Intro to our API Style Guide -category: api -tags: [api] -author: holger_reinhardt -author_email: holger.reinhardt@haufe-lexware.com -header-img: "images/bg-post.jpg" ---- - -Before joining [HaufeDev](http://twitter.com/HaufeDev) I was fortunate to work in the [API Academy](http://apiacademy.co) consultancy with some of the smartest guys in the API field. So it was quite predictable that I would advocate for APIs as one of the cornerstones in our technology strategy. - -Fast forward a few months and we open sourced the initial release of our [API style guide](http://haufe-lexware.github.io/resources/). It is a comprehensive collection across a wide range of API design resources and best practices. Credit for compiling this incredible resource has to go to our very own [Rainer Zehnle](https://github.com/Kodrafo) who probably cursed me a hundred times over for having to use Markdown to write it. - -But this was just the starting point. In parallel we started with formalized API Design Reviews to create the necessary awareness in the development teams. After a couple of those reviews we are now revising and extending our guide to reflect the lessons we have learnt. - -The design reviews in turn triggered discussions on the various tradeoffs when designing APIs. One of the most compelling discussion was about [the right use of schema to enable evolvable interfaces](https://github.com/Haufe-Lexware/api-style-guide/blob/master/message-schema.md). In that section we discuss how [Postels Law](https://en.wikipedia.org/wiki/Robustness_principle) (or Robustness Principle) can guide us towards robust and evolvable interfaces, but how our default approach to message schemas can result instead in tightly coupled and brittle interfaces. - -Another new section was triggered by our Service Platform teams asking for [feedback on the error response of our Login API](https://github.com/Haufe-Lexware/api-style-guide/blob/master/error-handling.md#error-response-format). - -While we are not claiming that our API design guidance and best practices are fool proof, having this document gives us an incredible leg up on having the right kind of conversations with engineering. And step by step we will be improving it. - -This is also one of the reasons why we open sourced our API guide from the start - we have gained so much knowledge from the community that we hope we can give something back. We would love to hear your feedback or get pull requests if you are willing to contribute. This is the genius of Github - making it a journey and a conversation with the larger engineering community out there, and not just a point release. :) - diff --git a/_posts/2015-12-07-devopscon-2015.md b/_posts/2015-12-07-devopscon-2015.md deleted file mode 100644 index e9e98b0..0000000 --- a/_posts/2015-12-07-devopscon-2015.md +++ /dev/null @@ -1,42 +0,0 @@ ---- -layout: post -title: Impressions from DevOpsCon 2015 -subtitle: Notes from DevOpsCon 2015 -category: conference -tags: [docker, devops] -author: rainer_zehnle -author_email: rainer.zehnle@haufe-lexware.com -header-img: "images/bg-post.jpg" ---- - -Elias Weingaertner, Helmut Strasser and I attended [DevOpsCon](http://devopsconference.de/de/) in Munich from 23th - 25th November 2015. -It was an impressive conference with a lot of new information and also excellent food :-). - -In the following I want to focus on my personal highlights. - -## Docker Basis Workshop -I joined the workshop **Der Docker Basis Workshop** from [Peter Rossbach](http://www.bee42.com/). Till now I managed to stay away from Docker cause there are other colleagues in our company that have more enthusiasm for tools like that. A **Basis Workshop** offered a good way to get familiar with Docker. The workshop itself focused on pure Docker. Peter introduced the intention and basic structure of the docker environment and the relationship between docker images, container, daemon, registry etc. Peters created his slides with markdown and shipped it using containers. This guy is really a Docker evangelist and is convinced about the stuff he presents. For most of the workshop we worked using the terminal on a virtual machine running docker and we learned about the different commands. It wasn't that easy for me because the workshop was clearly designed for guys that are familiar with linux. I struggled for example with creating a simple Docker file with vi (don't know how anybody can work with this editor). - -One of the reasons why I joined the workshop was to watch Peter presenting Docker and whether it is a good fit for an inhouse workshop. I'm sure this would work out great. I'm also sure that it's a good idea to meet with Peter to review our own docker journey and to get feedback from him. - -## Microservices and DevOps Journey at Wix.com -Aviran Mordo from [Wix.com](http://de.wix.com/) presented the way how wix.com separated their existing monolithic application in different microservices. This was the session that I enjoyed the most. Aviran explained how they broke up the existing application in just two services in a first step. They learned a lot of stuff about database separation, independent deployment etc. They also learned that it's not a good idea to do too much at the same time. I loved to hear what he marked as **YAGNI** (You ain't gonna need it). It allowed them to focus on business value and how they could handle the task and got the job done. Wix.com did not implement API versioning, distributed logging and some other stuff we are talking about. Aviran emphasized more than once that they strictly focused on tasks that must be done and that they cut away the "nice-to-have" things. Nevertheless it took a year to split the monolith in two services! After that they had more experience and they got faster. Now the need for distributed logging arose and they took care of it. After three years Wix.com has now 140 microservices. For me it was an eyeopener that it is absolutely ok to start with a small set of requirements in favor of getting the job done and to learn. Every journey begins with a single step! - -## Spreadshirts way to continuous delivery -Samuel Ferraz-Leite from [Spreadshirt.com](www.spreadshirt.de) presented their way to continuous delivery. They started with a matrix organisation. The teams were separated in - -* Frontend DEV -* Backend DEV -* QA -* Operation - -The QA team was located in another building than the DEV team. The Ops team as well. This setup led to a monolithic app and architecture that didn't scale. Symptoms were ticket ping-pong or phrases like "the feature lies with Q&A". The deployment of DEV was different from QA. Ops deployed manually. The cycles to even deploy a feature took days. Samuel quoted [Conway's law](https://en.wikipedia.org/wiki/Conway%27s_law). They got the results according to their organization structure. They reorganized. They created teams with a product owner, DEV and QA. Ops was not included in the first step. Each team got the full responsibility for a service/product. They also got the authority to decide. One outcome was the end of ticket ping-pong and the whole team felt responsible for product quality. They also started the construction of a microservice architecture and started to reduce technical debts. After the first succesful reorganization they integrated Ops to each team. This resulted in excellent telemetry and monitoring capabilities, infrastructure as code (puppet) and continuous delivery (rundeck). Team overlapping topics like Puppet are addressed in so called **FriendsOf** groups. Product owners and the whole management fosters these groups. Additionally they have weekly standups with representants of each team. - -I was really amazed by the power of organization restructuring. Of course I know Conway's law. But that it really influences the outcome of a whole company in such a heavy way made me thoughtful. I mulled it over for our own company. - -* How is our company structured regarding the product teams? How do we setup the teams? -* What about ourselves organized as one architect team? Isn't that an antipattern? -* What about SAP, SSMP? CorpDev and SBC? BTS and H2/H3? - -## Conclusion -It was a good conference and I especially appreciated learning from the experiences of other companies where they failed and where they are successful. I hope that in three years we can look back and share our successful way with others. diff --git a/_posts/2015-12-08-DockerCon-EU-2015.md b/_posts/2015-12-08-DockerCon-EU-2015.md deleted file mode 100644 index 07778fb..0000000 --- a/_posts/2015-12-08-DockerCon-EU-2015.md +++ /dev/null @@ -1,56 +0,0 @@ ---- -layout: post -title: Impressions from DockerCon 2015 - Part 1 -subtitle: Insights, Outlooks and Inbetweens -category: conference -tags: [docker, security] -author: thomas_schuering -author_email: thomas.schuering@haufe-lexware.com -header-img: "images/bg-post.jpg" ---- - -Once upon a time in a ~~galaxy~~ container far, far away ... We, a bunch or ~~rebels~~ Haufe-employees, entered the halls of container - wisdom: DockerCon EU 2015 in Barcelona, Spain. Hailing from different departments and locations (Freiburg AND Timisoira, CTO's, ICT, DevOps, ...), the common goal was to learn about the current state of Docker, the technology behind it and its evolving eco-system. - -The unexpected high level catering (at least on the first day and the DockerCon Party in the Marine Museum) was asking for more activity than moving from one session to the next, but we had to bear that burden (poor us!). - -We met with a couple of companies (Rancher Labs, SysDig, Zanlado and some more) to get a feeling what is already available on the market and how mature do the solutions feel? - -The recordings of the [Day 1 General Session](http://blog.docker.com/2015/11/dockercon-eu-2015-day-1-general-session/) and [Day 2 General Session](http://blog.docker.com/2015/11/dockercon-eu-2015-day-2-general-session/) contain the major news and most important things. - - -Here's what I found most important (no specific order and might be intermixed with other sessions or presentations, but you know that I'm deeply interested in security and stuff ;-)): - -- Docker delivers "Enterprise" Features and is getting more mature: - - Security - - Docker Content Trust is getting stronger by hardware signing - Since Docker 1.8, strong cryptographic guarantess over the content of an Docker image can be established by using signing procedures. From their [Blog](https://blog.docker.com/2015/08/content-trust-docker-1-8/) here an excerpt: - [...] Docker Engine uses the publisher’s public key to verify that the image you are about to run is exactly what the publisher created, has not been tampered with and is up to date. [...]. - At DockerConEU 2015, the support of Hardware Signing via Yubi Key was announced that strengthens the signing process even more. There's an elaborate [article](https://blog.docker.com/2015/11/docker-content-trust-yubikey/) available. - - Security scans for Images: Project Nautilus - Trusting an image is good and well, but what if the binaries (or packages) used image are vulnarable? Usually, distribution-maintainers provide information and updates for packages. But what about a Dockerfile that installs its artifacts without using a package-manager? Project Nautilus takes care about this situation but not only checking "all" of the vulnerability databases, but by scanning the files of an image. There's not much public information (link :-)) available yet, but it's a promising approach. - - No more exception from isolation: User namespaces (available in Docker 1.10 "dev") - Almost everything in Docker is isolated / abstracted from the Host-OS. One crucial exception was still present: Userids were used "as is". For example, this would allow the root user of a container to modify a readonly mounted file (by "volume" command) that is owned by the root user of the host. In future, this will not be possible anymore, because the userids will be "mapped" and the rootid "0" inside the container will be effectively treated as "xxxx + 0" outside the container. - - - SecComp - Seccomp is a filter technology to restrict the set of systemcalls available for processes. You could imagine a container being corrupted and the docker-engine would simply not allow a process inside a container to execute "unwanted" systemcalls. Setting the system (host) time? Modifying swap params? Such calls (and more) can be "eliminated". - - - Security made easy - This is mentioned in the recording of [Day 1 General Session](http://blog.docker.com/2015/11/dockercon-eu-2015-day-1-general-session/). Basically it says: If security is hard to do, nobody will do it ... - Docker tries to ease the "security pain" and I ask you to look/search for that small mentioning in the session. Maybe you agree to the points mentioned there :-) - -- Does it (Docker) scale? - - Live [scale testing](https://blog.docker.com/2015/11/scale-testing-docker-swarm-30000-containers/) wasn't something I was really looking forward to see, but it was impressive anyway. - -- Managing containers - - I DO like Rancher, I'd love to have something even more powerfull ... and there comes "Docker UCP (Universal Control Plane) Beta". The UCP and the Docker trusted registry are two of the "commercial" products I've seen from Docker. Hopefully, the basic tools are staying on the "Forces light side" of free opensource - at least for developers and private users. - -- Using Docker in production - - At Haufe, sometimes we seem to be lagging behind new technologies. Some of the presentations at Dockercon put a pretty strong emphasis on being carefull and preserve successfull processes (esp. dev, ops and security). A quick compilation of presentation links and topics will follow. - -In the meantime have a look at the [great overview](https://github.com/docker-saigon/dockercon-eu-2015) what happened during both days with links to all/most of the presentations, slides or videos to be found a . - -## Things "inbetween" -... were quite interesting, too. We met with some guys from Zalando (yes, the guys who're screeming a lot in their adverts), who were explaining how they are using some home-brewn (git-available) facade (or tool if you like) to ease the pain with running a custom PaaS on AWS. The project [STUPS](https://stups.io/) uses a plethora of Docker-containers and is to be found on its own web-page and [github](https://github.com/zalando-stups) . - -(To be continued :-)) - diff --git a/_posts/2015-12-11-apidays-paris.md b/_posts/2015-12-11-apidays-paris.md deleted file mode 100644 index 0a0fc68..0000000 --- a/_posts/2015-12-11-apidays-paris.md +++ /dev/null @@ -1,100 +0,0 @@ ---- -layout: post -title: APIdays Paris - From Philosophy to Technology and back again -subtitle: A biased report from APIdays Global in Paris -category: conference -tags: [api] -author: martin_danielsson -author_email: martin.danielsson@haufe-lexware.com -header-img: "images/bg-post.jpg" ---- - -Having just recently come home again from the [APIdays](http://www.apidays.io) conference in Paris (Dec 8-9th 2015), memories are still quite fresh. It was a crowded event, the first day hosting a whopping 800 API enthusiasts, ranging from the geekiest of the geeks to a fair amount of suited business people, showing that talking about APIs is no longer something just the most avantgardist of companies, the most high tech of the tech companies, spend their time with. *Au contraire* (we were in Paris after all), APIs are mainstream, and they contribute to the advancing of collaboration and automatization of the (digital) world as such. - -{:.center} -![Eiffel Tower - Paris]({{ site.url }}/images/2015-12-11-paris-eiffeltower.jpg){:style="margin:auto"} -(Image by Martin Danielsson, [CC BY 4.0 License](https://creativecommons.org/licenses/by/4.0/)) - -This was also one of the reasons the topic of APIdays was chosen as such: **Automating IT, Business and the whole society with APIs**. The partly non-techy twist of the subtitle to APIdays was also reflected in the sessions: Split into roughly three (or four) categories, you had a choice between real tech stuff, business related sessions and quite a few workshops. In addition to that, the opening and ending keynotes were more kept in a philosophical tone, featuring (in the opening keynote) [Christian Fauré](http://www.christian-faure.net/) and renowned french philosopher [Bernard Stiegler](https://en.wikipedia.org/wiki/Bernard_Stiegler) (in the ending keynote), presenting their takes on digital automation, collaboration and its effects on society, with respect to APIs. Even [Steven Willmott](http://twitter.com/njyx) pulled off a rather non-techy talk, and even non-businessy talk, rather unusual for a CEO of one of the big players in API space ([3scale](http://www.3scale.net)). - -### API Philosophy - -In their talks, both Fauré and Stiegler were talking about the effects of automation on society, but both with quite contradicting base sentiments, even if the message - in the end - seems similar. But more on that later. - -Fauré's topic was "Automation in the 21st century", and the fear of many people that software/robots/automated processes replace humans in tasks which were previously accomplished manually, the simple fear of becoming superfluous. This is what he calls *Opposition* to the automation in society, and it is our main task to instead encourage a culture of *Composition* in order to leverage the good, and focus on the capabilities to be creative (and yes, he included a sidekick to [Peter Drucker](https://en.wikipedia.org/wiki/Peter_Drucker)'s "Culture eats strategy for breakfast" quote). This is where he sees the realm of APIs: As an area of creativity. Composing APIs to create value in ways we have not thought of before. - -> Designing an API is an act of creativity. ->
-> Christian Fauré ([@ChristianFaure](https://twitter.com/ChristianFaure)) - -This act of composition is creativity, as well as designing an API is an act of creativity. Good APIs take time to design, APIs which encourage creative use of them even more so. Fauré also stresses that even with enhanced tooling (and we're just seeing the first big wave of API management and development tools yet), the actual designing of the API is still where the main work lies, or, at least the main lever. - -> API management solutions have great benefits, but you still cannot switch your brain off! ->
-> Christian Fauré ([@ChristianFaure](https://twitter.com/ChristianFaure)) - -Growing ground for such creativity lies for Fauré in the "Hacking Culture". Try out things, be creative, and use APIs as a means to bring your ideas into reality faster and simpler. - -Steven Willmotts ([@njyx](http://twitter.com/njyx)) main message in the session ([slides](http://www.slideshare.net/3scale/apis-and-the-creation-of-wealth-in-the-digital-economy-apidays-paris-2015-keynote)) following Christian Faurés gives the idea of enabling creativity a new spin, but still points in a similar direction: As of now, APIs are still a technical topic. You need to be a developer to be able to really leverage the functionality (see also [twilio's](http://www.twilio.com) billboard ad campaing, e.g. [here](https://twitter.com/ctava1/status/608451693110550529)). Steven thinks the "next big thing" in API space will be enabling business users to interact more easily with APIs, without needing fundamental engineering skills. Or, as he put it: - -> I want to buy my flowers from a florist, not from an engineer! ->
-> Steven Willmott ([@njyx](http://twitter.com/njyx)) - -The last but not least session of APIdays was to be by [Bernard Stiegler](https://en.wikipedia.org/wiki/Bernard_Stiegler); drawing a lot from his book "Automatic Society" ([*La Société Automatique*](http://www.amazon.fr/La-Soci%C3%A9t%C3%A9-automatique-Lavenir-travail/dp/2213685657), not yet available in english), he was also talking about the need to create new jobs out of automation. His claim is that a closed system, in which automation does not generate value and new opportunities, is doomed to self-destruction by *entropy*. Only a living system, allowing for biological processes (read: life, or life-like organisms), can survive. This is a main reason he sees automation not only as a positive aspect, but also highly critical: Automating to free up time only makes sense if the free time is used in a sensible way. And no, Facebook is not, according to Stiegler. The search for opportunities to create *disentropy* (as the opposite of entropy) has to be what human kind has to pursue, albeit the road there is not clear. - -### API Technology - -This blog post may until now have given the impression I attended a philosophy conference, which was of course not the case. It set an interesting frame to the conference though, opening up a new view on a topic which tended to be extremely techy until now. - -Many of the more technical talks dealt with the usual suspects [Microservices and DevOps](https://haufe-lexware.github.io/microservices-devopscon/), as being an integral part of the API economy and architecture style. Some were very enthusiastic, some, such as [Ori Pekelman](http://platform.sh) have had enough of it, tooting in our Elias' horn, saying it's no news, and he can't stand seeing "unicorns farting containers into microservice heaven" anymore. He had a quite drastic slide accompanying that one (yes, it's an actual quote), but I wasn't quick enough with the camera. - -The return to more serious topics, *Hypermedia* was an astonishingly big topic at the conference. Not that it's not a seriously good idea, but now adoption seems to find its way into real world scenarios, with practical and working specifications popping up, which are being adopted at an increasing rate. As Hypermedia leaves the state of a research topic (see below picture on [HATEOAS](https://en.wikipedia.org/wiki/HATEOAS) - Bless you!) and is actually being used. - -{:.center} -![HATEOAS - Bless you!]({{ site.url }}/images/2015-12-11-hateoas.jpg){:style="margin:auto"} -(Courtesy of [CommitStrip](http://www.commitstrip.com/en/2015/12/03/apiception/)) - -Many people are perhaps scared of the seemingly intransparent topic, but there are a lot of really good use cases for hypermedia. Jason Harmon of PayPal/Braintree ([@jharmn](http://twitter.com/jharmn)) pointed to some of the most prominent ones in his talk: - -* Paging links inside result sets (*first*, *last*, *previous*, *next*) -* Actions and permissions on actions: If an action is contained within the result, it's allowed, otherwise it isn't -* Self links for caching and refreshing purposes (you know where the result came from) - -Adopting Hypermedia techniques for these use cases can help doing the heavy lifting of doing e.g. paging for all clients at once, as opposed to forcing each client to find its own pattern of implementation. The adoption of hypermedia techniques is also due to the existance of (more or less) pragmatic specifications, such as - -* [HAL](http://stateless.co/hal_specification.html) (actually [Mike Kelly](http://stateless.co) also attended APIdays) -* [JSON-LD](http://json-ld.org) ([Elf Pavlik](https://twitter.com/elfpavlik) also attended APIdays) -* [Collection+JSON](http://amundsen.com/media-types/collection) ([Mike Amundsen](http://amundsen.com)) -* [SIREN](https://github.com/kevinswiber/siren) (by [Kevin Swiber](https://github.com/kevinswiber)) - -But, to reiterate the theme of "no actual news": - -> Hypermedia is actually already in Fielding's dissertation on REST, if you read until the end. ->
-> Benjamin Young ([@BigBlueHat](http://twitter.com/BigBlueHat)), organizer of [RESTFest](http://www.restfest.org) - -In order to keep this blog post not exceedingly long (I bet nobody's reading this far anyway), I'll just mention a couple of the more interesting topics I had the pleasure to check out in one or more sessions: - -* [RDF and SPARQL](http://www.w3.org/TR/rdf-sparql-query/) seems to get adopted more and more; new interesting techniques to offload work to clients make scaling easier (support only simpler queries, not full SPARQL language, let clients assemble results): Ruben Verborgh ([@rubenverborgh](https://twitter.com/rubenverborgh)) - [Slides](http://www.slideshare.net/RubenVerborgh/hypermedia-apis-that-make-sense). -* [Graph/QL](https://facebook.github.io/graphql/) looks very promising in terms of providing a very flexible querying language which claims to be backend agnostic (have to check that out in more detail, despite it being by Facebook): [Slides](http://www.slideshare.net/yann_s/introduction-to-graphql-at-api-days) - -### API Hackday - -Despite being tempted by a packed agenda of talks on the second day, I chose to participate in the "mini RESTFest" which was organized at the conference venue. Darrel Miller ([@darrell_miller](http://twitter.com/darrel_miller)) of Microsoft (yes, that Microsoft) and Benjamin Young ([@BigBlueHat](http://twitter.com/BugBlueHat)) did a great job in organizing and taming the different opinions which gathered in the hackday space on the second floor of the [*Tapis Rouge*](http://www.tapisrouge.fr/). - -The scene setting was in short the following: Starting with a RFC style definition of a "Conference Talk Proposal" media type which was conceived by Darrel, what can we do with that? - -I *think* Darrel had a master plan of creating something quite lightweight to be able to have an iCal or vCard corresponding transfer media type for conference sessions, but boy, did discussions come up on this. We had [Elf Pavlik](https://twitter.com/elfpavlik) taking part, bringing a lot of ideas into play regarding Hypermedia and JSON-LD. Additionally, [Paul Chavard](https://github.com/tchak) from Captain Train participated in the lively discussion. Darrel did explicitly not want to *boil the ocean* by adopting some larger scale specification like JSON-LD, but he wanted something lean and well specified to make adoption of the media type simple. After a good while, we *sort of* agreed on something inbetween... - -In the end, we did finish a couple of presentable things, such as a translator of the format into JSON-LD (guess who implemented that?), a cool Jekyll template for displaying the proposals on a static website (by Shelby Switzer, [@switzerly](https://twitter.com/switzerly)). My own contribution was to create a [JSON schema](http://json-schema.org/) matching the media type, and implementing an HTML form using [Jeremy Dorn](https://github.com/jdorn)'s quite cool [JSON Editor component](https://github.com/jdorn/json-editor). - -The results (and possibly also further proceedings with this) can be viewed on [RESTFests github repository](https://github.com/RESTFest/2015-apidays-conference-talk-api); some of the results are still in the branches. - -### Conclusion - -I had a good time at the APIdays; the sessions had overall good quality, and the audience was fantastic. I had a lot of opportunities to meet people I knew, and even more important, people I did not yet know. I would definitely recommend going there again. - -{:.center} -![APIdays]({{ site.url }}/images/2015-12-11-apidays-logo.png){:style="margin:auto"} -[APIdays](http://www.apidays.io) diff --git a/_posts/2015-12-15-dockercon_eu_2015.md b/_posts/2015-12-15-dockercon_eu_2015.md deleted file mode 100644 index f4cabca..0000000 --- a/_posts/2015-12-15-dockercon_eu_2015.md +++ /dev/null @@ -1,55 +0,0 @@ ---- -layout: post -title: Impressions from DockerCon 2015 - Part 2 -subtitle: Highlights and picks from DockerCon 2015 -category: conference -tags: [docker] -author: Peter Frey -author_email: peter.frey@haufe-lexware.com -header-img: "images/bg-post.jpg" ---- - -There are already some weeks since last [DockerCon 2015](http://europe-2015.dockercon.com/), and I want to share some ideas and thoughts that I took from there. The conference took place on November 16th and 17th this year in Barcelona, and was attended by over 1000 participants from Europe and beyond. The estimation is based on the size of the large [forum auditorium](http://www.ccib.es/spaces/forum-auditorium) that may take up 3.140 persons that got filled for the three plenary sessions. - -First of all, some background, although [Docker](https://www.docker.com/what-docker) as a technology or hype or platform - however you conceive it - is in the meantime an already well-known terminology. And a large number of articles have already been published on it in the last two years. Docker was [initially released in 2013](https://en.wikipedia.org/wiki/Docker_\(software\)#History). so it is a relatively new. My first experience with Docker was last year, in Spring 2014, when I was asked to do a prototype implementation for the Haufe editorial system (internally known as HRS). Docker was new, and it was new to me, and I struggled with a lot of does and don'ts when transforming an environment - even the small part choosen for the prototype - that has been grown over years and with heavy data centricity. - -So I was excited to visit DockerCon and see how Docker continued to evolve as platform into a very flexible, lightweight virtualization platform. The Docker universe indeed made big steps under the hood, with the tooling around it, and also with a growing number of third party adopters improving many aspects of what Docker is and wants to be. Docker may and will revolutionize the way we will build and deploy software in the future. And the future starts now, in the projects we bring ahead. - -### Virtualization and Docker - -The past waves of virtualization are now commodity, has reached IT and is no longer the domain of development as it was years ago, when we started with VMware for development and testing. It is the base for nowadays deployment. Virtualization has many aspects and flavours, but one thing is in common: it is rather heavy weighted to build up a virtualization platform, and using it will cause some performance reduction in comparision to deploying software artifacts directly to physical machines - what was still done for this reason, to have maximum throughput and optimal performance for business. But with virtualization we gain flexibility, beeing able to move a virtualized computing unit on the hardware below, especially from an older system to a newer one without having to rebuild, repackage or deploy anything. And it is already a big, well known industry behind virtualization infrastructure and technology. - -So what is new with Docker? First of all, Docker is *very lightweight*. It fits well in modern Unix enviroments as it bases upon kernel features like CGroups, LXC and more to provide a separation of the runtime environment for the application components from the base os system and drivers and hardware below. But docker is not linux only, there is movement also in the non-Linux part of our world implementing docker and docker related services. Important is: Docker is not about VMs, it is containers. Docker as technology and platform promises to become a radical shift in view. But as I am no authority in this domain, I just refer to a recent article on why Docker is [the biggest disruption in Linux virtualization](http://www.nextplatform.com/2015/11/06/linux-containers-will-disrupt-virtualization-incumbents/). - -### Docker fundamentals - -There was one session that made a deep impression on me. It was the session titled ["Cgroups, namespaces and beyond: what are containers made from"](http://de.slideshare.net/Docker/cgroups-namespaces-and-beyond-what-are-containers-made-from) by Jerome Pettazzoni. Jerome shows how Docker bases on and evolves from Linux features like cgroups and namespaces and LXC (Linux containers). Whereas early docker releases based on LXC, it uses now an own abstraction from underlying OS and Linux kernel called libcontainer. He show containers can be built from out of the box linus features in an impressive demo. The main message I took from this presentation: Docker introduces no overhead in comparision to direct deployment to a linux system, as the mechanism used by Docker are system inherent, it means are also used and in effect when one uses Linux without, as they sit there and are used anywhere. Docker is lightweight, really, and has nearly no runtime overhead, so it should be the natural way to deploy and use non-OS software components on Linux. - -### Docker and Security - -When I started in 2014, one message from IT was: Docker is insecure, and not ready for production use, we cannot support it. Indeed there are a couple of security issues related to Docker, especially if the application to deploy depends on NFS needed to share configuration data and provide a central pool of storage to be accessed by a multi-node system (as HRS is, for reasons of scaling and load balancing). In a Docker container, you are root, and this implies also root access to underlying services, such as NFS mounted volumes. Nowadays, still true, unfortunately. You will find the discussions in various discussion groups in the internet. For example in ["NFS Security in Docker"](https://groups.google.com/forum/#!topic/docker-user/baFYhFZp0Uw) and many more. - -But there are be big advances with Docker that may slip into the next versions that are planned. One of them, that I yearn to have, is called user namespace mapping. It was announced at DockerCon in more than one presentation, but I remember from "Understanding Docker Security", presented by two memobers of the Docker team, Nathan McCauley and Diogo Monica. The reason why it is not yet final is that it requires further improvements and testing, and so it is only available in the experimental branch of docker, currently. -The announcement can be read here: []"User namespaces have arrived in Docker"] (http://integratedcode.us/2015/10/13/user-namespaces-have-arrived-in-docker). The concept of user namespaces in Linux itself is described in the [linux manpages](http://man7.org/linux/man-pages/man7/user_namespaces.7.html) and may be supported by a few up-to-date linux kernels. So it is something for the hopefully near future. See also the known restrictions section in [github project 'Experimental: User namespace support'](https://github.com/docker/docker/blob/master/experimental/userns.md). - -An other progress with container security is the project [Notary and docker content trust](https://github.com/docker/notary). It was briefly presented at DockerCon, and I would have to dive deeper into this topic to say more on it. Interesting news is also support for secure hardware based security. To promote that, every participant in one of the general sessions got a YubiKey 4 Nano device, and its use for two factor authentication with code in a Docker repository was demonstrated in the session. The announcement can be found in ["Yubico Launches YubiKey 4 and Touch-to-Sign Functionality at DockerCon Europe 2015"](http://www.marketwired.com/press-release/yubico-launches-yubikey-4-and-touch-to-sign-functionality-at-dockercon-europe-2015-2073790.htm). -More technical information on it can be read in the blog article [Docker Content Trust](https://blog.docker.com/2015/08/content-trust-docker-1-8/). -See also the [InnoQ article](http://www.infoq.com/news/2015/11/docker-security-containers) and the [presentation](https://blog.docker.com/2015/05/understanding-docker-security-and-best-practices/) from May 2015. - -### Stateless vs Persistency - -One thing that struck me last year, when I worked on my Docker prototype implementation, was that Docker is perfect for stateless services. But troubles are ahead, as in real world projects, many services tend to be stateful, with more or less heavy dependencies on configuration and data. This has to be handled with care when constructing Docker containers - and I indeed ran into a problems with that in my experiments. - -I hoped to hear more on this topic, as I guess I am probably not the only one that has run into issues while constructing docker containers. -Advances in docker volumes where mentioned, indeed. Here I mention the session "Persistent, stateful services with docker clusters, namespaces and docker volume magic" by Michael Neale. - -### Usecase and Messages -As a contrast to the large number of rather technology focussed sessions was the one held by Ian Miell - author of 'Docker in Practice' - on ["Cultural Revolution - How to Manage the Change Docker brings"](http://de.slideshare.net/Docker/cultural-revolution-how-to-mange-the-change-docker-brings) - -A use case presentation was "Continuous Integration with Jenkins, Docker and Compose", held by Sandro Cirulli, Platform Tech Lead of Oxford University Press (OUP). He presents the devops workflow used with OUP for building and deploying two websites provising resources for digitally under represented languages. The infrastructure runs on Docker containers, with Jenkins used to rebuild the Docker Images for the API based on a Python Flask application and Docker Compose to orchestrate the container. The CI workflow and demo of how continuous integration was achived were given in the presentation. It is available in [slideshare](http://de.slideshare.net/Docker/continuous-integration-with-jenkins-docker-and-compose), too. - -One big message hovered over the whole conference: Docker is evolving ... as an open source project that is not only based on a core team but also heavily on many constributors making it grow and become a success story. Here to mention the presentation ["The Missing Piece: when Docker networking unleashing soft architecture 2.0"](http://de.slideshare.net/Docker/the-missing-piece-when-docker-networking-unleashing-soft-architecture-v15). And "Intro to the Docker Project: Engine, Networking, Swarm, Distribution" that rouse some expectations that were not met by the speaker, unfortunately. - -### Session Overview -An overview on the sessions held at DockerCon 2015 in Barcelona can be found [here](https://github.com/ngtuna/dockercon-eu-2015/blob/master/README.md), together with many links to the announcements made, presentations for most sessions in slideshare, and links to youtube videos of the general sessions, of which I recommented viewing the one for [day 2 closing general session](https://www.youtube.com/watch?v=ZBcMy-_xuYk) with a couple of demonstrations what can be done using Docker. It is entertaining and amazing. - diff --git a/_posts/2015-12-17-letsencrypt.md b/_posts/2015-12-17-letsencrypt.md deleted file mode 100644 index 8b99171..0000000 --- a/_posts/2015-12-17-letsencrypt.md +++ /dev/null @@ -1,178 +0,0 @@ ---- -layout: post -title: Using 'Let's Encrypt' Certificates with Azure -subtitle: Create free valid SSL certificates in 20 minutes. -category: howto -tags: [security, cloud] -author: martin_danielsson -author_email: martin.danielsson@haufe-lexware.com -header-img: "images/bg-post.jpg" ---- - -[Let's Encrypt](https://letsencrypt.org/) is a new Certificate Authority which has a couple of benefits almost unheard of before: It's free, automated and open. This means you can actually use Let's Encrypt to create real SSL certificates which will be accepted as valid by web browsers and others. - -This post will describe how you can leverage a simple [Azure](http://azure.com) Ubuntu VM to create a certificate for any domain you can control yourself (create a CNAME entry for). This how-to really starts with Adam and Eve, so if you already have an Ubuntu machine you can play with, and which is reachable from the public internet, you can skip the provisioning part on Azure. - -Please note that Let's Encrypt in no way depends on the VM being provisioned on Azure. It's just what I currently have at hand (as I have an MSDN subscription). - -### Prerequisites - -You will need the following things, which this how-to does not provide you with: - -* A valid Azure Account and some credit to play with (depending on how fast you are, you will need something between 10 Cents and a Euro/Dollar). -* Know-how on how to create a CNAME (DNS entry) for your machine (I usually delegate this to our friendly IT, they know that by heart). -* You need to know which DNS name you want for your certificate. I will assume `myserver.contoso.com` throughout this blog post. - -### Provision an Ubuntu VM on Azure - -To start things, open up the [Azure Portal](https://portal.azure.com) using your favorite web browser, and log in so that you have access to the Azure portal. Then click *Virtual machines (Classic)*, then *Add +*. - -{:.center} -![New VM]({{ site.url }}/images/letsencrypt-1-new-vm.png){:style="margin:auto"} - -Then, search for `ubuntu` and select *Ubuntu Server 14.04 LTS* (I think you can choose a regular Ubuntu, too, but this one definitely works). - -{:.center} -![Select Ubuntu]({{ site.url }}/images/letsencrypt-2-select-ubuntu.png){:style="margin:auto"} - -Specify the correct settings for the VM. I chose the following specs for the VM: - -* Hostname `letsencrypt` (Azure will pick a name for you starting with `letsencrypt`) -* User name `azure` (or whatever you want) -* Choose *Password* and enter a valid password -* Standard A1 (1,75 GB of RAM, 1 Core, completely sufficient) -* Add two endpoints: http (tunnel port 80) and https (tunnel port 443). See image below. -* Leave the rest of the setting to the defaults - -{:.center} -![VM Settings]({{ site.url }}/images/letsencrypt-3-vm-settings.png){:style="margin:auto"} - -When you're done and all your settings have been confirmed (*OK*), click the *Create* button to provision your VM. - -**Note**: You will be charged for running the VM on Azure. This is the only cost though you will generate when creating the certificate. - -This will take some time (around 5 minutes), but after that, you will find the information on your machine in the following way: - -{:.center} -![Azure VM Provisioned]({{ site.url }}/images/letsencrypt-4-azure-name.png){:style="margin:auto"} - -The automatically created DNS entry for your machine is displayed there, and this is the name you can use to connect to the machine using your favorite SSH tool (`ssh` if you're on Linux or Mac OS X, e.g PuTTY if you're on Windows). - -### Set the CNAME to your VM - -Now that you have a running Ubuntu machine we can play with, make sure the desired host name resolves to the Ubuntu VM DNS name from the above picture. Pinging `myserver.contoso.com` must resolve to your Ubuntu machine. - -If you don't know how to do this, contact your IT department or somebody else who knows how to do it. This is highly depending on your DNS provider, so this is left out here. - -### Setting up your Ubuntu machine for Let's Encrypt - -Now, using an SSH client, log into your machine (using the user name and password you provided when provisioning it). I will assume that your user is allowed to `sudo`, which is the case if you provisioned the Ubuntu machine according to the above. - -First, install a `git` client: - -``` -azure@letsencrypt:~$ sudo apt-get install git -``` - -Then, clone the `letsencrypt` GitHub repository into the `azure` user's home directory: - -``` -azure@letsencrypt:~$ git clone https://github.com/letsencrypt/letsencrypt -``` - -Get into that directory, and call the `letsencrypt-auto` script using the `certonly` parameter. This means Let's Encrypt will just create a certificate, but not install it onto some machine. Out of the box, Let's Encrypt is able to automatically create and install a certificate onto a web server (currently, Apache is supported, nginx support is on its way), but that requires the web server to run on the very same machine. But as I said, we'll just create a certificate here: - -``` -azure@letsencrypt:~$ cd letsencrypt/ -azure@letsencrypt:~$ ./letsencrypt-auto certonly -``` - -This will install quite a few additional packages onto your machine, which is also in part why I prefer to do this on a separate machine. The installation process and creation of the Let's Encrypt environment takes a couple of minutes. Don't get nervous, it will work. - -Using only the default values, you will end up with a 2048 bit certificate valid for 3 months. If you issue `./letsencrypt-auto --help all` you will see an extensive documentation of the various command line paramaters. The most useful one would presumably be `--rsa-key-size` which you can use to e.g. create a 4096 bit certificate. - -### Using Let's Encrypt - -In the first step, Let's Encrypt will ask for an administration email address; this is the email address which will be used if some problems occur (which normally doesn't happen). You will only have to provide this address once, subsequent calls of `letsencrypt-auto` will not ask for it (it's stored in `/etc/letsencrypt`). - -After that, you will have to accept the license terms: - -{:.center} -![License Terms]({{ site.url }}/images/letsencrypt-5-terms.png){:style="margin:auto"} - -In the next step, enter the domain name(s) you want to create the certificate for: - -{:.center} -![Domain Name]({{ site.url }}/images/letsencrypt-6-domain-name.png){:style="margin:auto"} - -Usually, you will create one certificate per domain you will use. Exceptions will be for example when creating a certificate which is both valid for `www.contoso.com` and `contoso.com`, if your web server answers to both. In this case, we will just provide `myserver.contoso.com` (this might be a web service or similar). - -If everything works out, Let's Encrypt will have created the certificate files for you in the `/etc/letsencrypt/live` folder. If you run into trouble, see below section of common problems. - -### Getting the certificates to a different machine - -In order to get the certificates off the Ubuntu VM, issue the following commands (first, we'll go `root`): - -``` -azure@letsencrypt:~$ sudo bash -root@letsencrypt:~# cd /etc/letsencrypt/live -root@letsencrypt:/etc/letsencrypt/live# ll -total 20 -drwx------ 5 root root 4096 Dec 16 13:50 ./ -drwxr-xr-x 8 root root 4096 Dec 15 14:38 ../ -drwxr-xr-x 2 root root 4096 Dec 16 13:50 myserver.contoso.com/ -root@letsencrypt:/etc/letsencrypt/live# ll myserver.contoso.com -total 8 -drwxr-xr-x 2 root root 4096 Dec 16 13:50 ./ -drwx------ 5 root root 4096 Dec 16 13:50 ../ -lrwxrwxrwx 1 root root 43 Dec 16 13:50 cert.pem -> ../../archive/myserver.contoso.com/cert1.pem -lrwxrwxrwx 1 root root 44 Dec 16 13:50 chain.pem -> ../../archive/myserver.contoso.com/chain1.pem -lrwxrwxrwx 1 root root 48 Dec 16 13:50 fullchain.pem -> ../../archive/myserver.contoso.com/fullchain1.pem -lrwxrwxrwx 1 root root 46 Dec 16 13:50 privkey.pem -> ../../archive/myserver.contoso.com/privkey1.pem -``` - -You should see the four files belonging to the certificate inside the `/etc/letsencrypt/live` folder. We will tar these up and make sure you can access them (securely) from the outside: - -``` -root@letsencrypt:/etc/letsencrypt/live# tar cfvzh ~azure/keys_contoso_com.tgz myserver.contoso.com/* -myserver.contoso.com/cert.pem -myserver.contoso.com/chain.pem -myserver.contoso.com/fullchain.pem -myserver.contoso.com/privkey.pem -root@letsencrypt:/etc/letsencrypt/live# chown azure:azure ~azure/keys_contoso_com.tgz -root@letsencrypt:/etc/letsencrypt/live# exit -azure@letsencrypt:~$ -``` - -Now you'll have a file called `keys_contoso_com.tgz` in the home directory of the `azure` user. Pick your favorite tool to get the file off the machine, e.g. WinSCP on Windows or `scp` on Linux or Mac OS X machines. - -### Backing up `/etc/letsencrypt` - -If you plan to re-use the settings of Let's Encrypt, please also back up the entire `/etc/letsencrypt` folder and store that in a safe place. - -### Converting to other formats - -In some cases, you can just use the PEM certificate files (e.g. for nginx or Apache). In other cases, you will need to convert these certificate files into a different format, like `PFX`. For more information on that, please see the following website: [https://www.sslshopper.com/ssl-converter.html](https://www.sslshopper.com/ssl-converter.html). - -### Using the certificates - -Now you're all set and done. You can now use the certificates on the actual machine you want to use them on. Before you do that, make sure the CNAME is set to the correct machine (your web server, web service,...). Depending on the TTL of the DNS setting, this may take some time, but your friendly DNS responsible will be able to tell you this. - -**Side note**: I successfully used certificates created in this way with the Azure API Management service to get nicer looking names for my API end points (e.g. `api.contoso.com` instead of `api-983bbc2.azure-api.net` or similar) and developer portal (e.g. `https://portal.contoso.com`). - -### VM disposal - -After you have finished all steps, make sure you power off your virtual machine (using the Azure Portal). In case you want to re-use it for other certificates, just power it off, but keep the storage for it, so that you can power it back on again. This will also generate some running cost, but it's almost neglectable (a couple of cents per month). - -If you want to get rid of all running costs for the VM, delete the VM altogether, including the storage (Azure Portal will ask you whether you want to do this automatically). - -### Common Problems - -* **Let's Encrypt cannot connect**: Let's Encrypt starts its own little web server which is used to verify the CNAME actually belongs to the machine. If port 80 and/or 443 are already occupied, this will obviously not work. Likewise, if ports 80 and 443 are not available from the public internet (you forgot to specify the endpoints?), Let's Encrypt will also fail. -* **Domain Name blacklisted**: If you try to create a certificate for a domain name which has a top level domain belonging to one of the larger providers, chances are that the request will be rejected (`Name is blacklisted`). This also applies for any machine names directly on Azure (`*.cloudapp.net`). You will need your own domain for this to work. - -### Disclaimer - -Currently, at the time of writing, Let's Encrypt is in public *Beta*. Which means I would not recommend using these certificates for production. When testing SSL related things, it may very well be useful anyhow. - -Additionally, by default the certificates are only valid for 3 months. If you need to renew the certificate, you should probably think of either getting a paid certificate valid for a longer period of time, and/or actually installing Let's Encrypt on your actual web server. On that machine, you could create a `cron` job to renew the certificate every two months. diff --git a/_posts/2015-12-8-microservices-devopscon.md b/_posts/2015-12-8-microservices-devopscon.md deleted file mode 100644 index 812e774..0000000 --- a/_posts/2015-12-8-microservices-devopscon.md +++ /dev/null @@ -1,53 +0,0 @@ ---- -layout: post -title: DevOpsCon 2015 - Is it really about the tools? -subtitle: My opinionated findings from DevOpsCon 2015 in Munich -category: conference -tags: [devops, microservice] -author: Elias Weingaertner -author_email: elias.weingaertner@haufe-lexware.com -header-img: "images/bg-post.jpg" ---- - -Two weeks ago, I attended the DevOp Conference (DevOpsCon) in Munich. As expected, it turned out to be the Mecca for Docker fans and Microservice enthusiasts. While I really enjoyed the conference, I drew two somehow controversial conclusions that are open for debate: - -1. *Microservices are dinosaurs:* When people spoke about Microservices at the conference, they often were very convinced that Microservices are a new and bleeding edge concept. I disagree. In my opinion, Microservices are a new name for concepts that are partially known for fourty years. - -2. *DevOps is not about technology:* Listening to many talks, I got the impression DevOps is all about containers. People somehow suggested that one just needed to get Docker + Docker-Compose + Consul up and running to transform you into a DevOps shop. Or CoreOs+Rocket. Or Whizzwatz plus Watzwitz. I disagree. Introducing DevOps concepts to your organization is mainly about getting your organizational structure right. - -### What is new about Microservices? - -To be honest, I really like the design principles of Microservices. A microservices does one particular job (*separations of concerns*). It is cleanly separated from other services (*isolation*), up to a degree where a microservice is responsible for its data persistence and hence is integrated with a dedicated database. - -The rest of the microservice story is told quickly. Divide your system into individual functional units that are available via REST APIs. Label them microservices. Build applications by composing functionalities from different microservices. As of 2015, wrapping individual microservices into containers using a Docker-based toolchain makes it super easy to run your microservice-based empowered ecosystem at almost any infrastructure provider, no matter if it is Amazon, Azure, Rackspace, DigitalOcean or your personal hosting company that has installed Docker on their machines. I totally agree that Docker and related technologies help a lot making Microservices actually work. I also think that wrapping functional units into containers is an exciting pattern for building service landscapes these days. But is it a novel concept? Not at all. - -In fact, many concepts that are nicely blended in todays microservices cocktail are more like dinosaurs that have escaped from the Jurassic Parc of computer science - not necessarily a bad thing, remembering that dinosaurs are still one of the most powerful and interesting creatures that have ever lived on earth. But what concepts am I thinking of? - -First of all, Micro Kernels! The basic idea of micro kernels was to design a modular operating systems, in which basic functionalities like device drivers, file system implementations and applications are are implemented as services that operate on top of a thin operating system kernel that provides basic primitives like scheduling, inter-process communication, device isolation and hardware I/O. In essence, the micro kernel is a general execution context, and not more. All high level operating system functionality, no matter if it is a VFAT driver or a window manager, would operate on top of the micro kernel. And guess what: The operating system is working simply because all services on top of the microkernel are cleverly interacting with each other, using an API delivered by the microkernel. The idea of micro kernels was first introduced in 1970 by Hansen[1], with a lot of research having been carried in this domain since then. Replace the micro kernel with a Container run-time of choice (CoreOS, Docker, Docker plus Docker Compose) - and it becomes clear that Docker can be seen as a microkernel infrastructure for distributed applications, of course at higher abstraction levels. - -Another fundamental cornerstones of Microservices as they are considered today are REST APIs. Computer scientists also have discussed APIs ever since. For example, modern operating systems (OS) like Windows or Linux do a great job in maintaining long-standing APIs that enable loose coupling between software and the OS. While we even don't notice that anymore, this is the reason why we can download pre-compiled software binaries or "Apps" to a computer or a smartphone, install them, and run them. One of the reasons this works like a charm are standardization efforts like POSIX[2] that have been carried out long before people even thought about Linux Containers. -In the distributed systems domain, we had a lot of discussions about how to do evolvable interface design over the past 20 years, mostly connected to technologies like Corba, Java RMI, XML-RPC or newer stuff like Apache Thrift, Protocol Buffers and now REST. In its core, the discussions have always been tackling the same questions: How can we version interfaces the best? Should we version at all? Or simply keep the old interfaces? In the OS domain, Microsoft is a good example: Windows still allows unmodified Win32 software from the mid nineties to be executed on today's versions of Windows - in the year of 2015. - -At DevOpsCon, I voiced this opinion during the Microservice Lifecycle workshop given by Viktor Farcic. Many people agreed and also said that we're constantly re-inventing the wheel and struggle with the same questions. We had a nice discussion how modern's REST and MicroService world is related to SOAP. And this was in fact the motivation to write this article. - -### DevOps is not about technology### - -First of all, I am not the first person to make this claim. In fact, at the conference there've been a number of people that reported that they needed to adapt their organization's structure to effectively work with MicroServices and DevOps concepts. - -Many speakers at the conference quoted the famous quote by Melvin Conway from 1967 that is commonly referred to as Conway's Law. - -> Organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations ->
-> Martin Conway, 1967 - -Similar to as mentioned by Rainer Zehnle, this led to my assumption that effectively doing Microservices and DevOps somehow doesn't work well in matrix-based organizations. Effectively, matrix-based organizations are often monoliths in which a lot of projects are tightly coupled, due to shared responsibilities of project teams and individuals. - -As already mentioned by Rainer in his blog post, I was really impressed how the folks at Spreadshirt - thank you Samuel for sharing this! - restructured their once matrix-based organization that produced a huge monolith into a company that is able to effectively develop a microservice-based enterprise architecture. I hope that success stories like that are not only shared among software architects and developers in the future, as a faster time to market for software artifacts does not only make a developer happy, but also the manager that carries the wallet. - -### Conclusion ### - -I took a lot from the conference - and I have constantly asked myself the question afterwards if we're ready yet for DevOps and MicroServices as a organization. Are we? Probably not yet, although we're certainly on the right track. And we're in good company: From many talks at the coffee table I got the feeling that many companies in the German IT industry are in the same phase of transition as we are. How do we get more agile? How do we do microservices? Should we have a central release engineering team? Or leave that to DevOps? I am excited which answers we will find at Haufe. We'll keep you updated. Promised. - -[1] Per Brinch Hansen. 1970. The nucleus of a multiprogramming system. Commun. ACM 13, 4 (April 1970), 238-241. DOI=[http://dx.doi.org/10.1145/362258.362278](http://dx.doi.org/10.1145/362258.362278) - -[2] [http://www.opengroup.org/austin/papers/posix_faq.html]([http://www.opengroup.org/austin/papers/posix_faq.html) \ No newline at end of file diff --git a/_posts/2016-01-11-log-aggregation.md b/_posts/2016-01-11-log-aggregation.md deleted file mode 100644 index 6f28912..0000000 --- a/_posts/2016-01-11-log-aggregation.md +++ /dev/null @@ -1,205 +0,0 @@ ---- -layout: post -title: Log Aggregation with Fluentd, Elasticsearch and Kibana -subtitle: Introduction to log aggregation using Fluentd, Elasticsearch and Kibana -category: howto -tags: [devops, docker, logging] -author: doru_mihai -author_email: doru.mihai@haufe-lexware.com -header-img: "images/bg-post.jpg" ---- - -With an increasing number of systems decoupled and scattered throughout the landscape it becomes increasingly difficult to track and trace events across all systems. -Log aggregation solutions provides a series of benefits to distributed systems. - -The problems it tackles are: -- Centralized, aggregated view over all log events -- Normalization of log schema -- Automated processing of log messages -- Support for a great number of event sources and outputs - -One of the most prolific open source solutions on the market is the [ELK stack](https://www.elastic.co/videos/introduction-to-the-elk-stack) created by Elastic. - -{:.center} -![Log aggregation Elk]({{ site.url }}/images/logaggregation-elk.png){:style="margin:auto; width:70%"} - -ELK stands for Elasticsearch – Logstash – Kibana and they are respectively their Search engine, Log Shipper and Visualization frontend solutions. -Elasticsearch becomes the nexus for gathering and storing the log data and it is not exclusive to Logstash. - -Another very good data collection solution on the market is Fluentd, and it also supports Elasticsearch (amongst others) as the destination for it’s gathered data. So using the same data repository and frontend solutions, this becomes the EFK stack and if you do a bit of searching you will discover many people have chosen to substitute Elastic's logstash with FluentD and we will talk about why that is in a minute. - -{:.center} -![Log aggregation Efk]({{ site.url }}/images/logaggregation-efk.png){:style="margin:auto; width:40%"} - -# Logstash vs FluentD -Both of them are very capable, have [hundreds](https://www.elastic.co/guide/en/logstash/current/input-plugins.html) and [hundreds](http://www.fluentd.org/plugins) of plugins available and are being maintained actively by corporation backed support. -### Technology - Fluentd wins -The big elephant in the room is that Logstash is written in JRuby and FluentD is [written in Ruby with performance sensitive parts in C](http://www.fluentd.org/faqs). As a result the overhead of running a JVM for the log shipper translates in large memory consumption, especially when you compare it to the footprint of Fluentd. The only advantage that Logstash can still invoke is the good parallelism support that the JVM brings and very good [Grok](https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html) support. - -The only downside for Fluentd was the lack of support for Windows, but even that has been [solved](https://github.com/fluent/fluentd/pull/674) and [grok](https://github.com/kiyoto/fluent-plugin-grok-parser) support is also available for Fluentd and you can even re-use the grok libraries you had used/built, even [Logstash grok patterns](https://github.com/elastic/logstash/tree/v1.4.2/patterns). -### Shippers - Fluentd wins -They both however offer the option of deploying lightweight components that will only read and send the log messages to a fully fledged instance that will do the necessary processing. These are called log forwarders and both have lightweight forwarders written in Go. As of this writing Elastic has released a replacement for it's [logstash-forwarder](https://github.com/elastic/logstash-forwarder)(formerly called Lumberjack) and it is built on top of it's new Data shipper platform [Beats](https://www.elastic.co/products/beats), and it is called [Filebeat](https://github.com/elastic/beats/tree/master/filebeat). - -This new Logstash forwarder allows for a TLS secured communication with the log shipper, something that the old one was not capable of but it is still lacking a very valuable feature that fluentd offers, and that is buffering. -### Resiliency - Fluentd wins -As mentioned previously fluentd offers buffering, something that you get "for free" and coupled with the active client-side load balancing you get a very competent solution without a large footprint. - -On the other side, logstash doesn't have buffering and only has an in-memory queue of messages that is fixed in length (20 messages) so in case messages can't get through, they are lost. To alleviate this weakness the common practice is to set up an external queue (like [Redis](http://www.logstash.net/docs/1.3.2//tutorials/getting-started-centralized)) for persistence of the messages in case something goes wrong at either end. They are [working on it](https://github.com/elastic/logstash/issues/2605) though, so in the future we might see an improvement in this area. - -Fluentd offers in-memory or file based buffering coupled with [active-active and active-standby load balancing and even weighted load balancing](http://docs.fluentd.org/articles/high-availability) and last but not least it also offers [at-most-once and at-least-once](http://docs.fluentd.org/articles/out_forward#requireackresponse) semantics. - -# Additional considerations - -Logstash benefits from a more chiselled, mature implementation due to the fact that the core and a lot of the essential plugins are maintained by Elastic, and some may argue that it's easier to deploy a JRE and the logstash jar and be done with it while others would consider it overkill to have a JVM running for such a small task. Plus the need to deploy and maintain a separate queueing - -Fluentd provides just the core and a couple of input/output plugins and filters and the rest of the large number of plugins available are community driven and so you are exposed to the risk of potential version incompatibilities and lack of documentation and support. - -I have personally seen that there is a bit of chaos since each plugin creator will define his own set of configuration input variables and there isn't a sense of consistency when you look at different plugins. You will encounter variables that are optional and have different default values, variables that are not properly documented but you can deduct their usage from the examples that the author offers and virtually all known naming conventions will appear in your config file. - -# What next? - -Well, as you can probably already tell, I have chosen to go with fluentd, and as such it became quickly apparent that I need to integrate it with Elasticsearch and Kibana to have a complete solution, and that wasn't a smooth ride due to 2 issues: -- Timestamps were sent to Elasticsearch without milliseconds -- All field values were by default analyzed fields - -For communicating with Elasticsearch I used the plugin [fluent-plugin-elasticsearch](https://github.com/uken/fluent-plugin-elasticsearch) as presented in one of their very helpful [use case tutorials](http://docs.fluentd.org/articles/free-alternative-to-splunk-by-fluentd). - -This plugin allows fluentd to impersonate logstash by just enabling the setting `logstash-format` in the configuration file. I snooped arround a bit and found that basically the only difference is that the plugin will make sure that the message sent has a timestamp field named `@timestamp`. - -And here we arrive at our first problem.... - -### Timestamp fix -This is a pain because if you want to properly visualize a set of log messages gathered from multiple systems, in sequence, to be able to see exactly what step followed the other.....well, you see the problem. - -Let's take a look at what fluentd sends to Elasticsearch. Here is a sample log file with 2 log messages: - -~~~ -2015-11-12 06:34:01,471 [ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO ==== Request === -2015-11-12 06:34:01,473 [ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO GET /monitor/broker/ HTTP/1.1 -~~~ -{: .language-java} - -A message sent to Elasticsearch from fluentd would contain these values: - -*-this isn't the exact message, this is the result of the stdout output plugin-* - -~~~ -2015-11-12 06:34:01 -0800 tag.common: {"message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO ==== Request ===","time_as_string":"2015-11-12 06:34:01 -0800"} - -2015-11-12 06:34:01 -0800 tag.common: {"message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO GET /monitor/broker/ HTTP/1.1\n","time_as_string":"2015-11-12 06:34:01 -0800"} -~~~ -{: .language-java} - -I added the `time_as_string` field in there just so you can see the literal string that is sent as the time value. - -This is a known issue and initially it was the fault of fluentd for not supporting that level of granularity, but is had been [fixed](https://github.com/fluent/fluentd/issues/461). Sadly, the fix has not made it's way to the Elasticsearch plugin and so, [alternatives have appeared](https://github.com/shivaken/fluent-plugin-better-timestamp). - -The fix basically involves manually formatting the `@timestamp` field to have the format `YYYY-MM-ddThh:mm:ss.SSSZ`. So you can either bring on the previously mentioned `fluent-plugin-better-timestamp` into your log processing pipeline to act as a filter that fixes your timestamps OR you can build it yourself. - -In order to build it yourself you only need the `record_transformer` filter that is part of the core of plugins that fluentd comes with and that I anyway would recommend you use for enriching your messages with things like the source hostname for example. - -Next you need to parse the timestamp of your logs into separate date, time and millisecond components (which is basically what the better-timestamp plugin asks you to do, to some extent), and then to create a filter that would match all the messages you will send to Elasticsearch and to create the `@timestamp` value by appending the 3 components. This makes use of the fact that fluentd also allows you to run ruby code within your record_transformer filters to accommodate for more special log manipulation tasks. - -~~~ - - type record_transformer - enable_ruby true - - @timestamp ${date_string + "T" + time_string + "." + msec + "Z"} - - -~~~ -{: .language-xml} -The result is that the above sample will come out like this: - - -~~~ -2015-12-12 05:26:15 -0800 akai.common: {"date_string":"2015-11-12","time_string":"06:34:01","msec":"471","message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO ==== Request ===","@timestamp":"2015-11-12T06:34:01.471Z"} -2015-12-12 05:26:15 -0800 akai.common: {"date_string":"2015-11-12","time_string":"06:34:01","msec":"473","message":"[ ajp-apr-127.0.0.1-8009-exec-3] LogInterceptor INFO GET /monitor/broker/ HTTP/1.1\n","@timestamp":"2015-11-12T06:34:01.473Z"} -~~~ -{: .language-java} -*__Note__: you can use the same record_transformer filter to remove the 3 separate time components after creating the `@timestamp` field via the `remove_keys` option.* - -### Do not analyse - -There are 2 reasons why you shouldn't want your fields to be analyzed in this scenario: -- It will potentially increase the storage requirements -- It will make it impossible to do proper analysis and visualization on your data if you have field values that contain hyphens, dots or others. - -Ok, so first, why does it increase the storage requirements? - -Well, while researching what would be the proper hardware sizing requirements for setting up our production EFK installation I stumbled upon [this](http://peter.mistermoo.com/2015/01/05/hardware-sizing-or-how-many-servers-do-i-really-need/) post that goes in detail about what and why and how big can the problem become. - -Worst case scenario, you could be using up to **40% more** disk space than you really need. Pretty bad huh? - -And the second issue which would become apparent much quicker than the first is that when you will try to make use of Kibana to visualize your data you will encounter the issue that fields that contain hyphens for example will appear split and duplicate when used in visualizations. - -For instance, by using the record_transformer I would send the hostname and also a statically specified field called `sourceProject`, to be able to group together messages that came from different identical instances of a project application. - -Using this example configuration I tried to create a pie chart showing the number of messages per project for a dashboard. Here is what I got. - -~~~ - - type record_transformer - enable_ruby true - - @timestamp ${date_string + "T" + time_string + "." + msec + "Z"} - sourceProject Test-Analyzed-Field - - -~~~ -{: .language-xml} -Sample output from stdout: - - -~~~ -2015-12-12 06:01:35 -0800 clear: {"date_string":"2015-10-15","time_string":"06:37:32","msec":"415","message":"[amelJettyClient(0xdc64419)-706] jetty:test/test INFO totallyAnonymousContent: http://whyAreYouReadingThis?:)/history/3374425?limit=1","@timestamp":"2015-10-15T06:37:32.415Z","sourceProject":"Test-Analyzed-Field"} -~~~ -{: .language-java} -And here is the result of trying to use it in a visualization: - -{:.center} -![Log aggregation analyzed]({{ site.url }}/images/logaggregation-analyzed-field.png){:style="margin:auto; width:35%"} - -I should mention, what you are seeing is the result of 6 messages that all have the field sourceProject set to the value "Test-Analyzed-Field". - -Sadly, once you put some data into Elasticsearch, indices are automatically created (by the fluent-plugin-elasticsearch) and mappings along with them and once a field is mapped as being analyzed [it cannot be changed](https://www.elastic.co/blog/changing-mapping-with-zero-downtime). - -Curiously this did not happen when using Logstash, which made me look into how they are handling this problem. Then I discovered the issue was discussed also in the context of the fluent-plugin-elasticsearch and [the solution was posted there](https://github.com/uken/fluent-plugin-elasticsearch/issues/33) along with the request to include it in future versions of the plugin. - -And the solution is: When Elasticsearch creates a new index, it will rely on the existence of a template to create that index. Logstash comes with a template of its own that it uses to tell Elasticsearch to create not analyzed copies of the fields it sends to it so that users can benefit from the analyzed fields for searching and the not analyzed fields when doing visualizations. And that template can be found [here](https://github.com/logstash-plugins/logstash-output-elasticsearch/blob/master/lib/logstash/outputs/elasticsearch/elasticsearch-template.json). - -And what you basically need to do is to do a curl put with that json content to ES and then all the indices created that are prefixed with `logstash-*` will use that template. Be aware that with the fluent-plugin-elasticsearch you can specify your own index prefix so make sure to adjust the template to match your prefix: - -~~~ -curl -XPUT localhost:9200/_template/template_doru -d '{ - "template" : "logstash-*", - "settings" : {.... -}' -~~~ -{: .language-bash} -The main thing to note in the whole template is this section: - - -~~~ -"string_fields" : { - "match" : "*", - "match_mapping_type" : "string", - "mapping" : { - "type" : "string", "index" : "analyzed", "omit_norms" : true, - "fielddata" : { "format" : "disabled" }, - "fields" : { - "raw" : {"type": "string", "index" : "not_analyzed", "doc_values" : true, "ignore_above" : 256} - } - } -} -~~~ -{: .language-json} -This tells Elasticsearch that for any field of type string that it receives it should create a mapping of type string that is analyzed + another field that adds a `.raw` suffix that will not be analyzed. - -The `not_analyzed` suffixed field is the one you can safely use in visualizations, but do keep in mind that this creates the scenario mentioned before where you can have up to 40% inflation in storage requirements because you will have both analyzed and not_analyzed fields in store. - -# Have fun -So, now you know what we went through here at [HaufeDev](http://haufe-lexware.github.io/) and what problems we faced and how we can overcome them. - -If you want to give it a try you can take a look at [our docker templates on github](https://github.com/Haufe-Lexware/docker-templates), there you will find a [logaggregation template](https://github.com/Haufe-Lexware/docker-templates/tree/master/logaggregation) for an EFK setup + a shipper that can transfer messages securely to the EFK solution and you can have it up and running in a matter of minutes. diff --git a/_posts/2016-01-18-fluentd-log-parsing.md b/_posts/2016-01-18-fluentd-log-parsing.md deleted file mode 100644 index 11f27f5..0000000 --- a/_posts/2016-01-18-fluentd-log-parsing.md +++ /dev/null @@ -1,218 +0,0 @@ ---- -layout: post -title: Better Log Parsing with Fluentd -subtitle: Description of a couple of approaches to designing your fluentd configuration. -category: howto -tags: [devops, logging] -author: doru_mihai -author_email: doru.mihai@haufe-lexware.com -header-img: "images/bg-post.jpg" ---- - -When you will start to deploy your log shippers to more and more systems you will encounter the issue of adapting your solution to be able to parse whatever log format and source each system is using. Luckily, fluentd has a lot of plugins and you can approach a problem of parsing a log file in different ways. - - -The main reason you may want to parse a log file and not just pass along the contents is that when you have multi-line log messages that you would want to transfer as a single element rather than split up in an incoherent sequence. - - -Another reason would be log files that contain multiple log formats that you would want to parse into a common data structure for easy processing. -Below I will enumerate a couple of strategies that can be applied for parsing logs. - -And last but not least, there is the case that you have multiple log sources (perhaps each using a different technology) and you want to parse them and aggregate all information to a common data structure for coherent analysis and visualization of the data. - -## One Regex to rule them all -The simplest approach is to just parse all messages using the common denominator. This will lead to a very black-box type approach to your messages deferring any parsing efforts to a later time or to another component further downstream. - - -In the case of a typical log file a configuration can be something like this (but not necessarily): - -~~~ - - type tail - path /var/log/test.log - read_from_head true - tag test.unprocessed - format multiline - format_firstline /\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2},\d{3}/ - #we go with the most generic pattern where we know a message will have - #a timestamp in front of it, the rest is just stored in the field 'message' - format1 /(?