The Log Pipeline You Can't Have

I've been working on the logging system for some of our products and it brought an article back to mind that I read years ago: Avery Pennarun's The log/event processing pipeline you can't have.

At Pennarun's work all their application logs were written directly to the kernel ring buffer via /dev/kmsg so they survive crashes. A log uploader ships them upstream as plain text files. Only then do you parse, structure, and process them. A few of our services use write-ahead logs to prevent dataloss.

In a world of containers - apart from the single log file to be shipped - we pretty much have Pennarun's tooling. In Kubernetes we have /var/log/pods, Docker is /var/lib/docker/containers/. For our simpler projects, we just ship a single static binary, no containers needed.

For systemd services you get Pennarun's pipeline with almost no effort. Your process writes to stdout, journald captures it to a durable on-disk store, and a shipper like journalbeat or promtail forwards the raw entries to a central system where they are parsed and structured. The only thing you lose compared to /dev/kmsg is kernel-panic survivability — journald is userspace — but for typical server workloads that rarely matters.

As always, I appreciate any feedback or if you want to reach out, I'm @neuralsandwich on twitter and most other places.