Logging for fun and profit
Once you have more than a few servers and applications running, handling all those logs is going to become a problem. Not in the sense of managing rotation – syslog (or the app itself) will deal with that, but rather making use of data; otherwise what’s the point of logging in the first place?
So why log stuff? There are probably 3 main use cases depending on where the logs are coming from:
- System logs – the likes of
/var/log/securewhich store output from the OS which is useful after some event – checking login attempts or figuring out why a process was killed by OOM.
- 3rd party application logs – such as Apache or MongoDB. These are useful when you’re setting things up or tracking down a specific problem. Is anyone still using the Apache access log for traffic analytics (file downloads maybe, but not real users)?
- Your application logs – when stuff is happening in your own code. For example we log each request and response we send and receive for our website monitoring service.
The first 2 cases are fairly obvious but it’s the last case – logging what your own code is doing – that has surprised us in how useful it is for providing technical support to customers in particular. It’s easy to understand why you might want to log errors in your code and even verbose debug output when you’re testing but over the last few weeks we’ve significantly increased our log output specifically to help with customer support requests.
We stream all our logs to Papertrail and give our engineers access so they can see what is happening now and in the past without needing access to production servers. Our logs are well formatted to allow us to do granular searches on whole accounts or specific items within an account. For example you can monitor multiple websites on a single account and we identify each website with a token that can be searched.
A common support request is debugging why we’re getting a certain response from a URL. With our logging we can see all the requests going out and the full response we get back (minus the content because that can be quite large and isn’t useful to log because we already store it and can look it up separately). All this requires us to do is conduct a search based on the ID and we get the logs from all our remote monitoring nodes via a simple web UI:
This is interesting because you probably consider logging something which is only useful to sysadmins or developers when actually, if you structure the logs correctly and/or train support engineers how to read the output, it will reduce the number of cases which have to involve product engineering or operations teams and mean customers can get answers to their questions faster.
It seems obvious but until you actually help a customer without needing to log into a server, it’s difficult to appreciate how useful it really is!
Enjoy this post? You may also like Introducing Sockii: HTTP and WebSocket aggregator