Introducing Alert Costs

When an engineer is working on a complex task the worst thing you can do is expose them to random alerts. Here is how to mitigate that with Alert Costs.

    Read more


Diversity is Good Business. Here is Why

We wanted to address this now, while our company culture was in its formative years. Here is why diversity is good business, and what we've done so far.

    Read more

website monitoring nagios

Website Monitoring with Nagios

Why not use someone else’s global infrastructure to automatically perform your web checks? Here is how you can do website monitoring with Nagios.

    Read more

How Spotify and GOV.UK handle on-call

How Spotify and GOV.UK handle on call, and more

Francesc Zacarias, SRE engineer at Spotify, and bob Walker, Head of Web Operations at GDS GOV.UK, spoke about their on call approach.

    Read more


How to Monitor Redis

What follows is our very own checklist of best practices for Redis, including key metrics and alerts we use to monitor Redis at Server Density.

    Read more

Status Update Header

How to Write Service Status Updates

The lowly service status update is an essential piece of communication. Here's how we write status updates that send the right message to our customers.

    Read more

Sparklines Header

Launching Sparklines for iOS

Unlike computers, most humans aren’t great at assimilating raw data. In order for us to understand, remember, and emote to data, we need context.

    Read more


HumanOps: Making Operations Human

HumanOps starts from a basic conviction, namely that technology affects the wellbeing of humans just as humans affect the reliable operation of technology.

    Read more

Articles you care about. Delivered.
Maybe another time