How GOV.UK Reduced their Incidents and Alerts

incidents_and-alerts

By David Mytton,
CEO & Founder of Server Density.

Published on the 25th October, 2016.

Did you watch last week’s HumanOps video—the one with Spotify? How about the one with Barclays?

Keep reading gentle reader, this is not some Friends episode potboiler joke. We just can’t help getting pumped up with all the amazing HumanOps work that’s happening out there. Independent 3rd party events are now taking place around the world (San Francisco and Poznan most recently).

So we decided to host another one closer to home in London.

The event will take place at the Facebook HQ (get your invite). And for those of you who are not around London in November, fear not. We’ll fill you in right here at the Server Density blog.

In the meantime, let’s take a look at the recent GOV.UK HumanOps talk. GOV.UK is the UK government’s digital portal. Millions of people access GOV.UK every single day whenever they need to interact with the UK government.

Bob Walker, Head of Web Operations, spoke about their recent efforts to reduce their incidents and alerts (a core tenet of HumanOps). What follows is the key take-aways from his talk. You can also watch the entire video or download it in PDF format and read at your own time (see right below the article).

GOV.UK does HumanOps

After extensive rationalisation, GOV.UK have reached a stage where only 6 types of incidents can alert (wake them up) out of hours. The rest can wait until next morning.

GOV.UK mirrors their website across disparate geographical locations and operates a managed CDN at the front. As a result, even if parts of their infrastructure fail, most of their website should remain available.

Once issues are resolved, GOV.UK carries out incident reviews (their own flavour of postmortems). In reiterating the importance of blameless postmortems, bob said:

Every Wednesday at 11:00AM they test their paging system. The purpose of this exercise is to not only test their monitoring system but also to ensure people have configured their phones to receive alerts!

Want to find out more? Watch Bob Walker’s talk. And if you want the full transcript, go ahead and use the download link right below this post.

See you in a HumanOps event!

Take the GOV.UK talk with you

Want to read at your own pace? Download the talk in PDF format.

Help us speak your language. What is your primary tech stack?

What infrastructure do you currently work with?

Articles you care about. Delivered.

Help us speak your language. What is your primary tech stack?

Maybe another time