Friday 6 June 2008

Rack Mount 1, Technician 0

Failure is coming to get you, but we're getting better at predicting the scenarios and coding for them.  We think a lot about servers dying, losing network connectivity, power cuts, and how to respond to critical bugs.  These things are essentially unexpected technical events, but there is a whole other category at play in real life - human error.

Imagine this server has your data on it...

What will your customers see while that gets put back together?  How are you going to get the data back?

Despite the comedy value of that clip, this is exactly the sort of thing that happens in real life - people make mistakes.  But even when this kind of maintenance is less clownishly executed, it still needs to happen - and you need to decide what effect you're going to let planned maintenance events have on your revenue stream.

