Jul
21

The Paper Process

I’m sorry, I can’t help you because the system is down

How many times have you heard this apology? Often, the apology is “The system is down again.” In some places the system is always down around certain times and you get the same apology every day.

Why are we suprised when systems are unavailable? There will be bugs, mistakes and scheduled downtime. We should not be suprised the system is down, we should expect it.

Some time ago, an IT architect asked me during a job interview if it was possible to build reliable systems out of unreliable parts. My response:

The only way to build a reliable system is to build it out of unreliable parts (or systems)

If you want to build a reliable system, you have to be aware that all its subsystems are unreliable. This allows you to take appropriate measures, like building in redundancy.

We need a paper process

A smart manager I once worked with insisted that every business-critical automated process also required a backup “Paper Process”.  The Paper Process defined how the work would be done when (not if) IT systems were down, with nothing more complicated than pen and paper (and a battery-powered calculator if really needed). When systems were unavailable everybody knew what they had to do to keep the business ticking over as if nothing happened. They also knew how to catch up when the systems were available again. In this case, it was clear who the bottlenecks were, so the other people subordinated by entering the backlog of data on paper into the system. Did I mention that this department used less automation than other similar departments, yet had a better track record of delivering on time,  was more efficient and brought in more money?

Defining an alternative Paper Process was relatively easy, because we really understood the real business requirements of our customer. Since then I use this as a test of my understanding of the real requirements: could I implement the requirements with nothing more than paper and pencil? If you have real requirements (and not a solution in disguise) it’s easy to define several different implementations.

Since that project I learned more about processes. If I had to do this project again I would do it the other way round: implement the Paper Process first and ask what, if anything, we would need if the Paper Process broke down. Maybe a whiteboard is enough.