systems thinking « Thinking for a Change

London, the final frontier

The Space Game

Royal Festival Hall Portia and I have been working on a Real Options game, with the help of Vera. Last Friday, we held our first tryout.

We set up shop in the friendly environment of the Royal Festival Hall with our game props: a galaxy game board, space ships, planets, sweets, maps, stories, news items, stuff to play with… The usual motly, colourful items that signal to participants that this is safe, ‘just’ a game.

Portia told the story behind the game: participants had to fulfill a mission to preserve peace across the galaxy. We explained some of the rules. Participants had to ask us more questions to discover what this simulation was all about.

We played the game in several rounds. In each round, the teams had to plan their move and then execute their chosen move on the game board. Along the way we introduced real options concepts. The players were really ‘in’ the game, fully absorbed. Near the end, they discovered an important concept.

I can’t tell you what, you’ll have to play the game yourself!

After the game, we held a retrospective with the players. This was the most important part of the evening for us, because the first rule of game and session development is:

Space ships Tryout, feedback, improve, repeat

If you want to create a great game, session or performance, iteration is essential. You gather your ideas, create a structure, bring in all the props… and then the real work begins. You get valuable feedback from the participants and by observing, you improve the game. And then you do it again. And each time the performance improves.

Yes, release often, iteration, feedback and simplicity are useful for game design too.

And courage… We were a bit nervous. Would the game work? Would the concepts be clear? Would the participants like it? The participants did have fun and learned something. We learned a lot. There’s a lot to improve.

Come out and play

If you want to have fun and learn more about real options you can play the game at Agile North on 26th of April and at XP Days France on 5-6 May. See you there!

A big T-H-A-N-K Y-O-U to David, Daniel, Maria, Sharmila, Matt, Chris and Henry for being such great players and for the excellent feedback.

Improving voter throughput pt. 2

Flow? Traffic jams.

As we saw in the first part of this story, we tried to create flow to keep the constraint resource (the voting booths) utilized at maximum capacity. The process diagram looks quite clean and clear. But look at how it looks when we lay out the process steps on the physical layout of the room:
Voting physical layout

See how the paths of people entering and leaving the booths criss-cross? Because the reader in the box where voters insert their token (step 6 ) is quite slow (and because this reader has a usability problem we’ll see later), people start to queue before the token reader. Not so much flowing as a traffic jam.

Because people come back to the token reader after leaving the booth, they also go back in the direction of the entrance. Our natural instinct is to leave a ‘room’ the same way we entered it. The result is that some people forget to take back their ID (step 8 ) and most people try to leave via the entrance, trying to force their way through the queue that’s forming there.

The designers of these polling rooms should read pattern 131 “The flow through rooms” from Christopher Alexander’s A Pattern Language: “The movement between rooms is as important as the rooms themselves; and its arrangement has as much effect on social interaction in the rooms, as the interiors of the rooms”. Move that token reader and the returning of the ID to the extreme left and near the exit. The paths of people entering and leaving polling booths will intersect less. After inserting the token, people will more likely move to step 8, because it is nearer than any other potential activity. Because the exit is now in sight (and the entrance isn’t), they will naturally exit the room via the exit.

The usability (or lack thereof) of cards

The voting computers record votes on creditcard-like cards with a magnetic stripe. To operate the voting computers, voters insert the cards horizontally in the computers, magnetic strip at the bottom, much as they would do at an ATM. I don’t think anybody had any problem with that.

After voting, the cards are inserted in a sealed box with a card reader. The reader verifies the validity of the card and increments a counter. The reader is placed horizontally, not vertically like an ATM mounted in a wall.

How would you insert the card?

Below is a picture of both sides of the card. How would you insert the card in the reader? Like the card on the left or the card on the right? There are two more permutations, but these two are the most used ones. Take a moment to decide. Don’t look at the solution below 🙂

Voting cards
No peeking!

The correct answer is… the card on the left. The option chosen by the large majority of users is the card on the right, despite the fact that there is a drawing on box depicting the image on the left. Horizontal card readers like this one are used frequently at payment terminals, but in that case you have to insert the card with the image facing yourself, just like the card on the right. Card users know these conventions:

In a vertical (ATM style) reader: magnetic strip below, image above
In a horizontal (payment terminal) reader: magnetic strip rear, image front

This card reader defies convention (just after the voting computer has conformed to the convention) and users are confused.

Stop and go at the slow reader

The card reader is quite slow. If you try insert the second card too quickly after the previous card, the card is stuck. The first instinct of the voters is to push harder. This doesn’t work. The only thing that works is to take the card back and wait until the reader is ready. How do you know the reader is ready? After a while, we knew that the reader emitted a faint “whirr-click” when it had processed one card and was ready for another card. The best solution would be to have a faster reader. If that’s not possible, give the user some indication when the reader is ready, for example with a small red and green light. Neither of these options were available to us, so we had to spend some time to regulate access to this mini-bottleneck.

Keep it flowing

Most of the work throughout the day is about the same in size: verify and register one or a group of two or three people and let them vote. Sometimes, the work is larger and takes longer. For example, when someone acts like a proxy, their documents have to be verified and filed before they are allowed to vote. In these cases, the person is immediately taken out of the line at step 2 and put into a waiting buffer while verifications are performed. Meanwhile, the other (standard) voters get processed as usual. As soon as the verification is performed, the voter is returned to the head of the queue and their (now standard) request is treated. Except for step 2, everybody only deals with small, standard tasks.

Being interruptable, so you don’t have to be

Of course, someone has to verify those proxy documents. Someone has to answer questions or help people who have problems. All of those interrupt-driven tasks are performed by the team leader. They don’t perform any of the tasks in the process, they are available to be interrupted. If a new interrupt arrives when the team leader is already servicing an interrupt, an idle team member handles the interrupt or the interrupt is queued for the team leader.

An interruptable teammember with plenty of slack. Every team should have (at least) one!

Are we having fun yet?

Are you the kind of person that thinks this is a fun way to spend a Sunday morning? Are you the kind of person who thinks “Hmmm… How could I make this go faster? How could I avoid these errors? How could I make this more fun?” at work, in the checkout queue in the supermarket or when you’re mowing the lawn? If you are or would like to be like that, join Rob Westgeest and me at the “I’m not a bottleneck! I’m a free man!” simulation at the Agile Business Conference on 2 and 3 October 2007 in London. Or join us at the XP Days Benelux 2007 or any other fine agile event in your neighbourhood.

Improving voter throughput pt. 1

Poll Day

Today, Belgians voted for the Federal parliament. This year, for the first time, I was chosen as part of a 5 person team to man the polling station. So, I spent most of Sunday thinking about and implementing process improvement. This Lean and Theory of Constraints stuff is addictive!

Step 1: What is the Goal?

We want to reduce the waiting time, the time people spend in the queue before being able to cast their vote. The queue length is easy to see, that gives us Visual Control.

Step 2: Where is the bottleneck?

Most polling stations use computers, some use paper ballots. The computer-based voting process has the following steps: Voting process 1

Voter presents ID card and their invitation to vote
Team verifies that the voter has come to the correct polling station and that the invitation matches the ID
Team verifies that the voter is on the register and register their presence
Voter receives their voting token, a card with a magnetic stripe
Voter chooses their candidates for the lower House and the Senate (X 5 polling booths)
Voter inserts their voting token (the magnetic strip contains their choice) into a sealed box. The token is validated and counted during insertion.
Team verifies that the voter is on the register and register their presence again, by second person
Voter receives back their ID card and their invitation, now stamped to indicate that they have voted
Voter leaves.

It was quickly apparent that the voting booths (thick red lines) were the bottleneck: we could fill all the booths with voters, everybody on the team had idle periods, even when there was a queue. We had a ‘bottleneck in waiting‘ in step 3 (thin red line): when a lot of people showed up after a slow period (and the booths were empty), voters had to wait for step 3 to get to the booth.

Step 3: exploit the constraint

So, if the booths are the bottleneck, we will improve throughput by exploiting the booths: try to keep the booths filled at all time. As soon as someone left a booth, the next person in line received a voting token and could enter the booth. During busy periods (of course, voters don’t arrive evenly spaced out over the day), when there was a queue, we achieved very high utlization of the constraint resource.

Step 4: subordinate to the constraint

No problem for the team members behind the bottleneck. They could work at a sustainable pace, as people came out of the booths.

The three people performing steps 2-3-4 worked at the rhythm of the booths. A one-person buffer was created before the bottleneck, so that there was always someone ready to enter a vacated booth. Another one-person buffer was created before step 3, the one task that might starve the bottleneck.

The voting computers are not very dependable. They regularly break down, leading to long queues and in a few cases to delays in voting. Another subordination is to have plenty of people on site or on call who can fix the computers. One of the five computers broke down twice during the day. Whenever that happened, queues lengthened, reinforcing our assessment that the voting booths/computers were the bottleneck. These breakdowns accidentally “lowered the water level to expose the rocks“. Luckily, in both cases, the machines were ‘fixed’ (or just rebooted?) quickly.

Fixing quickly is nice; avoiding problems is better. What are the root causes of these breakdowns and installation errors? In one polling station the computers didn’t work, because “the cables were connected incorrectly”. Is there a way to detect this kind of problems quickly (Jidoka) or to avoid wrong connections (Poka-Yoke)?

One-family flow

Voting process 2 There are two one-person buffers, before the two bottlenecks. The rest is flow, following the “flow where you can, pull where you must” adage. We found a way to reduce the load on the minor bottleneck by increasing the first buffer to one family. The electoral register is sorted by address. Therefore, it is very easy to check off multiple people living at the same address. Thus, the person performing step 2 takes on a bit more work (not being the bottleneck, there’s capacity to spare): look ahead in the queue so that people living at the same address are presented together for verification. Thus, the buffer before step 3 is family-sized, the buffer before the booths is still one person. With this improvement of step 3 in place, we could fill the booths up more quickly after a lull, and thus bottleneck utilization went up again.

Step 5: elevate the constraint

Most people use the voting computers without problem. I found them to be a little slow. However, some people have trouble using the computers. This is probably the one time in the year that they use a computer. The display tries to mimic the paper and pen-based voting process, but the UNDO and CONFIRM buttons seemed to confuse people: “Why should we confirm our choice? We’ve already made our choice with the pen“. The mixed metaphor confuses: it’s like paper-based voting (mark your choice with the electronic pen), with which voters are familiar, combined with the undo-ability of an electronic form, with which many voters are not familiar. Of course, we couldn’t do any detailed user interaction studies, what with the confidentiality of the vote 🙂

The voting rules were another cause of confusion and questions. You can vote for a party and primary and subordinate candidates. Which options can you combine, which combinations are illegal? Paper-based voting has the same problem. There were explanations in most media before the vote, but I believe a reminder in the polling station would be useful.

Of course, we could always install more booths. That would be expensive and wouldn’t give us much: most of the time there were no queues or queues with only a few people waiting. Waiting time was never more than 5 minutes between entering the polling station and entering the polling booth. I can’t give you exact numbers (only “most”, “some”, “a few”…), because it’s hard to be participant and observer of a process at the same time.

Step 6: And again!

People arrive randomly at the polling station. Very busy moments alternated with idle time. There was no predictable “takt time”. We made good use of this “feature”. During “slow moments”

team members took breaks. Because the two sub-groups in the team (before and after the bottleneck) made sure that everybody could perform all the tasks of the subgroups this was never a problem. The other teammembers could still perform all the work, albeit slightly slower.
we performed lower-priority “book-keeping” and administration chores.
we looked at the process and discussed improvements. And so we kept on improving.

We have applied the Theory of Constraints and a few Lean techniques. Are we satisfied now? No. Lean people are never satisfied! Next time we’ll see more ways to improve voter throughput.

Agile Open: Day Two

Agile Open. Day Two.

We planned to start day two with a re-planning session: look at the plan we made yesterday and adapt where necessary, based on new information. Raphael then asked the question if we shouldn’t try to go more with the “open space” idea, instead of having planned sessions. We decided to have a planned session in one track (the XP Game) and an unplanned session in the other track.

That meant I got to play an XP Game for the first time, after having hosted so many runs. I discovered I’m crap at inflating and putting knots in balloons. Halfway through the game I had to leave for the next session. This gave my team the opportunity to learn what you do when 1/4 of your team leaves: you reduce your velocity to 3/4 of what you produced in previous releases.

Metrics and Thoughput Accounting

I proposed this session to get some input. Throughput Accounting has basically three important variables, expressed in monetary terms for convenience:

Throughput = fresh money coming in from sales
Operating Expense = money going out to keep the company going. Once spent, the money is gone (wages, energy, rent…)
Investment = money that must be put in to be able to generate value. This is the most tricky category to explain.

Of course, time is also involved. The longer the time it takes to generate throughput, the higher the investment will be. To emphasize the fact, I keep time as a separate variable. All these variables are pretty easy to measure at the company level. We want to align the work we do with an improvement in these company metrics: increase throughput or decrease time, investment or operating expense.

The goal of the session was to find some metrics or indicators at an individual (IT) project level. We brainstormed some potential metrics for each of the 4 throughput accounting variables.

The throughput accounting variables and formulas are very simple. The only problem is that all the variables are interrelated. If you change one component of I, it’s going to have an effect on t, T and OE. And vice versa. You can’t really create a mathematical model of a project, but you can apply systems thinking. The advantage of methodologies with short iterations or releases is that you shorten the feedback loops, thus making it easier to see the result of your action and react in time.

We didn’t come to a conclusion. I’ll have to do some more thinking about it. Expect some throughput accounting posts in the near future…

Agile Open: Thinking for a Change

Thinking for a change

Marc Evers and I hosted a “Thinking for a Change” session. Not the whole session, just the “Current Reality Tree” to discover root causes of problems participants brought to the session.

Thinking for a Change 1 Team 1: the build keeps failing

On the left, the first group. They analyzed a problem where a team had a failing build for days on end and didn’t do anything about it.

You have to wonder why they put the automated build and tests in place. What was their goal?

Team 2: balance between work and life

Thinking for a Change 2nd group The second team, on the right analyzed the root causes of a situation where the balance between work and life wasn’t right. This was due, amongst other things, to long travel times between home and workplace. We had a similar situation during the session at SPA 2006.

You can use the thinking tools for other things than technical problems.

Thinking for a Change 3rd group

Team 3: getting two teams to work together

The third team analyzed a situation where development and operations teams didn’t work well together.

This situation was quite similar to the one I present to explain the technique, except that my example talks about one team and this situation was about the whole company. I managed to resolve the problem for one team. It will take a bit more thinking and effort to solve it for the whole company.

Finding the root of all evil

The teams found some potential root causes for their problem, but they needed a bit more time than the 90 minutes we spent today. I hope the three “customers” got a bit more insight in their situation. I find the Current Reality Tree a very useful tool to concentrate on a problem. All too often, we jump directly to a solution, before we really understand the problem. The CRT steps force me to go slowly and study the problem in depth, before I can start to think about solutions, with the Future Reality Tree.

I use the Thinking Processes every day, together with the other people who are involved in the situation. I don’t need to explain the technique, we just do it. All it takes is a piece of paper, a pen and a few people who want to solve a problem.