An oxygen canister was dented. Tests showed it wouldn't drain properly. If time and spare parts permitted, staff would have replaced the canister. Instead, they resorted to a workaround, using the internal heater, which seemed to function perfectly. The rush to keep on track blinded everyone to the risk -- until the canister exploded on Apollo 13.
Workarounds affect all sorts of operation and maintenance procedures. They usually play an important role in process startups and commissioning.
The workaround has become synonymous with good engineering practice -- but isn't. In an ideal world, an engineering team should address a problem as soon as it surfaces, not rely on a workaround.
What makes a workaround so dangerous? Complacency, for one thing. Complacency leads to re-using old, perhaps untried, or obsolete standard operating procedures (SOPs), and adapting others where they don't exactly apply. Over-familiarity prompts people to relax around inherently lethal equipment.
Exhaustion also can make workarounds dangerous. Long hours wear down even your best people. You don't want dead-tired troops on their own writing a workaround for a process problem; someone with unredened eyes should look it over and ask, if necessary, "You want to do what?"
That brings up my next cause: poor management. Recently, an engineer signed -- without reading -- an operational change request (OCR) I put in that involved a workaround. He told me he didn't need to read it because he trusted me. I scolded him, "Don't trust anybody." Journalists have the saying, "If your mother says she loves you, check it out." Remember, humans are only 98% accurate. That leaves a lot of room for doubt.
Communication is the reason why management exists! Part of the problem with most SOPs and OCRs is that they don't present information in a simple manner. Flow charts beat checklists. Graphics are better than text. These forms often are poorly understood and followed -- here management could help by writing clear instructions.
In "What Went Wrong," Trevor Kletz describes a mix-up involving pump maintenance (p. 29, 5th edition). Several identical pumps were in a row. The pump repair checklist ended with pump removal. It turned out that one pump was missing but this only was discovered when a lead operator walking by spotted the omission. A refurbished pump had not been re-installed, commissioned and checked out -- actually, no repair work had been done. Flow charts are better for conveying simple instructions.
Another problem -- what I call "silo-ing" -- is a natural extension of poor management. I've noticed that when resources are stretched and time is tight, an ad hoc team forms. Information flow usually gets restricted; management may not have a clear picture of what's going on. This can lead to safety problems if the ad hoc team exists too long. The best way to avoid problems is to embed management on these teams when they are formed and to involve the ad hoc team members in the SOPs and reliability improvements.
That brings me to the last topic. How do you manage your troubleshooting resources? First, these engineers must recognize a workaround as a problem. Disarming interlocks is a clear workaround topic. However, some workarounds are less obvious. Hoses are a workaround if you think about it. If operators hammer a pipe flat to unplug flow, that's a workaround even if pen never touched paper.
One useful tool in assessing risk is layers of protection analysis (LOPA). It's overly conservative but simple, and should be applied to every workaround. Unfortunately, its value isn't always appreciated (see "Overcome Skepticism About LOPA,") Now, let's consider some questions your reliability team should be asking about workarounds. Of course, start with: "Why is a workaround necessary?" Then, check: "Have the operators (or mechanics) done more than walked down the process steps?" You want them to rehearse the operation. Here's my favorite: "Did the design engineer write a rough procedure, as a learning tool, so the design would reflect equipment operation?" With new equipment ask: "Has the vendor or designer signed off?"
If something new appears, either from LOPA or hands-on-experience, consider a what-if hazard and operability review (HAZOP). For serious risks, or for new processes, conduct a full HAZOP. Eventually, you may want to borrow from Six Sigma methods to select process improvements needed to eliminate these workarounds.
Workarounds always should come with an enforced expiration date.
DIRK WILLARD is a Chemical Processing Contributing Editor. You can e-mail him at firstname.lastname@example.org