Accident investigation is a regulatory requirement under OSHA 1910.119(m). Such an investigation requires careful analysis of evidence to arrive at probable cause(s) of the event and develop appropriate safeguards to prevent its recurrence. Ideally, investigators would have ample time to analyze all evidence. In reality, plant personnel (and management) almost always feel tacit pressure to resume operations as soon as practicable. Although this pressure is understandable from a business economics view, hastily performed accident investigations could fail to catch the real culprit, the probable cause(s) of the accident. Critical evidence inadvertently could get destroyed, leading the investigators to rely on invalid assumptions. The same accident could recur again and again.
To get a plant up and running as soon as practicable while performing an effective investigation depends upon analyzing the accident and developing safeguards carefully and efficiently. Safety professionals, plant engineers, operators, maintenance professionals and management play a vital collective role in developing a framework for efficient accident investigations.
At the strategic level, engineering and administrative controls as well as safety culture jointly contribute to an efficient and reliable accident investigation. The key to ensuring efficient accident investigation is preparedness, i.e., having systems in place to deal with accidents. To improve your preparedness and, thus, the efficiency of your accident investigations, pay particular attention to five factors:
• Fault-tolerant systems;
• Data management to aid investigations/troubleshooting;
• An in-plant accident investigation team (AIT);
• Key spare components on standby; and
• Safety culture in middle and top management.
Briefly put, a fault-tolerant design focuses on preventing an accident or minimizing its impact.
At the design stage, process hazard analysis is helpful in identifying hazards and then minimizing their impact and occurrence. One important tool to address these hazards is fault-tolerant design. Simply put, it uses systems to blunt the adverse impact of an accident. Some examples of fault-tolerant systems include double-wall pipes or sumps with annular space monitoring, dikes, redundant instrumentation (e.g., dual-level transmitters on storage tanks), equipment separation, building locations in safe zones, and plant siting away from sensitive areas such as aquifers, rivers, lakes, parks, populated areas or wetlands.
Fault-tolerant systems help you improve efficiency of an accident investigation by decreasing the impact of an accident; this, in turn, reduces the tacit pressure to wrap up the investigation as quickly as possible.
In a broad sense, accident investigation is a vital step for determining measures to take and systems to install to minimize an event’s recurrence and impact in the future. In addition to fault-tolerant designs, multiple layers of protection help thwart accidents. Such layers of protection typically include alarm systems, control systems, relief valves, interlocks, safety instrumented systems, operator training and testing, alarm rationalization, and emergency response systems.
The reliability of an accident investigation depends upon the team having access to critical data. Such data could be destroyed during an accident. So, preserving these data is key.
What are critical data? This depends on the process and equipment. For example, on a distillation column, critical data may include pressures, differential pressures, temperatures, rate of rise of pressures, levels in the bottom and in the reflux drum, reflux flow, relief valve set points and maintenance records, and feed composition. For equipment such as compressors, turbines, transformers and flares, vendors can provide insights about safety critical data.
It’s also important to preserve records of operations, operator logs, instrument calibrations and maintenance history. Develop appropriate and easy-to-use database systems or data historians for these records.
After identifying the safety critical data, put in place systems to ensure the data set will not be destroyed during an accident. Consider the following steps:
• Protecting data transmission from the field to the control room and data management (data historians) from potential hazards such as fires, heavy rains, flooding, dropped objects and electromagnetic or radio frequency interference.
• Arranging safety critical data in a format that’s easy to use for accident investigations and troubleshooting.
• Time-stamping the data.
• Taking measures to ensure data security from cyber attacks.
• Using modular systems that facilitate system expansion.
• Performing system upgrades (to avoid obsolescence of the data management systems).