The recently released report of the Independent Safety Review Panel that investigated the safety culture and practices at BP’s U.S. refineries in response to the deadly March 2005 explosion at Texas City, Texas, provides somber reading (see "Panel blasts BP's safety practices").
Portions of this article are excerpted from “Guidelines for Safe and Reliable Instrumented Protective Systems,” published this year by AIChE’s Center for Chemical Process Safety, New York.
The panel stressed that the failings identified undoubtedly occur elsewhere in the industry. “The panel's findings present a landmark opportunity for the boards of directors and executives of oil and chemical companies throughout the world to re-examine their own safety cultures and ask whether they are sufficiently investing in the people, procedures, and equipment that will make their workplaces safe from catastrophic accidents,” says Carolyn W. Merritt, chairwoman and CEO of the U.S. Chemical Safety Board . “This is an opportunity for review and reform on a worldwide scale.”
Finding the balance
Balancing safety and production goals can be a tenuous, delicate and complex act. It’s undeniable that safety and production are compatible. It’s indisputable that investments in safety yield long-term benefits. However, these benefits aren’t as obvious nor do they produce the rapid results associated with production investments, which generally have a high certainty of providing a measurable positive effect within a short time frame. For protection and safety, many of the benefits are less tangible.
When successful, the instrumented protective system (IPS) is blamed for a process outage; when an IPS fails, it’s blamed for the incident. Within the system, individual instrumented protective functions (IPFs) act to prevent specific hazardous events. When an IPF successfully operates as required, it should be given credit for the event avoided, including potential fatalities, injuries, environmental releases, equipment damage and financial losses. Also, the IPF should be credited when its fault-tolerant design allows equipment failure to be detected without causing disruptions to the process operation.
Figure 1 summarizes the decision-making process, illustrating how available resources must be allocated across safety and production goals. Decision-makers often have defensive filters that affect the receipt and interpretation of information .
Figure 1. This overview illustrates the complexity of the decision-making process. Adapted from Ref. 2.
Today’s business climate puts pressure on personnel in a variety of forms, such as production forecasts, budget cuts, resource reductions or colleague retirement.
In the absence of a strong safety culture, production and budget pressure can result in a culture of denial in which the decision-maker’s defensive filter refuses to acknowledge any evidence that doesn’t support production or budget plans. Risk assessment can become skewed, with credible safety recommendations and concerns being dismissed without appropriate consideration. Erroneous assumptions concerning equipment and procedure robustness lead to complacency and an acceptance of increased risk. Often, this is done in the absence of dependable documentation, information and data, or a rigorous mechanical-integrity program.
Good engineering practices should be applied in preventing process safety incidents. Benchmark internal practices against those of market sector peers or other process-industries companies. Conduct periodic gap analysis to determine if existing equipment is designed, maintained, inspected, tested and operated in a safe manner. Based on observed performance and benchmarking information, develop and implement action plans for improvement.
A series of catastrophic chemical incidents occurred during the 1970s and 1980s. These incidents are so legendary that they are often referenced by city only: Flixborough (1974), Seveso (1976), Mexico City (1984) and Bhopal (1984). They were catastrophes that awakened the world to chemical industry risk. Reference 3 summarizes these incidents.
Europe pioneered process safety regulations in the 1970s in direct response to the impact on the communities of Flixborough and Seveso. Nearly a decade later, the tragedy caused by the Mexico City explosion and the Bhopal chemical release resulted in process safety regulations being issued in the United States and many other countries. Industrial societies responded worldwide by publishing numerous codes, standards and practices on a variety of process safety topics.
Despite these incidents occurring more than three decades ago, similar errors and root causes still exist today. Trevor Kletz has recounted numerous cases where an incident occurs and is repeated just a few years later . Kletz finds that organizations have poor memory due to many factors such as insufficient failure investigation, inadequate communication and distribution of investigation findings, lack of information retention and little training concerning previous events.
A safety culture doesn’t rely on balance sheet improvements to justify IPSs. It understands that the potential for incidents is an inherent part of the process design and that, without focused effort, incidents invariably occur.
Benchmarking current status
A company should understand how its internal practices compare with recognized and generally accepted good engineering practices. This so-called benchmarking establishes the company’s position with regard to industrial and market peers. This often is the most painful part of the continuous improvement process, as it tends to shed light on the shortfalls and inadequacies of the protective management system as a whole.
Some firms likely will find that their design and management philosophy is out of alignment with various aspects of good engineering practices. When unacceptable risk is identified, you should establish an action with short-term and long-term measures to sufficiently reduce the risk. It is well proven that investments to improve safety and reliability of the process operation yield long-term economic returns.
An increasing number of companies are finding themselves operating within a regulatory framework that often doesn’t provide prescriptive requirements. Instead, the requirements are a moving target based on the somewhat fuzzy concept of good engineering practice. Keeping up and complying with the basic requirements concerning process safety and IPSs can seem taxing enough. How does a company move forward with continuous improvement when it seems that even the immediate goals are a moving target? To some, it may seem challenging enough just to maintain the status quo, let alone to embrace more changes.
Continuous improvement operates on the principle that finding failures and errors is the beginning of a learning process. Minimizing their reoccurrence requires an understanding of how they developed.
Consider continuous improvement as well as the application of good engineering practice as an ongoing process rather than an endpoint. Periodically analyze work processes, information, resources and skills to identify weaknesses that limit performance and to recommend necessary improvements. Information systems, whether computerized or manual, should provide personnel with up-to-date details in a format that’s easy to understand. Information should be revision controlled, yet accessible.
Many factors — such as the economy, market trends and technology, along with legal and political issues — affect IPF performance expectations and the designs used to attain them. A strong safety culture expects ownership and accountability for safe and reliable performance of the process equipment over its life. Management should support periodic evaluation of the existing equipment to determine that it’s designed, maintained, inspected, tested and operating in a safe manner.
Changes in operability, functionality, reliability or maintainability expectations may require implementing different design or management practices. Proof test, failure investigation, alarm, trip, audit reports, etc. provide valuable insight into personnel and management system performance. Operating excellence mandates identifying and resolving the root causes of unacceptable process reliability and equipment performance. Improving equipment mechanical integrity requires a culture that values maintenance.
Various management system activities can identify a need for improvement. Periodically examine overall information available to identify, trend and correct systematic problems. Perform a gap analysis to compare observed to expected IPS performance. The gap analysis should determine that:
- equipment is operating according to design intent;
- safety, operating, maintenance and emergency procedures are appropriate for competency and risk-reduction expectations;
- hazard and risk analysis or management of change (MOC) recommendations are addressed in a timely manner; and
- training of personnel is adequate for current work expectations.
You may identify significant issues during the analysis. Management system failures often are reflected in multiple performance metrics. Watch out for systemic problems such as poor adherence to policies, procedures, and practices or insufficient inspection and preventive maintenance. If IPS equipment isn’t maintained, it’s likely that other equipment also is suffering from inadequate maintenance. The cumulative maintenance deviations, whether intentional or unintentional, may cause a breakdown of multiple protection layers.
Checking IPS requirements and performance frequently demands team effort. Some organizations establish a formal structure in which particular personnel participate as site representatives on a core team. The core team evaluates changes in the good engineering practices and recommends modifications to internal practices.
Whenever work processes are modified, a shift in emphasis often leads to changes in the way team members perceive the process, its associated risks, various protection layers and IPSs. This shift may result in recommendations for additional risk reduction or IPSs. These recommendations and other continuous-improvement efforts complete the lifecycle, moving the process toward safer and more reliable operation.
Determining the path forward
The key aspect of continuous improvement is charting the course to achieve it. Over time, various options will be presented to upgrade hardware, software or human interface systems. Review each proposed change using a MOC process to identify how the change affects other functions or systems. Address areas for improvement with an action plan, which typically prioritizes recommendations based on consequence severity and risk gap.
Action plans should define objectives, milestones and timelines. Periodically reassess action plans to determine whether there’s a need to accelerate the schedule or broaden the objectives. For example, you may decide to speed up a planned IPS upgrade in response to a manufacturer withdrawing support for critical equipment. To be successful, action plans should be communicated to affected personnel so they understand and commit to them.
Implementing upgrades aimed at improving long-term operational effectiveness takes time to complete, depending on the complexity and degree of change involved. As the IPS is changed, operating plans and targets should consider any additional risk borne by the process during the transition. Once the design basis changes are underway, review the operating and mechanical integrity basis and implement all needed revisions.
There are many barriers to improvement, including:
- poor data integrity and quality;
- inadequate information availability and consistency;
- lack of broad understanding of facts and procedures;
- deficient or missing internal practices and procedures;
- poorly understood compliance expectations;
- inadequate revision control or notification of changes; and
- lack of comprehensive training on data, information, procedures and practices.
To overcome these barriers, you must provide personnel with more than just another initiative or mandate for change. Continuous improvement must be part of an organization’s culture, beginning at the highest management level and continuing to the front-line operator. A continuous improvement culture requires that all staff understand the importance of following approved practices and procedures. People should feel that safe and reliable operation is an institutional value and that they won’t lose their jobs or be held back for speaking out. Front-line personnel must believe that continuous improvement is supported by all levels of management. Everyone should understand that employment is conditional on safe work performance.
To succeed, personnel must be aware of the potential risk and be committed to do what’s necessary to maintain and continuously improve operational and mechanical integrity. The path forward encompasses many detailed tasks, but generally includes the following:
- assign responsibility and hold personnel accountable;
- audit to ensure practices and procedures are followed;
- question norms and reduce risk further when practical;
- integrate business and process safety goals;
- track performance, address bad actors and celebrate success; and
- learn and remember.
A corporate responsibility
“Corporate leadership at the highest level is accountable for the safe operation of facilities that use hazardous chemicals. Safety culture is created at the top, and when it fails there, it fails workers far down the line,” states Merritt .
An organization’s culture is ultimately driven by what management indicates is important — what’s measured and what’s rewarded. Sustaining safe operation demands a recognition that the direct costs of an incident represent the tip of an iceberg. Hidden from view are the indirect costs and long-term business damages resulting from unsafe operations. When the true cost of an incident is understood, it becomes very clear that being cost effective involves much more than simply today’s budget. Success requires that personnel believe that investment in reducing risk further is encouraged and rewarded.
Market leaders recognize that this investment provides benefits that far outweigh its costs. Operating excellence seeks to prevent incidents because it is good for business and it is the right thing to do.
- Merritt, C.W., “Statement on the Release of the BP Refineries Independent Safety Review Panel Report,” U.S. Chemical Safety Board, Washington (Jan 2007).
- Reason, J., “Human Error,” Cambridge Univ. Press, Cambridge, U.K. (1990).
- Mannan, S., “Lees’ Loss Prevention in the Process Industries,” Vols 1-3, Elsevier Butterworth-Heinemann, Oxford, U.K. (2005).
- Kletz, T., “Lessons from Disaster: How Organizations Have No Memory and Accidents Recur,” Institution of Chem. Engs., Rugby, U.K. (1993).
Angela Summers is president of SIS-TECH Solutions, LP, Houston, Texas. She recently completed the “Guidelines for Safe and Reliable Instrumented Protective Systems” for AIChE’s Center for Chemical Process Safety. She was the recipient of the 2005 ISA Albert F. Perry Award and is a 2007 inductee into the Process Automation Hall of Fame. E-mail her at firstname.lastname@example.org.