The situation at today’s processing facilities differs markedly from that five or ten years ago. Operators are younger and less experienced, and there are fewer of them. Yet, those small teams are expected to shoulder more responsibility than ever before. Maintaining safety remains a prime consideration, with alarms as one of the most important lines of defense.
“Management of Alarm Systems for the Process Industries,” standard 18.2 of the International Society of Automation (ISA), defines recognized and generally accepted good engineering practices for alarm management. Many plants have striven for decades to reach these goals, only to fall into the trap of repeating the same ineffective processes over and over without obtaining the desired results. By recognizing and avoiding nine common misconceptions about alarm management strategy, chemical companies can close the gap between their desired alarm functionality and actual practice in their facilities.
Let’s look at each of these misconceptions, what they lead to, and how to properly proceed.
1. We just need to reduce our alarms. Alarm floods are the bane of any operations team and have become even more of a stumbling point as today’s operational groups have gotten smaller. When tens, hundreds or even thousands of alarms flood in a short time — whether during an emergency or a less-critical aberration — operator attention is drawn in many directions at once, making isolating, defining and solving potentially safety-critical problems difficult.
As a result, alarm management teams often adopt a “zero alarm strategy,” i.e., one that aims to configure as few alarms as possible.
While optimized alarm systems will result in fewer configured alarms, thinking in terms of quantity rather than quality is a mistake. The goal of effective alarm management is to identify quality alarms and keep them in service while improving or eliminating nuisance alarms. At the heart of this strategy is one key rule: the quality of an alarm is negative if it does not conform to all five of the following keywords and definitions:
Figure 1. The system includes a number of interlocks.
• abnormal — not planned or expected, a surprise to the operator;
• actionable — operator response to the alarm is required and possible;
• consequential— lack of or incorrect/insufficient action likely will lead to an undesirable result;
• unique — only one alarm sounds to announce an abnormal deviation; and
• relevant — understandable to the operator and pertinent to the current operating state.
2. We should alarm everything just to be safe. Alarms differ from status information. Most pieces of equipment in the plant only have a few statuses that conform to all five keywords and, therefore, require an alarm. Identifying, evaluating and then deliberately ignoring an inconsequential alarm squanders an operator’s precious attention; so, elimination of these alarms is important.
Status information might draw more attention when configured as an alarm but doing so clutters the operators’ alarm interface. This not only distracts operators but also conditions them to ignore alarms, creating a dangerous situation when a true alarm needs attention.
A common example of status information inappropriately appearing as an alarm is an inactive pump. If two pumps are installed in parallel, with only one expected to operate, this will guarantee the presence of a standing alarm at all times. The better alarm to configure in this situation is a failure or command disagree alarm for the pump, indicating it is stopped when commanded to run or vice versa.
Another common situation involving alarming normal events is where there is on-off control action, such as an automatic start-stop on a sump pump. As the sump level rises, the pump is turned on. When the pump action successfully reduces the level to the desired point, the pump is stopped. These normal control actions often are alarmed — but only serve to disturb the operator. A proper alarm in this situation would be one that is set above the auto-start level, at a point that indicates the pump failed to start or is malfunctioning.
Status information must be kept off the alarm summary. Instead, the operator should get statuses via indications on the graphical interface associated with the equipment in question.
3. Multiple alarms draw more attention to problems. Nobody wants a safety incident to occur; so, teams configuring alarms look for ways to ensure operators are immediately notified when something is wrong. It seems intuitive, then, to create multiple alarms for the most-severe equipment aberrations. After all, multiple alarms popping up are much harder for operators to ignore, intentionally or accidentally.
However, having multiple alarms for a single event creates its own set of problems. As alarms flood in, operators quickly can become confused as to which they must address first, delaying responses. Moreover, even when operators identify the source of the problem and begin to take action, they waste valuable time silencing the other alarms.
A better strategy is to create a single alarm for each event. This not only will present the operator with an alarm but also will provide clear severity information to help in understanding the importance, simplifying prioritization.
For example, where a reactor has multiple temperature indications, with each having a high temperature alarm, then multiple alarms will sound during a process upset where only one really is needed. The best option here is to configure a maximum temperature and have an alarm on that value. When the alarm sounds, the operator can consult the reactor graphic to see where the high temperature exists and can respond appropriately.
State-based or dynamic alarming also should be employed to ensure that multiple alarms do not annunciate when an upset condition occurs.
4. We don’t need dynamic alarming. The plant environment is not static. Plant activities and environments change from day to day; alarm management must reflect that fluctuation. Even in the best operating facilities, plants go through many different operating states; each state often will require a unique alarm configuration to avoid nuisance alarms.
These state changes complicate alarm management. Alarms, by definition, identify abnormalities in plant and equipment operation. However, what is normal and abnormal often varies with operating state. As a result, to be effective, alarm management also must adapt to the state of the plant.
Dynamic management enables alarm configuration changes based on logic defining the operating state and process conditions. Alarm systems configured with dynamic management facilitate smooth transmissions from one operating state to another using state determination logic.
Figure 2. Readings of key operating variables determine the state of the process.
The ideal alarm management system will integrate seamlessly with the distributed control system (DCS) to make dynamic management easier. When the alarm management system and DCS work in tandem, operators will have clear, instantaneous visibility of alarm status right from their consoles, regardless of operating state. Such a scenario dramatically reduces the risk of operator error during high-stress operating states, such as startup and shutdown.
The best strategy is to incorporate dynamic management from the earliest stages of developing an alarm management strategy. Dynamic management can be effectively handled during an alarm rationalization process without significantly increasing time and budget.
Let’s consider the application of dynamic alarming to the distillation system shown in Figure 1. Here, the alarm rationalization team identified three basic operating states and developed logic to determine when the system is operating in each state. The logic is based on reading key operating variables from the control system and then applying those readings in a logic structure (Figure 2).
This logic prompted a number of alarm changes (see Table 1) to optimize the alarm configuration for each state and improve the operator’s experience.
5. Our tag and alarm descriptions are perfectly clear. Alarms only are useful if operators can quickly understand what they mean. Few plants still have the luxury of a deep bench of veteran operators; even the plants that do have highly experienced operators will need to bring in new personnel at some point.
Even to a seasoned operator, an alarm description such as HDR PNL 17LP3n-1B-C likely will mean very little. Those abbreviations may capture a wide array of information but, if operators can’t decipher their meaning, the alarm is not useful. Simply avoiding abbreviations is not the answer — names so long that pertinent information runs off the screen still have little value in a crisis.
A better strategy is to develop a standardized naming convention in partnership with the operations team. Each name should be short, use abbreviations and terminology the operators understand, and be taught easily to new operators.
Creating a standardized convention helps ensure that operators, even if unfamiliar with the alarm at first glance, quickly will be able to determine its meaning based on experience. This type of fast comprehension can save precious minutes.
6. Triggering interlocks with alarms saves effort. Often, DCS configuration teams will try to save effort by tying interlocks to their associated alarms. For example, it may be easier to configure a high-level reading as an alarm and an interlock if it exceeds 95%.
Alarms and interlocks exist for different reasons. So, enforcing both with the same parameter generally is not recommended; ISA 18.2 and its technical reports discourage the practice. When the two are tied together, the alarm management team cannot change alarms without changing interlocks, which creates new risks for the plant.
Moreover, suppressing alarms tied to interlocks can disable the interlock. This creates a potential safety and security loophole.
The better strategy is to spend a bit more time to configure the interlock separately from the alarm. Many modern control systems provide separate parameters for that purpose.
7. We only need to rationalize bad actors. Setting up a successful alarm-management strategy takes time; the more equipment a plant has, the more time it likely will take. Often, alarm management practices review only the bad actors, for example the top ten or twenty most-frequent alarms. Just reviewing bad actor devices does not meet ISA 18.2 guidelines; the standard advises rationalization of all alarms.
While a bad actor review might produce a quick win in reducing alarm rates, this methodology does little or nothing to prevent alarm floods — and it does not ensure an optimum overall alarm configuration. Instead, considering alarms as a system rather than individually is essential.
Only a thorough rationalization, including all alarms and dynamic alarming, will produce a result that provides optimum alarm configuration with a satisfactory experience for the operator while conforming to the recommended ISA metrics in terms of average and peak alarm rates, percent of time in flood, and other factors.
Figure 3. This can provide enterprise-level teams with better visibility of plant performance.
8. The best alarm strategies are the ones that satisfy management. Whether because the facility recently had a safety incident or because metrics are identifying problems, the call for improving alarms often comes from upper management. Because a need to demonstrate success to management is necessary, it is easy to think of management as the primary audience for alarm reform. Seeing alarm management as a top-down edict fosters a “check the box” mentality and only doing the minimum without considering the true benefits.
However, while management will evaluate the results of alarm management, the true audience is the operators who rely on the alarms to help them safely and efficiently perform their jobs. Developing an alarm strategy that satisfies ISA 18.2 guidelines means designing based on solid principles and on the feedback of the board operators who must deal with each alarm. Therefore, an alarm configuration that empowers operators to control the process effectively and safely should be one of the ultimate goals of the program.
Involving experienced operators trusted by their peers in alarm strategy design from the very first sessions will help the team create an alarm system that suits the way they work, while increasing the probability of acceptance from the remainder of the operators.
9. Metrics can guide our entire alarm strategy. Modern alarm-management tools provide enterprise-level, web-based alarm and event reporting tools. Engineers can use these tools to create customized reports; dashboards, like those that can be created in Emerson’s AgileOps software, can afford improved visibility of plant performance from anywhere in the world (Figure 3).
Metrics are important. Not only do they help enterprise-level personnel guide overall business and plant strategy but also, when properly configured, they help plant personnel gain more visibility into the safety, efficiency and effectiveness of their own facility. However, metrics never must be used as a basis to compromise an alarm management process.
For example, excluding startup and shutdown alarms in alarm reports sent to management does not comply with ISA 18.2 guidelines. The plant must be able to meet metrics for all states, not just the run state. Also, doing this may make alarm system performance look deceivingly good and, consequently, could result in a lack of funding for a thorough rationalization and improvements that are truly needed.
Successful alarm management is not about metrics but instead requires providing only quality alarms that support the operators’ efforts to monitor and control the plant efficiently and safely. A well-conceived and well-run alarm rationalization effort that includes dynamic alarming very often will result in a reduction in the alarm count and conformance to metrics — but metrics should not drive the process.
Table 1. Optimizing alarms based on the state of the column enhances operator experience.
Recognizing the misconceptions that lead to poor alarm management strategies enables teams to better leverage their alarm management system and tools to promote safer and more efficient operation. The most-effective teams combine the right tools and strategy to dramatically improve the way they operate.
Modern alarm-management tools provide a wide range of features such as native DCS integration, dynamic alarm management, and intuitive dashboards. When coupled with a clear knowledge of the potential pitfalls of alarm management, these tools not only help teams configure alarms correctly from the very first moments of operation but also bring alarm strategies in line with ISA 18.2. This improves visibility and guides operators through the riskiest and rarest moments as well as everyday operations.
DARWIN LOGEROT is a Houston-based principal operator performance consultant for Emerson. Email him at [email protected].