Do operators sing the praises of your plant’s alarm system? No? Well, do they at least agree that generated alarms represent real abnormal situations requiring a response and that the automation/control system presents alarms in a timely, accurate and reliable way? No again? Well why not? Aren’t operators the primary customers of your alarm system? Perhaps it’s time for an alarm remediation project.
Operators’ hopes and expectations for a plant’s alarm system include:
- minimal nuisance alarms;
- alarm annunciation and displays unencumbered with notifications (i.e., routine messages not representing an abnormal situation requiring a response);
- clear prioritization of alarms to indicate what’s most important during alarm floods;
- no information overload or unnecessary multiple/redundant alarms during abnormal situations;
- availability of sufficient information in easy-to-access and easy-to-understand form as to what the alarm is, where it’s occurring, and any other relevant supporting material; and
- guidance about the expected response.
These expectations accord with universally accepted principles that characterize alarms — namely, an alarm should represent an abnormal situation that requires a response [1–3] and alarm systems should aim to alert, inform and guide operators to help them deal with abnormal situations .
If helping operators do their jobs isn’t incentive enough in implementing an effective alarm system or remediating existing ones, consider the perspectives of quality control groups and plant management.
If you were a quality assurance representative, regulatory inspector or a plant manager and saw that a plant was generating more than 1,000 alarms/month, wouldn’t that raise a red flag about whether the plant was “in control”? Wouldn’t you be particularly interested in any alarms pertaining to safety, environmental compliance or product quality due to their possible regulatory/liability implications? Wouldn’t the number of alarms suggest high plant variability and resulting consequences involving queues, increased average cycle time, reduced product yield and plant throughput, and extra runs needed to evaluate new ideas in plant trials?
If a high number of nuisance alarms is a major factor, that raises other issues. It suggests that alarms weren’t appropriately rationalized by the process design team or weren’t configured properly into the control system. This, in turn, might cause some operators to lose respect for the alarm system and start ignoring alarm annunciations, possibly then missing real abnormal-situation alerts.
The current situation
It’s not unusual for a manufacturing facility to average more than 1,000 alarms/month (with some plants averaging far more). Many operators are frustrated by both the number of nuisance alarms they receive and the total number of alarms. Quality control personnel can be frustrated with a large number of abnormal situation incidents that must be formally investigated before product is forward-processed or released for sale. Plant management can be frustrated with an unacceptably high level of process variability caused by frequent abnormal situations.
A number of factors contribute to the current situation:
Increase in configured alarms. Plant personnel continue to add alarms to automation systems. The number of configured alarms per operator has dramatically risen during the past few decades (Figure 1). Some of this is due to the increasing implementation of smart sensors and valves. These devices communicate a large amount of information with the host process control computer — so automation engineers are tempted to configure alarms to much of this additional information. Always ask the question: Does this really represent an abnormal process upset requiring a response from the operator? If the answer is no, it’s not an alarm and shouldn’t be configured as one. Use the same logic for alarm remediation projects for existing systems.
Figure 1. Low cost and ease of implementing alarms in digital control systems have contributed to steep growth. Data reproduced with permission of PAS .
Low alarm implementation cost. Configuring an alarm has dropped dramatically in cost as the use of digital control systems has proliferated. Gone are the days when a new alarm required wiring and hardware additions to panel boards or other relay devices and several days or more to schedule the work. Now it often can be done in minutes by one person with a few simple keystrokes on a computer engineering console.
More actual alarms per operator. The rise in configured alarms per operator normally correlates to a subsequent increase in actual alarms per operator. Sometimes, the frequency of alarms exceeds what an operator can reasonably be expected to handle, as suggested by EEMUA .
Lack of sufficient guidance on how to respond to so many different alarm conditions generally exacerbates the problem.
Standard Operating Procedures (SOP) and operator manuals may exist but may not be quickly accessible when a major abnormal situation occurs on the plant floor.
Sometimes, basic documentation is missing. For instance, some project design teams fail to document both the rationale for selecting attribute values for an individual alarm and the expected operator response to abnormal situations. Again, always remember that an effective alarm should alert, inform and provide guidance to the expected response .
Multiple alarms per incident. An abnormal situation sometimes can generate many alarms, a so-called alarm flood — often confusing operators and generating information overload, rather than providing a single more intelligent conclusion for operators about what’s occurring.
Distribution of alarm logic in many parts of the system. Part of the problem with overall alarm management is that alarm functionality resides throughout the automation system (Figure 2). For example, field devices, on-line analytical systems and Safety Instrumented Systems (SIS) as well as programmable logic controllers (PLC), distributed control systems (DCS) and interfaced third-party systems (e.g., expert systems) can generate alarms. ISA Standard S-95 offers some perspective on this, defining several different levels of automation.
Figure 2. Many levels of plant automation can generate alarms.
It’s a challenge to create an environment that consolidates alarm information such that a single user interface and alarm acknowledge/response paradigm exists.
However, having alarms displayed in one place in a consistent format will help operators more effectively react to abnormal situations; having alarm records in a single database will greatly facilitate mining data for their information and knowledge content and generating batch reports.
Use of the alarm system for notification messages. At many plants, operator information overload is made worse by having the alarm system handle routine notifications (information messages). In fact, this has been blamed as a contributing factor to the severity of certain well-publicized plant disasters. Using the alarm system for routine notifications sometimes is pursued as a convenience for the automation engineers and sometimes represents a limitation of a vendor’s automation software. Regardless, notifications shouldn’t appear as alarms to operators.
Nuisance alarms. Operators at many plants are frustrated by a large number of alarms that don’t represent abnormal situations or don’t require a response. In cases where operators receive frequent nuisance alarms, they may lose respect for the alarm system and then sometimes miss real abnormal situations while responding to (or ignoring) nuisances.
Batch processing adds complexity
Plants that rely upon batch operations face additional challenges in configuring and managing alarms. These include:
- organization of processes as a sequence of steps/phases, often with transients involved in moving from step to step;
- the non-steady-state nature of batch processes, with time-varying set points and alarm limits;
- increased use of notifications;
- need for different time reference, i.e., relative time rather than calendar time;
- necessity to query, sort, and report alarms by lot number and other batch parameters; and
- adhering to the ISA S-88 Batch Standard .
Let’s look at each of these particular challenges in a bit more detail:
A sequence of steps. While all chemical processes have multiple steps/phases (even continuous processes have start-up and shut-down steps), batch processes typically are characterized by a relatively short total cycle time (e.g., days), numerous steps, and significant activity (automated or manual) involved in transitioning between steps.
For instance, an overall batch sequence of unit operations might include: preparation of raw materials (thawing, milling, screening, putting into supply tanks, etc.), chemical reaction, filtration, chromatography purification and then crystallization.
In addition, a sequence of steps might take place within a single unit operation. As an example, a chemical-reaction unit operation might consist of: automated cleaning of the vessel, raw material filling (including weighing operations), heat-up, reaction, cool-down and finally harvesting.
Most alarms are relevant to one or more of the batch steps but not others and, therefore, need to be configured as a function of process step/phase. In addition, whenever possible, the alarm record tag should note the applicable process step or phase to facilitate obtaining relevant historized information specific to a step.
Non-steady-state operations. In contrast to continuous processes, batch processes typically have few, if any, steady-state characteristics. For example, control of the chemical reaction step of the process described above might involve monitoring or control of time-varying reactant feed rates, temperature, pH, etc.
The control of time-varying processes can result in nuisance alarms if alarms aren’t appropriately configured, such as immediately following a set-point step change. Some plants pursue ramping of set points and use of “deviation from set point” alarm tolerances (rather than absolute values) as one technique to help avoid nuisance alarms.
Sometimes alarm limits must be calculated time-varying values. So check whether vendor software can accommodate this need.
A very popular feature, developed, implemented and published by one manufacturing company, recognizes the difficulty operators have in remembering what the typical values of time-varying process parameters should be at particular points in time.
Therefore, its plant’s trend plots of current production runs include a backdrop showing the calculated time-varying normal range of successful historical runs (Figure 3). The process variable (PV) of a current run crossing over these historical backdrop lines can be the basis of an alarm (i.e., an indication that the current run is not normal).
Figure 3. Comparing a current batch run to the historical range of satisfactory runs can identify an abnormal situation.
Notifications. Batch processes often require the generation of information messages because they typically have a large number of sequential steps and the transition from one step to the next may involve prompts to operators to pursue specific manual operations.
Notifications aren’t alarms and, vendor software permitting, shouldn’t be configured to the alarm system. Instead, this information should be stored and displayed in a separate part of the Human Machine Interface (HMI). If this can’t be done with a particular automation system, then every effort should be made to display the information so that operators can distinguish notifications from alarms (for instance, using different colors or audio signals).
Use batch time rather than calendar time. Those who work with batch processes tend to think of their processes in terms of relative batch time, so that “Time = 0” indicates the start of the batch or batch step.
However, almost all automation systems collect, display and store process data, including alarm records, in calendar time. This causes significant inefficiency when analyzing such data, as users must constantly translate calendar times to batch-relative times to provide the appropriate context in analyzing the significance of an alarm.
While the value of displaying batch data in relative time has been known for decades, some vendors only recently have been providing such functionality in batch historian products. So, consider insisting upon this capability in the functional requirements for new systems and adding utilities to provide this functionality in existing systems.
Note the use of relative time in Figure 3.
Data query efficiency. Batch manufacturing professionals often desire to analyze process data based on lot number. This allows them to compare one particular lot to another, facilitates the generation of batch reports and enables them to efficiently access information relating to specific abnormal situations that may have occurred during a batch. Unfortunately, many commercial automation systems don’t include lot number in their process data or alarm record tags.
At the start of a project’s life engineers and automation personnel clearly need to consider who’s likely to access and use alarm records and for what purposes. For instance, there’s often value in generating batch reports at the end of a run. Quality control groups may want to know about any and all product-quality-related alarms.
Other reports might highlight environmental excursions (i.e., potential violations of environmental permits) or safety incidents. Such reports can help plant personnel quickly focus on that part of plant operations they are responsible for and determine which abnormal events during a batch need to be investigated further.
So, if possible, tag alarm records with the batch lot number and alarm category/class, such as safety, environmental or product quality. If the tag can’t accommodate these details, make sure that software utilities such as dictionaries, etc., exist within the data historian to align alarm records with their appropriate lot number and category/class.
ISA’s S-88 Batch Standard. This standard was developed in the 1990s to address, in part, four basic problems that industry was encountering:
- lack of a universal model for batch control;
- difficulty in communicating batch processing requirements;
- obstacles in integrating solutions from different vendors; and
- trouble in configuring batch control solutions.
These problems led to expensive batch control systems that often didn’t meet the needs of users and were difficult to maintain.
The ISA S-88 Standard defines terms and provides a common framework for discussion of batch operations. It establishes some common models (e.g., procedural, physical and state) for understanding equipment and the sequences involved in batch operations.
One of many features of this standard is the separation of equipment logic from product recipe logic. Historically, when the code to run equipment and the code that defines a product recipe are in the same device (e.g., a PLC) the two different sets of code eventually can become indistinguishable and in some cases inseparable. So, changes to the product recipe or to process equipment can require excessive effort in software modification; documentation is then often difficult. This makes recipes resource-intensive and hard to maintain. Therefore ISA-S-88 provides a structure that separates recipes for making a product from the code specific to equipment in which the product is made. The value of this is especially apparent at batch manufacturing plants that use the same equipment to make several different products.
S-88 also provides guidelines on how to recover from abnormal events (which, of course, are typically associated with alarms). These guidelines include definition of procedural commands such as start, hold, pause, resume, restart, stop, abort and reset.
In addition, S-88 defines various states of a process (e.g., running, holding, pausing, aborting, etc.). This adds another dimension to the applicability and, therefore, configuration of a particular alarm — namely, the enabling of an alarm specific not only to a batch process step or phase but also to the process state. Managing states and transitions between states is a very important batch automation topic in that, for many processes, the amount of logic associated with these aspects of a process greatly exceeds that for the normal running of the process.
Vendor software off-the-shelf functionality
Most automation vendors’ roots are in the continuous process industries; they only recently have been addressing the unique needs of batch processes, including regulatory requirements for validation (such as electronic records and signatures).
Keep this in mind when starting a new automation project. The documented system functional requirements that typically are pursued near the beginning of a project may influence vendor selection or the need for customization of the system.
Often when a project engineering team has assumed that a vendor’s automation system will provide all the alarm (and other process control) functionality that will be needed, the end result has frustrated both automation engineers and users when it was discovered too late in the project timeline that the system didn’t provide all the desired functionality. Other project engineering teams have recognized alarm management limitations in commercial off-the-shelf software systems and, therefore, sometimes have chosen to interface third-party products (alarm loggers, paging systems, expert systems, etc.) with their core DCS/PLC/historian automation systems.
Other project teams have worked with automation vendors to customize their standard product offerings.
If your plant is struggling with huge numbers of alarms, take heart. Consider the total number of alarms that exist on an automobile. Although cars are complicated systems involving many different components, electrical circuits, frequent load changes and even critical safety operations, only a few alarms usually appear on most automobiles: e.g., low fuel level, unfastened seat belt, unauthorized entry, and engine maintenance needed. Each of these alarms represents an abnormal situation requiring a user response and presents information in a format that’s timely, accurate and easy to understand. A single alarm (not multiple alarms) warns of each type of abnormal situation.
This isn’t to say that a manufacturing plant should have very few configured alarms — but it does suggest that adherence to alarm management best practices can result in far fewer alarms than would otherwise exist. Indeed, it’s common for alarm remediation projects in industry to reduce total alarms by over 70%.
Take the crucial steps
The key considerations in achieving effective alarm systems include defining objectives early in a project’s life (i.e., in a plant’s alarm philosophy or a system’s functional requirements), adhering to the definition of an alarm, and implementing alarm-management best practices.
The following checklist can help plant project teams in avoiding some of the pitfalls in designing and implementing alarm management systems and in working more effectively with vendors.
- Identify desired alarm functionality when defining alarm management philosophy and system functional requirements — near the beginning of a project when there’s still time for customization (e.g., outputting alarm alerts to pagers). Don’t assume that vendor off-the-shelf products will provide all desired alarm functionality.
- Define in the early stage of a project what use will be made of alarm records. This can influence the determination of alarm attributes (e.g., categories and priorities) when designing/rationalizing individual alarms. It also can help identify the need for specific data mining and reporting software utilities.
- With minimal exceptions, configure alarms only for abnormal situations requiring a response. System permitting, use other areas of the HMI for informational/notification messages or use color and other graphical means to distinguish between alarms and such messages.
- When designing or rationalizing alarms determine and document expected operator response. Consider coding the expected response on-line. Remember that an alarm’s purpose is to alert, inform and guide.
- Include tools in historians (or third-party products such as alarm loggers) to allow for the querying, sorting, charting and batch reporting of alarm records. For example, a Pareto chart could be used to depict individual alarm frequency, thereby helping engineers identify and focus on the situations generating the most alarms.
- Implement a monitoring program and pursue continuous improvement (including reduction of nuisance alarms).
- Maintain the mindset that nuisance alarms can be a major frustration for operators, may contribute to their missing real abnormal situations, and, if excessive, may call into question the qualification/validation of the system.
- Recognize that responsibility for alarm management typically is shared among process engineers (who should specify what alarms are required), automation engineers (who should implement the requested alarms) and operators (who are the primary customers). Assigning alarm management responsibility to only one or two of the above groups often is a formula for failure.
- Become familiar with ISA’s S-88  (for batch processes).
- Keep on eye out for ISA’s S-18 Standard on “Management of Alarm Systems for the Process Industries.” It likely will be published later this year. This standard builds upon the best guidance available in recent years — namely, EEMUA’s Publication 191 . It will define expectations for most key alarm-management activities during an automation system’s life cycle. Major sections will cover: philosophy/specifications; alarm rationalization/design; implementation/training; change management; and monitoring/assessment.
When working with vendors, keep in mind that they should be able to provide:
- configuration of all alarm attributes (rather than having them hard coded).
- change of alarm attributes as a function of batch process state, step/phase and process time/condition — with appropriate access security and audit trail functionality when manually making such changes on-line, as required for certain regulated industries (e.g., pharmaceutical and nuclear).
- several user-definable fields (such as lot number and batch step/phase) in alarm record tags.
- alarm configuration using all appropriate information in the automation system (for example, “if-then-else” rules). This often can lead to more intelligent on-line conclusions about a process versus just comparing a current value to a set point within an individual control loop.
- view of alarm information in relative time (i.e., from the beginning of the batch or batch step) in addition to calendar time.
- software utilities (such as Pareto charts) to help users analyze alarm information.
Joseph S. Alford, Ph.D., P.E., C.A.P., is a consultant in Zionsville, IN. He recently retired after a long career as an engineering advisor for a major pharmaceutical company. Reach him via e-mail at [email protected].
- “Alarms Systems: A Guide to Design, Management and Procurement,” Publication 191, 2nd ed., Engineering Equipment and Materials Users’ Association (EEMUA), London, U.K. (2007).
- Hollifield, B. and E. Habibi, “The Alarm Management Handbook: A Comprehensive Guide,” PAS, Houston (2006).
- Alford, J., R. Stankovich and J. Kindervater, “Alarm Management: Regulatory Expectations and Selected Best Practices,” p. 25, CEP (April 2005).
- “Batch Control, Part 1: Models and Terminology,” ANSI/ISA-88.01, ISA, Research Triangle Park, N.C. (1995).