Make the Most of Historical Process Data

The past can provide critical insights for the future.

April 27, 2009

13 min read

More Content On Historical Data

People today expect that they can quickly check their investment portfolios over the last quarter, year or multiple years with a few mouse clicks. When you visit Amazon.com, your recent purchase and viewing history is available. This typifies our growing appetite for historical information in our personal world. Yet, many processes lack the same basic historization and analysis tools that we find beneficial in our personal lives. If data historization is desirable and adds value to your 401(k) investments, why wouldn’t you demand it for your process data? Although the idea of data historization is straightforward and historians already have demonstrated their value across industry, the road to successful implementation can be treacherous. One of the early speed bumps that you’ll encounter is justification of implementation and maintenance costs. How can you assess the value of data to which you don’t currently have access? Predicting or quantifying in advance the discoveries of patterns and events from these data may not be possible but experience has shown that they almost always occur. Sometimes these discoveries can be dramatic, resulting in significant savings that far outweigh the cost required to implement data historization. Analysis of past “mistakes” may help with early detection of current process issues, allowing you to prevent lost or scrapped batches and material. The projected value of the solution should include expected savings due to increased production, reduced waste and less downtime. Take advantage of demo or trial versions of historization software you’re considering — this may lead to early recognition of data patterns and events before the full solution has been implemented. Value in HistoryThe primary reason to implement a data history solution is to gain a deeper understanding of your data so you can reduce waste, improve efficiency and save money. In batch processes, manufacturers strive to replicate the perfect or “golden” batch. With historical batch data and the proper analysis tools, you can determine what contributes to the golden batch. Initially, you may be able to pinpoint batches that were highly successful, although you may not know what lead to each success. Analysis of successful batches versus unsuccessful ones can result in identification of key process measurement profiles; successful batches may share a common profile for one or more related process measurements. As new batches begin, tracking these measurements against historically created profiles can boost the percentage of successful batches. A data historian and the right tools and resources, coupled with continuous data collection during uptime and downtime, allows analysis that can provide insights about production downtimes, enabling you to increase your runtime, product output and profits. A few third-party software products target this task; they leverage existing historians or implement their own data historization specifically for downtime analysis. Historical data also offer benefits for diagnostics and predictive maintenance of equipment such as pumps and valves. They can allow you to follow the degradation of a part over time so that preventative maintenance can occur when needed. They can prevent unexpected failures due to broken parts, premature wearing or other unexpected mechanical problems. For example, you can track the torque level of a valve actuator over time to see variances from the norm. Sometimes, regulatory or industry requirements mandate implementation of a data historian. In such cases, their strict stipulations may narrow your solution choices. Regulatory requirements for your process, particularly for industries such as pharmaceuticals, may demand that you provide data history as well as may mandate that you perform extensive certification of your data historization and presentation tools — thoroughly testing and documenting accuracy of data retrieval, storage and playback. This may lock you into current software versions because making any changes to the system, including upgrading software to the latest version, may require that you recertify upgraded components. Due to these strict requirements, some companies implement “uncertified” installations that can be frequently upgraded and modified, with the understanding that data contained in the system are also uncertified and therefore can’t be used when making process or production decisions. Requirements for continuous emissions monitoring systems (CEMS) have steadily expanded to more geographical areas and industries. CEMS involves collection, storage and reporting of data related to emissions such as NO_x, SO_x and CO₂. In some of the stricter reporting areas, data must be electronically reported daily to the supervising district. In many cases there’s oversight and validation of the implemented solution, requirements for long-term and accurate storage of emissions data, rules and requirements for calculating and reporting data, and heavy financial fines for incomplete, missing or inaccurate reporting. Unfortunately, the complex and varying rules make purchasing an “out of the box” CEMS solution difficult. Selecting a SolutionAlthough you may be able to in-house build a custom data-archiving solution, you may quickly find that it can’t deliver more than basic required functionality without a significant investment of time, money and resources. Fortunately, over the past few decades, companies and products that focus on data process data historization have emerged. Their product maturity and breadth of features make custom solutions difficult, if not impossible, to justify. These products support efficient data collection for hundreds of data sources (programmable logic controllers (PLC), distributed control systems (DCS), supervisory control and data acquisition (SCADA) systems and OPC servers), provide efficient storage and backup of historized data, offer value-added functionality such as aggregation, and serve data to clients through a rich set of tools including trends, displays and reports. Scalability makes a core set of functionality available to all customers regardless of the number of archived data points. The number of archived data points and client products used typically determines price. Even though your initial data historization project may involve a single data source, you should consider sources that may be added in the future. You currently may own some type of data historization solution through your existing DCS or SCADA system; these solutions typically concentrate on the historization of their own data, though. As your solution expands, the ability to support the collection of process data from multiple data sources will become increasingly important. Different DCSs, PLCs and SCADA systems may be deployed across your company, so selecting a historization solution that will support all of them is crucial. Data collection is performed by a “data interface” or “data collection” program for each specific data source. These data interfaces usually are written using software drivers from the data source vendor to poll the source for new data values. The new values along with the collection time are sent to the data archive for historization. An increasing number of vendors are making their process data available through OPC, so the “OPC data collection” interface has become the most popular method for process data collection. Data storage typically involves either proprietary data archives or relational databases. The current leaders in process data historization rely on their own proprietary data archives. Relational databases generally don’t deliver the features or performance of custom or proprietary data archives that have been designed specifically for the storage of time-stamped process data. Many data historians will allow you to store a subset of the total number of data values scanned for each measurement point. This “data compression” helps to efficiently store your process data within a prescribed deadband. Data compression is one of the least understood mechanisms of data history — yet its intelligent implementation can lead to significant performance improvements. For example, you may want to historize the state of a pump (running/stopped) by scanning the current pump state every second. If the state changes once per hour, storing the one-second pump state values is redundant and unnecessary. By storing only the changed value, you can reproduce the same history with far fewer values and less storage space. This speeds retrieval of the pump state values because fewer values were stored. For continuous readings such as temperatures and pressures, storing data within an allowed deadband can prevent storage of instrument noise or data with more precision than the instrument you are reading. Accessing Your Historized DataThe tools and applications for accessing your historical data are as important as the data themselves. The ease-of-use, reliability and features of the tools will drive the acceptance and popularity of your historization solution with its users. The majority of users will access your data through three methods: graphical data displays, ad-hoc analysis and reports. Graphical displays present process data through a collection of historical and real-time objects such as trends, values and bar charts, along with static and dynamic process equipment and pictures. One benefit of a graphical display application with history is the ability to take a real-time HMI display and replay the display over time. Although web browsers aren’t always the best framework for viewing graphical content, many applications have moved to web-based solutions because they don’t require installations and upgrades for each client machine. Newer technologies are allowing companies to produce products that merge the benefits of web applications and desktop applications, providing rich graphical displays with zero or little installation. Microsoft Excel is the most popular tool for process data analysis and reporting. Most historians allow you to easily retrieve the results of simple and complex data queries directly into Excel. Some tools will perform complex aggregatation or filtering of data before they are retrieved while others retrieve raw or sampled historical data and rely on aggregation and filtering within the spreadsheet. Although Excel can be used for report creation, its popularity stems from its capabilities as an ad-hoc analysis tool. Applications that concentrate on report generation, such as SQL Server Reporting Services, are far more powerful than standard spreadsheets for report generation. It’s important that your historical data doesn’t exist within a black box that’s only accessible through the vendor’s applications. This implies that the vendor permits external access through its own programming interfaces as well as supports more open standards such as OPC and OLEDB at reasonable costs. After all, these are your data, and the best solution may involve combining applications from various vendors. So, open access to historical data can be crucial. Make History Your FutureAccess to process history is a must-have in today’s competitive landscape. The maturity of current historization products makes successful implementation an attainable goal. It’s important to chart your long-term objectives for data historization, ensuring that the chosen solution is robust and can handle your various sources of process data. Open standards and a variety of third-party products allow you to customize your solution to maximize user satisfaction and payback. Tips and TrapsUnderappreciated issues can undermine an implementation. Pay particular attention to four areas: 1. Scan rates. A historian can’t store a value that wasn’t scanned. Data collection nodes for a historian are responsible for retrieving new measurement values. These nodes typically poll measurements at a configured interval. You may have the ability to scan groups of points at different intervals. The scan rate must allow the historian to accurately reproduce the measurement waveform. A common mistake is setting rates too high, increasing the likelihood that the collection node won’t scan important values. For example, if the collection node is scanning a voltage measurement every 5 seconds, a 3-sec. voltage spike may not be scanned or recorded in the historian. So, during configuration, it’s imperative to involve people familiar with the measurement behavior. Someone who understands the process and the measurements that are being collected would know that a 3-sec. voltage spike could be an issue. When selecting scan rates, you need to ask the question “What is the minimum period in which a significant value change can occur?” Too often people implement blanket settings that aren’t appropriate for all measurements. For example, scanning temperatures once per minute would be fine for a large vessel where significant changes (0.5°) take tens of minutes but wouldn’t be good for measuring an exhaust air temperature which can change 10° in less than 5 seconds because you will miss significant events. If you don’t know the appropriate interval, check with a person who does. 2. Data retrieval. Most process data historians record a set of measurement values along with a status and time stamp. These values may be stored sporadically based on the data collection and filtering settings for the measurement point. Reports and analysis tools may offer a standard set of retrieval functions; the data consumer must understand these functions. Users like to create reports by retrieving evenly spaced samples of the recorded values because these calls result in a known number of returned values. Yet, sampled retrieval calls often will miss important recorded events that occur between the evenly spaced samples, resulting in inaccurate reports. 3. Aggregates. Data historians may offer a multitude of aggregate functions that will return calculation results performed on the historized data. Many of these aggregates have similar names or names that may be confused with common language usage. So understanding the details of each function is imperative. For example, a “total” may return an integration or a summation; an average may be time-weighted or a mean of the recorded values. If possible, retrieve the historized data, compute your own calculation result, and compare to the historian’s aggregate function result to verify the expected operation. 4. What time is it? Time-stamped process data in a historian presents many opportunities for errors and confusion. • Collection nodes may time-stamp the scanned data based on the current time of the data source (e.g., PLC, DCS), which may differ from the current time of the historian or other data sources stored in the same historian. • Users may exist in different time zones than that of the data historian computer. So a request for “the value at 8:30” may return different values for different users. Can users request data according to their time zone or must they use the time zone of the historian? • Clock changes for daylight savings time result in one 23-hr day and one 25-hr day. What happens to a daily report during these two days?• Some “events” occur within the confines of continuous time and require additional processing and correlation tools. Such events have a defined start and end time, and include traditional batch and sub-batch operations, equipment startups, environmental emission violations, and equipment failure or downtime. Tracking these events requires storage of the event start and end times along with the ability to correlate the historical process data within the event time. Traditional absolute-time based tools may not work well when performing event-based analysis. Event analysis demands tools that provide event querying (when did the events occur?), correlation of the process data within the event, removal of absolute time and “time-into-event” analysis. Dane Overfield is product development lead at Exele Information Systems, Inc., East Rochester, N.Y. E-mail him at [email protected].