Even if you get it right at the beginning it wont last. This is where characterization is important. Knowing the exact characteristics of the data values enables conversion and aggregation routines to be used to bring consistency to the information.
Any data quality initiative must therefore have the capability for units of measure conversion, and aggregation over different times and scopes, so that data reported at differing levels in the organization or for different materials can be combined.
Data quality toolkit
With consistent information comes the ability to apply analysis methodologies, such as balancing, measurement conciliation, batch tracking and data reconciliation (Table 1).
Table 1. Each method has its merits. (Click to enlarge.)
Balancing is the application of simple volume, mass or energy balances to envelopes around sections of the facility (Figure 3). The objective is to identify areas where data mismatches occur. Typically, the analysis can be exception based or can balance by commodity or owner. Balancing will show errors across a plant area, but wont show the exact location of the erroneous measurement. Further investigation and drilldown is usually necessary to pinpoint the problem.
Figure 3. A typical material and energy balance begins with a process flow diagram.
Conciliation is a process of mediating between two measurements of the same quantity based on predefined accuracies. The rules for this mediation can be as defined by a standards organization or by business rules and agreements. Rules can be as simple as, if meter A is different from meter B, use meter Bs value. Alternatively, more complex adjustments can be made using a statistical comparison of the meter accuracies. The adjustments made on each side of the conciliation process can be tracked and analyzed to provide statistical information on the relative meter performance.
In one example, a reactor thermocouple measured against another thermocouple and a calculation of the adiabatic flame temperature. Any deviation was a sign of either mechanical failure of one of the thermocouples or poor installation. Mechanical failure meant an increase in resistance with a corresponding creeping rise in reported temperature.
Batch tracking involves following batches of material along the supply chain from inventory locations through transportation systems to distribution. At each location compositions, physical qualities and quantities are calculated for comparison to previously measured qualities and quantities. This process can be applied to any supply chain configuration: continuous flow or with discrete transactions, with or without predefined batches. It requires no additional data input other than that normally captured to run the business; measurements, lab analyses, inventories and movements.
Benefits are significant. It provides predictions of qualities and quantities at any time and for any location in the supply chain. This enables continuous validation of material balances and laboratory analyses. As with any method, there are caveats: There must be an agreement between the methods and accuracy of measurements used in comparison.
Data reconciliation is the application of statistical or heuristic methods to generate a complete and consistent mass balances across an entire process flow sheet. Every measurement of inventory and flow is adjusted, subject to applied constraints, to produce a completely balanced result. The adjustments to each specific measurement that had to be made to get a balance are compared to their accuracy tolerance. This relative measure shows which values were statistically in error and need to be examined. This is a powerful analysis tool because it pinpoints specific measurements in error, it provides estimates for unknown flows and it provides accuracy statistics on measurements and balances.
Although data reconciliation is a powerful tool it isnt often fully employed. This is because the data collection and clean up required to make it effective arent easily put in place. It isnt a tool that can be implemented in isolation; it needs a data quality strategy behind it.
Is it worth gambling?
Would you bet your business on your data? Whether you like it or not youre doing that every day. Application of a coordinated data integration strategy with some data analysis tools will put you on a continuous improvement path to high inherent raw data quality. Validated, audited, consistent information with a single version of the truth is the key foundation of successful business decisions.
Andrew Nelson is product manager at Matrikon in Edmonton, Alberta, Canada. E-mail him at firstname.lastname@example.org.