This series of articles aims to improve your use of test data. Here, in this very space, before your very eyes, without smoke or mirrors, Dr. Gooddata will help you develop objective numerical tests for good data. We will discuss the philosophy of measurement uncertainty analysis and its methods. We will deal with example problems, as well as issues you pose.
The goal is to have fun as we stumble through the corridors of measurement uncertainty, beginning with first principles and progressing to thorny issues whose solutions have eluded brilliant men for ages.
Let’s begin with some obvious questions. What are good data and what are bad data?
It always is important to secure good data. Each time a test or experiment is run, the person reviewing the data makes a decision about whether it is good. How does the typical reviewer decide whether the test data are good? Some common definitions for good data are:
â€¢ data that were expected or that prove the reviewer’s point.
â€¢ nominal. How do you quantify “nominal?” Does the term have any meaning?
â€¢ spot-on or astonishingly close. The data is close to what — and how close?
How about bad data? Try these common definitions on for size. Bad data are:
â€¢ data that were not expected or that disprove the reviewer’s thesis. (I thought we ran experiments or tests because we did not know all the answers. Plus, there usually is a lot to be learned from these data.)
â€¢ data secured from bad instruments, poor calibrations or incompetent instrumentation engineers. This usually means the experimenter doesn’t like the test result and is seeking to transfer the blame.
Perhaps the instrument is key? Does good simply mean the instrument has been calibrated in the test environment, properly ranged and its accuracy has been evaluated, with test results compared to a baseline? Those points surely make sense, but don’t suffice. We need a definition that is strictly objective and more general.
An acceptable definition
What, then, are good data? To understand the correct answer we first must consider why tests or experiments are conducted. That’s right, we need to think about what we are going to do before we do it!
We run tests to obtain knowledge about some physical process. Remember, the scientific method (theorize, experiment or test, analyze and then conclude) needs test data from which valid conclusions can be drawn. So, we also need a definition for a valid conclusion.
A valid conclusion is one that is unambiguously supported by test data. Ha! We are right back to defining the quality of our test data, this time as data that unambiguously support a test result! Seems like circular reasoning, doesn’t it?
Think of it this way: For a conclusion to be unambiguous, the differences or levels observed in test data must exceed those that might be due to measurement error alone.
This leads to Dr. Gooddata’s definition for good data: Good data are test data whose errors do not influence valid test conclusions. Good data do not have measurement errors so large that those errors alone could yield the test result observed. Conversely, bad data have errors large enough that they could account for the test result seen. Seems simple, doesn’t it? However, we admit that we need to compare numbers. (Note that both good and bad data are defined in the context of its use, not by the predisposition or prejudices of the experimenter.)
With these simple definitions for good and bad data, we still must define the limits to which the ever-present measurement error will influence these data.
What is measurement error? The most common definition is the difference between the measured value and the true value. “I know what I’ve got,” says the novice experimenter, “but what is this true value?”
The true value is the value of some physical standard or that from a national standard laboratory. Physical standard true values include the atomic clock and the triple point of water. They are true by definition. Also by definition, the test result certified by a national standard laboratory is true. That is, everyone agrees to that reference or true value
Always remember that we never get the true value in a test measurement process.
Measurement error always alters our test result, therefore, it does not yield the true value.
If measurement error is the difference between what we get in our test and the true value, how do we know whether our conclusions are valid if we don’t know the true value? This is a very important question. If the measurement error is sufficiently small so that it doesn’t influence the decision process that is based on the test results, we can obtain valid conclusions. Or, stated another way, we will be OK as long as the measurement error doesn’t actually affect the decision.
Note my deliberate emphasis on the decision. Making decisions is why we conduct tests; there is no other reason.
We can use uncertainty analysis to estimate how far away from our test result we expect the true value to be. This analysis is essential to reach valid conclusions and make correct decisions. You must get up to speed on uncertainty analysis if you care about these two things.
What kind of measurement error can influence our decisions? There really are three types. However, measurement uncertainty analysis only recognizes two. Can you guess what that third type is?
The first is random error, which sometimes is called precision error. Random error makes test results less precise.
The second is systematic error, also known as bias error, stemming from the measurement process.
Both of these error types cause uncertainty in a test result and therefore are involved in a measurement uncertainty analysis.
The third error type is blunders. That’s right, blunders. These errors should never be part of your experiment or test. Measurement uncertainty analysis starts with the assumption that there has been adequate test engineering to eliminate blunders. If not, all bets are off. So, let’s assume from here on that blunders have been eliminated by competent engineering practices.
A question of magnitude
It is not enough to know what kinds of errors enter into a measurement uncertainty analysis; we need to estimate the magnitude of their effects. What information is available that will help us assign values to the scale of random and systematic errors?
Let’s begin with the easy one. How can we estimate the magnitude of the effect of random errors? We first need a definition for random error. The best definition that Dr. Gooddata has encountered is random error causes observable scatter in test results. Put another way, we can see in our test results the effects of all the random errors that are operating in our test or experiment.
How is this random error estimated? I’m sure you have some fine ideas and I’d like to know them. E-mail them to me. Think in two frames of reference: pretest and posttest. We want to be able to estimate the magnitude of the random errors before running a test so if they are too large we can do something to improve test results. And, we also want to get a handle on them when we have actual data. Should the estimating methods differ?
I’m looking forward to your thoughts rolling in. After we discuss estimating the magnitude of random errors, we’ll turn to assessing the size of systematic errors.
Remember, use numbers, not adjectives.
Ronald H. Dieck is principal of Ron Dieck Associates Inc., Palm Beach Gardens, Fla. E-mail him at firstname.lastname@example.org.
This series of articles will focus on the importance of securing good data and how to characterize data. It will discuss the standard methods for numerically estimating the quality of test data. Readers are encouraged to e-mail Dr. Gooddata to ask for guidance about measurement uncertainty issues at email@example.com.