In the first article in this series (Feb., p. 26), we briefly discussed the three major types of errors that affect the quality of test data: precision (random) errors, bias (systematic) errors and blunders. We assumed that blunders could be eliminated from experiments by good engineering practice. We then defined random error sources as those that cause scatter in the test result and categorized all other error sources as systematic. Estimating the magnitude of the effects of those random errors is the next job we have. This requires a numerical way to describe the effects.
The most common and useful way to describe the magnitude and effect of random errors is through the use of standard deviation, Sx. The standard deviation for a sample of data is calculated as:
where Sx is the standard deviation of the sample of data, Xi is the ith value of X in the sample, is the average X and N is the number of data or X values in the sample.
It can be used to describe the scatter in one error source or, if calculated from the test data, to describe the scatter in the test result — that is, the combined effect of all random error sources for an experiment. This standard deviation is easy to calculate after a test. (In a future article, we will discuss how to calculate the random error contribution to measurement uncertainty from data available before a test.)
It is usually desired to also be able to describe the scatter in the average result. We, however, have only one average result. How can we describe the scatter in a single average? Try calculating standard deviation where N = 1 and see what you get.
The standard deviation of the average, (even though we have only one average) can be estimated from the standard deviation of the data using an expression that is an outgrowth of the central limit theorem in statistics. (If you want to sound really knowledgeable, say that to your engineering friends.) The equation is:
where Sx is the standard deviation of the data sample (just as before) and N is the number of values in the calculated average. Note that is often called the standard error of the mean and the random standard uncertainty.
The N in Equation 2 is not always the same as the N used in Equation 1. We could estimate standard deviation from “tons” of historic data, with many data points associated with the result. However, our experiment might be expensive to run and, so, we might only have, say, five trials. In this case, N in Equation 1 would be “tons” and the N in Equation 2 would be five. We will explain this in more detail later.
Now that we know how to calculate Sx and , how are they used to numerically describe the scatter in test data and the scatter expected in replicate averages? We need a multiplier (not actually a fudge factor, as you might be thinking) that, when used with either Sx or , will yield a band or interval into which a given percentage of the data or replicate averages will fall. How do we do that? We need a confidence interval.
The confidence interval
The 95% confidence interval is by far the most common one in uncertainty analysis. We will use that interval throughout our discussions unless it is clearly pointed out that an alternative interval is preferred.
The confidence interval needed (no matter the percentage) is computed by the use of the factor t. This actually is known to your local statisticians as Student’s t, and was invented by a Mr. Gossett in England. The brewers he worked for and for whom he developed the statistic would not allow him to publish under his real name because they didn’t want to look like scientists: They wanted to keep their reputation as master brewers. So, Gossett published under the pseudonym “Student.”
Student’s t, when multiplied against Sx, will yield an interval into which approximately 95% of the data should fall. It is not exact, but it is useful.
However, we are really interested in the expected scatter in the average. For this, Student’s t is exact. The Student’s t times will yield an interval about the test data average into which we can expect the true average to fall 95% of the time. We can estimate the standard error of the mean or standard deviation of the average from only one average and the standard deviation of the data. That makes it easy. Note here that we are also assuming or have demonstrated that the data are normally distributed — that is, they conform to a Gaussian-Normal distribution.
We see, then, that the proper expression for the interval about the average into which the true average will fall 95% of the time is t. This is the proper expression for uncertainty when only random sources contribute to the error. It covers the expected scatter in an average due to random error sources. It is the random component of the uncertainty.
Where do we get this Student’s t? We select the proper Student’s t from tables in any standard statistics text. The criteria needed are confidence interval of choice and degrees of freedom. Degrees of freedom? We will discuss degrees of freedom in more detail in a future article. For now, suffice it to say that for most calculations of standard deviation, the degrees of freedom amount to one less than the number of data points in the sample. In other words, 10 data points translates to nine degrees of freedom.
Student’s t, for 95% confidence, runs from 12.7 with one degree of freedom to two for 30 or more degrees of freedom. (At infinite degrees of freedom, Student’s t is about 1.96; at 29 degrees of freedom, it is about 2.04. Therefore, we approximate it as 2.000 for degrees of freedom over 29.)
Now we need to discuss the interval that we can obtain from our test data into which the true average, µ, will lie with 95% confidence if there is no systematic error. Can you guess what that is? Using the standard deviation of the average and Student’s t, the interval is:
( - t95) µ ( + t95)
Test data always show the effects of a random error source. Not only do the data points vary around the average, but, if the experiment is repeated, the average will change. How much can we expect the average to vary? The answer is ±t95 , which looks very much like the interval above — and it is! Experiments often are too expensive to repeat; Equation 3 allows us to use the scatter of our data to estimate the expected scatter in replicate averages. Let’s see how this operates. Consider the two sets of temperature readings shown in the table. There are eight data points in each set; so, we are dealing with seven degrees of freedom and the resultant ±t95 of 2.365.
For data Set A, the true average 95% of the time is contained in the interval 0.0 ± (2.365)(0.87) = 0.0 ± 2.06. What is the 95% confidence interval that contains the true average for data Set B? We will provide the answer next time. Until then, it is also worth pondering: Which of the above two data sets yields a more accurate average? Be careful, this is a trick question and you might be surprised at the answer.
Until then, remember: Use numbers, not adjectives.
Ronald H. Dieck is principal of Ron Dieck Associates Inc., Palm Beach Gardens, Fla. E-mail him at email@example.com.