Reliability of Data

In a previous entry I showed that the basic concepts of quality control, which depends upon the laws of probability (statistics), are surprisingly simple. All that we are trying to do is measure lengths of lines. The equations used to calculate the mean and standard deviation are those that describe only two lines so that no matter how many samples are tested, the calculations of those parameters result in just those two lines which are independent of each other. While “n” data points occupy “n” dimensions, the mean and standard deviation occupy only two. We can use the standard deviation as the ruler to measure the lengths of interest.

What makes things difficult is the fuzziness of those lines. In quality control the first thing we want to determine is the length of the distance from the measured length (sample mean) to some desired length. To do that we use a ruler in which the standard deviation is set to be one. For convenience, and because the standard deviation is defined as the second moment around the mean, the targeted mean is subtracted from the data points so that the resulting length of the data vector is reduced to the difference between the sample mean and the target. That length is then divided by the standard deviation. The resulting length is then measured not in inches or millimeters but rather in units of the standard deviation ruler. As an example, assume that 100 was the target value, the measured mean was 85 and the standard deviation was 10. We are not interested in what the actual measured mean is, but rather how close it is to the target, based upon the standard deviation ruler:

1. (100-85)/10 = a distance of 1.5 SD units. In some cases the measurement is not from the desired target, but to upper and lower limits.

However, the mean value is fuzzy and the standard deviation may or may not be fuzzy. The data generated in calculating the mean make up a random variable (X= (x1, x2, —, xn)) in vector space. How fuzzy it is depends upon the length of the SD, and the type of distribution. While there are many distributions, if the SD is not fuzzy, what is called the normal distribution is often used. Because of the uncertainty in the mean, the distribution function tells us the chances of the mean actually being somewhere else.  In example 1 with only the mean being fuzzy, and using the normal distribution, we can say that there is a 6.68% chance that the true mean of the data is the desired mean.

Unfortunately, the SD often is fuzzy too and is thus also a random variable. The square of the SD is called the variance, and has its own distribution function called the chi squared distribution. While the normal distribution is independent of the number of data points defining the random variable, the form of the chi squared distribution depends upon the degrees of freedom. The chi square distribution with one degree of freedom is the square of the normal distribution. That distribution may be used to determine whether two measured standard deviations are really the same.

How the fuzziness or uncertainty is handled will be covered later. Although the mathematics gets more complex, especially when multivariate sets of data must be considered, the goal is still to simply measure lengths with a specific ruler.



Means and Standard Deviations as Lengths

When we talk about quality control we hear about distributions, such as the poisson, hypergeometric, binomial, normal, “t”, chi-squared and “F”. How complicated! And we are told to worry about things being independent, are inundated with words like variance, mean, median, mode, standard deviation, whether the standard deviation is homo or hetroscedastic (whether the standard deviation is constant or not), confidence limits, and such things as Type I error, Type II error, null hypothesis etc. It cannot be denied that all of these have their place. However, to get to the basics, all we are really trying to do is measure lengths. Statistics is really simply analytical geometry or linear algebra, depending on one’s outlook. Let’s look at the mean and standard deviation.

Mean (one type of average). We are told that it is the first moment around the origin.

Mathematically it is the integral of xf(x)dx between some limits where f(x) is some distribution  function. Yet it is still length.

Consider a set of “n” data points, X= (x1, x2, —, xn). Then visualize a graph of n dimensions with a single location, X, representing those data. Also visualize a line in that n dimensional space that is equidistant from each axis, i.e. It goes through (1,1,—–,1) etc. Drop a line perpendicular from X to that equidistant line. Call that point M=(µ, µ,—-, µ).  Divide every point by the square root of n, the number of data points to introduce the number of tests into our considerations.

The line (δ ) from the X to M would be the vector (x1– µ, x2– µ, —, xn– µ) while the line (µ) from the origin to M would be the vector (µ, µ,—-, µ). Since the two lines are perpendicular, their scalar (or inner or dot) product would be zero:

((µ, µ,—-, µ))·((x1– µ, x2– µ,—, xn– µ)/ )= 0

x1, + x2, +—-,+ xn – nµ = 0

µ= (x1, + x2, +—-, + xn)/n, which is identical to the form for the mean.

That is, the length of the line µ from the origin to M is equal in value to the mean of the data points.

Standard Deviation. The length of the line, δ, from X to M is the square root of (1/n)*((x1)2+ (x2)2+—-,+ (xn)2 – nµ2). (1/n)*(x1)2+ (x2)2+—-,+ (xn)2 is the square of the length of the line from the origin to the data, X,  while (1/n)*(nµ2) is the square of the length from the origin to the point of M.

δ = ((1/n)*((x1)2+ (x2)2+—-, + (xn)2 -nµ2))0.5

Thus the equation of the length of the line δ is identically to one of the equations used for calculating standard deviations (where the standard deviation is not a random variable. If the sample standard deviation (s) is a random variable, 1/n would be replaced with 1/(n-1)).

Rulers. To measure lengths we need a ruler. We use miles in the United States, in Canada they use kilometers while in Russia, the Verst may be used. In statistics the ruler used is the length, “δ”, if the standard deviation is known or, “s” if the standard deviation is a random variable.

The many terms mentioned above and the sophistication of the mathematics are important in establishing the reliability of the data, still, basically we are only measuring lengths.


Chip Seals

The application of a seal coat has a number of functions however one of the most important is to waterproof the pavements, protecting them from water damage and oxidation. If pavements were sealed early in their life, e.g. within a year, the pavements would last a lot longer. Chips seals are used especial on highways.

Chip Seal Emulsion. The emulsified asphalt used for chip seals are specially designed to break very fast on contact with aggregate. Emulsions can be either anionic (basic) or cationic (acidic) although the cationic are very popular. With asphalts from some crude oils the amount of emulsifier required for anionic chip seal emulsions is very small, approaching zero as a result naphthenic acids in the asphalt which serve as emulsifiers when neutralized with caustic soda.

Special Seal Emulsion. There is a product called PASS that has the ability to re-seal cracks and regenerate pavements.

Where to Use. A chip seal does an excellent job as a seal. While it can be used in cities, in my opinion a slurry seal would be better, unless it is a Capeseal in which a slurry is placed over the chip. The disadvantage of use in cities is that the chips can spread over lawns, in driveways, etc.

Mix Design. It is very important that a mix design is done, otherwise there can be failures.

Problems. One of the causes of failure is dirty aggregate. The chip seal emulsions are designed to break immediately on a surface thus when it hits the dust it breaks on the dust and not on the surface of the aggregate. An emulsion type called High Float is more tolerant of dust. Not enough emulsion can cause loss of chips while too much emulsion can called bleeding.  Also when used in cities, loss of chips can occur at the centerline as along the centerline there can be less asphalt as a result of less overlap of the spray. For rural roads this isn’t a big problem as there is not that much turning stress on the aggregate, however in the city, there can be turning traffic out of driveways. Also, there is another important problem; it is difficult to skate on chips.

It isn’t a good career move for a director of public works to place a chip seal on streets in expensive neighborhoods, especially if chips end up on the lawns, sidewalks and driveways.

Robert L. Dunning, chemistdunning@gmail.com, www.petroleumsciences.com


Using Local Materials

Roads are absolutely necessary for economies to succeed. Yet in these perilous economic times, funds are not available to build them to luxury standards. However there is technology available that allows the construction of very usable roads using materials already in-place.

Asphalt Emulsion Stabilized Bases. With soils with a plasticity index of 6 or less asphalt emulsions could be considered for base stabilization. This technology has been around for decades. I published a paper in the 1965 Proc. Asphalt Paving Technology on asphalt emulsion stabilized bases which included a mix design. We had found that one inch of stabilized base could replace about 1 ¼” of crushed aggregate base. For roads with low truck traffic a chip seal might be used as a wearing surface. A word of caution, the same care must be taken for compaction as with soils, and in calculating the maximum density; the liquid would be the sum of the emulsion and added water. There are some sophisticated emulsion formulations in which the emulsion “sets” and kicks out water, however they are not available everywhere.

Another caution. Just because an emulsion is labeled “slow set” does not mean it will necessarily mix with all in-place soils. We once were working with a particular slow set emulsion that was working quite well. On this project we first treated the soil with lime then stabilized it with a slow set emulsion. To save money, the contractor switched to another slow set emulsion, which didn’t work. In emulsion stabilization the mixed soil should be brown. In this case it came out the same color as it was before mixing, indicating that then emulsion was coagulating and balling up rather than coating.

Emulsion Based Macadams. When I was in Panama many years ago I witnessed the construction of a macadam using CRS-2 asphalt emulsion. The emulsion was manufactured by a company for whom I was doing consulting on asphalt emulsion manufacturing. A typical macadam construction technique was used. First a layer of large stone was place followed by a layer of asphalt emulsion. Following that were consecutive layers of aggregate and emulsion with each aggregate size ½ the size of the proceeding one. The last layer was sand. Since CRS-2 emulsions break as soon as it contacts the aggregate, it appeared to work in the tropics.

Lime Stabilization. For soils too plastic for using asphalt emulsions, lime stabilization might be the selection.

Cold In-Place Recycling.  Cold in-place recycling is being used in the United States especially in place of hauling new aggregate base. For low truck traffic the wearing surface could be a single or double chip seal. For heavier traffic, however, hot mix should be used.

This short piece was to suggest that there may be lower cost options for constructing roads in rural areas. For any question contact me at chemistdunninng@gmail.com in English or Spanish. (I have also had a couple of years of Russian but that was a lifetime ago, but I can read the Cyrillic alphabet. Although my knowledge of Russian has retreated to the far reaches of my brain, we do have a large Russian population here so we can accept Russian inquiries.)

Robert L. Dunning. www.petroleumsciences.com