MATH 2441

Probability and Statistics for Biological Sciences


Point Estimators


As mentioned several times already in these notes, there are two broad types of statistical inferences that are commonly made about population parameters:

we can estimate the value of the population parameter (estimation)
we can decide whether we have adequate evidence to support a statement about the value of that population parameter (hypothesis testing)

Both of these approaches make use of the data we have from a random sample.

There are two different approaches to estimation:

a point estimator is a formula or expression producing a single value estimate of the population parameter. Some people refer to point estimates as a "best guess" value. The term "guess" is a bit pessimistic, but it does give you the sense that there is a degree of uncertainty in relying on point estimates.
an interval estimator (giving a confidence interval estimate) is a formula or expression that produces a numerical interval which has a specified probability of capturing the true value of the population parameter. Again, this element of uncertainty appears, but in the case of confidence interval estimates, the degree of uncertainty is displayed more explicitly.

In practice, confidence interval estimates are used more commonly by far than point estimates. Nevertheless, since point estimates are used in certain important ways in statistics, and carry with them some important concepts and terms, we need to look at them briefly in this course.

The term estimator refers to the formula or expression used to calculate the estimate, the actual numerical value estimate of the population parameter in a particular problem. When we speak generically, it is conventional to represent the population parameter being estimated by the symbol θ , the Greek letter 'theta', and to represent the estimator by the symbol , the same Greek letter, but with a caret on top.

You've already seen a number of potential point estimators:



population parameters
(θ )

point estimators




σ 2


standard deviation






correlation coefficient



linear regression

β 0, β 1

b0, b1

difference between two means

μ 2 - μ 1

difference between two proportions

π 2 - π 1

p2 - p1


(You haven't seen the last two entries in this table yet -- they would be used when we want to compare two populations, a very important type of problem in statistics. The other entries should be quite familiar to you by now.)

Notice that all of the point estimators, , listed in this table are necessarily random variables, because they are statistics for random samples. Since such random variables have distributions with some width (that is, different random samples will generally give different values of ), we realize that the determination of a value of does not guarantee us that we have the exact value of the corresponding population parameter, θ . In fact, in general, we can write

                                                                    (PE - 1)

where 'error' stands for the difference between the observed value of and the actual "true" value of θ . This emphasizes one very serious danger in reporting point estimates of population parameters -- readers may forget that there is an unstated and perhaps large error, and so may mistakenly attribute greater accuracy to the estimate than is really warranted. The consequences of unwittingly using erroneous results can often be worse than not having any result to use at all! One reason interval estimates are favored much more than point estimates in statistical inference is that they make the presence of potential estimation error much more explicit.

Nevertheless, there are some instances in which the convenience of point estimates outweighs their deficiencies. Two examples are:

    1. interval estimates of certain fundamental physical constants would be very difficult or inconvenient to work with in calculations. Thus, although quantities such as the gravitational acceleration, g; Avogadro's number, N; and so forth are numbers which are experimentally determined and thus subject to sampling errors of one sort or another, we normally use point estimates of them rather than interval estimates. Of course, many of these fundamental constants have been estimated with high precision so that errors in their estimates are not significant for many applications. Also, as we saw last term in MATH 1441, there are ways to "estimate" the effect of uncertainties in such numbers on calculations after the fact.
    2. as you'll see shortly, when we carry out various procedures of statistical inference focussing on one population parameter of greatest interest, the formulas that result may involve the values of other population parameters. In such situations, we can usually obtain adequately accurate results by using point estimates for the parameters of secondary interest in order to derive formulas for an interval estimate of the parameter of greatest interest.

      For example, in deriving formulas for interval estimates of the population mean, μ , we require the value of the population standard deviation, σ . Since μ is unknown, it is very unlikely that we'll know the value of σ (though in some instances we might). Rather than backing up one more step and determining an interval estimate for σ , it is more usual to use the available value of s as a point estimate of σ in the formula for the interval estimate of μ .

There may be many potential point estimators for a given population parameter. Obviously we'd like to use the one which has the smallest error term in (PE - 1). Unfortunately, we have no way of calculating what the error is for a particular estimator (if we could calculate the error exactly, we could compensate for it exactly and there would be no further need for statistical analysis!), and so there is no direct way to decide which is the best estimator to use in a specific situation. However, statisticians have developed some criteria which they find useful in deciding which estimator might be the more advantageous in specific circumstances. Very briefly, three of these are:

1. An unbiased estimator is generally considered superior to one which is not unbiased. An estimator, is unbiased if its mean value (or expected value) equals the actual value of the parameter being estimated:

E[] = θ                                              (PE - 2)

Although particular values of obtained from actual random samples are unlikely to be exactly equal to θ , the thought here is that for unbiased estimators, will not systematically underestimate or systematically overestimate θ .

The sample mean, is an unbiased estimator of the population mean, μ , and the sample variance, s2 is an unbiased estimator of the population variance, σ 2 . The denominator 'n - 1' in the formula for s2 is chosen to make this true. Although the sample standard deviation, s, is not an unbiased estimator of the population standard deviation, σ , the degree of bias is generally considered to be small enough that it doesn't override the other advantages of using s. Under fairly general conditions, the sample proportion is an unbiased estimator of the population proportion. When the population distribution is continuous and symmetric (for example, populations which are approximately normally distributed), the sample median and symmetrically trimmed sample means are also unbiased estimators of the population mean.

2. Estimators whose sampling distributions have smaller variances are considered superior. The idea here is that different random samples will give different values of the estimator, consistent with the sampling distribution. The variance of the sampling distribution is a measure of how spread out these values may be. A smaller variance means that any given sample is more likely to give a value nearer to the actual value of the population parameter.

For example, is smaller than indicating that the sample mean is preferred over the sample median as an estimator of the population. Some references speak of the sample mean being a more efficient estimator than the sample median in this case. In fact, for a normally distributed population, it can be proven mathematically that the sampling distribution of the mean has the smallest possible variance, making the sample mean the most efficient possible estimator of the population mean.

Incidentally, because of its being a measure of the scale of error in a point estimator, the standard deviation of a sampling distribution is often referred to as the standard error of the estimate.

3. An estimator is said to be consistent if the variance of its sampling distribution decreases with increasing sample size. This is a good property because it means that if you make the effort to collect data from a larger random sample, you should end up with a more accurate estimate of the population parameter.

For example, the sample mean is a consistent estimator of the population mean because


that is, the variance of the sampling distribution is inversely proportional to the sample size.


How well a specific estimator satisfies each of these criteria depends very much on the details of the population distribution. Evaluating specific estimators in relation to these criteria often involves mathematical techniques that are beyond the scope of this course.



This material is available in Microsoft WORD format here.

previous.gif (2947 bytes) nextrt.gif (2883 bytes)

listtopics.gif (2280 bytes)

Copyright 1999 [David W. Sabo]. All rights reserved.
Revised: March 22, 2003