Central Limit Theorem
Can we assume that our sample mean is normally distributed? In addition to the whole process of sample selection and subject measurement, the Central Limit Theorem14 is often used to answer this question. Formally, what is says (and does not say) are shown below:
Central Limit Theorem
Draw a simple random sample of size n from any population whatsoever with mean |
It's not surprising that when sampling from a normal population the means will be normally distributed. It's far more useful to know that no matter what the underlying distribution is, your means will be normally distributed, as long as you have sufficient n. How large an n is required? It depends on the underlying distribution, but the rule of thumb is 30.
However, the magic of this theorem cannot save us from an ill-conceived sampling methodology. That is, if we draw a simple random sample then we can trust that the CLT will hold. Say we didn't do a simple random sample; are we in trouble? We're not in great danger if the data can plausibly be thought of as observations taken at random from a population. If the data are representative, we're probably OK. However, there is no way to rescue a study using data collected haphazardly. The data will have unknown biases and no fancy formula can rescue badly produced data. GIGO (garbage in, garbage out).
So, assuming the data at hand is representative, let's move on to confidence intervals. So far, our estimation methods have resulted in point estimates. Confidence intervals are even more useful.
14 Also see Daniel, page 130.