10 - 9
pass play? When variability is large, it is simply more difficult to regard a measure of central
tendency as a dependable guide to representative performance.
This also applies to detecting the effects of an experimental treatment. This task is very much
like distinguishing two or more radio signals in the presence of static. In this analogy, the effects of
the experimental variable (treatment) represent the radio signals, and the variability is the static
(noise). If the radio signal is strong, relative to the static, it is easily detected; but if the radio signal is
weak, relative to the static, the signal may be lost in a barrage of noise.
In short, two factors are commonly involved in assessing the effects of an experimental
variable: a measure of centrality, such as the mean, median, or proportion; and a measure of
variability, such as the standard deviation. Broadly speaking, the investigator exercises little control
over the measure of centrality. If the effect of the treatment is large, the differences in measures of
central tendency will generally be large. In contrast, control over variability is possible. Indeed, much
of this text focuses, directly or indirectly, on procedures for reducing variability—for example,
selecting a reliable dependent variable, providing uniform instructions and standardized experimental
procedures, and controlling obtrusive and extraneous experimental stimuli. We wish to limit the
extent of this unsystematic variability for much the same reasons that a radio operator wishes to limit
static or noise—to permit better detection of a treatment effect in the one case and a radio signal in
the other. The lower the unsystematic variability (random error), the more sensitive is our statistical
test to treatment effects.
Tables and Graphs
Raw scores, measures of central tendency, and measures of variability are often presented in tables or
graphs. Tables and graphs provide a user-friendly way of summarizing information and revealing
patterns in the data. Let’s take a hypothetical set of data and play with it.
One group of 30 children was observed on the playground after watching a TV program without
violence, and another group of 30 children was observed on the playground after watching a TV
program with violence. In both cases, observers counted the number of aggressive behaviors. The
data were as follows:
Program with no violence: 5, 2, 0, 4, 0, 1, 2, 1, 3, 6, 5, 1, 4, 2, 3, 2, 2, 2, 5, 3, 4, 2, 2, 3, 4, 3, 7, 3, 6,
3, 3
Program with violence: 5, 3, 1, 4, 2, 0, 5, 3, 4, 2, 6, 1, 4, 1, 5, 3, 7, 2, 4, 2, 3, 5, 4, 6, 3, 4, 4, 5, 6, 5
Take a look at the raw scores. Do you see any difference in number of aggressive behaviors between
the groups? If you are like us, you find it difficult to tell.