Data can be collected from a sample to answer questions about the population.
That is to say, information from samples allows us to make inferences
(draw conclusions) about the population.
Samples and variation
A feature of samples is that a second sample collected from the same
population will be different from the first sample, and so on. This difference
between samples indicates the variation that exists within a population.
Statisticians are interested in this difference because it can be used
to infer something about a population. The variation in a property or
characteristic that exists between units of a population can be measured.
The measurements of these properties for each unit are called variables.
By using samples, these variables can be compared within a smaller group
rather than with the entire population. The results of this comparison
can then be inferred for the entire population. Although it is often useful
to remember that units or members of a population are more similar than
they are different, it is the differences in which statisticians are interested.
You might now be starting to realise that it is the variation in a population
that influences how that population is sampled.
The relationship between population, unit and variable
Figure 1 - The relationship between population, unit and variable
You might notice that the smiley faces have more similarities than they
have differences. However, a question that might be asked is: “Why
are some smiley faces more likely to be attacked by predators than others?”
To answer this question you might begin by collecting data from records
of deaths. In these records, the colour of the smiley face is one of the
variables recorded.