Graphic displays like charts and graphs are made up of elements
such as labels, lines, markers and symbols. Any pattern displayed
by these elements may indicate interesting data features. One of
the first questions that you can ask is, "What am I being asked
to do in this graph?" The graph should do at least one (though
preferably more) of the following:
provide information (If the graph is being used just to read
numbers, then a table would be better);
show a trend or a norm;
encourage a comparison of groups or an examination of the relationship
between two variables.
If the graph doesn't do one or more of these things, then you need
to ask why the graphic is there at all.
A graphic display is of value when it is used to show a particular
type of pattern, which includes:
comparing groups,
looking at the relationship between two variables,
showing trends in a series of observations, or
showing the distribution of data.
When can you use text to present data?
If you only have a couple of numbers, then you have to question
whether in fact you need a graph. If in such cases you do decide
to use a graphic display, you find that you end up filling the space
with what has been called ‘chart junk’ [6]
or ‘data noise’ [7].
You might have noticed such graphic displays in magazines. The chart
below is such an example.
(Wainer, 1983, p.138)
Chart junk, such as that in the graphic display above, is more
likely to make the pattern more difficult to read and interpret.
With such a small set of data, you are better off using a table
or text to present your data By the way, take another look at the
graph above. Did you notice that no data are provided for U.S. or
Japanese productivity, so no comparison is in fact possible from
the graph! We are left to trust the author and designer implicitly.
Let’s look at some basic rules to bear in mind when creating
a graphic display.
Rule 1 of good graphic displays: Keep the display free
of ‘chart junk’
Chart junk tends to obscure the message that the authors are trying
to present in the graphic because our ability to observe the pattern
in the data is being swamped by extra, unnecessary information.
Some examples of chart junk and good practice follow.
i. Excessive use of gridlines on a graph.
Often using excessive gridlines is counterproductive. Look at the
following example.
Note how patterns in fact become easier to see when the unnecessary
gridlines are removed.
ii. The use of unnecessary and distracting colour, shading
or designs.
Background infill and pictures that have little relevance to the
numbers being presented can make interpretation difficult, as the
following example illustrates.
Put simply, the number of variables in your data should indicate
to you the number of dimensions in your graphic display. However,
with one variable the convention is to use an area like a bar or
a circle to show the data. This is why bar charts are the best way
of showing these data because either the height or the width of
the bars should change (not both) with the change in the frequency
or quantity of the data.
iii. The use of three-dimensional depth
It is a common practice to think that adding 3D chart junk to a
chart or graph adds value to the display. Unfortunately, it mostly
adds confusion, and is easily done because of thoughtless use of
spreadsheet packages like Microsoft Excel. With the development
of computer software packages it has become easy for people to add
extra dimensions to their display. This adds nothing to the data
and is just something extra that readers have to interpret. The
extra depth is not only distracting, it might lead the reader to
misinterpret the real information because it is more difficult to
compare columns and relate them to the values on the vertical axis,
as is the case in the example below.
The number of dimensions shown in a graph or chart should not exceed
the number of dimensions in the data [11].
Adding depth to a bar chart just makes the pattern more difficult
to read. However, there are far worse examples.
iv. Chart junk and line graphs
The use of chart junk has affected the graphic display below[8].
Use of a ribbon rather than a line makes it more difficult to accurately
read variables off the chart and make accurate comparisons between
two sets of data.
(Griffiths et.al, 1998)
Notice how much easier the graph is to read when lines are used
instead of ribbons.
The ribbon graph might look more 'professional' than the conventional
line graph for the examples above, but it makes the patterns in
the change in monthly temperature more difficult to read. The line
graph allows a more accurate reading of the temperature for each
month and makes it easier for comparisons of temperature changes
to be made between the two towns. Additionally, the added depth
on the ribbon chart makes it more difficult to interpret which town
has the higher average temperature for any given month. Clearly
chart junk makes it more difficult for readers to interpret accurately
charts and graphs.
v. Chart junk and pie charts
The addition of values to a pie chart helps to avoid estimation
error, as in the following example.
However, too much can make the graph ‘busy’, i.e. harder
to read, as in this case.
Adding three dimensions also introduces the possibility of the
chart or graph telling a lie, that is, implying something that is
not supported by the data. In the chart below, the hardware and
home sectors have the same size ($882 million) but don’t appear
so because of the distortion at the front of the hardware sector.
(This negative aspect could be improved by adding the data values
to show the exact values of the sectors.)
Often a useful device is to explode a pie chart sector to emphasise
it, as shown below.
You need to always guard against creeping 3-dimensionality in your
own work and examine its presence critically in the work of others
because usually all it does is make it more difficult to read the
pattern in the graphic display. In the three-dimensional pie chart
above, how accurately do the proportions of the chart represent
the proportions or percentages? Making the pie chart three-dimensional
distorts the circle into an ellipse making it more difficult for
the reader to interpret since the sector angles are no longer proportional
to the percentages. Note how much easier it is to read the size
of the segments on the 2-dimensional pie chart.
Rule 2 of good graphic displays: Ensure the scales used
clearly represent the data
i. Ask whether the magnitude of the scale used serves to
hide any changes in the data. [9]
In this example, the scale was used to hide the growth in private
schools in the 1950's.
(Wainer, 1983, p.138)
This type of plot is called a 'stacked bar chart'. Its use here
effectively hides any information about the trends in the number
of private schools. The graphical display was used to show that
the number of private elementary schools in the US had remained
more or less static over 40 years as the number of public elementary
schools declined over the same period. Let’s have a look what
happens when we choose a different scale and use a line graph. A
different pattern emerges, one that might invite some explanation.
(Wainer, 1983, p.138)
ii. Ensure that the same scale is used all the way along
the axis.
If you need to introduce a break in the scale so that all data
can be shown on the one scale, make sure that you show the break
clearly – usually at the beginning or end of the data.
A change in scale mid-axis is very misleading and can have a big
influence on your perceptions. For example, the figure below is
a plot of doctors' incomes over time. If you do not examine it too
closely, it appears that the physicians’ incomes are linear
with a slight tapering off towards the mid-70s.
Physicians' Income ($)
Professional & Technical Workers' Income ($)
However, look at the horizontal scale and note that the plot started
off measuring physicians' income every 8 years and ended up plotting
it yearly. Let’s have a look what happens if the plot is made
where the scale is not changed.
(Wainer, 1983, p.142)
When looking at this data, you might also ask yourself about the
validity of the calculated incomes. For example, are all the incomes
based on 1976 dollars so that they can be compared? You would realise
that a 1939 dollar would have more earning power than a 1976 dollar.
iii. Ensure that all the lines drawn on the graph start from a
common base.
The graphical displays below are plots of the US import of red
meat taken from the Handbook of Agricultural Charts published by
the US Department of Agriculture [10].
Graph A
Graph B
(Wainer, 1983, p.143)
A and B show the same data but in B where the data start from a
common base, comparisons between the three sets of data are easier
to make. In chart A, the import quantities for each different type
of meat start from different bases. This makes it difficult to compare
fluctuations. However, in chart B the standard of comparison is
the time axis. This makes comparison of the different quantities
easier.
Let’s finish this section by looking at an example.
SCENARIO
What is wrong with this graph? This graph was based
on a "snapshot of contemporary Australian women" that
was developed as part of a research project for the
Sydney Morning Herald [10]
based on women and the 1996 Australian Census. The chart
is designed to show the percentage of each group that
earn a specific amount per week.
You might also wonder why this type of graphic was selected and
what you were expected to interpret from the graph. Are you supposed
to interpret that people who do not have children have more time
to get an education and work on their job and so they earn more
money? The title is What Women Earn and yet much of the data are
related to couples' income.
SCENARIO
The following graph was adapted from a guide to writing
research papers [11].
It represents the data that were collected from a study
in which teachers were asked to identify underachievers
and achievers in their classes and the grade point average
of the students was compared to find out if teachers
were good judges of student achievement. The authors
claim that they have applied both MLA and APA documentation
styles to their handbook. Can you identify any limitations
of this graph?