1.3. Time-series plots¶

We start off by considering a plot most often seen in engineering applications: the time-series plot. The time-series plot is a univariate plot: it shows only one variable. It is a 2-dimensional plot in which one axis, the time-axis, shows graduations at an appropriate scale (seconds, minutes, weeks, quarters, years), while the other axis shows the numeric values. Usually, the time-axis is displayed horizontally, but this is not a requirement: some interesting analysis can be done with time running vertically.

Many statistical packages call this a line plot, as it can be used generally to display any sort of sequence, whether it is along time or some other ordering. The time-series plot is an excellent way to visualize long sequences of data. It tells a visual story along the sequence axis, and the human brain is incredible at absorbing this high density of data, locating patterns in the data such as sinusoids, spikes, and outliers, and separating any noise from signal.

Here are some tips for effective time-series plots:

The software should have horizontal and vertical zooming ability. Once zoomed in, there must be tools to scroll up, down, left and right.
Always label the x-axis appropriately with (time) units that make sense.

This plot, found on the Internet, shows a computer’s CPU temperature with time. There are several problems with the plot, but the key issue here is the x-axis. This plot is probably the result of poor default settings in the software, but as you will start to realize, bad defaults are very common in most software packages. They waste your time when you have to repeatedly modify the charts, especially if you are just starting out with exploring the data. Good software will sensibly label the time-based axis for you.

When plotting more than one trajectory (a vector of values) against time, it is helpful if the lines do not cross or jumble too much. This allows you to clearly see the relationship with other variables. The use of a second y-axis on the right-hand side is helpful when plotting two trajectories, but when plotting three or more trajectories that are in the same numeric range, it is better to use several parallel axes.

../_images/three_correlated_variables_-_better.png

Using the same data as in the previous tip, a much improved visualization technique is to use sparklines to represent the sequence of data.

Sparklines are small graphics that carry a high density of information. The human eye is easily capable of absorbing about 100 dots or points per linear centimeter and around 10000 points per square centimeter. These sparklines convey the same amount of information as the previous plots and are easy to consume on hand-held devices such as cellphones and tablet computing devices that are common in chemical plants and other engineering facilities. Read more about them from this hyperlink.
When plotting money values over time (e.g. sales of your product over the past 10 years), adjust for inflation effects by dividing by the consumer price index or an appropriate factor. Distortions due to the time value of money can be very misleading, as this example of retail sales shows. For Canadians, here is a Canadian inflation calculator from the Bank of Canada that can help you. For most countries you can almost certainly find something similar from the country’s national bank or a government office.
If you ever ask yourself, “Why are we being shown so little?” then you must request more data before and after the time period or current sequence shown. A typical example is stock-price data (see example figure of Apple’s stock). There are numerous graphical “lies” in magazines and reports where the plot shows a drastic change in trend, but in the context of prior data, that trend is a small aberration. Again, this brings into play the brain’s remarkable power to discern signal from noise, but to do this, our brains require context. Ask for the extra context, or look for it, if not provided.