Lattice Graphics: Annotation, Themes, and Scales

Deepayan Sarkar

Datasets for illustration

  • carData::Anscombe : U. S. states plus Washington, D. C. in 1970

    • education : Per-capita education expenditures (USD)

    • income : Per-capita income (USD)

    • young : Proportion under 18 (per 1000)

    • urban : Proportion urban (per 1000)

   education income young urban
ME       189   2824 350.7   508
NH       169   3259 345.9   564
VT       230   3072 348.5   322
MA       168   3835 335.3   846
RI       180   3549 327.1   871
CT       193   4256 341.0   774

Datasets for illustration

'data.frame':   40 obs. of  5 variables:
 $ Status: Factor w/ 2 levels "Rural","Urban": 2 1 2 1 2 1 2 1 2 1 ...
 $ Sex   : Factor w/ 2 levels "Female","Male": 2 2 1 1 2 2 1 1 2 2 ...
 $ Cause : Factor w/ 10 levels "Alzheimers","Cancer",..: 6 6 6 6 2 2 2 2 7 7 ...
 $ Rate  : num  210 243 132 155 196 ...
 $ SE    : num  0.2 0.6 0.2 0.4 0.2 0.5 0.2 0.4 0.1 0.3 ...

Datasets for illustration

  • MASS::Cars93 : Data from cars on sale in the USA in 1993
'data.frame':   93 obs. of  27 variables:
 $ Manufacturer      : Factor w/ 32 levels "Acura","Audi",..: 1 1 2 2 3 4 4 4 4 5 ...
 $ Model             : Factor w/ 93 levels "100","190E","240",..: 49 56 9 1 6 24 54 74 73 35 ...
 $ Type              : Factor w/ 6 levels "Compact","Large",..: 4 3 1 3 3 3 2 2 3 2 ...
 $ Min.Price         : num  12.9 29.2 25.9 30.8 23.7 14.2 19.9 22.6 26.3 33 ...
 $ Price             : num  15.9 33.9 29.1 37.7 30 15.7 20.8 23.7 26.3 34.7 ...
 $ Max.Price         : num  18.8 38.7 32.3 44.6 36.2 17.3 21.7 24.9 26.3 36.3 ...
 $ MPG.city          : int  25 18 20 19 22 22 19 16 19 16 ...
 $ MPG.highway       : int  31 25 26 26 30 31 28 25 27 25 ...
 $ AirBags           : Factor w/ 3 levels "Driver & Passenger",..: 3 1 2 1 2 2 2 2 2 2 ...
 $ DriveTrain        : Factor w/ 3 levels "4WD","Front",..: 2 2 2 2 3 2 2 3 2 2 ...
 $ Cylinders         : Factor w/ 6 levels "3","4","5","6",..: 2 4 4 4 2 2 4 4 4 5 ...
 $ EngineSize        : num  1.8 3.2 2.8 2.8 3.5 2.2 3.8 5.7 3.8 4.9 ...
 $ Horsepower        : int  140 200 172 172 208 110 170 180 170 200 ...
 $ RPM               : int  6300 5500 5500 5500 5700 5200 4800 4000 4800 4100 ...
 $ Rev.per.mile      : int  2890 2335 2280 2535 2545 2565 1570 1320 1690 1510 ...
 $ Man.trans.avail   : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 1 1 1 1 1 ...
 $ Fuel.tank.capacity: num  13.2 18 16.9 21.1 21.1 16.4 18 23 18.8 18 ...
 $ Passengers        : int  5 5 5 6 4 6 6 6 5 6 ...
 $ Length            : int  177 195 180 193 186 189 200 216 198 206 ...
 $ Wheelbase         : int  102 115 102 106 109 105 111 116 108 114 ...
 $ Width             : int  68 71 67 70 69 69 74 78 73 73 ...
 $ Turn.circle       : int  37 38 37 37 39 41 42 45 41 43 ...
 $ Rear.seat.room    : num  26.5 30 28 31 27 28 30.5 30.5 26.5 35 ...
 $ Luggage.room      : int  11 15 14 17 13 16 17 21 14 18 ...
 $ Weight            : int  2705 3560 3375 3405 3640 2880 3470 4105 3495 3620 ...
 $ Origin            : Factor w/ 2 levels "USA","non-USA": 2 2 2 2 2 1 1 1 1 1 ...
 $ Make              : Factor w/ 93 levels "Acura Integra",..: 1 2 4 3 5 6 7 9 8 10 ...

Fuel efficiency by number of cylinders

plot of chunk unnamed-chunk-4

Fuel efficiency by number of cylinders and weight

plot of chunk unnamed-chunk-5

Fuel efficiency by number of cylinders and weight

  • This plot can be improved in a number of ways

  • Most importantly, there is no legend by default: can be added using auto.key = TRUE

  • To make a version of the plot for presentation, we would usually want to add

    • Nice decscriptive labels

    • Units of variables plotted

    • Reference grids and possibly other relevant reference objects

Fuel efficiency by number of cylinders and weight

plot of chunk unnamed-chunk-6

Legends in lattice graphics

  • Two general purpose arguments: key and legend (see help(xyplot))

    • key allows structured legends with columns of text, lines, points, and rectangles.

    • legend allows arbitrary grid objects to be used as legends

    • Both need detailed specification by user (will not discuss in detail)

  • More useful argument: auto.key = TRUE

    • Uses groups argument and display type to construct a legend using key

    • Allows limited customization by specifying as a list: auto.key = list(...)

    • See help(simpleKey) and help(xyplot) for details

Legends in lattice graphics

  • The most useful components when specifying auto.key = list(...) are:

    • space : location of legend, usually "left", "right", "top", "bottom"

    • columns : number of columns into which to arrange the legend

    • title : a title for the legend

    • text : labels to replace default levels of groups

Using auto.key

plot of chunk unnamed-chunk-7

Using auto.key

plot of chunk unnamed-chunk-8

Modifying graphical parameters

plot of chunk unnamed-chunk-9

Modifying graphical parameters

  • Some graphical parameters can be modified through optional arguments

  • Unfortunately, this does not change the corresponding legend

  • This happens because

    • When it is rendered, a lattice display uses a theme consisting of graphical parameter settings

    • The panel display and the legend are actually created by completely different functions

    • The only common information they have access to is the theme

  • To change graphical parameters in the display and legend together, we need to change the theme itself

  • The good news is that this is very easy to do:

    • We can change the global theme used for all subsequent plots

    • We can temporarily change settings for a specific plot using par.settings

Modifying graphical parameters

plot of chunk unnamed-chunk-10

Global themes

  • There are a few global themes defined in lattice (see help(trellis.device))

  • Themes can be set globally using trellis.par.set() (as well as individual components)

  • latticeExtra defines additional themes: see ?theEconomist.theme and ?ggplot2like

  • latticeExtra also defines a custom.theme() function to construct new themes

Global themes

plot of chunk unnamed-chunk-11

Global themes

plot of chunk unnamed-chunk-12

Global themes

plot of chunk unnamed-chunk-13

Global themes

plot of chunk unnamed-chunk-14

Global themes and global settings

  • The last plot looks somewhat like a default ggplot2 plot, but not completely

  • This is because certain other (non-graphical) settings are also different

  • Many of these can be customized through a global “options” setting

    • The main interface is through lattice.options()

    • Can be temporarily modified through the optional argument lattice.options

    • The latter is preferred unless you want to change the settings globally

Global themes and global settings

plot of chunk unnamed-chunk-15

Global themes and global settings

plot of chunk unnamed-chunk-16

Themes and legends in other high-level plots

plot of chunk unnamed-chunk-17

Themes and legends in other high-level plots

plot of chunk unnamed-chunk-18

Themes and legends in other high-level plots

plot of chunk unnamed-chunk-19

Themes and legends in other high-level plots

plot of chunk unnamed-chunk-20

Displaying tables: bar charts vs dot plots

  • The last few plots are typical visualizations of cross-tabulated (group-wise summary) data

  • The previous plot is known as a Cleveland dot plot

  • Recommended by Cleveland because

    • Barcharts encode data by both position and length, which is redundant

    • Position is better encoding of a quantity than length (Cleveland and McGill 1984; Heer and Bostock 2010)

  • Cleveland also recommends reordering categories by outcome when there is no inherent ordering

  • This is accomplished by the reorder() function

Finer control of scales

  • So far we have used the default scales / axes, but we may want to customize these as well

  • This is achieved using the scales argument, which does three things

    • Control how range of data in individual panels are combined

    • Whether an axis is log-transformed

    • How the axis is annotated (with tick marks and labels)

Finer control of scales: examples

plot of chunk unnamed-chunk-21

Finer control of scales: examples

plot of chunk unnamed-chunk-22

Finer control of scales: examples

plot of chunk unnamed-chunk-23

Anscombe data: model education expenditure

plot of chunk unnamed-chunk-24

Anscombe data: model education expenditure

  • We do not necessarily want to see all pairs, only response vs predictors

  • lattice supports this by allowing multiple terms to separated by + in the formula

  • By default all terms are plotted in the same panel (superposed as groups)

  • Can be split into different panels using outer = TRUE

  • Default labels usually need further customization

Scatter plot with multiple terms

plot of chunk unnamed-chunk-25

Scatter plot with multiple terms

plot of chunk unnamed-chunk-26

Scatter plot with multiple terms

plot of chunk unnamed-chunk-27

Scatter plot with multiple terms

plot of chunk unnamed-chunk-28

Scatter plot with multiple terms

plot of chunk unnamed-chunk-29

Exercises

  • High-level lattice functions are S3 generic functions

  • The formula methods are the primary interface, but some specialized methods are also available

  • One such useful method is xyplot() for time-series objects

  • Visualize yearly number of sunspots using xyplot(sunspot.year)

  • Add the optional argument aspect = "xy". Does this make it easier to see some features of the time series?

  • Add the optional argument cut = 4. What does this do? Does it improve the visualization?

Exercises

  • Another class of useful methods are barchart() and dotplot() methods for tables (array, matrix, etc.)

  • Use these methods to recreate the following plots for the VADeaths data set (see ?dotplot.table)

plot of chunk unnamed-chunk-30

References

Cleveland, William S., and Robert McGill. 1984. “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods.” Journal of the American Statistical Association 79:531–54.

Heer, Jeffrey, and Michael Bostock. 2010. “Crowdsourcing Graphical Perception: Using Mechanical Turk to Assess Visualization Design.” In Proceedings of the Sigchi Conference on Human Factors in Computing Systems, 203–12. ACM.