Lecture 12: Lab 2 discussion and introduction to grammar graphics

Outline:
  • Go over Lab 2 with ggplot2 extensions
  • Review ggplot2 structure
  • Work with sequence alignments in R
Preliminaries:  Download and save the national cancer registry data from

Links:
Some nice examples of ggplot2: http://www.cookbook-r.com/Graphs/
The ggplot2 docs linked by picture: http://docs.ggplot2.org/current/

Using ggplot2 to better understand the data of Lab 2

Overview: The ggplot2 package has become the defacto standard for graphics it R. 

Main ggplot2 functions:
  • qplot (render all at once) or ggplot (render piece by piece)
  • print()  output to screen
  • ggsave()  render to disk
  • save() and load() save the actual objects with the data to disk
  • summary() describe structure of data
Organizational structure:
  • geom: point, line, bar, boxplot
  • aesthetics:  size, color, shape
  • scales: log, reciprocal, square root
  • coord: linear, log, polar
  • facets: 2D grid of plots
  • layers: third dimension of overlays
  • stats: 1D and 2D binning, group means, quantile regression, contour regression, 
Creating a plot with qplot:
  1. Give columns that you wish to display
  2. Give the data parameter indicating the data.frame these columns came from.
  3. List aesthetic parameter mappings (e.g., color, shape)
  4. List geom which indicates the type of plot.
  5. List of stats -- transformations to make before applying the geometry
  6. Position adjustments (to reduce effect of overlap)
  7. List other parameters
Can add layers and facets to this plot later.

Creating a plot with ggplot:
  1. The parameters for ggplot are the name of the data.frame and a call to aes to set the asethetics.
  2. The initial call doesn't display anything --- you have to add layers.
  3. ggplot allows more control for building plots from the ground up.

Layer:
  1. geom
  2. geom parameters
  3. stats
  4. stats parameters
  5. data
  6. mapping
  7. position