R statistics

Measures of central tendency:
  • Average or mean: mean, colMeans, rowMeans
  • Median: median

Measures of dispersion:
  • Variance (sample):  var
  • Standard deviation (sample): sd
  • Median absolute deviation (sample): mad
  • Interquartile range: IQR
  • Range:  range
Other useful measures:
  • Totals: sum, colSums, rowSums
  • Extrema:  max, min, range

Characterization of the distribution of values:
  • Histograms: hist
  • Boxplot: boxplot
  • Quantiles:  quantile
Probability distributions:
  • Get a list of the distributions supported by R: help(Distributions)
  • The probability distribution functions usually have four different forms (e., g., dnorm, pnorm, qnorm, and rnorm). The four variations correspond to the heights of the probability distribution, the heights of the cumulative probability distributions, the quantiles (working backwards from probabilities), and random values drawn for the distribution.
Variable relationships:
  • Linearly related: Vector variables x and y are linearly related if  yi = m * xi + b    (When plotted against each other as ordered pairs the points fall on a line.)
  • Covariance measures how linearly related two variables are:  cov(x, y)
  • Correlation is a normalized measure of how linearly related variables are.  The values of correlation are between -1 and 1. A value of zero indicates no relationship.  cor(x, y)