Goals:  Understand the basic structure of DNA and the translationtranscription process in order to begin understanding how to view genetic data.
 Be able to do some very basic statistics and data analysis in R.
Outline:  Structure of DNA and genes (with a visit to NCBI website)
 Continue to work on R (data structures and combining)
 Basic statistical indicators: mean, median, maximum, minimum, standard deviation, median/mean absolute deviation
Summary of R statistics functions covered in class: R statistics
Biology:  Structure of the DNA: always read from the 5' end. The left is ACTG and the right strand is read CAGT. DNA synthesis always goes 5' to 3'.
 Gene substructure (codons):
 Other important terms:
 Allelle  a version of the gene on one or other of the chromosomes
 Open reading frame  the region of the DNA between start
Exercise: Look at Genbank for the TP53 gene (tumor protein 53 gene).
Example 1: A time series (airmiles)  Plot as a time series
 Plot by pulling out the vectors
 Calculate the mean and median  why are they so different
 Plot the histogram
 Plot boxplot
Example 2: A time series (sunspot.month)  Plot as a time series
 Calculate mean and median
 Plot boxplot
 Plot histogram
 Reshape into a two dimensional array
 Sum rows to get sunspots by year
 Sum columns to get sunspots by month
Example 3: A data frame (faithful)  Plot
 Extract columns
 Calculate basic statistics
