Lecture 2: Getting with some basics

Outline:
  1. Account setup with classque and VDI
  2. Introduction of focus paper for NextGen sequencing
  3. Genetics terminology
  4. Brief start with R
Preliminaries:

The Big C (focus paper for Next Gen sequencing and the $50 genome):
http://www.nature.com/nature/journal/vaop/ncurrent/full/nature12912.html
Discovery and saturation analysis of cancer genes across 21 tumour types
Lawrence et al. Nature (2014) published online Jan 5, 2014.

We will be using this paper as a focus for exploring next gen sequencing and how low-cost sequencing is likely to profoundly change approaches to disease.


Biology review:
  • Proteins
    • Molecules
    • Consist of chains of amino acids (22 standard ones for life)
    • Perform functions of living organisms (the main operational programming components --- so to speak)
    • Produced in factories (ribosomes) using RNA as a template
    • Have complex 3D structures and substructures which influence what they can do
  • Chromosome
    • Single piece of DNA  
    • Humans have 23 pairs 
    • Human chromosomes are located in the cell nucleus
  • DNA: 
    • Molecule that encodes genetic instructions 
    • Building blocks are A, T, C, G (nucleotides)
    • Consists of two strands that are matched (A-T, C-G) - to keep molecule stable. 
    • Each strand of the DNA can contain both sense and antisense sequences (sense has same sequence as mRNA except U's for T's). Both can be transcribed. (More details later)
  • Transcription:
    • Process by which an RNA template is produced from a "gene" embedded in a DNA strand
    • A chemical called an RNA polymerase enzyme does the copying.
    • The RNA template that a particular gene produces may or may not code a protein:
      • Codes a protein - called mRNA or messenger RNA
      • Produces something else: microRNA, lincRNA, ribosomal RNA, etc.
  • Translation:
    • Process by which cellular ribosomes (factories) create proteins from templates (mRNA).

Space Opera Saga II:


Introduction to R:  Here are some R basics

Data topics:
  • Table or spread sheet as the data model
  • Basic plotting and bar charts
  • Percentages versus incidence rates versus counts
Resources:
Look at some real data:

In class exercise:
  • Download the data by race and ethnicity
  • Read it into R
  • Plot it using barplots.