Laboratory 5: Integrating pathway information

In this lab you will combine microarray analysis and pathway information for your gene. Perform the project setup as usual. Your project should be in its own directory that contains a project script as well as the knitr files which produce html and pdf reports of the lab. The directory should include all of the data files used i in the project as well as a report as a Word document.

Part 1: Pathway information. In this part of the lab, you will extract genes from a major pathway that your gene is involved in and produce a gene list.
  • Find the major KEGG pathways that your gene is involved in (available on the gene page of NCBI). List those as part of your assignment report, along with the pathway categories and KEGG pathway numbers.
  • Verify that your gene actually appears in the pathway.
  • Download the genes (from NCBI) that are part of the pathway.
  • Read the resulting file into R as a data frame and parse out the gene names and aliases for Part 3. Save the list of genes in a file.
Part 2: Microarray profiles. In this part of the lab you will select a GEO series (from Lab 4) and a pair of conditions in which there is interesting differential expression. You will have two lists of samples (GSM's) from the set (e.g., the list from condition A and the list from condition B).
  • Use the limma package to find a list of differentially expressed genes between conditions A and B (in R)
  • Also find the top 250 expressed genes in each condition and across the two conditions (from R). 
  • Find the intersection of the top genes from condition A and the top genes from condition B. Which of these also appear in the top overall?
Part 3: Pathway expression. Find the intersection of the genes from the list of part 1 and each of the lists found in Part 2. List the genes for each and save to a file.