An intensive course on Microarray Data Analysis (3 credits), Finnish Society of Biostatistics  and Åbo Akademi, 3.-5.9.07.

 

The course book: Amaratunga & Cabrera:  Exploration and Analysis of DNA Microarray and Protein Array Data. Wiley 2004, has a website here.

 

Lecture are held in the Wikgren auditorium Biocity (first corridor to the left after the main entrance). Lecture contents with an approximate schedule:

 

Monday

 

These lecture slides contain material for sections 1.1 onwards (see below). We look at those mainly in the afternoon session.

 

10.15-11.00 1.0 Genome basics

 

The pages 'DNA from the beginning' provide down-to-earth explanations of the issues connected to heredity and DNA. The site 'DNA tutorial' provides a somewhat more comprised explation of the DNA and its fellows in the cell. An excellent online biology textbook is available here (Kimball's biology pages).

 

You may have stumbled upon the phrase: 'Central dogma of the molecular biology', which has to do with the flow of information in cells. Although it is beneficial to have wider understanding of classical genetics as well as of biomolecules in general, we keep ourselves to the most crucial aspects relevant for understanding the uses and limitations of microarrays, which correspond the main elements of the Central dogma. Below is a collection of links to material available at the Web that will be considered during the first lecture.

 

DNA, Genes, Transcription (animation),mRNA, RNA, Translation (animation), Ribosomes, Proteins, Regulation

 

11.15-12.00 1.1 Microarrays

 

There are different types of microarrays for different purposes. This course is restricted to the study of DNA microarrays, more specifically to cDNA arrays and to Affymetrix type oligonucleotide arrays. Two generic link collections concerned with a wide variety of resources for microarray experiments and analyses are available at this and this spot. A short guide to microarray data analysis is freely available from the Finnish CSC web pages, outlining the primary steps in the process of making sense out of microarray experiments. However, notice that this guide has a limited scope regarding statistical model-based methods.

 

To give a flavor of a cDNA array experiment, here's an animation. The people behind this animation have made comparable real yeast microarray data and presentations available here. Another animation showing an experiment with an oligonucleotide array.

 

Various purposes of microarray studies utilized in human health applications: drug target identification, toxicogenomics, pharmacogenomics, health care 

 

Sources of variation in microarray experiments, a paper by Gary Churchill, which is highly recommended. The discussion is exemplified through 2-channel experiments (cDNA arrays), however, most issues are relevant for the single channel experiments as well.

 

 

12-12.30 Lunch break (just enough time for grabbing a sandwich or two...)

 

12.30-13.15 1.2 Microarray image processing

 

Nice tutorials about microarray analyses are available at the MD Anderson Cancer Center, including one explaining vital issues for image processing for cDNA arrays (many of the issues apply also to oligonucleotide arrays). During the afternoon lectures we will take a look at the dChip software that is quite widely used for analyzing oligonucleotide array data, as it can perform a wide range of tasks from processing the raw images to gene clustering.

 

13.15-13.30 Break

 

13.30-14.15 Microarray image processing continued

 

14.15-14.30 Break

 

14.30-16.00 1.3. Pre-processing microarray data

 

 

Tuesday

 

These lecture slides contain material for Tuesday sessions. Inference, modeling and experimental designs will be mainly covered in the morning sessions, whereas data mining and related issues are discussed in the afternoon lectures.  The tentative Tuesday schedule looks like:

 

10.15-11.00 Lecture I

11.15-11.15 Break

11.15-12.00 Lecture II

12.00-12.30 Lunch break

12.30-13.15 Lecture III

13.15-13.30 Break

13.30-14.15 Lecture IV

14.15-14.30 Break

14.30-16.00 Lecture V

 

In addition to the lecture slides, we will consider these papers as much as time permits:

 

Using Bayesian networks to study genetic networks,

by Nir Friedman and co-workers. This was among the earliest attempts to try to use BN to study microarray expression data. It looked very promising, but later work has better reflected the actual challenge. For example, the paper

Reconstructing gene-regulatory networks,

by Geier et al, BMC Systems Biology 2007 shows how difficult it is to discover the underlying causal relations between genes, even with best thinkable data sets that could be currently available.

 

Gene ontology (GO) is a controlled vocabulary for describing functions and components in cellular processes. It is an effort towards bringing some order in the jungle of genes and gene products for the model organisms. Tests for gene set enrichment compare lists of differentially expressed genes and non-differentially expressed genes to find which terms in the GO are over or under-represented amongst the differentially expressed genes. The papers by Lewin and Grieve, BMC Bioinformatics 2006, and Al-Shahrour et al, Bioinformatics 2004, discuss statistical approaches for this purpose.

 

Several Bayesian model-based approaches have been developed for clustering genes using microarray expression data. Examples of such are Medvedovic and Sivaganesan, Bioinformatics 2002, Medvedovic et al, Bioinformatics 2004, Lewin et al (2007) Fully Bayesian mixture model for differential gene expression: simulations and model checks.

 

Mike West's research group has developed a framework for Bayesian modelling and analysis of sparse latent factor and factor-regression models, with a number of developments related to applications in analysis of large-scale gene expression data sets. 

 

Wednesday:

 

9.15-17.00 Practicals in the ICT computer lab A2065. The exact schedule will be announced in the beginning of the practicals.

 

.

 

 Updated by Jukka Corander August 26th, 2007.