Mark Reimers

Dept. of Biostatistics (cross-appointed to the Virginia Institute for Psychiatric and Behavioral Genetics, the Massey Cancer Center, and the Center for the Study of Biological Complexity), at Virginia Commonwealth University

Today’s sophisticated biotechnologies and electronics enable researchers to gather data in quantities unimagined ten years ago. These data acquisition technologies are changing the nature of research in biology and are poised to revolutionize medical diagnosis and treatment. At the same time the infrastructure of knowledge is changing: a great deal of relevant information is stored in online databases, which may aid interpretation of experimental and clinical data.

Biostatisticians must accept the challenge of analyzing and integrating these new data sets. The first challenge is to extract a clear signal from the technologies; there are many confounding factors, such as technical or physiological artifacts, which distort the signals. Then we may test hypotheses about biological organization or mechanisms against the data. Usually we are testing hypotheses of a common form for many specific items, such as genes or brain regions; these may be simple hypotheses (e.g. which gene expressions are changed) or more complex (e.g. which measures are correlated). Finally we must take advantage of previous efforts, usually in the form of databases, to constrain and aid our analysis.

We who analyze such data are like the prisoners in Plato’s Cave: with our measures we perceive only a shadow of the reality, and we must infer the reality from the data using our imagination and logic. In my opinion the best analytic approaches combine statistical subtility with knowledge of the processes under study.

I work on genomic data: gene expression, genotype and epigenetic measures, and on neuroscience data, particularly global methods such as fMRI.

Current Research

Current and Recent Teaching

Research in Neuroscience

Multivariate analysis of neural data

Recent advances in technology have enabled functional neuroscientists to obtain data at high resolution over large areas of the brain or over many cells. This highly multivariate data is our best hope of understanding the dynamic interaction of different brain regions or of different cells within a region. I work on methods to try to identify underlying patterns in the shifting activations.

Research in neural genomics

We have recently completed a large study of gene expression in the developing human brain. I am currently involved in a large study by RNA-Seq of gene expression in the cortex of psychiatric patients compared to normal individuals.

Research in Genomics

Microarray normalization

The starting point of every analysis is getting the measures right. There are many (usually unnoticed) technical differences each time a technician prepares an array. These technical differences impact probe measures differently, because probes have different thermodynamics and other characteristics that affect hybridization kinetics. I have developed two approaches: a method of regression on technical variables per array in order to estimate (hence eliminate) the technical differences; and a method of inferring the correlation structure of the errors. Graduate student Tobias Guennel and I have shown that these methods can give a S/N ratio on Nimblegen CGH data up to three times better than conventional approaches.

Epigenomics

'Epigenomics' describes the study of epigenetic processes on a genome-wide scale. Such processes include DNA methylation and chromatin modification. Some of my work is fairly technical, such as improving the methodology for the HELP isoschizomer assay. More interesting projects include the study of methylation during differentiation of hematopoietic cells, and integrated analysis of methylation and gene expression.

Analysis of genome-scale data using pathway categories

A well-done microarray experiment may yield a list of hundreds or thousands of modulated genes. It is hard to organize such a list into meaningful knowledge. Organizing genes by biological function or pathway is one approach to doing so. Graduate student Philip Yates and I have developed robust methods to do so. The opposite problem occurs in psychiatric genetics: too few variants achieve genome-wide statistical significance. However I have developed methods to organize gene variants in pathways, which helps to identify pathways in which combinations of gene variants seem to increase risk significantly.

See my Opinionated Guide to Microarray Data Analysis

My Collaborators

Selected Publications

Readings for the 'Enlightened Brain' course

A humanist ideas web site

My Personal Web Site

Support Wikipedia