Fitting Large-Scale Spatial Models with Applications to Microarray Data Analysis
Stephan R. Sain, (CU-Denver), ssain@math.cudenver.edu, and
Reinhard Furrer, (NCAR), furrer@ucar.edu
Abstract
Many problems in the environmental and biological sciences involve the analysis of large quantities of data. Further, the data in these problems are often subject to various types of structure and, in particular, spatial dependence. Traditional model fitting often fails due to the size of the data sets since it is difficult to not only specify but to also compute with the full covariance matrix. For example a single microarray can include over 400K individual observations. We propose using a very general type of mixed model that has a random spatial component. Recognizing that spatial covariance matrices often exhibit a large number of zero or near-zero entries, covariance tapering is used to force these entries to zero. Then, using the sparse nature of such matrices and a new computational approach for computing the Cholesky decomposition, backfitting is used to estimate the fixed and random model parameters. Results will be demonstrated on a experiment using microarrays to! build a profile of differentially expressed genes relating to cerebral vascular malformations, an important cause of hemorrhagic stroke and seizures.