Background: Data analytic approaches to Affymetrix® microarray data include: (a) a covariate model, in which the observed signal is some estimated linear function of perfect match (PM) and mismatch (MM) signals; (b) a difference model [PM-MM]; and (c) a PM-only model, in which MM data is not utilized. Methods: By decomposing the correlations among the variables in the statistical model and making certain assumptions, we theoretically derive the statistical model that reflects the actual gene expression level under a variety of conditions expected in microarray data. Results and conclusion: When modeling non-systematic variation, the covariate model provides maximum flexibility and often reflects the actual gene expression levels better than the difference model. However, the PM-only model demonstrates superior power in an overwhelming majority of realistic situations, which provides theoretical support for the current trend to employ PM-only models in microarray data analyzes. © 2005 Adis Data Information BV. All rights reserved.