-
How to capitalize on a priori contrasts in linear (mixed) models: A tutorial
Authors:
Daniel J. Schad,
Shravan Vasishth,
Sven Hohenstein,
Reinhold Kliegl
Abstract:
Factorial experiments in research on memory, language, and in other areas are often analyzed using analysis of variance (ANOVA). However, for effects with more than one numerator degrees of freedom, e.g., for experimental factors with more than two levels, the ANOVA omnibus F-test is not informative about the source of a main effect or interaction. Because researchers typically have specific hypot…
▽ More
Factorial experiments in research on memory, language, and in other areas are often analyzed using analysis of variance (ANOVA). However, for effects with more than one numerator degrees of freedom, e.g., for experimental factors with more than two levels, the ANOVA omnibus F-test is not informative about the source of a main effect or interaction. Because researchers typically have specific hypotheses about which condition means differ from each other, a priori contrasts (i.e., comparisons planned before the sample means are known) between specific conditions or combinations of conditions are the appropriate way to represent such hypotheses in the statistical model. Many researchers have pointed out that contrasts should be "tested instead of, rather than as a supplement to, the ordinary `omnibus' F test" (Hays, 1973, p. 601). In this tutorial, we explain the mathematics underlying different kinds of contrasts (i.e., treatment, sum, repeated, polynomial, custom, nested, interaction contrasts), discuss their properties, and demonstrate how they are applied in the R System for Statistical Computing (R Core Team, 2018). In this context, we explain the generalized inverse which is needed to compute the coefficients for contrasts that test hypotheses that are not covered by the default set of contrasts. A detailed understanding of contrast coding is crucial for successful and correct specification in linear models (including linear mixed models). Contrasts defined a priori yield far more useful confirmatory tests of experimental hypotheses than standard omnibus F-test. Reproducible code is available from https://osf.io/7ukf6/.
△ Less
Submitted 17 July, 2019; v1 submitted 27 July, 2018;
originally announced July 2018.
-
A Semiparametric Model for Bayesian Reader Identification
Authors:
Ahmed Abdelwahab,
Reinhold Kliegl,
Niels Landwehr
Abstract:
We study the problem of identifying individuals based on their characteristic gaze patterns during reading of arbitrary text. The motivation for this problem is an unobtrusive biometric setting in which a user is observed during access to a document, but no specific challenge protocol requiring the user's time and attention is carried out. Existing models of individual differences in gaze control…
▽ More
We study the problem of identifying individuals based on their characteristic gaze patterns during reading of arbitrary text. The motivation for this problem is an unobtrusive biometric setting in which a user is observed during access to a document, but no specific challenge protocol requiring the user's time and attention is carried out. Existing models of individual differences in gaze control during reading are either based on simple aggregate features of eye movements, or rely on parametric density models to describe, for instance, saccade amplitudes or word fixation durations. We develop flexible semiparametric models of eye movements during reading in which densities are inferred under a Gaussian process prior centered at a parametric distribution family that is expected to approximate the true distribution well. An empirical study on reading data from 251 individuals shows significant improvements over the state of the art.
△ Less
Submitted 18 July, 2016;
originally announced July 2016.
-
On the Ambiguity of Interaction and Nonlinear Main Effects in a Regime of Dependent Covariates
Authors:
Hannes Matuschek,
Reinhold Kliegl
Abstract:
The analysis of large experimental datasets frequently reveals significant interactions that are difficult to interpret within the theoretical framework guiding the research. Some of these interactions actually arise from the presence of unspecified nonlinear main effects and statistically dependent covariates in the statistical model. Importantly, such nonlinear main effects may be compatible (or…
▽ More
The analysis of large experimental datasets frequently reveals significant interactions that are difficult to interpret within the theoretical framework guiding the research. Some of these interactions actually arise from the presence of unspecified nonlinear main effects and statistically dependent covariates in the statistical model. Importantly, such nonlinear main effects may be compatible (or, at least, not incompatible) with the current theoretical framework. In the present literature this issue has only been studied in terms of correlated (linearly dependent) covariates. Here we generalize to nonlinear main effects (i.e., main effects of arbitrary shape) and dependent covariates. We propose a novel nonparametric method to test for ambiguous interactions where present parametric methods fail. We illustrate the method with a set of simulations and with reanalyses (a) of effects of parental education on their children's educational expectations and (b) of effects of word properties on fixation locations during reading of natural sentences, specifically of effects of length and morphological complexity of the word to be fixated next. The resolution of such ambiguities facilitates theoretical progress.
△ Less
Submitted 24 July, 2017; v1 submitted 9 December, 2015;
originally announced December 2015.
-
The cave of Shadows. Addressing the human factor with generalized additive mixed models
Authors:
Harald Baayen,
Shravan Vasishth,
Douglas Bates,
Reinhold Kliegl
Abstract:
Generalized additive mixed models are introduced as an extension of the generalized linear mixed model which makes it possible to deal with temporal autocorrelational structure in experimental data. This autocorrelational structure is likely to be a consequence of learning, fatigue, or the ebb and flow of attention within an experiment (the `human factor'). Unlike molecules or plots of barley, sub…
▽ More
Generalized additive mixed models are introduced as an extension of the generalized linear mixed model which makes it possible to deal with temporal autocorrelational structure in experimental data. This autocorrelational structure is likely to be a consequence of learning, fatigue, or the ebb and flow of attention within an experiment (the `human factor'). Unlike molecules or plots of barley, subjects in psycholinguistic experiments are intelligent beings that depend for their survival on constant adaptation to their environment, including the environment of an experiment. Three data sets illustrate that the human factor may interact with predictors of interest, both factorial and metric. We also show that, especially within the framework of the generalized additive model, in the nonlinear world, fitting maximally complex models that take every possible contingency into account is ill-advised as a modeling strategy. Alternative modeling strategies are discussed for both confirmatory and exploratory data analysis.
△ Less
Submitted 14 November, 2016; v1 submitted 10 November, 2015;
originally announced November 2015.
-
Balancing Type I Error and Power in Linear Mixed Models
Authors:
Hannes Matuschek,
Reinhold Kliegl,
Shravan Vasishth,
Harald Baayen,
Douglas Bates
Abstract:
Linear mixed-effects models have increasingly replaced mixed-model analyses of variance for statistical inference in factorial psycholinguistic experiments. Although LMMs have many advantages over ANOVA, like ANOVAs, setting them up for data analysis also requires some care. One simple option, when numerically possible, is to fit the full variance-covariance structure of random effects (the maxima…
▽ More
Linear mixed-effects models have increasingly replaced mixed-model analyses of variance for statistical inference in factorial psycholinguistic experiments. Although LMMs have many advantages over ANOVA, like ANOVAs, setting them up for data analysis also requires some care. One simple option, when numerically possible, is to fit the full variance-covariance structure of random effects (the maximal model; Barr et al. 2013), presumably to keep Type I error down to the nominal alpha in the presence of random effects. Although it is true that fitting a model with only random intercepts may lead to higher Type I error, fitting a maximal model also has a cost: it can lead to a significant loss of power. We demonstrate this with simulations and suggest that for typical psychological and psycholinguistic data, higher power is achieved without inflating Type I error rate if a model selection criterion is used to select a random effect structure that is supported by the data.
△ Less
Submitted 2 January, 2017; v1 submitted 5 November, 2015;
originally announced November 2015.
-
Parsimonious Mixed Models
Authors:
Douglas Bates,
Reinhold Kliegl,
Shravan Vasishth,
Harald Baayen
Abstract:
The analysis of experimental data with mixed-effects models requires decisions about the specification of the appropriate random-effects structure. Recently, Barr, Levy, Scheepers, and Tily, 2013 recommended fitting `maximal' models with all possible random effect components included. Estimation of maximal models, however, may not converge. We show that failure to converge typically is not due to…
▽ More
The analysis of experimental data with mixed-effects models requires decisions about the specification of the appropriate random-effects structure. Recently, Barr, Levy, Scheepers, and Tily, 2013 recommended fitting `maximal' models with all possible random effect components included. Estimation of maximal models, however, may not converge. We show that failure to converge typically is not due to a suboptimal estimation algorithm, but is a consequence of attempting to fit a model that is too complex to be properly supported by the data, irrespective of whether estimation is based on maximum likelihood or on Bayesian hierarchical modeling with uninformative or weakly informative priors. Importantly, even under convergence, overparameterization may lead to uninterpretable models. We provide diagnostic tools for detecting overparameterization and guiding model simplification.
△ Less
Submitted 26 May, 2018; v1 submitted 16 June, 2015;
originally announced June 2015.
-
Reconstruction of eye movements during blinks
Authors:
M. S. Baptista,
C. Bohn,
R. Kliegl,
R. Engbert,
J. Kurths
Abstract:
In eye movement research in reading, the amount of data plays a crucial role for the validation of results. A methodological problem for the analysis of the eye movement in reading are blinks, when readers close their eyes. Blinking rate increases with increasing reading time, resulting in high data losses, especially for older adults or reading impaired subjects. We present a method, based on t…
▽ More
In eye movement research in reading, the amount of data plays a crucial role for the validation of results. A methodological problem for the analysis of the eye movement in reading are blinks, when readers close their eyes. Blinking rate increases with increasing reading time, resulting in high data losses, especially for older adults or reading impaired subjects. We present a method, based on the symbolic sequence dynamics of the eye movements, that reconstructs the horizontal position of the eyes while the reader blinks. The method makes use of an observed fact that the movements of the eyes before closing or after opening contain information about the eyes movements during blinks. Test results indicate that our reconstruction method is superior to methods that use simpler interpolation approaches. In addition, analyses of the reconstructed data show no significant deviation from the usual behavior observed in readers.
△ Less
Submitted 13 March, 2008; v1 submitted 15 February, 2008;
originally announced February 2008.
-
Scaling of Horizontal and Vertical Fixational Eye Movements
Authors:
**-Rong Liang,
Shay Moshel,
Ari Z. Zivotofsky,
Avi Caspi,
Ralf Engbert,
Reinhold Kliegl,
Shlomo Havlin
Abstract:
Eye movements during fixation of a stationary target prevent the adaptation of the photoreceptors to continuous illumination and inhibit fading of the image. These random, involuntary, small, movements are restricted at long time scales so as to keep the target at the center of the field of view. Here we use the Detrended Fluctuation Analysis (DFA) in order to study the properties of fixational…
▽ More
Eye movements during fixation of a stationary target prevent the adaptation of the photoreceptors to continuous illumination and inhibit fading of the image. These random, involuntary, small, movements are restricted at long time scales so as to keep the target at the center of the field of view. Here we use the Detrended Fluctuation Analysis (DFA) in order to study the properties of fixational eye movements at different time scales. Results show different scaling behavior between horizontal and vertical movements. When the small ballistics movements, i.e. micro-saccades, are removed, the scaling exponents in both directions become similar. Our findings suggest that micro-saccades enhance the persistence at short time scales mostly in the horizontal component and much less in the vertical component. This difference may be due to the need of continuously moving the eyes in the horizontal plane, in order to match the stereoscopic image for different viewing distance.
△ Less
Submitted 24 October, 2004;
originally announced October 2004.