Multiple Output Regression with Latent Noise
Authors:
Jussi Gillberg,
Pekka Marttinen,
Matti Pirinen,
Antti J. Kangas,
Pasi Soininen,
Mehreen Ali,
Aki S. Havulinna,
Marjo-Riitta Marjo-Riitta Järvelin,
Mika Ala-Korpela,
Samuel Kaski
Abstract:
In high-dimensional data, structured noise caused by observed and unobserved factors affecting multiple target variables simultaneously, imposes a serious challenge for modeling, by masking the often weak signal. Therefore, (1) explaining away the structured noise in multiple-output regression is of paramount importance. Additionally, (2) assumptions about the correlation structure of the regressi…
▽ More
In high-dimensional data, structured noise caused by observed and unobserved factors affecting multiple target variables simultaneously, imposes a serious challenge for modeling, by masking the often weak signal. Therefore, (1) explaining away the structured noise in multiple-output regression is of paramount importance. Additionally, (2) assumptions about the correlation structure of the regression weights are needed. We note that both can be formulated in a natural way in a latent variable model, in which both the interesting signal and the noise are mediated through the same latent factors. Under this assumption, the signal model then borrows strength from the noise model by encouraging similar effects on correlated targets. We introduce a hyperparameter for the \emph{latent signal-to-noise ratio} which turns out to be important for modelling weak signals, and an ordered infinite-dimensional shrinkage prior that resolves the rotational unidentifiability in reduced-rank regression models. Simulations and prediction experiments with metabolite, gene expression, FMRI measurement, and macroeconomic time series data show that our model equals or exceeds the state-of-the-art performance and, in particular, outperforms the standard approach of assuming independent noise and signal models.
△ Less
Submitted 3 February, 2016; v1 submitted 27 October, 2014;
originally announced October 2014.
Bayesian Information Sharing Between Noise And Regression Models Improves Prediction of Weak Effects
Authors:
Jussi Gillberg,
Pekka Marttinen,
Matti Pirinen,
Antti J Kangas,
Pasi Soininen,
Marjo-Riitta Järvelin,
Mika Ala-Korpela,
Samuel Kaski
Abstract:
We consider the prediction of weak effects in a multiple-output regression setup, when covariates are expected to explain a small amount, less than $\approx 1%$, of the variance of the target variables. To facilitate the prediction of the weak effects, we constrain our model structure by introducing a novel Bayesian approach of sharing information between the regression model and the noise model.…
▽ More
We consider the prediction of weak effects in a multiple-output regression setup, when covariates are expected to explain a small amount, less than $\approx 1%$, of the variance of the target variables. To facilitate the prediction of the weak effects, we constrain our model structure by introducing a novel Bayesian approach of sharing information between the regression model and the noise model. Further reduction of the effective number of parameters is achieved by introducing an infinite shrinkage prior and group sparsity in the context of the Bayesian reduced rank regression, and using the Bayesian infinite factor model as a flexible low-rank noise model. In our experiments the model incorporating the novelties outperformed alternatives in genomic prediction of rich phenotype data. In particular, the information sharing between the noise and regression models led to significant improvement in prediction accuracy.
△ Less
Submitted 16 October, 2013;
originally announced October 2013.