Skip to main content

Showing 1–32 of 32 results for author: Ruppert, D

.
  1. arXiv:2311.01985  [pdf, other

    q-fin.CP

    Maximizing Portfolio Predictability with Machine Learning

    Authors: Michael Pinelis, David Ruppert

    Abstract: We construct the maximally predictable portfolio (MPP) of stocks using machine learning. Solving for the optimal constrained weights in the multi-asset MPP gives portfolios with a high monthly coefficient of determination, given the sample covariance matrix of predicted return errors from a machine learning model. Various models for the covariance matrix are tested. The MPPs of S&P 500 index const… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  2. arXiv:2310.19340  [pdf, other

    astro-ph.IM stat.AP

    Splines 'n Lines: Rest-frame galaxy spectral energy distributions via Bayesian functional data analysis

    Authors: David Kent, Tamás Budavári, Thomas J. Loredo, David Ruppert

    Abstract: Survey-based measurements of the spectral energy distributions (SEDs) of galaxies have flux density estimates on badly misaligned grids in rest-frame wavelength. The shift to rest frame wavelength also causes estimated SEDs to have differing support. For many galaxies, there are sizeable wavelength regions with missing data. Finally, dim galaxies dominate typical samples and have noisy SED measure… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: 20 pages, 7 figures

  3. arXiv:2205.09800  [pdf, other

    math.ST

    Smoothness-Penalized Deconvolution (SPeD) of a Density Estimate

    Authors: David Kent, David Ruppert

    Abstract: This paper addresses the deconvolution problem of estimating a square-integrable probability density from observations contaminated with additive measurement errors having a known density. The estimator begins with a density estimate of the contaminated observations and minimizes a reconstruction error penalized by an integrated squared $m$-th derivative. Theory for deconvolution has mainly focuse… ▽ More

    Submitted 10 April, 2023; v1 submitted 19 May, 2022; originally announced May 2022.

    Comments: Revisions: added new theorem in Section 6; added list of assumptions; other, more minor revisions throughout

  4. arXiv:2109.08308  [pdf, other

    stat.ME stat.AP stat.CO

    Adaptive Ridge-Penalized Functional Local Linear Regression

    Authors: Wentian Huang, David Ruppert

    Abstract: We introduce an original method of multidimensional ridge penalization in functional local linear regressions. The nonparametric regression of functional data is extended from its multivariate counterpart, and is known to be sensitive to the choice of $J$, where $J$ is the dimension of the projection subspace of the data. Under multivariate setting, a roughness penalty is helpful for variance redu… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

  5. arXiv:2104.04190  [pdf, other

    stat.ME

    Measurement Errors in Semiparametric Generalized Regression Models

    Authors: Mohammad W. Hattab, David Ruppert

    Abstract: Regression models that ignore measurement error in predictors may produce highly biased estimates leading to erroneous inferences. It is well known that it is extremely difficult to take measurement error into account in Gaussian nonparametric regression. This problem becomes tremendously more difficult when considering other families such as logistic regression, Poisson and negative-binomial. For… ▽ More

    Submitted 31 January, 2023; v1 submitted 9 April, 2021; originally announced April 2021.

    Comments: 24 pages and 10 figures

  6. arXiv:2104.00645  [pdf, other

    stat.ME

    Bayesian Functional Principal Components Analysis via Variational Message Passing

    Authors: Tui H. Nolan, Jeff Goldsmith, David Ruppert

    Abstract: Functional principal components analysis is a popular tool for inference on functional data. Standard approaches rely on an eigendecomposition of a smoothed covariance surface in order to extract the orthonormal functions representing the major modes of variation. This approach can be a computationally intensive procedure, especially in the presence of large datasets with irregular observations. I… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

    Comments: 43 pages, 5 figures, 1 table

  7. arXiv:2006.00952  [pdf, other

    math.ST stat.ME

    Bootstrap inference for quantile-based modal regression

    Authors: Tao Zhang, Kengo Kato, David Ruppert

    Abstract: In this paper, we develop uniform inference methods for the conditional mode based on quantile regression. Specifically, we propose to estimate the conditional mode by minimizing the derivative of the estimated conditional quantile function defined by smoothing the linear quantile regression estimator, and develop two bootstrap methods, a novel pivotal bootstrap and the nonparametric bootstrap, fo… ▽ More

    Submitted 12 April, 2021; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: 78 pages

  8. arXiv:2003.00656  [pdf, other

    q-fin.PM q-fin.GN q-fin.PR q-fin.RM q-fin.ST

    Machine Learning Portfolio Allocation

    Authors: Michael Pinelis, David Ruppert

    Abstract: We find economically and statistically significant gains when using machine learning for portfolio allocation between the market index and risk-free asset. Optimal portfolio rules for time-varying expected returns and volatility are implemented with two Random Forest models. One model is employed in forecasting the sign probabilities of the excess return with payout yields. The second is used to c… ▽ More

    Submitted 3 November, 2021; v1 submitted 1 March, 2020; originally announced March 2020.

  9. arXiv:1908.03968  [pdf, ps, other

    stat.ME

    Finite Sample Hypothesis Tests for Stacked Estimating Equations

    Authors: Eli S. Kravitz, Raymond J. Carroll, David Ruppert

    Abstract: Suppose there are two unknown parameters, each parameter is the solution to an estimating equation, and the estimating equation of one parameter depends on the other parameter. The parameters can be jointly estimated by "stacking" their estimating equations and solving for both parameters simultaneously. Asymptotic confidence intervals are readily available for stacked estimating equations. We int… ▽ More

    Submitted 11 August, 2019; originally announced August 2019.

    Comments: preprint. arXiv admin note: text overlap with arXiv:1908.03967

  10. arXiv:1908.03967  [pdf, ps, other

    stat.ME

    Sample Splitting as an M-Estimator with Application to Physical Activity Scoring

    Authors: Eli S. Kravitz, Raymond J. Carroll, David Ruppert

    Abstract: Sample splitting is widely used in statistical applications, including classically in classification and more recently for inference post model selection. Motivating by problems in the study of diet, physical activity, and health, we consider a new application of sample splitting. Physical activity researchers wanted to create a scoring system to quickly assess physical activity levels. A score is… ▽ More

    Submitted 11 August, 2019; originally announced August 2019.

    Comments: preprint. arXiv admin note: text overlap with arXiv:1908.03968

  11. arXiv:1907.07309  [pdf, other

    stat.ME stat.CO

    Optimal Sampling for Generalized Linear Models under Measurement Constraints

    Authors: Tao Zhang, Yang Ning, David Ruppert

    Abstract: Under "measurement constraints," responses are expensive to measure and initially unavailable on most of records in the dataset, but the covariates are available for the entire dataset. Our goal is to sample a relatively small portion of the dataset where the expensive responses will be measured and the resultant sampling estimator is statistically efficient. Measurement constraints require the sa… ▽ More

    Submitted 25 March, 2020; v1 submitted 16 July, 2019; originally announced July 2019.

    Comments: 52 pages, 11 figures

  12. arXiv:1906.09473  [pdf, other

    stat.ME stat.AP

    Density Estimation on a Network

    Authors: Yang Liu, David Ruppert

    Abstract: This paper develops a novel approach to density estimation on a network. We formulate nonparametric density estimation on a network as a nonparametric regression problem by binning. Nonparametric regression using local polynomial kernel-weighted least squares have been studied rigorously, and its asymptotic properties make it superior to kernel estimators such as the Nadaraya-Watson estimator. Whe… ▽ More

    Submitted 4 August, 2020; v1 submitted 22 June, 2019; originally announced June 2019.

    Comments: 38 pages, 13 figures

  13. arXiv:1906.00538  [pdf, other

    stat.ME stat.AP

    Copula-based functional Bayes classification with principal components and partial least squares

    Authors: Wentian Huang, David Ruppert

    Abstract: We present a new functional Bayes classifier that uses principal component (PC) or partial least squares (PLS) scores from the common covariance function, that is, the covariance function marginalized over groups. When the groups have different covariance functions, the PC or PLS scores need not be independent or even uncorrelated. We use copulas to model the dependence. Our method is semiparametr… ▽ More

    Submitted 16 September, 2021; v1 submitted 2 June, 2019; originally announced June 2019.

  14. arXiv:1812.01786  [pdf, other

    stat.ME

    Density Deconvolution with Additive Measurement Errors using Quadratic Programming

    Authors: Ran Yang, Daniel Apley, Jeremy Staum, David Ruppert

    Abstract: Distribution estimation for noisy data via density deconvolution is a notoriously difficult problem for typical noise distributions like Gaussian. We develop a density deconvolution estimator based on quadratic programming (QP) that can achieve better estimation than kernel density deconvolution methods. The QP approach appears to have a more favorable regularization tradeoff between oversmoothing… ▽ More

    Submitted 4 December, 2018; originally announced December 2018.

  15. Dynamic Shrinkage Processes

    Authors: Daniel R. Kowal, David S. Matteson, David Ruppert

    Abstract: We propose a novel class of dynamic shrinkage processes for Bayesian time series and regression analysis. Building upon a global-local framework of prior construction, in which continuous scale mixtures of Gaussian distributions are employed for both desirable shrinkage properties and computational tractability, we model dependence among the local scale parameters. The resulting processes inherit… ▽ More

    Submitted 23 February, 2018; v1 submitted 3 July, 2017; originally announced July 2017.

  16. arXiv:1703.02736  [pdf, ps, other

    math.ST stat.ME

    Profile Estimation for Partial Functional Partially Linear Single-Index Model

    Authors: Qingguo Tang, Linglong Kong, David Ruppert, Rohana J. Karunamuni

    Abstract: This paper studies a \textit{partial functional partially linear single-index model} that consists of a functional linear component as well as a linear single-index component. This model generalizes many well-known existing models and is suitable for more complicated data structures. However, its estimation inherits the difficulties and complexities from both components and makes it a challenging… ▽ More

    Submitted 8 March, 2017; originally announced March 2017.

  17. arXiv:1606.03775  [pdf, ps, other

    stat.ME

    Additive Function-on-Function Regression

    Authors: Janet S. Kim, Ana-Maria Staicu, Arnab Maity, Raymond J. Carroll, David Ruppert

    Abstract: We study additive function-on-function regression where the mean response at a particular time point depends on the time point itself as well as the entire covariate trajectory. We develop a computationally efficient estimation methodology based on a novel combination of spline bases with an eigenbasis to represent the trivariate kernel function. We discuss prediction of a new response trajectory,… ▽ More

    Submitted 14 December, 2016; v1 submitted 12 June, 2016; originally announced June 2016.

    Comments: 26 pages, 4 figures

  18. Functional Autoregression for Sparsely Sampled Data

    Authors: Daniel R. Kowal, David S. Matteson, David Ruppert

    Abstract: We develop a hierarchical Gaussian process model for forecasting and inference of functional time series data. Unlike existing methods, our approach is especially suited for sparsely or irregularly sampled curves and for curves sampled with non-negligible measurement error. The latent process is dynamically modeled as a functional autoregression (FAR) with Gaussian process innovations. We propose… ▽ More

    Submitted 19 October, 2016; v1 submitted 9 March, 2016; originally announced March 2016.

  19. arXiv:1511.01609  [pdf, other

    stat.ME stat.CO

    Linear Non-Gaussian Component Analysis via Maximum Likelihood

    Authors: Benjamin B. Risk, David S. Matteson, David Ruppert

    Abstract: Independent component analysis (ICA) is popular in many applications, including cognitive neuroscience and signal processing. Due to computational constraints, principal component analysis is used for dimension reduction prior to ICA (PCA+ICA), which could remove important information. The problem is that interesting independent components (ICs) could be mixed in several principal components that… ▽ More

    Submitted 1 October, 2017; v1 submitted 5 November, 2015; originally announced November 2015.

  20. Simultaneously modelling far-infrared dust emission and its relation to CO emission in star forming galaxies

    Authors: Rahul Shetty, Julia Roman-Duval, Sacha Hony, Diane Cormier, Ralf S. Klessen, Lukas K. Konstandin, Thomas Loredo, Eric W. Pellegrini, David Ruppert

    Abstract: We present a method to simultaneously model the dust far-infrared spectral energy distribution (SED) and the total infrared $-$ carbon monoxide (CO) integrated intensity $(S_{\rm IR}-I_{\rm CO})$ relationship. The modelling employs a hierarchical Bayesian (HB) technique to estimate the dust surface density, temperature ($T_{\rm eff}$), and spectral index at each pixel from the observed far-infrare… ▽ More

    Submitted 19 April, 2016; v1 submitted 2 September, 2015; originally announced September 2015.

    Comments: 17 pages, 14 figures, Updated to match MNRAS accepted version

  21. A Bayesian Multivariate Functional Dynamic Linear Model

    Authors: Daniel R. Kowal, David S. Matteson, David Ruppert

    Abstract: We present a Bayesian approach for modeling multivariate, dependent functional data. To account for the three dominant structural features in the data--functional, time dependent, and multivariate components--we extend hierarchical dynamic linear models for multivariate time series to the functional data setting. We also develop Bayesian spline theory in a more general constrained optimization fra… ▽ More

    Submitted 5 August, 2015; v1 submitted 3 November, 2014; originally announced November 2014.

  22. arXiv:1405.1792  [pdf, ps, other

    stat.ME stat.CO

    RAPTT: An Exact Two-Sample Test in High Dimensions Using Random Projections

    Authors: Radhendushka Srivastava, ** Li, David Ruppert

    Abstract: In high dimensions, the classical Hotelling's $T^2$ test tends to have low power or becomes undefined due to singularity of the sample covariance matrix. In this paper, this problem is overcome by projecting the data matrix onto lower dimensional subspaces through multiplication by random matrices. We propose RAPTT (RAndom Projection T-Test), an exact test for equality of means of two normal popul… ▽ More

    Submitted 7 May, 2014; originally announced May 2014.

  23. Restricted Likelihood Ratio Tests for Linearity in Scalar-on-Function Regression

    Authors: Mathew W. McLean, Giles Hooker, David Ruppert

    Abstract: We propose a procedure for testing the linearity of a scalar-on-function regression relationship. To do so, we use the functional generalized additive model (FGAM), a recently developed extension of the functional linear model. For a functional covariate X(t), the FGAM models the mean response as the integral with respect to t of F{X(t),t} where F is an unknown bivariate function. The FGAM can be… ▽ More

    Submitted 22 October, 2013; originally announced October 2013.

  24. Fast Covariance Estimation for High-dimensional Functional Data

    Authors: Luo Xiao, David Ruppert, Vadim Zipunnikov, Ciprian Crainiceanu

    Abstract: For smoothing covariance functions, we propose two fast algorithms that scale linearly with the number of observations per function. Most available methods and software cannot smooth covariance matrices of dimension $J \times J$ with $J>500$; the recently introduced sandwich smoother is an exception, but it is not adapted to smooth covariance matrices of large dimensions such as $J \ge 10,000$. Co… ▽ More

    Submitted 26 February, 2014; v1 submitted 24 June, 2013; originally announced June 2013.

    Comments: 35 pages, 4 figures

    Journal ref: Statistics and Computing, 2016, Vol. 26, No. 1, 409-421

  25. arXiv:1305.3585  [pdf, other

    stat.ME stat.CO

    Bayesian Functional Generalized Additive Models with Sparsely Observed Covariates

    Authors: Mathew W. McLean, Fabian Scheipl, Giles Hooker, Sonja Greven, David Ruppert

    Abstract: The functional generalized additive model (FGAM) was recently proposed in McLean et al. (2013) as a more flexible alternative to the common functional linear model (FLM) for regressing a scalar on functional covariates. In this paper, we develop a Bayesian version of FGAM for the case of Gaussian errors with identity link function. Our approach allows the functional covariates to be sparsely obser… ▽ More

    Submitted 26 May, 2017; v1 submitted 15 May, 2013; originally announced May 2013.

    Comments: substantial updates based on referee comments

  26. arXiv:1301.4954  [pdf, ps, other

    math.ST stat.ME

    Optimal Prediction in an Additive Functional Model

    Authors: Xiao Wang, David Ruppert

    Abstract: The functional generalized additive model (FGAM) provides a more flexible nonlinear functional regression model than the well-studied functional linear regression model. This paper restricts attention to the FGAM with identity link and additive errors, which we will call the additive functional model, a generalization of the functional linear model. This paper studies the minimax rate of convergen… ▽ More

    Submitted 21 January, 2013; originally announced January 2013.

  27. arXiv:1206.4569  [pdf, ps, other

    astro-ph.HE astro-ph.IM stat.AP

    Multilevel Bayesian framework for modeling the production, propagation and detection of ultra-high energy cosmic rays

    Authors: Kunlaya Soiaporn, David Chernoff, Thomas Loredo, David Ruppert, Ira Wasserman

    Abstract: Ultra-high energy cosmic rays (UHECRs) are atomic nuclei with energies over ten million times energies accessible to human-made particle accelerators. Evidence suggests that they originate from relatively nearby extragalactic sources, but the nature of the sources is unknown. We develop a multilevel Bayesian framework for assessing association of UHECRs and candidate source populations, and Markov… ▽ More

    Submitted 28 November, 2013; v1 submitted 20 June, 2012; originally announced June 2012.

    Comments: Published in at http://dx.doi.org/10.1214/13-AOAS654 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS654

    Journal ref: Annals of Applied Statistics 2013, Vol. 7, No. 3, 1249-1285

  28. arXiv:1206.3540  [pdf, ps, other

    astro-ph.HE astro-ph.IM stat.AP

    Guilt by Association: Finding Cosmic Ray Sources Using Hierarchical Bayesian Clustering

    Authors: Kunlaya Soiaporn, David Chernoff, Thomas Loredo, David Ruppert, Ira Wasserman

    Abstract: The Earth is continuously showered by charged cosmic ray particles, naturally produced atomic nuclei moving with velocity close to the speed of light. Among these are ultra high energy cosmic ray particles with energy exceeding 5x10^19 eV, which is ten million times more energetic than the most energetic particles produced at the Large Hadron Collider. Astrophysical questions include: what phenome… ▽ More

    Submitted 15 June, 2012; originally announced June 2012.

  29. arXiv:1201.0708  [pdf, ps, other

    math.ST

    Local Asymptotics of P-splines

    Authors: Luo Xiao, Yingxing Li, Tatiyana V. Apanasovich, David Ruppert

    Abstract: This report studies local asymptotics of P-splines with $p$th degree B-splines and a $m$th order difference penalty. Earlier work with $p$ and $m$ restricted is extended to the general case. Asymptotically, penalized splines are kernel estimators with equivalent kernels depending on $m$, but not on $p$. A central limit theorem provides simple expressions for the asymptotic mean and variance. Provi… ▽ More

    Submitted 10 June, 2012; v1 submitted 3 January, 2012; originally announced January 2012.

    Comments: 34 pages, 1 figure

  30. arXiv:1011.4916  [pdf, ps, other

    stat.ME math.ST

    Fast Bivariate Penalized Splines: the Sandwich Smoother

    Authors: Luo Xiao, Yingxing Li, David Ruppert

    Abstract: We propose a fast penalized spline method for bivariate smoothing. Univariate P-spline smoothers (Eilers and Marx, 1996) are applied simultaneously along both coordinates. The new smoother has a sandwich form which suggested the name "sandwich smoother" to a referee. The sandwich smoother has a tensor product structure that simplifies an asymptotic analysis and it can be fast computed. We derive a… ▽ More

    Submitted 13 July, 2012; v1 submitted 22 November, 2010; originally announced November 2010.

    Comments: 45 pages, 3 fgiures

    Journal ref: J. R. Statist. Soc. B (2013), 75, Part 3, pp. 577-599

  31. arXiv:0912.1824  [pdf, ps, other

    math.ST

    Local Asymptotics of P-Spline Smoothing

    Authors: Xiao Wang, **glai Shen, David Ruppert

    Abstract: This paper addresses asymptotic properties of general penalized spline estimators with an arbitrary B-spline degree and an arbitrary order difference penalty. The estimator is approximated by a solution of a linear differential equation subject to suitable boundary conditions. It is shown that, in certain sense, the penalized smoothing corresponds approximately to smoothing by the kernel method.… ▽ More

    Submitted 9 December, 2009; originally announced December 2009.

  32. Discussion: Conditional growth charts

    Authors: Raymond J. Carroll, David Ruppert

    Abstract: Discussion of Conditional growth charts [math.ST/0702634]

    Submitted 22 February, 2007; originally announced February 2007.

    Comments: Published at http://dx.doi.org/10.1214/009053606000000641 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS0102B

    Journal ref: Annals of Statistics 2006, Vol. 34, No. 5, 2098-2104