Search | arXiv e-print repository

arXiv:2203.02057 [pdf, other]

Interpretable Latent Variables in Deep State Space Models

Authors: Haoxuan Wu, David S. Matteson, Martin T. Wells

Abstract: We introduce a new version of deep state-space models (DSSMs) that combines a recurrent neural network with a state-space framework to forecast time series data. The model estimates the observed series as functions of latent variables that evolve non-linearly through time. Due to the complexity and non-linearity inherent in DSSMs, previous works on DSSMs typically produced latent variables that ar… ▽ More We introduce a new version of deep state-space models (DSSMs) that combines a recurrent neural network with a state-space framework to forecast time series data. The model estimates the observed series as functions of latent variables that evolve non-linearly through time. Due to the complexity and non-linearity inherent in DSSMs, previous works on DSSMs typically produced latent variables that are very difficult to interpret. Our paper focus on producing interpretable latent parameters with two key modifications. First, we simplify the predictive decoder by restricting the response variables to be a linear transformation of the latent variables plus some noise. Second, we utilize shrinkage priors on the latent variables to reduce redundancy and improve robustness. These changes make the latent variables much easier to understand and allow us to interpret the resulting latent variables as random effects in a linear mixed model. We show through two public benchmark datasets the resulting model improves forecasting performances. △ Less

Submitted 19 May, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

arXiv:2107.02283 [pdf, other]

Clustering Structure of Microstructure Measures

Authors: Liao Zhu, Ningning Sun, Martin T. Wells

Abstract: This paper builds the clustering model of measures of market microstructure features which are popular in predicting stock returns. In a 10-second time-frequency, we study the clustering structure of different measures to find out the best ones for predicting. In this way, we can predict more accurately with a limited number of predictors, which removes the noise and makes the model more interpret… ▽ More This paper builds the clustering model of measures of market microstructure features which are popular in predicting stock returns. In a 10-second time-frequency, we study the clustering structure of different measures to find out the best ones for predicting. In this way, we can predict more accurately with a limited number of predictors, which removes the noise and makes the model more interpretable. △ Less

Submitted 25 December, 2021; v1 submitted 5 July, 2021; originally announced July 2021.

arXiv:2106.07103 [pdf, other]

A News-based Machine Learning Model for Adaptive Asset Pricing

Authors: Liao Zhu, Haoxuan Wu, Martin T. Wells

Abstract: The paper proposes a new asset pricing model -- the News Embedding UMAP Selection (NEUS) model, to explain and predict the stock returns based on the financial news. Using a combination of various machine learning algorithms, we first derive a company embedding vector for each basis asset from the financial news. Then we obtain a collection of the basis assets based on their company embedding. Aft… ▽ More The paper proposes a new asset pricing model -- the News Embedding UMAP Selection (NEUS) model, to explain and predict the stock returns based on the financial news. Using a combination of various machine learning algorithms, we first derive a company embedding vector for each basis asset from the financial news. Then we obtain a collection of the basis assets based on their company embedding. After that for each stock, we select the basis assets to explain and predict the stock return with high-dimensional statistical methods. The new model is shown to have a significantly better fitting and prediction power than the Fama-French 5-factor model. △ Less

Submitted 13 June, 2021; originally announced June 2021.

arXiv:2008.10183 [pdf, other]

HALO: Learning to Prune Neural Networks with Shrinkage

Authors: Skyler Seto, Martin T. Wells, Wenyu Zhang

Abstract: Deep neural networks achieve state-of-the-art performance in a variety of tasks by extracting a rich set of features from unstructured data, however this performance is closely tied to model size. Modern techniques for inducing sparsity and reducing model size are (1) network pruning, (2) training with a sparsity inducing penalty, and (3) training a binary mask jointly with the weights of the netw… ▽ More Deep neural networks achieve state-of-the-art performance in a variety of tasks by extracting a rich set of features from unstructured data, however this performance is closely tied to model size. Modern techniques for inducing sparsity and reducing model size are (1) network pruning, (2) training with a sparsity inducing penalty, and (3) training a binary mask jointly with the weights of the network. We study different sparsity inducing penalties from the perspective of Bayesian hierarchical models and present a novel penalty called Hierarchical Adaptive Lasso (HALO) which learns to adaptively sparsify weights of a given network via trainable parameters. When used to train over-parametrized networks, our penalty yields small subnetworks with high accuracy without fine-tuning. Empirically, on image recognition tasks, we find that HALO is able to learn highly sparse network (only 5% of the parameters) with significant gains in performance over state-of-the-art magnitude pruning methods at the same level of sparsity. Code is available at https://github.com/skyler120/sparsity-halo. △ Less

Submitted 27 February, 2021; v1 submitted 24 August, 2020; originally announced August 2020.

Comments: Accepted at SDM 2021

arXiv:2005.12415 [pdf, other]

Robust Matrix Completion with Mixed Data Types

Authors: Daqian Sun, Martin T. Wells

Abstract: We consider the matrix completion problem of recovering a structured low rank matrix with partially observed entries with mixed data types. Vast majority of the solutions have proposed computationally feasible estimators with strong statistical guarantees for the case where the underlying distribution of data in the matrix is continuous. A few recent approaches have extended using similar ideas th… ▽ More We consider the matrix completion problem of recovering a structured low rank matrix with partially observed entries with mixed data types. Vast majority of the solutions have proposed computationally feasible estimators with strong statistical guarantees for the case where the underlying distribution of data in the matrix is continuous. A few recent approaches have extended using similar ideas these estimators to the case where the underlying distributions belongs to the exponential family. Most of these approaches assume that there is only one underlying distribution and the low rank constraint is regularized by the matrix Schatten Norm. We propose a computationally feasible statistical approach with strong recovery guarantees along with an algorithmic framework suited for parallelization to recover a low rank matrix with partially observed entries for mixed data types in one step. We also provide extensive simulation evidence that corroborate our theoretical results. △ Less

Submitted 25 May, 2020; originally announced May 2020.

Comments: 35 pages

arXiv:1906.11333 [pdf, other]

Fairness criteria through the lens of directed acyclic graphical models

Authors: Benjamin R. Baer, Daniel E. Gilbert, Martin T. Wells

Abstract: A substantial portion of the literature on fairness in algorithms proposes, analyzes, and operationalizes simple formulaic criteria for assessing fairness. Two of these criteria, Equalized Odds and Calibration by Group, have gained significant attention for their simplicity and intuitive appeal, but also for their incompatibility. This chapter provides a perspective on the meaning and consequences… ▽ More A substantial portion of the literature on fairness in algorithms proposes, analyzes, and operationalizes simple formulaic criteria for assessing fairness. Two of these criteria, Equalized Odds and Calibration by Group, have gained significant attention for their simplicity and intuitive appeal, but also for their incompatibility. This chapter provides a perspective on the meaning and consequences of these and other fairness criteria using graphical models which reveals Equalized Odds and related criteria to be ultimately misleading. An assessment of various graphical models suggests that fairness criteria should ultimately be case-specific and sensitive to the nature of the information the algorithm processes. △ Less

Submitted 26 June, 2019; originally announced June 2019.

arXiv:1611.07115 [pdf, other]

Tree Space Prototypes: Another Look at Making Tree Ensembles Interpretable

Authors: Sarah Tan, Matvey Soloviev, Giles Hooker, Martin T. Wells

Abstract: Ensembles of decision trees perform well on many problems, but are not interpretable. In contrast to existing approaches in interpretability that focus on explaining relationships between features and predictions, we propose an alternative approach to interpret tree ensemble classifiers by surfacing representative points for each class -- prototypes. We introduce a new distance for Gradient Booste… ▽ More Ensembles of decision trees perform well on many problems, but are not interpretable. In contrast to existing approaches in interpretability that focus on explaining relationships between features and predictions, we propose an alternative approach to interpret tree ensemble classifiers by surfacing representative points for each class -- prototypes. We introduce a new distance for Gradient Boosted Tree models, and propose new, adaptive prototype selection methods with theoretical guarantees, with the flexibility to choose a different number of prototypes in each class. We demonstrate our methods on random forests and gradient boosted trees, showing that the prototypes can perform as well as or even better than the original tree ensemble when used as a nearest-prototype classifier. In a user study, humans were better at predicting the output of a tree ensemble classifier when using prototypes than when using Shapley values, a popular feature attribution method. Hence, prototypes present a viable alternative to feature-based explanations for tree ensembles. △ Less

Submitted 25 August, 2020; v1 submitted 21 November, 2016; originally announced November 2016.

Comments: Camera-ready version for ACM-IMS FODS 2020. A short version was presented at NIPS 2016 Workshop on Interpretable Machine Learning for Complex Systems

Showing 1–7 of 7 results for author: Wells, M T