Search | arXiv e-print repository

arXiv:2406.19531 [pdf, other]

Forward and Backward State Abstractions for Off-policy Evaluation

Authors: Meiling Hao, **fan Su, Liyuan Hu, Zoltan Szabo, Qingyuan Zhao, Chengchun Shi

Abstract: Off-policy evaluation (OPE) is crucial for evaluating a target policy's impact offline before its deployment. However, achieving accurate OPE in large state spaces remains challenging.This paper studies state abstractions-originally designed for policy learning-in the context of OPE. Our contributions are three-fold: (i) We define a set of irrelevance conditions central to learning state abstracti… ▽ More Off-policy evaluation (OPE) is crucial for evaluating a target policy's impact offline before its deployment. However, achieving accurate OPE in large state spaces remains challenging.This paper studies state abstractions-originally designed for policy learning-in the context of OPE. Our contributions are three-fold: (i) We define a set of irrelevance conditions central to learning state abstractions for OPE. (ii) We derive sufficient conditions for achieving irrelevance in Q-functions and marginalized importance sampling ratios, the latter obtained by constructing a time-reversed Markov decision process (MDP) based on the observed MDP. (iii) We propose a novel two-step procedure that sequentially projects the original state space into a smaller space, which substantially simplify the sample complexity of OPE arising from high cardinality. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 42 pages, 5 figures

ACM Class: G.3; I.2.6; G.1.2

arXiv:2307.05234 [pdf, ps, other]

CR-Lasso: Robust cellwise regularized sparse regression

Authors: Peng Su, Garth Tarr, Samuel Muller, Suo** Wang

Abstract: Cellwise contamination remains a challenging problem for data scientists, particularly in research fields that require the selection of sparse features. Traditional robust methods may not be feasible nor efficient in dealing with such contaminated datasets. We propose CR-Lasso, a robust Lasso-type cellwise regularization procedure that performs feature selection in the presence of cellwise outlier… ▽ More Cellwise contamination remains a challenging problem for data scientists, particularly in research fields that require the selection of sparse features. Traditional robust methods may not be feasible nor efficient in dealing with such contaminated datasets. We propose CR-Lasso, a robust Lasso-type cellwise regularization procedure that performs feature selection in the presence of cellwise outliers by minimising a regression loss and cell deviation measure simultaneously. To evaluate the approach, we conduct empirical studies comparing its selection and prediction performance with several sparse regression methods. We show that CR-Lasso is competitive under the settings considered. We illustrate the effectiveness of the proposed method on real data through an analysis of a bone mineral density dataset. △ Less

Submitted 1 March, 2024; v1 submitted 11 July, 2023; originally announced July 2023.

arXiv:2305.06651 [pdf]

Robust Inference for Causal Mediation Analysis of Recurrent Event Data

Authors: Yan-Lin Chen, Yan-Hong Chen, Pei-Fang Su, Huang-Tz Ou, An-Shun Tai

Abstract: Recurrent events, including cardiovascular events, are commonly observed in biomedical studies. Researchers must understand the effects of various treatments on recurrent events and investigate the underlying mediation mechanisms by which treatments may reduce the frequency of recurrent events are crucial. Although causal inference methods for recurrent event data have been proposed, they cannot b… ▽ More Recurrent events, including cardiovascular events, are commonly observed in biomedical studies. Researchers must understand the effects of various treatments on recurrent events and investigate the underlying mediation mechanisms by which treatments may reduce the frequency of recurrent events are crucial. Although causal inference methods for recurrent event data have been proposed, they cannot be used to assess mediation. This study proposed a novel methodology of causal mediation analysis that accommodates recurrent outcomes of interest in a given individual. A formal definition of causal estimands (direct and indirect effects) within a counterfactual framework is given, empirical expressions for these effects are identified. To estimate these effects, a semiparametric estimator with triple robustness against model misspecification was developed. The proposed methodology was demonstrated in a real-world application. The method was applied to measure the effects of two diabetes drugs on the recurrence of cardiovascular disease and to examine the mediating role of kidney function in this process. △ Less

Submitted 11 May, 2023; originally announced May 2023.

Comments: In preparation for journal submission

arXiv:2207.03466 [pdf, other]

Data-Driven optimal shrinkage of singular values under high-dimensional noise with separable covariance structure with application

Authors: Pei-Chun Su, Hau-Tieng Wu

Abstract: We develop a data-driven optimal shrinkage algorithm for matrix denoising in the presence of high-dimensional noise with a separable covariance structure; that is, the noise is colored and dependent across samples. The algorithm, coined {\em extended OptShrink} (eOptShrink) depends on the asymptotic behavior of singular values and singular vectors of the random matrix associated with the noisy dat… ▽ More We develop a data-driven optimal shrinkage algorithm for matrix denoising in the presence of high-dimensional noise with a separable covariance structure; that is, the noise is colored and dependent across samples. The algorithm, coined {\em extended OptShrink} (eOptShrink) depends on the asymptotic behavior of singular values and singular vectors of the random matrix associated with the noisy data. Based on the developed theory, including the sticking property of non-outlier singular values and delocalization of the non-outlier singular vectors associated with weak signals with a convergence rate, and the spectral behavior of outlier singular values and vectors, we develop three estimators, each of these has its own interest. First, we design a novel rank estimator, based on which we provide an estimator for the spectral distribution of the pure noise matrix, and hence the optimal shrinker called eOptShrink. In this algorithm we do not need to estimate the separable covariance structure of the noise. A theoretical guarantee of these estimators with a convergence rate is given. On the application side, in addition to a series of numerical simulations with a comparison with various state-of-the-art optimal shrinkage algorithms, we apply eOptShrink to extract maternal and fetal electrocardiograms from the single channel trans-abdominal maternal electrocardiogram. △ Less

Submitted 11 May, 2024; v1 submitted 7 July, 2022; originally announced July 2022.

Comments: arXiv admin note: text overlap with arXiv:1905.13060 by other authors

arXiv:2110.12406 [pdf, ps, other]

Robust Variable Selection under Cellwise Contamination

Authors: Peng Su, Garth Tarr, Samuel Muller

Abstract: Cellwise outliers are widespread in data and traditional robust methods may fail when applied to datasets under such contamination. We propose a variable selection procedure, that uses a pairwise robust estimator to obtain an initial empirical covariance matrix among the response and potentially many predictors. Then we replace the primary design matrix and the response vector with their robust co… ▽ More Cellwise outliers are widespread in data and traditional robust methods may fail when applied to datasets under such contamination. We propose a variable selection procedure, that uses a pairwise robust estimator to obtain an initial empirical covariance matrix among the response and potentially many predictors. Then we replace the primary design matrix and the response vector with their robust counterparts based on the estimated covariance matrix. Finally, we adopt the adaptive Lasso to obtain variable selection results. The proposed approach is robust to cellwise outliers in regular and high dimensional settings and empirical results show good performance in comparison with recently proposed alternative robust approaches, particularly in the challenging setting when contamination rates are high but the magnitude of outliers is moderate. Real data applications demonstrate the practical utility of the proposed method. △ Less

Submitted 4 September, 2023; v1 submitted 24 October, 2021; originally announced October 2021.

Comments: 17 pages, 4 figures

arXiv:2009.04450 [pdf, other]

Map-Adaptive Goal-Based Trajectory Prediction

Authors: Lingyao Zhang, Po-Hsun Su, Jerrick Hoang, Galen Clark Haynes, Micol Marchetti-Bowick

Abstract: We present a new method for multi-modal, long-term vehicle trajectory prediction. Our approach relies on using lane centerlines captured in rich maps of the environment to generate a set of proposed goal paths for each vehicle. Using these paths -- which are generated at run time and therefore dynamically adapt to the scene -- as spatial anchors, we predict a set of goal-based trajectories along w… ▽ More We present a new method for multi-modal, long-term vehicle trajectory prediction. Our approach relies on using lane centerlines captured in rich maps of the environment to generate a set of proposed goal paths for each vehicle. Using these paths -- which are generated at run time and therefore dynamically adapt to the scene -- as spatial anchors, we predict a set of goal-based trajectories along with a categorical distribution over the goals. This approach allows us to directly model the goal-directed behavior of traffic actors, which unlocks the potential for more accurate long-term prediction. Our experimental results on both a large-scale internal driving dataset and on the public nuScenes dataset show that our model outperforms state-of-the-art approaches for vehicle trajectory prediction over a 6-second horizon. We also empirically demonstrate that our model is better able to generalize to road scenes from a completely new city than existing methods. △ Less

Submitted 13 November, 2020; v1 submitted 9 September, 2020; originally announced September 2020.

Comments: Published at CoRL 2020

Journal ref: Conference on Robot Learning (CoRL) 2020

arXiv:1904.09525 [pdf, other]

Recovery of the fetal electrocardiogram for morphological analysis from two trans-abdominal channels via optimal shrinkage

Authors: Pei-Chun Su, Stephen Miller, Salim Idriss, Piers Barker, Hau-Tieng Wu

Abstract: We propose a novel algorithm to recover fetal electrocardiogram (ECG) for both the fetal heart rate analysis and morphological analysis of its waveform from two or three trans-abdominal maternal ECG channels. We design an algorithm based on the optimal-shrinkage and the nonlocal Euclidean median under the wave-shape manifold model. For the fetal heart rate analysis, the algorithm is evaluated on p… ▽ More We propose a novel algorithm to recover fetal electrocardiogram (ECG) for both the fetal heart rate analysis and morphological analysis of its waveform from two or three trans-abdominal maternal ECG channels. We design an algorithm based on the optimal-shrinkage and the nonlocal Euclidean median under the wave-shape manifold model. For the fetal heart rate analysis, the algorithm is evaluated on publicly available database, 2013 PhyioNet/Computing in Cardiology Challenge, set A. For the morphological analysis, we propose to simulate semi-real databases by mixing the MIT-BIH Normal Sinus Rhythm Database and MITDB Arrhythmia Database. For the fetal R peak detection, the proposed algorithm outperforms all algorithms under comparison. For the morphological analysis, the algorithm provides an encouraging result in recovery of the fetal ECG waveform, including PR, QT and ST intervals, even when the fetus has arrhythmia. To the best of our knowledge, this is the first work focusing on recovering the fetal ECG for morphological analysis from two or three channels with an algorithm potentially applicable for continuous fetal electrocardiographic monitoring, which creates the potential for long term monitoring purpose. △ Less

Submitted 8 August, 2019; v1 submitted 20 April, 2019; originally announced April 2019.

Comments: 25 pages, 6 figures

arXiv:1904.09204 [pdf, other]

Optimal Recovery of Precision Matrix for Mahalanobis Distance from High Dimensional Noisy Observations in Manifold Learning

Authors: Matan Gavish, Ronen Talmon, Pei-Chun Su, Hau-Tieng Wu

Abstract: Motivated by establishing theoretical foundations for various manifold learning algorithms, we study the problem of Mahalanobis distance (MD), and the associated precision matrix, estimation from high-dimensional noisy data. By relying on recent transformative results in covariance matrix estimation, we demonstrate the sensitivity of \MD~and the associated precision matrix to measurement noise, de… ▽ More Motivated by establishing theoretical foundations for various manifold learning algorithms, we study the problem of Mahalanobis distance (MD), and the associated precision matrix, estimation from high-dimensional noisy data. By relying on recent transformative results in covariance matrix estimation, we demonstrate the sensitivity of \MD~and the associated precision matrix to measurement noise, determining the exact asymptotic signal-to-noise ratio at which MD fails, and quantifying its performance otherwise. In addition, for an appropriate loss function, we propose an asymptotically optimal shrinker, which is shown to be beneficial over the classical implementation of the MD, both analytically and in simulations. The result is extended to the manifold setup, where the nonlinear interaction between curvature and high-dimensional noise is taken care of. The developed solution is applied to study a multiscale reduction problem in the dynamical system analysis. △ Less

Submitted 9 September, 2021; v1 submitted 19 April, 2019; originally announced April 2019.

arXiv:1802.03753 [pdf, other]

Sample Efficient Deep Reinforcement Learning for Dialogue Systems with Large Action Spaces

Authors: Gellért Weisz, Paweł Budzianowski, Pei-Hao Su, Milica Gašić

Abstract: In spoken dialogue systems, we aim to deploy artificial intelligence to build automated dialogue agents that can converse with humans. A part of this effort is the policy optimisation task, which attempts to find a policy describing how to respond to humans, in the form of a function taking the current state of the dialogue and returning the response of the system. In this paper, we investigate de… ▽ More In spoken dialogue systems, we aim to deploy artificial intelligence to build automated dialogue agents that can converse with humans. A part of this effort is the policy optimisation task, which attempts to find a policy describing how to respond to humans, in the form of a function taking the current state of the dialogue and returning the response of the system. In this paper, we investigate deep reinforcement learning approaches to solve this problem. Particular attention is given to actor-critic methods, off-policy reinforcement learning with experience replay, and various methods aimed at reducing the bias and variance of estimators. When combined, these methods result in the previously proposed ACER algorithm that gave competitive results in gaming environments. These environments however are fully observable and have a relatively small action set so in this paper we examine the application of ACER to dialogue policy optimisation. We show that this method beats the current state-of-the-art in deep learning approaches for spoken dialogue systems. This not only leads to a more sample efficient algorithm that can train faster, but also allows us to apply the algorithm in more difficult environments than before. We thus experiment with learning in a very large action space, which has two orders of magnitude more actions than previously considered. We find that ACER trains significantly faster than the current state-of-the-art. △ Less

Submitted 11 February, 2018; originally announced February 2018.

arXiv:1711.11023 [pdf, other]

A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management

Authors: Iñigo Casanueva, Paweł Budzianowski, Pei-Hao Su, Nikola Mrkšić, Tsung-Hsien Wen, Stefan Ultes, Lina Rojas-Barahona, Steve Young, Milica Gašić

Abstract: Dialogue assistants are rapidly becoming an indispensable daily aid. To avoid the significant effort needed to hand-craft the required dialogue flow, the Dialogue Management (DM) module can be cast as a continuous Markov Decision Process (MDP) and trained through Reinforcement Learning (RL). Several RL models have been investigated over recent years. However, the lack of a common benchmarking fram… ▽ More Dialogue assistants are rapidly becoming an indispensable daily aid. To avoid the significant effort needed to hand-craft the required dialogue flow, the Dialogue Management (DM) module can be cast as a continuous Markov Decision Process (MDP) and trained through Reinforcement Learning (RL). Several RL models have been investigated over recent years. However, the lack of a common benchmarking framework makes it difficult to perform a fair comparison between different models and their capability to generalise to different environments. Therefore, this paper proposes a set of challenging simulated environments for dialogue model development and evaluation. To provide some baselines, we investigate a number of representative parametric algorithms, namely deep reinforcement learning algorithms - DQN, A2C and Natural Actor-Critic and compare them to a non-parametric model, GP-SARSA. Both the environments and policy models are implemented using the publicly available PyDial toolkit and released on-line, in order to establish a testbed framework for further experiments and to facilitate experimental reproducibility. △ Less

Submitted 6 April, 2018; v1 submitted 29 November, 2017; originally announced November 2017.

Comments: Accepted at the Deep Reinforcement Learning Symposium, 31st Conference on Neural Information Processing Systems (NIPS 2017) Paper updated with minor changes

arXiv:1707.06299 [pdf, other]

Reward-Balancing for Statistical Spoken Dialogue Systems using Multi-objective Reinforcement Learning

Authors: Stefan Ultes, Paweł Budzianowski, Iñigo Casanueva, Nikola Mrkšić, Lina Rojas-Barahona, Pei-Hao Su, Tsung-Hsien Wen, Milica Gašić, Steve Young

Abstract: Reinforcement learning is widely used for dialogue policy optimization where the reward function often consists of more than one component, e.g., the dialogue success and the dialogue length. In this work, we propose a structured method for finding a good balance between these components by searching for the optimal reward component weighting. To render this search feasible, we use multi-objective… ▽ More Reinforcement learning is widely used for dialogue policy optimization where the reward function often consists of more than one component, e.g., the dialogue success and the dialogue length. In this work, we propose a structured method for finding a good balance between these components by searching for the optimal reward component weighting. To render this search feasible, we use multi-objective reinforcement learning to significantly reduce the number of training dialogues required. We apply our proposed method to find optimized component weights for six domains and compare them to a default baseline. △ Less

Submitted 19 July, 2017; originally announced July 2017.

Comments: Accepted at SIGDial 2017

arXiv:1705.04524 [pdf, other]

Long-term Blood Pressure Prediction with Deep Recurrent Neural Networks

Authors: Peng Su, Xiao-Rong Ding, Yuan-Ting Zhang, **g Liu, Fen Miao, Ni Zhao

Abstract: Existing methods for arterial blood pressure (BP) estimation directly map the input physiological signals to output BP values without explicitly modeling the underlying temporal dependencies in BP dynamics. As a result, these models suffer from accuracy decay over a long time and thus require frequent calibration. In this work, we address this issue by formulating BP estimation as a sequence predi… ▽ More Existing methods for arterial blood pressure (BP) estimation directly map the input physiological signals to output BP values without explicitly modeling the underlying temporal dependencies in BP dynamics. As a result, these models suffer from accuracy decay over a long time and thus require frequent calibration. In this work, we address this issue by formulating BP estimation as a sequence prediction problem in which both the input and target are temporal sequences. We propose a novel deep recurrent neural network (RNN) consisting of multilayered Long Short-Term Memory (LSTM) networks, which are incorporated with (1) a bidirectional structure to access larger-scale context information of input sequence, and (2) residual connections to allow gradients in deep RNN to propagate more effectively. The proposed deep RNN model was tested on a static BP dataset, and it achieved root mean square error (RMSE) of 3.90 and 2.66 mmHg for systolic BP (SBP) and diastolic BP (DBP) prediction respectively, surpassing the accuracy of traditional BP prediction models. On a multi-day BP dataset, the deep RNN achieved RMSE of 3.84, 5.25, 5.80 and 5.81 mmHg for the 1st day, 2nd day, 4th day and 6th month after the 1st day SBP prediction, and 1.80, 4.78, 5.0, 5.21 mmHg for corresponding DBP prediction, respectively, which outperforms all previous models with notable improvement. The experimental results suggest that modeling the temporal dependencies in BP dynamics significantly improves the long-term BP prediction accuracy. △ Less

Submitted 14 January, 2018; v1 submitted 12 May, 2017; originally announced May 2017.

Comments: To appear in IEEE BHI 2018

arXiv:1606.03352 [pdf, other]

Conditional Generation and Snapshot Learning in Neural Dialogue Systems

Authors: Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Lina M. Rojas-Barahona, Pei-Hao Su, Stefan Ultes, David Vandyke, Steve Young

Abstract: Recently a variety of LSTM-based conditional language models (LM) have been applied across a range of language generation tasks. In this work we study various model architectures and different ways to represent and aggregate the source information in an end-to-end neural dialogue system framework. A method called snapshot learning is also proposed to facilitate learning from supervised sequential… ▽ More Recently a variety of LSTM-based conditional language models (LM) have been applied across a range of language generation tasks. In this work we study various model architectures and different ways to represent and aggregate the source information in an end-to-end neural dialogue system framework. A method called snapshot learning is also proposed to facilitate learning from supervised sequential signals by applying a companion cross-entropy objective function to the conditioning vector. The experimental and analytical results demonstrate firstly that competition occurs between the conditioning vector and the LM, and the differing architectures provide different trade-offs between the two. Secondly, the discriminative power and transparency of the conditioning vector is key to providing both model interpretability and better performance. Thirdly, snapshot learning leads to consistent performance improvements independent of which architecture is used. △ Less

Submitted 10 June, 2016; originally announced June 2016.

arXiv:1604.04562 [pdf, other]

A Network-based End-to-End Trainable Task-oriented Dialogue System

Authors: Tsung-Hsien Wen, David Vandyke, Nikola Mrksic, Milica Gasic, Lina M. Rojas-Barahona, Pei-Hao Su, Stefan Ultes, Steve Young

Abstract: Teaching machines to accomplish tasks by conversing naturally with humans is challenging. Currently, develo** task-oriented dialogue systems requires creating multiple components and typically this involves either a large amount of handcrafting, or acquiring costly labelled datasets to solve a statistical learning problem for each component. In this work we introduce a neural network-based text-… ▽ More Teaching machines to accomplish tasks by conversing naturally with humans is challenging. Currently, develo** task-oriented dialogue systems requires creating multiple components and typically this involves either a large amount of handcrafting, or acquiring costly labelled datasets to solve a statistical learning problem for each component. In this work we introduce a neural network-based text-in, text-out end-to-end trainable goal-oriented dialogue system along with a new way of collecting dialogue data based on a novel pipe-lined Wizard-of-Oz framework. This approach allows us to develop dialogue systems easily and without making too many assumptions about the task at hand. The results show that the model can converse with human subjects naturally whilst hel** them to accomplish tasks in a restaurant search domain. △ Less

Submitted 24 April, 2017; v1 submitted 15 April, 2016; originally announced April 2016.

Comments: published at EACL 2017

Showing 1–14 of 14 results for author: Su, P