Search | arXiv e-print repository

arXiv:2402.19109 [pdf, other]

Confidence and Assurance of Percentiles

Abstract: Confidence interval of mean is often used when quoting statistics. The same rigor is often missing when quoting percentiles and tolerance or percentile intervals. This article derives the expression for confidence in percentiles of a sample population. Confidence intervals of median is compared to those of mean for a few sample distributions. The concept of assurance from reliability engineering i… ▽ More Confidence interval of mean is often used when quoting statistics. The same rigor is often missing when quoting percentiles and tolerance or percentile intervals. This article derives the expression for confidence in percentiles of a sample population. Confidence intervals of median is compared to those of mean for a few sample distributions. The concept of assurance from reliability engineering is then extended to percentiles. The assurance level of sorted samples simply matches the confidence and percentile levels. Numerical method to compute assurance using Brent's optimization method is provided as an open-source python package. △ Less

Submitted 29 February, 2024; originally announced February 2024.

Comments: 5 pages, 4 Figures

arXiv:2311.06212 [pdf, other]

Differentiable VQ-VAE's for Robust White Matter Streamline Encodings

Authors: Andrew Lizarraga, Brandon Taraku, Edouardo Honig, Ying Nian Wu, Shantanu H. Joshi

Abstract: Given the complex geometry of white matter streamlines, Autoencoders have been proposed as a dimension-reduction tool to simplify the analysis streamlines in a low-dimensional latent spaces. However, despite these recent successes, the majority of encoder architectures only perform dimension reduction on single streamlines as opposed to a full bundle of streamlines. This is a severe limitation of… ▽ More Given the complex geometry of white matter streamlines, Autoencoders have been proposed as a dimension-reduction tool to simplify the analysis streamlines in a low-dimensional latent spaces. However, despite these recent successes, the majority of encoder architectures only perform dimension reduction on single streamlines as opposed to a full bundle of streamlines. This is a severe limitation of the encoder architecture that completely disregards the global geometric structure of streamlines at the expense of individual fibers. Moreover, the latent space may not be well structured which leads to doubt into their interpretability. In this paper we propose a novel Differentiable Vector Quantized Variational Autoencoder, which are engineered to ingest entire bundles of streamlines as single data-point and provides reliable trustworthy encodings that can then be later used to analyze streamlines in the latent space. Comparisons with several state of the art Autoencoders demonstrate superior performance in both encoding and synthesis. △ Less

Submitted 18 November, 2023; v1 submitted 10 November, 2023; originally announced November 2023.

Comments: 5 pages, 4 figures, 1 table

arXiv:2305.16578 [pdf, other]

Computation of Reliability Statistics for Finite Samples of Success-Failure Experiments

Authors: Sanjay M. Joshi

Abstract: Computational method for statistical measures of reliability, confidence, and assurance are available for infinite population size. If the population size is finite and small compared to the number of samples tested, these computational methods need to be improved for a better representation of reality. This article discusses how to compute reliability, confidence, and assurance statistics for fin… ▽ More Computational method for statistical measures of reliability, confidence, and assurance are available for infinite population size. If the population size is finite and small compared to the number of samples tested, these computational methods need to be improved for a better representation of reality. This article discusses how to compute reliability, confidence, and assurance statistics for finite number of samples. Graphs and tables are provided as examples and can be used for low number of test sample sizes. Two open-source python libraries are provided for computing reliability, confidence, and assurance with both infinite and finite number of samples. △ Less

Submitted 25 May, 2023; originally announced May 2023.

Comments: 6 pages, 4 figures, 1 table

arXiv:2305.16536 [pdf, ps, other]

Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression

Authors: Yihao Xue, Siddharth Joshi, Eric Gan, Pin-Yu Chen, Baharan Mirzasoleiman

Abstract: Contrastive learning (CL) has emerged as a powerful technique for representation learning, with or without label supervision. However, supervised CL is prone to collapsing representations of subclasses within a class by not capturing all their features, and unsupervised CL may suppress harder class-relevant features by focusing on learning easy class-irrelevant features; both significantly comprom… ▽ More Contrastive learning (CL) has emerged as a powerful technique for representation learning, with or without label supervision. However, supervised CL is prone to collapsing representations of subclasses within a class by not capturing all their features, and unsupervised CL may suppress harder class-relevant features by focusing on learning easy class-irrelevant features; both significantly compromise representation quality. Yet, there is no theoretical understanding of \textit{class collapse} or \textit{feature suppression} at \textit{test} time. We provide the first unified theoretically rigorous framework to determine \textit{which} features are learnt by CL. Our analysis indicate that, perhaps surprisingly, bias of (stochastic) gradient descent towards finding simpler solutions is a key factor in collapsing subclass representations and suppressing harder class-relevant features. Moreover, we present increasing embedding dimensionality and improving the quality of data augmentations as two theoretically motivated solutions to {feature suppression}. We also provide the first theoretical explanation for why employing supervised and unsupervised CL together yields higher-quality representations, even when using commonly-used stochastic gradient methods. △ Less

Submitted 28 May, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: to appear at ICML 2023

arXiv:2303.03167 [pdf, other]

Computation of Reliability Statistics for Success-Failure Experiments

Authors: Sanjay M. Joshi

Abstract: Reliability is probability of success in a success-failure experiment. Confidence in reliability estimate improves with increasing number of samples. Assurance sets confidence level same as reliability to create one number for easier communication. Assuming binomial distribution for the samples, closed-form expression exists only for calculating confidence. The Wilson Score interval with continuit… ▽ More Reliability is probability of success in a success-failure experiment. Confidence in reliability estimate improves with increasing number of samples. Assurance sets confidence level same as reliability to create one number for easier communication. Assuming binomial distribution for the samples, closed-form expression exists only for calculating confidence. The Wilson Score interval with continuity correction provides approximate closed-form expression for reliability. Brent's method was found to provide fast and accurate estimate for both reliability and assurance computations. Graphs and tables are provided for several number of samples. Two open-source python libraries are introduced for computing reliability, confidence, and assurance. △ Less

Submitted 2 March, 2023; originally announced March 2023.

Comments: 6 pages, 4 figure, 3 tables

arXiv:2210.10769 [pdf, other]

"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts

Authors: Haoran Zhang, Harvineet Singh, Marzyeh Ghassemi, Shalmali Joshi

Abstract: Machine learning models frequently experience performance drops under distribution shifts. The underlying cause of such shifts may be multiple simultaneous factors such as changes in data quality, differences in specific covariate distributions, or changes in the relationship between label and features. When a model does fail during deployment, attributing performance change to these factors is cr… ▽ More Machine learning models frequently experience performance drops under distribution shifts. The underlying cause of such shifts may be multiple simultaneous factors such as changes in data quality, differences in specific covariate distributions, or changes in the relationship between label and features. When a model does fail during deployment, attributing performance change to these factors is critical for the model developer to identify the root cause and take mitigating actions. In this work, we introduce the problem of attributing performance differences between environments to distribution shifts in the underlying data generating mechanisms. We formulate the problem as a cooperative game where the players are distributions. We define the value of a set of distributions to be the change in model performance when only this set of distributions has changed between environments, and derive an importance weighting method for computing the value of an arbitrary set of distributions. The contribution of each distribution to the total performance change is then quantified as its Shapley value. We demonstrate the correctness and utility of our method on synthetic, semi-synthetic, and real-world case studies, showing its effectiveness in attributing performance changes to a wide range of distribution shifts. △ Less

Submitted 6 June, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

Comments: Published in ICML 2023

arXiv:2209.08682 [pdf, other]

doi 10.1145/3514094.3534198

Towards Robust Off-Policy Evaluation via Human Inputs

Authors: Harvineet Singh, Shalmali Joshi, Finale Doshi-Velez, Himabindu Lakkaraju

Abstract: Off-policy Evaluation (OPE) methods are crucial tools for evaluating policies in high-stakes domains such as healthcare, where direct deployment is often infeasible, unethical, or expensive. When deployment environments are expected to undergo changes (that is, dataset shifts), it is important for OPE methods to perform robust evaluation of the policies amidst such changes. Existing approaches con… ▽ More Off-policy Evaluation (OPE) methods are crucial tools for evaluating policies in high-stakes domains such as healthcare, where direct deployment is often infeasible, unethical, or expensive. When deployment environments are expected to undergo changes (that is, dataset shifts), it is important for OPE methods to perform robust evaluation of the policies amidst such changes. Existing approaches consider robustness against a large class of shifts that can arbitrarily change any observable property of the environment. This often results in highly pessimistic estimates of the utilities, thereby invalidating policies that might have been useful in deployment. In this work, we address the aforementioned problem by investigating how domain knowledge can help provide more realistic estimates of the utilities of policies. We leverage human inputs on which aspects of the environments may plausibly change, and adapt the OPE methods to only consider shifts on these aspects. Specifically, we propose a novel framework, Robust OPE (ROPE), which considers shifts on a subset of covariates in the data based on user inputs, and estimates worst-case utility under these shifts. We then develop computationally efficient algorithms for OPE that are robust to the aforementioned shifts for contextual bandits and Markov decision processes. We also theoretically analyze the sample complexity of these algorithms. Extensive experimentation with synthetic and real world datasets from the healthcare domain demonstrates that our approach not only captures realistic dataset shifts accurately, but also results in less pessimistic policy evaluations. △ Less

Submitted 18 September, 2022; originally announced September 2022.

Comments: 10 pages, 5 figures, 1 table. Appeared at AIES '22: Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society. Expanded version of arXiv:2103.15933

arXiv:2201.08262 [pdf, other]

Generalizing Off-Policy Evaluation From a Causal Perspective For Sequential Decision-Making

Authors: Sonali Parbhoo, Shalmali Joshi, Finale Doshi-Velez

Abstract: Assessing the effects of a policy based on observational data from a different policy is a common problem across several high-stake decision-making domains, and several off-policy evaluation (OPE) techniques have been proposed. However, these methods largely formulate OPE as a problem disassociated from the process used to generate the data (i.e. structural assumptions in the form of a causal grap… ▽ More Assessing the effects of a policy based on observational data from a different policy is a common problem across several high-stake decision-making domains, and several off-policy evaluation (OPE) techniques have been proposed. However, these methods largely formulate OPE as a problem disassociated from the process used to generate the data (i.e. structural assumptions in the form of a causal graph). We argue that explicitly highlighting this association has important implications on our understanding of the fundamental limits of OPE. First, this implies that current formulation of OPE corresponds to a narrow set of tasks, i.e. a specific causal estimand which is focused on prospective evaluation of policies over populations or sub-populations. Second, we demonstrate how this association motivates natural desiderata to consider a general set of causal estimands, particularly extending the role of OPE for counterfactual off-policy evaluation at the level of individuals of the population. A precise description of the causal estimand highlights which OPE estimands are identifiable from observational data under the stated generative assumptions. For those OPE estimands that are not identifiable, the causal perspective further highlights where more experimental data is necessary, and highlights situations where human expertise can aid identification and estimation. Furthermore, many formalisms of OPE overlook the role of uncertainty entirely in the estimation process.We demonstrate how specifically characterising the causal estimand highlights the different sources of uncertainty and when human expertise can naturally manage this uncertainty. We discuss each of these aspects as actionable desiderata for future OPE research at scale and in-line with practical utility. △ Less

Submitted 20 January, 2022; originally announced January 2022.

arXiv:2112.00866 [pdf, other]

Bridge Simulation and Metric Estimation on Lie Groups and Homogeneous Spaces

Authors: Mathias Højgaard Jensen, Lennard Hilgendorf, Sarang Joshi, Stefan Sommer

Abstract: We present schemes for simulating Brownian bridges on complete and connected Lie groups and homogeneous spaces. We use this to construct an estimation scheme for recovering an unknown left- or right-invariant Riemannian metric on the Lie group from samples. We subsequently show how pushing forward the distributions generated by Brownian motions on the group results in distributions on homogeneous… ▽ More We present schemes for simulating Brownian bridges on complete and connected Lie groups and homogeneous spaces. We use this to construct an estimation scheme for recovering an unknown left- or right-invariant Riemannian metric on the Lie group from samples. We subsequently show how pushing forward the distributions generated by Brownian motions on the group results in distributions on homogeneous spaces that exhibit non-trivial covariance structure. The pushforward measure gives rise to new parametric families of distributions on commonly occurring spaces such as spheres and symmetric positive tensors. We extend the estimation scheme to fit these distributions to homogeneous space-valued data. We demonstrate both the simulation schemes and estimation procedures on Lie groups and homogenous spaces, including $\SPD(3) = \GL_+(3)/\SO(3)$ and $\mathbb S^2 = \SO(3)/\SO(2)$. △ Less

Submitted 24 May, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

Comments: arXiv admin note: text overlap with arXiv:2106.03431

arXiv:2109.10957 [pdf, other]

Real Robot Challenge: A Robotics Competition in the Cloud

Authors: Stefan Bauer, Felix Widmaier, Manuel Wüthrich, Annika Buchholz, Sebastian Stark, Anirudh Goyal, Thomas Steinbrenner, Joel Akpo, Shruti Joshi, Vincent Berenz, Vaibhav Agrawal, Niklas Funk, Julen Urain De Jesus, Jan Peters, Joe Watson, Claire Chen, Krishnan Srinivasan, Junwu Zhang, Jeffrey Zhang, Matthew R. Walter, Rishabh Madan, Charles Schaff, Takahiro Maeda, Takuma Yoneda, Denis Yarats , et al. (17 additional authors not shown)

Abstract: Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at MPI for Intelligent Systems and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able… ▽ More Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at MPI for Intelligent Systems and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able to control the platforms remotely by submitting code that is executed automatically, akin to a computational cluster. Using this setup, i) we host robotics competitions, where teams from anywhere in the world access our platforms to tackle challenging tasks ii) we publish the datasets collected during these competitions (consisting of hundreds of robot hours), and iii) we give researchers access to these platforms for their own projects. △ Less

Submitted 10 June, 2022; v1 submitted 22 September, 2021; originally announced September 2021.

arXiv:2109.06312 [pdf, other]

Learning-to-defer for sequential medical decision-making under uncertainty

Authors: Shalmali Joshi, Sonali Parbhoo, Finale Doshi-Velez

Abstract: Learning-to-defer is a framework to automatically defer decision-making to a human expert when ML-based decisions are deemed unreliable. Existing learning-to-defer frameworks are not designed for sequential settings. That is, they defer at every instance independently, based on immediate predictions, while ignoring the potential long-term impact of these interventions. As a result, existing framew… ▽ More Learning-to-defer is a framework to automatically defer decision-making to a human expert when ML-based decisions are deemed unreliable. Existing learning-to-defer frameworks are not designed for sequential settings. That is, they defer at every instance independently, based on immediate predictions, while ignoring the potential long-term impact of these interventions. As a result, existing frameworks are myopic. Further, they do not defer adaptively, which is crucial when human interventions are costly. In this work, we propose Sequential Learning-to-Defer (SLTD), a framework for learning-to-defer to a domain expert in sequential decision-making settings. Contrary to existing literature, we pose the problem of learning-to-defer as model-based reinforcement learning (RL) to i) account for long-term consequences of ML-based actions using RL and ii) adaptively defer based on the dynamics (model-based). Our proposed framework determines whether to defer (at each time step) by quantifying whether a deferral now will improve the value compared to delaying deferral to the next time step. To quantify the improvement, we account for potential future deferrals. As a result, we learn a pre-emptive deferral policy (i.e. a policy that defers early if using the ML-based policy could worsen long-term outcomes). Our deferral policy is adaptive to the non-stationarity in the dynamics. We demonstrate that adaptive deferral via SLTD provides an improved trade-off between long-term outcomes and deferral frequency on synthetic, semi-synthetic, and real-world data with non-stationary dynamics. Finally, we interpret the deferral decision by decomposing the propagated (long-term) uncertainty around the outcome, to justify the deferral decision. △ Less

Submitted 5 December, 2022; v1 submitted 13 September, 2021; originally announced September 2021.

arXiv:2108.06553 [pdf]

Dimensionality Reduction and State Space Systems: Forecasting the US Treasury Yields Using Frequentist and Bayesian VARs

Authors: Sudiksha Joshi

Abstract: Using a state-space system, I forecasted the US Treasury yields by employing frequentist and Bayesian methods after first decomposing the yields of varying maturities into its unobserved term structure factors. Then, I exploited the structure of the state-space model to forecast the Treasury yields and compared the forecast performance of each model using mean squared forecast error. Among the fre… ▽ More Using a state-space system, I forecasted the US Treasury yields by employing frequentist and Bayesian methods after first decomposing the yields of varying maturities into its unobserved term structure factors. Then, I exploited the structure of the state-space model to forecast the Treasury yields and compared the forecast performance of each model using mean squared forecast error. Among the frequentist methods, I applied the two-step Diebold-Li, two-step principal components, and one-step Kalman filter approaches. Likewise, I imposed the five different priors in Bayesian VARs: Diffuse, Minnesota, natural conjugate, the independent normal inverse: Wishart, and the stochastic search variable selection priors. After forecasting the Treasury yields for 9 different forecast horizons, I found that the BVAR with Minnesota prior generally minimizes the loss function. I augmented the above BVARs by including macroeconomic variables and constructed impulse response functions with a recursive ordering identification scheme. Finally, I fitted a sign-restricted BVAR with dummy observations. △ Less

Submitted 14 August, 2021; originally announced August 2021.

Comments: 21 figures and 16 tables

arXiv:2108.03697 [pdf, other]

Alignment of Tractography Streamlines using Deformation Transfer via Parallel Transport

Authors: Andrew Lizarraga, David Lee, Antoni Kubicki, Ashish Sahib, Elvis Nunez, Katherine Narr, Shantanu H. Joshi

Abstract: We present a geometric framework for aligning white matter fiber tracts. By registering fiber tracts between brains, one expects to see overlap of anatomical structures that often provide meaningful comparisons across subjects. However, the geometry of white matter tracts is highly heterogeneous, and finding direct tract-correspondence across multiple individuals remains a challenging problem. We… ▽ More We present a geometric framework for aligning white matter fiber tracts. By registering fiber tracts between brains, one expects to see overlap of anatomical structures that often provide meaningful comparisons across subjects. However, the geometry of white matter tracts is highly heterogeneous, and finding direct tract-correspondence across multiple individuals remains a challenging problem. We present a novel deformation metric between tracts that allows one to compare tracts while simultaneously obtaining a registration. To accomplish this, fiber tracts are represented by an intrinsic mean along with the deformation fields represented by tangent vectors from the mean. In this setting, one can determine a parallel transport between tracts and then register corresponding tangent vectors. We present the results of bundle alignment on a population of 43 healthy adult subjects. △ Less

Submitted 8 August, 2021; originally announced August 2021.

arXiv:2107.05578 [pdf, other]

doi 10.1016/j.rse.2021.112574

Impact of Scene-Specific Enhancement Spectra on Matched Filter Greenhouse Gas Retrievals from Imaging Spectroscopy

Authors: Markus D. Foote, Philip E. Dennison, Patrick R. Sullivan, Kelly B. O'Neill, Andrew K. Thorpe, David R. Thompson, Daniel H. Cusworth, Riley Duren, Sarang C. Joshi

Abstract: Matched filter (MF) techniques have been widely used for retrieval of greenhouse gas enhancements (enh.) from imaging spectroscopy datasets. While multiple algorithmic techniques and refinements have been proposed, the greenhouse gas target spectrum used for concentration enh. estimation has remained largely unaltered since the introduction of quantitative MF retrievals. The magnitude of retrieved… ▽ More Matched filter (MF) techniques have been widely used for retrieval of greenhouse gas enhancements (enh.) from imaging spectroscopy datasets. While multiple algorithmic techniques and refinements have been proposed, the greenhouse gas target spectrum used for concentration enh. estimation has remained largely unaltered since the introduction of quantitative MF retrievals. The magnitude of retrieved methane and carbon dioxide enh., and thereby integrated mass enh. (IME) and estimated flux of point-source emitters, is heavily dependent on this target spectrum. Current standard use of molecular absorption coefficients to create unit enh. target spectra does not account for absorption by background concentrations of greenhouse gases, solar and sensor geometry, or atmospheric water vapor absorption. We introduce geometric and atmospheric parameters into the generation of scene-specific (SS) unit enh. spectra to provide target spectra that are compatible with all greenhouse gas retrieval MF techniques. For methane plumes, IME resulting from use of standard, generic enh. spectra varied from -22 to +28.7% compared to SS enh. spectra. Due to differences in spectral shape between the generic and SS enh. spectra, differences in methane plume IME were linked to surface spectral characteristics in addition to geometric and atmospheric parameters. IME differences for carbon dioxide plumes, with generic enh. spectra producing integrated mass enh. -76.1 to -48.1% compared to SS enh. spectra. Fluxes calculated from these integrated enh. would vary by the same %s, assuming equivalent wind conditions. Methane and carbon dioxide IME were most sensitive to changes in solar zenith angle and ground elevation. SS target spectra can improve confidence in greenhouse gas retrievals and flux estimates across collections of scenes with diverse geometric and atmospheric conditions. △ Less

Submitted 10 August, 2021; v1 submitted 25 June, 2021; originally announced July 2021.

Comments: 13 pages, 5 figures, 3 tables

Journal ref: Remote Sensing of Environment, Volume 264, October 2021, 112574

arXiv:2106.03431 [pdf, other]

Bridge Simulation and Metric Estimation on Lie Groups

Authors: Mathias Højgaard Jensen, Sarang Joshi, Stefan Sommer

Abstract: We present a simulation scheme for simulating Brownian bridges on complete and connected Lie groups. We show how this simulation scheme leads to absolute continuity of the Brownian bridge measure with respect to the guided process measure. This result generalizes the Euclidean result of Delyon and Hu to Lie groups. We present numerical results of the guided process in the Lie group $\SO(3)$. In pa… ▽ More We present a simulation scheme for simulating Brownian bridges on complete and connected Lie groups. We show how this simulation scheme leads to absolute continuity of the Brownian bridge measure with respect to the guided process measure. This result generalizes the Euclidean result of Delyon and Hu to Lie groups. We present numerical results of the guided process in the Lie group $\SO(3)$. In particular, we apply importance sampling to estimate the metric on $\SO(3)$ using an iterative maximum likelihood method. △ Less

Submitted 7 June, 2021; originally announced June 2021.

arXiv:2103.15933 [pdf, other]

Learning Under Adversarial and Interventional Shifts

Authors: Harvineet Singh, Shalmali Joshi, Finale Doshi-Velez, Himabindu Lakkaraju

Abstract: Machine learning models are often trained on data from one distribution and deployed on others. So it becomes important to design models that are robust to distribution shifts. Most of the existing work focuses on optimizing for either adversarial shifts or interventional shifts. Adversarial methods lack expressivity in representing plausible shifts as they consider shifts to joint distributions i… ▽ More Machine learning models are often trained on data from one distribution and deployed on others. So it becomes important to design models that are robust to distribution shifts. Most of the existing work focuses on optimizing for either adversarial shifts or interventional shifts. Adversarial methods lack expressivity in representing plausible shifts as they consider shifts to joint distributions in the data. Interventional methods allow more expressivity but provide robustness to unbounded shifts, resulting in overly conservative models. In this work, we combine the complementary strengths of the two approaches and propose a new formulation, RISe, for designing robust models against a set of distribution shifts that are at the intersection of adversarial and interventional shifts. We employ the distributionally robust optimization framework to optimize the resulting objective in both supervised and reinforcement learning settings. Extensive experimentation with synthetic and real world datasets from healthcare demonstrate the efficacy of the proposed approach. △ Less

Submitted 29 March, 2021; originally announced March 2021.

Comments: 19 pages including 5 pages appendix, 6 figures, 2 tables. Preliminary version presented at Causal Discovery & Causality-Inspired Machine Learning Workshop 2020

arXiv:2011.08753 [pdf, other]

Confounding Feature Acquisition for Causal Effect Estimation

Authors: Shirly Wang, Seung Eun Yi, Shalmali Joshi, Marzyeh Ghassemi

Abstract: Reliable treatment effect estimation from observational data depends on the availability of all confounding information. While much work has targeted treatment effect estimation from observational data, there is relatively little work in the setting of confounding variable missingness, where collecting more information on confounders is often costly or time-consuming. In this work, we frame this c… ▽ More Reliable treatment effect estimation from observational data depends on the availability of all confounding information. While much work has targeted treatment effect estimation from observational data, there is relatively little work in the setting of confounding variable missingness, where collecting more information on confounders is often costly or time-consuming. In this work, we frame this challenge as a problem of feature acquisition of confounding features for causal inference. Our goal is to prioritize acquiring values for a fixed and known subset of missing confounders in samples that lead to efficient average treatment effect estimation. We propose two acquisition strategies based on i) covariate balancing (CB), and ii) reducing statistical estimation error on observed factual outcome error (OE). We compare CB and OE on five common causal effect estimation methods, and demonstrate improved sample efficiency of OE over baseline methods under various settings. We also provide visualizations for further analysis on the difference between our proposed methods. △ Less

Submitted 17 November, 2020; originally announced November 2020.

arXiv:2010.07093 [pdf, other]

Function Contrastive Learning of Transferable Meta-Representations

Authors: Muhammad Waleed Gondal, Shruti Joshi, Nasim Rahaman, Stefan Bauer, Manuel Wüthrich, Bernhard Schölkopf

Abstract: Meta-learning algorithms adapt quickly to new tasks that are drawn from the same task distribution as the training tasks. The mechanism leading to fast adaptation is the conditioning of a downstream predictive model on the inferred representation of the task's underlying data generative process, or \emph{function}. This \emph{meta-representation}, which is computed from a few observed examples of… ▽ More Meta-learning algorithms adapt quickly to new tasks that are drawn from the same task distribution as the training tasks. The mechanism leading to fast adaptation is the conditioning of a downstream predictive model on the inferred representation of the task's underlying data generative process, or \emph{function}. This \emph{meta-representation}, which is computed from a few observed examples of the underlying function, is learned jointly with the predictive model. In this work, we study the implications of this joint training on the transferability of the meta-representations. Our goal is to learn meta-representations that are robust to noise in the data and facilitate solving a wide range of downstream tasks that share the same underlying functions. To this end, we propose a decoupled encoder-decoder approach to supervised meta-learning, where the encoder is trained with a contrastive objective to find a good representation of the underlying function. In particular, our training scheme is driven by the self-supervision signal indicating whether two sets of examples stem from the same function. Our experiments on a number of synthetic and real-world datasets show that the representations we obtain outperform strong baselines in terms of downstream performance and noise robustness, even when these baselines are trained in an end-to-end manner. △ Less

Submitted 22 July, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

Comments: ICML 2021

arXiv:2009.11087 [pdf, other]

Probabilistic Machine Learning for Healthcare

Authors: Irene Y. Chen, Shalmali Joshi, Marzyeh Ghassemi, Rajesh Ranganath

Abstract: Machine learning can be used to make sense of healthcare data. Probabilistic machine learning models help provide a complete picture of observed data in healthcare. In this review, we examine how probabilistic machine learning can advance healthcare. We consider challenges in the predictive model building pipeline where probabilistic models can be beneficial including calibration and missing data.… ▽ More Machine learning can be used to make sense of healthcare data. Probabilistic machine learning models help provide a complete picture of observed data in healthcare. In this review, we examine how probabilistic machine learning can advance healthcare. We consider challenges in the predictive model building pipeline where probabilistic models can be beneficial including calibration and missing data. Beyond predictive models, we also investigate the utility of probabilistic machine learning models in phenoty**, in generative models for clinical use cases, and in reinforcement learning. △ Less

Submitted 23 September, 2020; originally announced September 2020.

Comments: Annual Reviews of Biomedical Data Science 2021

arXiv:2007.09028 [pdf, other]

Sequential Explanations with Mental Model-Based Policies

Authors: Arnold YS Yeung, Shalmali Joshi, Joseph Jay Williams, Frank Rudzicz

Abstract: The act of explaining across two parties is a feedback loop, where one provides information on what needs to be explained and the other provides an explanation relevant to this information. We apply a reinforcement learning framework which emulates this format by providing explanations based on the explainee's current mental model. We conduct novel online human experiments where explanations gener… ▽ More The act of explaining across two parties is a feedback loop, where one provides information on what needs to be explained and the other provides an explanation relevant to this information. We apply a reinforcement learning framework which emulates this format by providing explanations based on the explainee's current mental model. We conduct novel online human experiments where explanations generated by various explanation methods are selected and presented to participants, using policies which observe participants' mental models, in order to optimize an interpretability proxy. Our results suggest that mental model-based policies (anchored in our proposed state representation) may increase interpretability over multiple sequential explanations, when compared to a random selection baseline. This work provides insight into how to select explanations which increase relevant information for users, and into conducting human-grounded experimentation to understand interpretability. △ Less

Submitted 17 July, 2020; originally announced July 2020.

Comments: Accepted into ICML 2020 Workshop on Human Interpretability in Machine Learning (Spotlight)

arXiv:2006.11654 [pdf, other]

Counterfactually Guided Off-policy Transfer in Clinical Settings

Authors: Taylor W. Killian, Marzyeh Ghassemi, Shalmali Joshi

Abstract: Domain shift, encountered when using a trained model for a new patient population, creates significant challenges for sequential decision making in healthcare since the target domain may be both data-scarce and confounded. In this paper, we propose a method for off-policy transfer by modeling the underlying generative process with a causal mechanism. We use informative priors from the source domai… ▽ More Domain shift, encountered when using a trained model for a new patient population, creates significant challenges for sequential decision making in healthcare since the target domain may be both data-scarce and confounded. In this paper, we propose a method for off-policy transfer by modeling the underlying generative process with a causal mechanism. We use informative priors from the source domain to augment counterfactual trajectories in the target in a principled manner. We demonstrate how this addresses data-scarcity in the presence of unobserved confounding. The causal parametrization of our sampling procedure guarantees that counterfactual quantities can be estimated from scarce observational target data, maintaining intuitive stability properties. Policy learning in the target domain is further regularized via the source policy through KL-divergence. Through evaluation on a simulated sepsis treatment task, our counterfactual policy transfer procedure significantly improves the performance of a learned treatment policy when assumptions of "no-unobserved confounding" are relaxed. △ Less

Submitted 16 March, 2022; v1 submitted 20 June, 2020; originally announced June 2020.

Comments: Will be published at the Conference for Health, Inference and Learning (CHIL) 2022, Camera Ready Version

arXiv:2005.00447 [pdf, other]

Image fusion using symmetric skip autoencodervia an Adversarial Regulariser

Authors: Snigdha Bhagat, S. D. Joshi, Brejesh Lall

Abstract: It is a challenging task to extract the best of both worlds by combining the spatial characteristics of a visible image and the spectral content of an infrared image. In this work, we propose a spatially constrained adversarial autoencoder that extracts deep features from the infrared and visible images to obtain a more exhaustive and global representation. In this paper, we propose a residual aut… ▽ More It is a challenging task to extract the best of both worlds by combining the spatial characteristics of a visible image and the spectral content of an infrared image. In this work, we propose a spatially constrained adversarial autoencoder that extracts deep features from the infrared and visible images to obtain a more exhaustive and global representation. In this paper, we propose a residual autoencoder architecture, regularised by a residual adversarial network, to generate a more realistic fused image. The residual module serves as primary building for the encoder, decoder and adversarial network, as an add on the symmetric skip connections perform the functionality of embedding the spatial characteristics directly from the initial layers of encoder structure to the decoder part of the network. The spectral information in the infrared image is incorporated by adding the feature maps over several layers in the encoder part of the fusion structure, which makes inference on both the visual and infrared images separately. In order to efficiently optimize the parameters of the network, we propose an adversarial regulariser network which would perform supervised learning on the fused image and the original visual image. △ Less

Submitted 4 June, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

arXiv:2003.11639 [pdf, other]

Memory Organization for Energy-Efficient Learning and Inference in Digital Neuromorphic Accelerators

Authors: Clemens JS Schaefer, Patrick Faley, Emre O Neftci, Siddharth Joshi

Abstract: The energy efficiency of neuromorphic hardware is greatly affected by the energy of storing, accessing, and updating synaptic parameters. Various methods of memory organisation targeting energy-efficient digital accelerators have been investigated in the past, however, they do not completely encapsulate the energy costs at a system level. To address this shortcoming and to account for various over… ▽ More The energy efficiency of neuromorphic hardware is greatly affected by the energy of storing, accessing, and updating synaptic parameters. Various methods of memory organisation targeting energy-efficient digital accelerators have been investigated in the past, however, they do not completely encapsulate the energy costs at a system level. To address this shortcoming and to account for various overheads, we synthesize the controller and memory for different encoding schemes and extract the energy costs from these synthesized blocks. Additionally, we introduce functional encoding for structured connectivity such as the connectivity in convolutional layers. Functional encoding offers a 58% reduction in the energy to implement a backward pass and weight update in such layers compared to existing index-based solutions. We show that for a 2 layer spiking neural network trained to retain a spatio-temporal pattern, bitmap (PB-BMP) based organization can encode the sparser networks more efficiently. This form of encoding delivers a 1.37x improvement in energy efficiency coming at the cost of a 4% degradation in network retention accuracy as measured by the van Rossum distance. △ Less

Submitted 5 March, 2020; originally announced March 2020.

Comments: submitted to ISCAS2020

arXiv:2003.02978 [pdf, other]

doi 10.1109/TGRS.2020.2976888

Fast and Accurate Retrieval of Methane Concentration from Imaging Spectrometer Data Using Sparsity Prior

Authors: Markus D. Foote, Philip E. Dennison, Andrew K. Thorpe, David R. Thompson, Siraput Jongaramrungruang, Christian Frankenberg, Sarang C. Joshi

Abstract: The strong radiative forcing by atmospheric methane has stimulated interest in identifying natural and anthropogenic sources of this potent greenhouse gas. Point sources are important targets for quantification, and anthropogenic targets have potential for emissions reduction. Methane point source plume detection and concentration retrieval have been previously demonstrated using data from the Air… ▽ More The strong radiative forcing by atmospheric methane has stimulated interest in identifying natural and anthropogenic sources of this potent greenhouse gas. Point sources are important targets for quantification, and anthropogenic targets have potential for emissions reduction. Methane point source plume detection and concentration retrieval have been previously demonstrated using data from the Airborne Visible InfraRed Imaging Spectrometer Next Generation (AVIRIS-NG). Current quantitative methods have tradeoffs between computational requirements and retrieval accuracy, creating obstacles for processing real-time data or large datasets from flight campaigns. We present a new computationally efficient algorithm that applies sparsity and an albedo correction to matched filter retrieval of trace gas concentration-pathlength. The new algorithm was tested using AVIRIS-NG data acquired over several point source plumes in Ahmedabad, India. The algorithm was validated using simulated AVIRIS-NG data including synthetic plumes of known methane concentration. Sparsity and albedo correction together reduced the root mean squared error of retrieved methane concentration-pathlength enhancement by 60.7% compared with a previous robust matched filter method. Background noise was reduced by a factor of 2.64. The new algorithm was able to process the entire 300 flightline 2016 AVIRIS-NG India campaign in just over 8 hours on a desktop computer with GPU acceleration. △ Less

Submitted 5 March, 2020; originally announced March 2020.

Comments: 13 pages, 11 figures

Journal ref: IEEE Transactions on Geoscience and Remote Sensing, 2020, pp. 1-13

arXiv:2003.02821 [pdf, other]

What went wrong and when? Instance-wise Feature Importance for Time-series Models

Authors: Sana Tonekaboni, Shalmali Joshi, Kieran Campbell, David Duvenaud, Anna Goldenberg

Abstract: Explanations of time series models are useful for high stakes applications like healthcare but have received little attention in machine learning literature. We propose FIT, a framework that evaluates the importance of observations for a multivariate time-series black-box model by quantifying the shift in the predictive distribution over time. FIT defines the importance of an observation based on… ▽ More Explanations of time series models are useful for high stakes applications like healthcare but have received little attention in machine learning literature. We propose FIT, a framework that evaluates the importance of observations for a multivariate time-series black-box model by quantifying the shift in the predictive distribution over time. FIT defines the importance of an observation based on its contribution to the distributional shift under a KL-divergence that contrasts the predictive distribution against a counterfactual where the rest of the features are unobserved. We also demonstrate the need to control for time-dependent distribution shifts. We compare with state-of-the-art baselines on simulated and real-world clinical data and demonstrate that our approach is superior in identifying important time points and observations throughout the time series. △ Less

Submitted 28 October, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

arXiv:2002.02493 [pdf]

On Ridership and Frequency

Authors: Simon Berrebi, Sanskruti Joshi, Kari E Watkins

Abstract: Even before the start of the COVID-19 pandemic, bus ridership in the United States had attained its lowest level since 1973. If transit agencies hope to reverse this trend, they must understand how their service allocation policies affect ridership. This paper is among the first to model ridership trends on a hyper-local level over time. A Poisson fixed-effects model is developed to evaluate the r… ▽ More Even before the start of the COVID-19 pandemic, bus ridership in the United States had attained its lowest level since 1973. If transit agencies hope to reverse this trend, they must understand how their service allocation policies affect ridership. This paper is among the first to model ridership trends on a hyper-local level over time. A Poisson fixed-effects model is developed to evaluate the ridership elasticity to frequency on weekdays using passenger count data from Portland, Miami, Minneapolis/St-Paul, and Atlanta between 2012 and 2018. In every agency, ridership is found to be elastic to frequency when observing the variation between individual route-segments at one point in time. In other words, the most frequent routes are already the most productive in terms of passengers per vehicle-trip. When observing the variation within each route-segment over time, however, ridership is inelastic; each additional vehicle-trip is expected to generate less ridership than the average bus already on the route. In three of the four agencies, the elasticity is a decreasing function of prior frequency, meaning that low-frequency routes are the most sensitive to changes in frequency. This paper can help transit agencies anticipate the marginal effect of shifting service throughout the network. As the quality and availability of passenger count data improve, this paper can serve as the methodological basis to explore the dynamics of bus ridership. △ Less

Submitted 9 April, 2021; v1 submitted 6 February, 2020; originally announced February 2020.

arXiv:1910.10769 [pdf, other]

doi 10.1109/TBME.2020.3024826

Learning Multiparametric Biomarkers for Assessing MR-Guided Focused Ultrasound Treatment of Malignant Tumors

Authors: Blake E. Zimmerman, Sara Johnson, Henrik Odéen, Jill Shea, Markus D. Foote, Nicole Winkler, Sarang C. Joshi, Allison Payne

Abstract: Noninvasive MR-guided focused ultrasound (MRgFUS) treatments are promising alternatives to the surgical removal of malignant tumors. A significant challenge is assessing the viability of treated tissue during and immediately after MRgFUS procedures. Current clinical assessment uses the nonperfused volume (NPV) biomarker immediately after treatment from contrast-enhanced MRI. The NPV has variable a… ▽ More Noninvasive MR-guided focused ultrasound (MRgFUS) treatments are promising alternatives to the surgical removal of malignant tumors. A significant challenge is assessing the viability of treated tissue during and immediately after MRgFUS procedures. Current clinical assessment uses the nonperfused volume (NPV) biomarker immediately after treatment from contrast-enhanced MRI. The NPV has variable accuracy, and the use of contrast agent prevents continuing MRgFUS treatment if tumor coverage is inadequate. This work presents a novel, noncontrast, learned multiparametric MR biomarker that can be used during treatment for intratreatment assessment, validated in a VX2 rabbit tumor model. A deep convolutional neural network was trained on noncontrast multiparametric MR images using the NPV biomarker from follow-up MR imaging (3-5 days after MRgFUS treatment) as the accurate label of nonviable tissue. A novel volume-conserving registration algorithm yielded a voxel-wise correlation between treatment and follow-up NPV, providing a rigorous validation of the biomarker. The learned noncontrast multiparametric MR biomarker predicted the follow-up NPV with an average DICE coefficient of 0.71, substantially outperforming the current clinical standard (DICE coefficient = 0.53). Noncontrast multiparametric MR imaging integrated with a deep convolutional neural network provides a more accurate prediction of MRgFUS treatment outcome than current contrast-based techniques. △ Less

Submitted 29 September, 2020; v1 submitted 23 October, 2019; originally announced October 2019.

Comments: 11 pages, 12 figures

arXiv:1909.11841 [pdf, ps, other]

doi 10.1007/978-3-319-67675-3_16

Rank Constrained Diffeomorphic Density Motion Estimation for Respiratory Correlated Computed Tomography

Authors: Markus D. Foote, Pouya Sabouri, Amit Sawant, Sarang C. Joshi

Abstract: Motion estimation of organs in a sequence of images is important in numerous medical imaging applications. The focus of this paper is the analysis of 4D Respiratory Correlated Computed Tomography (RCCT) Imaging. It is hypothesized that the quasi-periodic breathing induced motion of organs in the thorax can be represented by deformations spanning a very low dimension subspace of the full infinite d… ▽ More Motion estimation of organs in a sequence of images is important in numerous medical imaging applications. The focus of this paper is the analysis of 4D Respiratory Correlated Computed Tomography (RCCT) Imaging. It is hypothesized that the quasi-periodic breathing induced motion of organs in the thorax can be represented by deformations spanning a very low dimension subspace of the full infinite dimensional space of diffeomorphic transformations. This paper presents a novel motion estimation algorithm that includes the constraint for low-rank motion between the different phases of the RCCT images. Low-rank deformation solutions are necessary for the efficient statistical analysis and improved treatment planning and delivery. Although the application focus of this paper is RCCT the algorithm is quite general and applicable to various motion estimation problems in medical imaging. △ Less

Submitted 25 September, 2019; originally announced September 2019.

Journal ref: In: MFCA 2017. Lecture Notes in Computer Science, vol 10551. Springer, Cham

arXiv:1908.07064 [pdf, other]

Domain-Independent turn-level Dialogue Quality Evaluation via User Satisfaction Estimation

Authors: Praveen Kumar Bodigutla, Longshaokan Wang, Kate Ridgeway, Joshua Levy, Swanand Joshi, Alborz Geramifard, Spyros Matsoukas

Abstract: An automated metric to evaluate dialogue quality is vital for optimizing data driven dialogue management. The common approach of relying on explicit user feedback during a conversation is intrusive and sparse. Current models to estimate user satisfaction use limited feature sets and rely on annotation schemes with low inter-rater reliability, limiting generalizability to conversations spanning mul… ▽ More An automated metric to evaluate dialogue quality is vital for optimizing data driven dialogue management. The common approach of relying on explicit user feedback during a conversation is intrusive and sparse. Current models to estimate user satisfaction use limited feature sets and rely on annotation schemes with low inter-rater reliability, limiting generalizability to conversations spanning multiple domains. To address these gaps, we created a new Response Quality annotation scheme, based on which we developed turn-level User Satisfaction metric. We introduced five new domain-independent feature sets and experimented with six machine learning models to estimate the new satisfaction metric. Using Response Quality annotation scheme, across randomly sampled single and multi-turn conversations from 26 domains, we achieved high inter-annotator agreement (Spearman's rho 0.94). The Response Quality labels were highly correlated (0.76) with explicit turn-level user ratings. Gradient boosting regression achieved best correlation of ~0.79 between predicted and annotated user satisfaction labels. Multi Layer Perceptron and Gradient Boosting regression models generalized to an unseen domain better (linear correlation 0.67) than other models. Finally, our ablation study verified that our novel features significantly improved model performance. △ Less

Submitted 19 August, 2019; originally announced August 2019.

Comments: Implications of Deep Learning for Dialog Modeling - Special session at SIGdial 2019

arXiv:1907.09615 [pdf, other]

Towards Realistic Individual Recourse and Actionable Explanations in Black-Box Decision Making Systems

Authors: Shalmali Joshi, Oluwasanmi Koyejo, Warut Vijitbenjaronk, Been Kim, Joydeep Ghosh

Abstract: Machine learning based decision making systems are increasingly affecting humans. An individual can suffer an undesirable outcome under such decision making systems (e.g. denied credit) irrespective of whether the decision is fair or accurate. Individual recourse pertains to the problem of providing an actionable set of changes a person can undertake in order to improve their outcome. We propose a… ▽ More Machine learning based decision making systems are increasingly affecting humans. An individual can suffer an undesirable outcome under such decision making systems (e.g. denied credit) irrespective of whether the decision is fair or accurate. Individual recourse pertains to the problem of providing an actionable set of changes a person can undertake in order to improve their outcome. We propose a recourse algorithm that models the underlying data distribution or manifold. We then provide a mechanism to generate the smallest set of changes that will improve an individual's outcome. This mechanism can be easily used to provide recourse for any differentiable machine learning based decision making system. Further, the resulting algorithm is shown to be applicable to both supervised classification and causal decision making systems. Our work attempts to fill gaps in existing fairness literature that have primarily focused on discovering and/or algorithmically enforcing fairness constraints on decision making systems. This work also provides an alternative approach to generating counterfactual explanations. △ Less

Submitted 22 July, 2019; originally announced July 2019.

arXiv:1905.05134 [pdf, ps, other]

What Clinicians Want: Contextualizing Explainable Machine Learning for Clinical End Use

Authors: Sana Tonekaboni, Shalmali Joshi, Melissa D McCradden, Anna Goldenberg

Abstract: Translating machine learning (ML) models effectively to clinical practice requires establishing clinicians' trust. Explainability, or the ability of an ML model to justify its outcomes and assist clinicians in rationalizing the model prediction, has been generally understood to be critical to establishing trust. However, the field suffers from the lack of concrete definitions for usable explanatio… ▽ More Translating machine learning (ML) models effectively to clinical practice requires establishing clinicians' trust. Explainability, or the ability of an ML model to justify its outcomes and assist clinicians in rationalizing the model prediction, has been generally understood to be critical to establishing trust. However, the field suffers from the lack of concrete definitions for usable explanations in different settings. To identify specific aspects of explainability that may catalyze building trust in ML models, we surveyed clinicians from two distinct acute care specialties (Intenstive Care Unit and Emergency Department). We use their feedback to characterize when explainability helps to improve clinicians' trust in ML models. We further identify the classes of explanations that clinicians identified as most relevant and crucial for effective translation to clinical practice. Finally, we discern concrete metrics for rigorous evaluation of clinical explainability methods. By integrating perceptions of explainability between clinicians and ML researchers we hope to facilitate the endorsement and broader adoption and sustained use of ML systems in healthcare. △ Less

Submitted 7 August, 2019; v1 submitted 13 May, 2019; originally announced May 2019.

arXiv:1807.07752 [pdf]

doi 10.5120/ijca2018917319

Twitter Sentiment Analysis System

Authors: Shaunak Joshi, Deepali Deshpande

Abstract: Social media is increasingly used by humans to express their feelings and opinions in the form of short text messages. Detecting sentiments in the text has a wide range of applications including identifying anxiety or depression of individuals and measuring well-being or mood of a community. Sentiments can be expressed in many ways that can be seen such as facial expression and gestures, speech an… ▽ More Social media is increasingly used by humans to express their feelings and opinions in the form of short text messages. Detecting sentiments in the text has a wide range of applications including identifying anxiety or depression of individuals and measuring well-being or mood of a community. Sentiments can be expressed in many ways that can be seen such as facial expression and gestures, speech and by written text. Sentiment Analysis in text documents is essentially a content-based classification problem involving concepts from the domains of Natural Language Processing as well as Machine Learning. In this paper, sentiment recognition based on textual data and the techniques used in sentiment analysis are discussed. △ Less

Submitted 20 July, 2018; originally announced July 2018.

Comments: 5 pages

Journal ref: International Journal of Computer Applications (2018)

arXiv:1806.08867 [pdf, other]

xGEMs: Generating Examplars to Explain Black-Box Models

Authors: Shalmali Joshi, Oluwasanmi Koyejo, Been Kim, Joydeep Ghosh

Abstract: This work proposes xGEMs or manifold guided exemplars, a framework to understand black-box classifier behavior by exploring the landscape of the underlying data manifold as data points cross decision boundaries. To do so, we train an unsupervised implicit generative model -- treated as a proxy to the data manifold. We summarize black-box model behavior quantitatively by perturbing data samples alo… ▽ More This work proposes xGEMs or manifold guided exemplars, a framework to understand black-box classifier behavior by exploring the landscape of the underlying data manifold as data points cross decision boundaries. To do so, we train an unsupervised implicit generative model -- treated as a proxy to the data manifold. We summarize black-box model behavior quantitatively by perturbing data samples along the manifold. We demonstrate xGEMs' ability to detect and quantify bias in model learning and also for understanding the changes in model behavior as training progresses. △ Less

Submitted 22 June, 2018; originally announced June 2018.

arXiv:1608.00704 [pdf, other]

Identifiable Phenoty** using Constrained Non-Negative Matrix Factorization

Authors: Shalmali Joshi, Suriya Gunasekar, David Sontag, Joydeep Ghosh

Abstract: This work proposes a new algorithm for automated and simultaneous phenoty** of multiple co-occurring medical conditions, also referred as comorbidities, using clinical notes from the electronic health records (EHRs). A basic latent factor estimation technique of non-negative matrix factorization (NMF) is augmented with domain specific constraints to obtain sparse latent factors that are anchored… ▽ More This work proposes a new algorithm for automated and simultaneous phenoty** of multiple co-occurring medical conditions, also referred as comorbidities, using clinical notes from the electronic health records (EHRs). A basic latent factor estimation technique of non-negative matrix factorization (NMF) is augmented with domain specific constraints to obtain sparse latent factors that are anchored to a fixed set of chronic conditions. The proposed anchoring mechanism ensures a one-to-one identifiable and interpretable map** between the latent factors and the target comorbidities. Qualitative assessment of the empirical results by clinical experts suggests that the proposed model learns clinically interpretable phenotypes while being predictive of 30 day mortality. The proposed method can be readily adapted to any non-negative EHR data across various healthcare institutions. △ Less

Submitted 20 September, 2016; v1 submitted 2 August, 2016; originally announced August 2016.

Comments: Presented at 2016 Machine Learning and Healthcare Conference (MLHC 2016), Los Angeles, CA

arXiv:1512.07139 [pdf, other]

Weighted geometric distribution with a new characterisation of geometric distribution

Authors: Deepesh Bhati, Savitri Joshi

Abstract: In this paper, we introduce a new generalization of geometric distribution which can also viewed as discrete analogue of weighted exponential distribution introduced by Gupta and Kundu(2009). We study some basic distributional properties like moments, generating functions, hazard function followed by different methods of estimation of the parameters. Characterization of Geometric distribution have… ▽ More In this paper, we introduce a new generalization of geometric distribution which can also viewed as discrete analogue of weighted exponential distribution introduced by Gupta and Kundu(2009). We study some basic distributional properties like moments, generating functions, hazard function followed by different methods of estimation of the parameters. Characterization of Geometric distribution have also been presented. Finally, we examine the model with real data sets. △ Less

Submitted 5 February, 2016; v1 submitted 22 December, 2015; originally announced December 2015.

Comments: 17 pages 1 figure

arXiv:1503.06675 [pdf, ps, other]

doi 10.1098/rspa.2016.0871

The Fourier Decomposition Method for nonlinear and nonstationary time series analysis

Authors: Pushpendra Singh, Shiv Dutt Joshi, Rakesh Kumar Patney, Kaushik Saha

Abstract: Since many decades, there is a general perception in literature that the Fourier methods are not suitable for the analysis of nonlinear and nonstationary data. In this paper, we propose a Fourier Decomposition Method (FDM) and demonstrate its efficacy for the analysis of nonlinear (i.e. data generated by nonlinear systems) and nonstationary time series. The proposed FDM decomposes any data into a… ▽ More Since many decades, there is a general perception in literature that the Fourier methods are not suitable for the analysis of nonlinear and nonstationary data. In this paper, we propose a Fourier Decomposition Method (FDM) and demonstrate its efficacy for the analysis of nonlinear (i.e. data generated by nonlinear systems) and nonstationary time series. The proposed FDM decomposes any data into a small number of `Fourier intrinsic band functions' (FIBFs). The FDM presents a generalized Fourier expansion with variable amplitudes and frequencies of a time series by the Fourier method itself. We propose an idea of zero-phase filter bank based multivariate FDM (MFDM) algorithm, for the analysis of multivariate nonlinear and nonstationary time series, from the FDM. We also present an algorithm to obtain cutoff frequencies for MFDM. The MFDM algorithm is generating finite number of band limited multivariate FIBFs (MFIBFs). The MFDM preserves some intrinsic physical properties of the multivariate data, such as scale alignment, trend and instantaneous frequency. The proposed methods produce the results in a time-frequency-energy distribution that reveal the intrinsic structures of a data. Simulations have been carried out and comparison is made with the Empirical Mode Decomposition (EMD) methods in the analysis of various simulated as well as real life time series, and results show that the proposed methods are powerful tools for analyzing and obtaining the time-frequency-energy representation of any data. △ Less

Submitted 31 August, 2015; v1 submitted 26 February, 2015; originally announced March 2015.

Comments: 14 Pages, 18 Figures

Journal ref: Proceedings of the Royal Society of London A; March 2017, Volume 473, issue 2199

arXiv:1302.5556 [pdf]

DCT and Eigenvectors of Covariance of 1st and 2nd order Discrete fractional Brownian motion

Authors: Anubha Gupta, ShivDutt Joshi

Abstract: This paper establishes connection between discrete cosine transform (DCT) and 1st and 2nd order discrete-time fractional Brownian motion process. It is proved that the eigenvectors of the auto-covariance matrix of a 1st and 2nd order discrete-time fractional Brownian motion can be approximated by DCT basis vectors in the asymptotic sense. Perturbation in eigenvectors from DCT basis vectors is mode… ▽ More This paper establishes connection between discrete cosine transform (DCT) and 1st and 2nd order discrete-time fractional Brownian motion process. It is proved that the eigenvectors of the auto-covariance matrix of a 1st and 2nd order discrete-time fractional Brownian motion can be approximated by DCT basis vectors in the asymptotic sense. Perturbation in eigenvectors from DCT basis vectors is modeled using the analytic perturbation theory of linear operators. △ Less

Submitted 22 February, 2013; originally announced February 2013.

Comments: 27 pages, Submitted to transactions on information theory, January 2013

Showing 1–37 of 37 results for author: Joshi, S