Search | arXiv e-print repository

Graph Representation Learning Strategies for Omics Data: A Case Study on Parkinson's Disease

Authors: Elisa Gómez de Lope, Saurabh Deshpande, Ramón Viñas Torné, Pietro Liò, Enrico Glaab, Stéphane P. A. Bordas

Abstract: Omics data analysis is crucial for studying complex diseases, but its high dimensionality and heterogeneity challenge classical statistical and machine learning methods. Graph neural networks have emerged as promising alternatives, yet the optimal strategies for their design and optimization in real-world biomedical challenges remain unclear. This study evaluates various graph representation learn… ▽ More Omics data analysis is crucial for studying complex diseases, but its high dimensionality and heterogeneity challenge classical statistical and machine learning methods. Graph neural networks have emerged as promising alternatives, yet the optimal strategies for their design and optimization in real-world biomedical challenges remain unclear. This study evaluates various graph representation learning models for case-control classification using high-throughput biological data from Parkinson's disease and control samples. We compare topologies derived from sample similarity networks and molecular interaction networks, including protein-protein and metabolite-metabolite interactions (PPI, MMI). Graph Convolutional Network (GCNs), Chebyshev spectral graph convolution (ChebyNet), and Graph Attention Network (GAT), are evaluated alongside advanced architectures like graph transformers, the graph U-net, and simpler models like multilayer perceptron (MLP). These models are systematically applied to transcriptomics and metabolomics data independently. Our comparative analysis highlights the benefits and limitations of various architectures in extracting patterns from omics data, paving the way for more accurate and interpretable models in biomedical research. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: Submitted to Machine Learning in Computational Biology 2024 as an extended abstract, 2 pages + 1 appendix

arXiv:2405.03538 [pdf, other]

Adolescent sports participation and health in early adulthood: An observational study

Authors: A**kya H. Kokandakar, Yuzhou Lin, Steven **, Jordan Weiss, Amanda R. Rabinowitz, Reuben A. Buford May, Dylan Small, Sameer K. Deshpande

Abstract: We study the impact of teenage sports participation on early-adulthood health using longitudinal data from the National Study of Youth and Religion. We focus on two primary outcomes measured at ages 23--28 -- self-rated health and total score on the PHQ9 Patient Depression Questionnaire -- and control for several potential confounders related to demographics and family socioeconomic status. To pro… ▽ More We study the impact of teenage sports participation on early-adulthood health using longitudinal data from the National Study of Youth and Religion. We focus on two primary outcomes measured at ages 23--28 -- self-rated health and total score on the PHQ9 Patient Depression Questionnaire -- and control for several potential confounders related to demographics and family socioeconomic status. To probe the possibility that certain types of sports participation may have larger effects on health than others, we conduct a matched observational study at each level within a hierarchy of exposures. Our hierarchy ranges from broadly defined exposures (e.g., participation in any organized after-school activity) to narrow (e.g., participation in collision sports). We deployed an ordered testing approach that exploits the hierarchical relationships between our exposure definitions to perform our analyses while maintaining a fixed family-wise error rate. Compared to teenagers who did not participate in any after-school activities, those who participated in sports had statistically significantly better self-rated and mental health outcomes in early adulthood. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: The pre-analysis protocol for this study is available at arXiv:2211.02104

arXiv:2405.02664 [pdf, other]

MedPromptExtract (Medical Data Extraction Tool): Anonymization and Hi-fidelity Automated data extraction using NLP and prompt engineering

Authors: Roomani Srivastava, Suraj Prasad, Lipika Bhat, Sarvesh Deshpande, Barnali Das, Kshitij Jadhav

Abstract: A major roadblock in the seamless digitization of medical records remains the lack of interoperability of existing records. Extracting relevant medical information required for further treatment planning or even research is a time consuming labour intensive task involving expenditure of valuable time of doctors. In this demo paper we present, MedPromptExtract an automated tool using a combination… ▽ More A major roadblock in the seamless digitization of medical records remains the lack of interoperability of existing records. Extracting relevant medical information required for further treatment planning or even research is a time consuming labour intensive task involving expenditure of valuable time of doctors. In this demo paper we present, MedPromptExtract an automated tool using a combination of semi supervised learning, large language models, natural language processing and prompt engineering to convert unstructured medical records to structured data which is amenable for further analysis. △ Less

Submitted 6 June, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

Comments: 4 pages, 3 figures, pre-print sumitted to CIKM 2024

arXiv:2404.01567 [pdf, ps, other]

PyCPL: The ESO Common Pipeline Library in Python v1.0

Authors: Mrunmayi S. Deshpande, Nuria P. F. Lorente, Anthony Horton, Brent Miszalski, Ralf Palsa, Lars Lundin, Anthony Heng, Aidan Farrell

Abstract: PyCPL provides full access to ESO's Common Pipeline Library ( CPL) for astronomical data reduction within a Python environment. Not only does it offer a Python interface to the robust CPL library, but it also lets users and developers fully utilise the rest of the scientific Python ecosystem. We have written a C++ layer to CPL and with pybind11 (a third-party library) created a Pythonic API to CPL… ▽ More PyCPL provides full access to ESO's Common Pipeline Library ( CPL) for astronomical data reduction within a Python environment. Not only does it offer a Python interface to the robust CPL library, but it also lets users and developers fully utilise the rest of the scientific Python ecosystem. We have written a C++ layer to CPL and with pybind11 (a third-party library) created a Pythonic API to CPL. Since CPL has been around for so long, it has been thoroughly tested and understood. In 2003 it was developed in C due to its efficiency and speed of execution. With the community however moving away from C/C++ programming and embracing Python for data processing tasks, there is a need to provide access to the CPL utilities within a Python environment. With the latest version being released users can now install PyCPL to run existing CPL recipes (written in C) and access the results from Python. It also provides the ability to create new recipes in Python using the functionality provided by CPL. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: This paper was for a poster presented in ADASS XXXIII. poster number P923

arXiv:2403.17394 [pdf, other]

IQMDose3D: a software tool for reconstructing the dose in patient using patient planning CT images and the signals measured by IQM detector

Authors: Aitang Xing, Gary Goozee, Alison Gray, Vaughan Moutrie, Sankar Arumugam, Shrikant Deshpande, Anthony Espinoza, Vasilis Kondilis, Marjorie McDonald, Philip Vial

Abstract: The integral quality monitor (IQM) system compares the signal measured with a large volume chamber mounted to the linear accelerator's head to the signal calculated using the patient DICOM RT plan for patient-specific quality assurance (PSQA). A method was developed to reconstruct the dose in patients using the signal measured by IQM chamber and patient planning CT images. A software tool named IQ… ▽ More The integral quality monitor (IQM) system compares the signal measured with a large volume chamber mounted to the linear accelerator's head to the signal calculated using the patient DICOM RT plan for patient-specific quality assurance (PSQA). A method was developed to reconstruct the dose in patients using the signal measured by IQM chamber and patient planning CT images. A software tool named IQMDose3D was implemented to automate this procedure and integrated into the IQM-based PSQA workflow. IQMDose3D enables the physicists to evaluate PSQA by focusing on the clinical perspective by comparing the delivered plan to the approved clinical plan in terms of the clinical goals, dose-volume histogram (DVH) in addition to the three-dimensional (3D) gamma map and gamma pass rate. △ Less

Submitted 29 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

Comments: Accepted by ICCR 2024 conference

arXiv:2403.17365 [pdf, other]

AutoMRISimQA: an automated system for daily quality control of a 3T MRI simulator

Authors: Aitang Xing, Gary Goozee, Gary Liney, Sankar Arumugam, Shrikant Deshpande, Anthony Espinoza, Alison Gray, Vasilis Kondilis, Doaa Elwadia, Robba Rai, Lois Holloway

Abstract: A software system named AutoMRISimQA was developed to monitor the daily performance of a wide-bore 3T scanner(MRI) which was designed and dedicated to radiotherapy simulation. The system can monitor the performance of the MRI simulator not only by using image quality indices such as signal-to-noise ratio (SNR), uniformity, ghosting and contrast but also performing a quick check of geometry accurac… ▽ More A software system named AutoMRISimQA was developed to monitor the daily performance of a wide-bore 3T scanner(MRI) which was designed and dedicated to radiotherapy simulation. The system can monitor the performance of the MRI simulator not only by using image quality indices such as signal-to-noise ratio (SNR), uniformity, ghosting and contrast but also performing a quick check of geometry accuracy as well as the external lasers quantitatively. It was implemented into the daily clinically workflow in 2013 and has been used for more than 10 years. It was also seamlessly integrated with QAtrack, allowing continuous monitoring of the consistency of the MRI simulator's performance. △ Less

Submitted 29 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

Comments: Accepted by conference ICCR 2024

arXiv:2403.05749 [pdf, other]

Characterizing Flow Complexity in Transportation Networks using Graph Homology

Authors: Shashank A Deshpande, Hamsa Balakrishnan

Abstract: Series-parallel network topologies generally exhibit simplified dynamical behavior and avoid high combinatorial complexity. A comprehensive analysis of how flow complexity emerges with a graph's deviation from series-parallel topology is therefore of fundamental interest. We introduce the notion of a robust $k$-path on a directed acycylic graph, with increasing values of the length $k$ reflecting… ▽ More Series-parallel network topologies generally exhibit simplified dynamical behavior and avoid high combinatorial complexity. A comprehensive analysis of how flow complexity emerges with a graph's deviation from series-parallel topology is therefore of fundamental interest. We introduce the notion of a robust $k$-path on a directed acycylic graph, with increasing values of the length $k$ reflecting increasing deviations. We propose a graph homology with robust $k$-paths as the bases of its chain spaces. In this framework, the topological simplicity of series-parallel graphs translates into a triviality of higher-order chain spaces. We discuss a correspondence between the space of order-three chains and sites within the network that are susceptible to the Braess paradox, a well-known phenomenon in transportation networks. In this manner, we illustrate the utility of the proposed graph homology in sytematically studying the complexity of flow networks. △ Less

Submitted 8 March, 2024; originally announced March 2024.

Comments: 7 pages, 3 figures, letter

arXiv:2402.16452 [pdf]

Multiscale Experiments and Predictive Modelling for Inverse Design and Failure Mitigation in Additively Manufactured Lattices

Authors: Mattia Utzeri, Marco Sasso, Vikram S. Deshpande, S. Kumar

Abstract: Additive manufacturing (AM) enables the development of high-performance architected cellular materials, emphasizing the growing importance of establishing programmable and predictable energy absorption capabilities. This study evaluates the impact of a precisely tuned fused filament fabrication (FFF) AM process on the energy absorption and failure characteristics of thermoplastic lattice materials… ▽ More Additive manufacturing (AM) enables the development of high-performance architected cellular materials, emphasizing the growing importance of establishing programmable and predictable energy absorption capabilities. This study evaluates the impact of a precisely tuned fused filament fabrication (FFF) AM process on the energy absorption and failure characteristics of thermoplastic lattice materials through multiscale experiments and predictive modelling. Lattices with four distinct unit cell topologies and three varying relative densities are manufactured, and their in-plane mechanical response under quasi-static compression is measured. Macroscale testing and micro-CT imaging reveal relative density-dependent damage mechanisms and failure modes, prompting the development of a robust predictive modelling framework to capture process-induced performance variation and damage. For lower relative density lattices, an FE model based on the extended Drucker-Prager material model, incorporating Bridgman correction with crazing failure criteria, accurately captures the crushing response. As lattice density increases, interfacial damage along bead-bead interfaces becomes predominant, necessitating the enrichment of the model with a microscale cohesive zone model to capture interfacial debonding. All proposed models are validated, highlighting inter-bead damage as the primary factor limiting energy absorption performance in FFF-printed lattices. Finally, the predictive modelling introduces an enhancement factor, providing a straightforward approach to assess the influence of the AM process on energy absorption performance, facilitating the inverse design of FFF-printed lattices. This approach enables a critical evaluation of how FFF processes can be improved to achieve the highest attainable performance and mitigate failures in architected cellular materials. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.13961 [pdf, other]

New directions in algebraic statistics: Three challenges from 2023

Authors: Yulia Alexandr, Miles Bakenhus, Mark Curiel, Sameer K. Deshpande, Elizabeth Gross, Yuqi Gu, Max Hill, Joseph Johnson, Bryson Kagy, Vishesh Karwa, Jiayi Li, Hanbaek Lyu, Sonja Petrović, Jose Israel Rodriguez

Abstract: In the last quarter of a century, algebraic statistics has established itself as an expanding field which uses multilinear algebra, commutative algebra, computational algebra, geometry, and combinatorics to tackle problems in mathematical statistics. These developments have found applications in a growing number of areas, including biology, neuroscience, economics, and social sciences. Naturally… ▽ More In the last quarter of a century, algebraic statistics has established itself as an expanding field which uses multilinear algebra, commutative algebra, computational algebra, geometry, and combinatorics to tackle problems in mathematical statistics. These developments have found applications in a growing number of areas, including biology, neuroscience, economics, and social sciences. Naturally, new connections continue to be made with other areas of mathematics and statistics. This paper outlines three such connections: to statistical models used in educational testing, to a classification problem for a family of nonparametric regression models, and to phase transition phenomena under uniform sampling of contingency tables. We illustrate the motivating problems, each of which is for algebraic statistics a new direction, and demonstrate an enhancement of related methodologies. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: This research was performed while the authors were visiting the Institute for Mathematical and Statistical Innovation (IMSI), which is supported by the National Science Foundation (Grant No. DMS-1929348). We participated in the long program "Algebraic Statistics and Our Changing World"

MSC Class: 62R01

arXiv:2402.10338 [pdf, other]

Athermal granular creep in a quenched sandpile

Authors: Nakul S. Deshpande, Paulo E. Arratia, Douglas J. Jerolmack

Abstract: Creep is a generic descriptor of slow motions -- in the context of materials, it describes quasi-static deformation of a solid when subjected to stresses below the global yield, at which all rigidity collapses and the material flows. Here, we experimentally investigate creep, flow, and the transition between the two states in a granular heap flow. Within the surface flowing layer the dimensionless… ▽ More Creep is a generic descriptor of slow motions -- in the context of materials, it describes quasi-static deformation of a solid when subjected to stresses below the global yield, at which all rigidity collapses and the material flows. Here, we experimentally investigate creep, flow, and the transition between the two states in a granular heap flow. Within the surface flowing layer the dimensionless strain rate diminishes with depth, there is an absence of spatial correlations, and there is no aging dynamics. Beneath this layer, the bulk creeps via localized avalanches of plasticity, and there is significant aging. The transition between fast surface flow and slow bulk creep and aging is observed to be in the vicinity of a critical inertial number of $I = 10^{-5}$. Surprisingly, at the cessation of surface flow and the `quenching' of the pile, creep persists in the absence of the flowing layer; albeit with significant differences for a pile that experiences a long duration of surface flow (strongly annealed) and one where flow during preparation does not last long (weakly annealed). Our results contribute to an emerging view of athermal granular creep, showing similarities across dry and submerged systems. Quenched quiescent heaps that creep indefinitely, however, present a challenge to granular rheology, and open new possibilities for interpreting and casting creep and deformation of soils in nature. △ Less

Submitted 15 February, 2024; originally announced February 2024.

arXiv:2401.16914 [pdf, other]

Energy-conserving equivariant GNN for elasticity of lattice architected metamaterials

Authors: Ivan Grega, Ilyes Batatia, Gábor Csányi, Sri Karlapati, Vikram S. Deshpande

Abstract: Lattices are architected metamaterials whose properties strongly depend on their geometrical design. The analogy between lattices and graphs enables the use of graph neural networks (GNNs) as a faster surrogate model compared to traditional methods such as finite element modelling. In this work, we generate a big dataset of structure-property relationships for strut-based lattices. The dataset is… ▽ More Lattices are architected metamaterials whose properties strongly depend on their geometrical design. The analogy between lattices and graphs enables the use of graph neural networks (GNNs) as a faster surrogate model compared to traditional methods such as finite element modelling. In this work, we generate a big dataset of structure-property relationships for strut-based lattices. The dataset is made available to the community which can fuel the development of methods anchored in physical principles for the fitting of fourth-order tensors. In addition, we present a higher-order GNN model trained on this dataset. The key features of the model are (i) SE(3) equivariance, and (ii) consistency with the thermodynamic law of conservation of energy. We compare the model to non-equivariant models based on a number of error metrics and demonstrate its benefits in terms of predictive performance and reduced training requirements. Finally, we demonstrate an example application of the model to an architected material design task. The methods which we developed are applicable to fourth-order tensors beyond elasticity such as piezo-optical tensor etc. △ Less

Submitted 20 March, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

Comments: International Conference on Learning Representations 2024

arXiv:2312.16994 [pdf]

3D observations discover a new paradigm in rubber elasticity

Authors: Zifan Wang, Shuvrangsu Das, Akshay Joshi, Angkur Jyoti Dipanka Shaikeea, Vikram Sudhir Deshpande

Abstract: The mechanical response of rubbers has been ubiquitously assumed to be only a function of the imposed strain. Using innovative X-ray measurements capturing the three-dimensional spatial volumetric strain fields, we demonstrate that rubbers and indeed many common engineering polymers, undergo significant local volume changes. But remarkably the overall specimen volume remains constant regardless of… ▽ More The mechanical response of rubbers has been ubiquitously assumed to be only a function of the imposed strain. Using innovative X-ray measurements capturing the three-dimensional spatial volumetric strain fields, we demonstrate that rubbers and indeed many common engineering polymers, undergo significant local volume changes. But remarkably the overall specimen volume remains constant regardless of the imposed loading. This strange behaviour which also leads to apparent negative local bulk moduli is due to the presence of a mobile phase within these materials. Combining X-ray tomographic observations with high-speed radiography to track the motion of the mobile phase we have revised classical thermodynamic frameworks of rubber elasticity. The work opens new avenues to understand not only the mechanical behaviour of rubbers but a large class of widely used engineering polymers. △ Less

Submitted 28 December, 2023; originally announced December 2023.

Comments: 10 pages of main text, 4 main figures, 4 extended data figures

arXiv:2311.17375 [pdf, other]

Drone Delivery Optimization

Authors: Saayuj Deshpande, Purushotham Mani

Abstract: This research has addressed three critical challenges inherent in the implementation of drone delivery systems, namely, optimizing battery charging station placement, solving the shortest path problem for drones within their single battery charge travel distance, and efficiently scheduling multiple drones across numerous warehouses and delivery locations with diverse demands. The study has leverag… ▽ More This research has addressed three critical challenges inherent in the implementation of drone delivery systems, namely, optimizing battery charging station placement, solving the shortest path problem for drones within their single battery charge travel distance, and efficiently scheduling multiple drones across numerous warehouses and delivery locations with diverse demands. The study has leveraged a 2D grid model with obstacles, providing a practical foundation extendable to a 3D grid for accommodating complex structures. For battery station placement, the Miller-Tucker-Zemlin subtour elimination method has been applied to avoid the formation of charging station clusters. Future research directions involve the integration of these cases into a holistic solution, exploration of three-dimensional space, and the pursuit of bi-level optimization considering the interdependence of battery station placement and shortest path determination. This study contributes to the emerging field of drone delivery systems by addressing key optimization challenges and paving the way for comprehensive, integrated solutions. △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2309.03812 [pdf, other]

AnthroNet: Conditional Generation of Humans via Anthropometrics

Authors: Francesco Picetti, Shrinath Deshpande, Jonathan Leban, Soroosh Shahtalebi, Jay Patel, Peifeng **g, Chunpu Wang, Charles Metze III, Cameron Sun, Cera Laidlaw, James Warren, Kathy Huynh, River Page, Jonathan Hogins, Adam Crespi, Sujoy Ganguly, Salehe Erfanian Ebadi

Abstract: We present a novel human body model formulated by an extensive set of anthropocentric measurements, which is capable of generating a wide range of human body shapes and poses. The proposed model enables direct modeling of specific human identities through a deep generative architecture, which can produce humans in any arbitrary pose. It is the first of its kind to have been trained end-to-end usin… ▽ More We present a novel human body model formulated by an extensive set of anthropocentric measurements, which is capable of generating a wide range of human body shapes and poses. The proposed model enables direct modeling of specific human identities through a deep generative architecture, which can produce humans in any arbitrary pose. It is the first of its kind to have been trained end-to-end using only synthetically generated data, which not only provides highly accurate human mesh representations but also allows for precise anthropometry of the body. Moreover, using a highly diverse animation library, we articulated our synthetic humans' body and hands to maximize the diversity of the learnable priors for model training. Our model was trained on a dataset of $100k$ procedurally-generated posed human meshes and their corresponding anthropometric measurements. Our synthetic data generator can be used to generate millions of unique human identities and poses for non-commercial academic research purposes. △ Less

Submitted 7 September, 2023; originally announced September 2023.

Comments: AnthroNet's Unity data generator source code is available at: https://unity-technologies.github.io/AnthroNet/

arXiv:2308.07414 [pdf, other]

Votemandering: Strategies and Fairness in Political Redistricting

Authors: Sanyukta Deshpande, Ian G Ludden, Sheldon H Jacobson

Abstract: Gerrymandering, the deliberate manipulation of electoral district boundaries for political advantage, is a persistent issue in U.S. redistricting cycles. This paper introduces and analyzes a new phenomenon, 'votemandering'- a strategic blend of gerrymandering and targeted political campaigning, devised to gain more seats by circumventing fairness measures. It leverages accurate demographic and soc… ▽ More Gerrymandering, the deliberate manipulation of electoral district boundaries for political advantage, is a persistent issue in U.S. redistricting cycles. This paper introduces and analyzes a new phenomenon, 'votemandering'- a strategic blend of gerrymandering and targeted political campaigning, devised to gain more seats by circumventing fairness measures. It leverages accurate demographic and socio-political data to influence voter decisions, bolstered by advancements in technology and data analytics, and executes better-informed redistricting. Votemandering is established as a Mixed Integer Program (MIP) that performs fairness-constrained gerrymandering over multiple election rounds, via unit-specific variables for campaigns. To combat votemandering, we present a computationally efficient heuristic for creating and testing district maps that more robustly preserve voter preferences. We analyze the influence of various redistricting constraints and parameters on votemandering efficacy. We explore the interconnectedness of gerrymandering, substantial campaign budgets, and strategic campaigning, illustrating their collective potential to generate biased electoral maps. A Wisconsin State Senate redistricting case study substantiates our findings on real data, demonstrating how major parties can secure additional seats through votemandering. Our findings underscore the practical implications of these manipulations, stressing the need for informed policy and regulation to safeguard democratic processes. △ Less

Submitted 15 August, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

arXiv:2308.03897 [pdf, other]

Hardware Architecture for a Quantum Computer Trusted Execution Environment

Authors: Theodoros Trochatos, Chuanqi Xu, Sanjay Deshpande, Yao Lu, Yongshan Ding, Jakub Szefer

Abstract: The cloud-based environments in which today's and future quantum computers will operate, raise concerns about the security and privacy of user's intellectual property. Quantum circuits submitted to cloud-based quantum computer providers represent sensitive or proprietary algorithms developed by users that need protection. Further, input data is hard-coded into the circuits, and leakage of the circ… ▽ More The cloud-based environments in which today's and future quantum computers will operate, raise concerns about the security and privacy of user's intellectual property. Quantum circuits submitted to cloud-based quantum computer providers represent sensitive or proprietary algorithms developed by users that need protection. Further, input data is hard-coded into the circuits, and leakage of the circuits can expose users' data. To help protect users' circuits and data from possibly malicious quantum computer cloud providers, this work presented the first hardware architecture for a trusted execution environment for quantum computers. To protect the user's circuits and data, the quantum computer control pulses are obfuscated with decoy control pulses. While digital data can be encrypted, analog control pulses cannot and this paper proposed the novel decoy pulse approach to obfuscate the analog control pulses. The proposed decoy pulses can easily be added to the software by users. Meanwhile, the hardware components of the architecture proposed in this paper take care of eliminating, i.e. attenuating, the decoy pulses inside the superconducting quantum computer's dilution refrigerator before they reach the qubits. The hardware architecture also contains tamper-resistant features to protect the trusted hardware and users' information. The work leverages a new metric of variational distance to analyze the impact and scalability of hardware protection. The variational distance of the circuits protected with our scheme, compared to unprotected circuits, is in the range of only $0.16$ to $0.26$. This work demonstrates that protection from possibly malicious cloud providers is feasible and all the hardware components needed for the proposed architecture are available today. △ Less

Submitted 7 August, 2023; originally announced August 2023.

arXiv:2308.00162 [pdf, other]

Soft matter physics of the ground beneath our feet

Authors: Anne Voigtländer, Morgane Houssais, Karol A. Bacik, Ian C. Bourg, Justin C. Burton, Karen E. Daniels, Sujit S. Datta, Emanuela Del Gado, Nakul S. Deshpande, Olivier Devauchelle, Behrooz Ferdowsi, Rachel Glade, Lucas Goehring, Ian J. Hewitt, Douglas Jerolmack, Ruben Juanes, Arshad Kudrolli, Ching-Yao Lai, Wei Li, Claire Masteller, Kavinda Nissanka, Allan M. Rubin, Howard A. Stone, Jenny Suckale, Nathalie M. Vriend , et al. (2 additional authors not shown)

Abstract: Inspired by presentations by the authors during a workshop organized at the Princeton Center for Theoretical Science (PCTS) in January 2022, we present a perspective on some of the outstanding questions related to the "physics of the ground beneath our feet." These identified challenges are intrinsically shared with the field of Soft Matter but also have unique aspects when the natural environment… ▽ More Inspired by presentations by the authors during a workshop organized at the Princeton Center for Theoretical Science (PCTS) in January 2022, we present a perspective on some of the outstanding questions related to the "physics of the ground beneath our feet." These identified challenges are intrinsically shared with the field of Soft Matter but also have unique aspects when the natural environment is studied. △ Less

Submitted 31 July, 2023; originally announced August 2023.

Comments: Perspective Paper, 30 pages, 15 figures

arXiv:2307.01766 [pdf, other]

The Quantum Advantage in Binary Teams and the Coordination Dilemma: Part II

Authors: Shashank A. Deshpande, Ankur A. Kulkarni

Abstract: In our previous work, we have shown that the use of a quantum architecture in decentralised control allows access to a larger space of control strategies beyond what is classically implementable through common randomness, and can lead to an improvement in the cost -- a phenomenon we called the quantum advantage. In the previous part of this two part series, we showed, however, that not all decisio… ▽ More In our previous work, we have shown that the use of a quantum architecture in decentralised control allows access to a larger space of control strategies beyond what is classically implementable through common randomness, and can lead to an improvement in the cost -- a phenomenon we called the quantum advantage. In the previous part of this two part series, we showed, however, that not all decision problems admit such an advantage. We identified a decision-theoretic property of the cost called the `coordination dilemma' as a necessary condition for the quantum advantage to manifest. In this article, we investigate the impact on the quantum advantage of a scalar parameter that captures the extent of the coordination dilemma. We show that this parameter can be bounded within an open interval for the quantum advantage to exist, and for some classes, we precisely identify this range of values. This range is found to be determined by the information of the agents. △ Less

Submitted 4 July, 2023; originally announced July 2023.

Comments: 12 pages, journal

MSC Class: 93E20

arXiv:2307.01762 [pdf, other]

The Quantum Advantage in Binary Teams and the Coordination Dilemma: Part I

Authors: Shashank A. Deshpande, Ankur A. Kulkarni

Abstract: We have shown that entanglement assisted stochastic strategies allow access to strategic measures beyond the classically correlated measures accessible through passive common randomness, and thus attain a quantum advantage in decentralised control. In this two part series of articles, we investigate the decision theoretic origins of the quantum advantage within a broad superstructure of problem cl… ▽ More We have shown that entanglement assisted stochastic strategies allow access to strategic measures beyond the classically correlated measures accessible through passive common randomness, and thus attain a quantum advantage in decentralised control. In this two part series of articles, we investigate the decision theoretic origins of the quantum advantage within a broad superstructure of problem classes. Each class in our binary team superstructure corresponds to a parametric family of cost functions with a distinct algebraic structure. In this part, identify the only problem classes that benefit from quantum strategies. We find that these cost structures admit a special decision-theoretic feature -- `the coordination dilemma'. Our analysis hence reveals some intuition towards the utility of non-local quantum correlations in decentralised control. △ Less

Submitted 4 July, 2023; originally announced July 2023.

Comments: 12 pages, 1 figure, journal

MSC Class: 93E20

arXiv:2306.07666 [pdf, other]

doi 10.1007/s00161-023-01237-5

Analysis of hydrogen diffusion in the three stage electro-permeation test

Authors: A. Raina, V. S. Deshpande, E. Martínez-Pañeda, N. A. Fleck

Abstract: The presence of hydrogen traps within a metallic alloy influences the rate of hydrogen diffusion. The electro-permeation (EP) test can be used to assess this: the permeation of hydrogen through a thin metallic sheet is measured by suitable control of hydrogen concentration on the front face and by recording the flux of hydrogen that exits the rear face. Additional insight is achieved by the more s… ▽ More The presence of hydrogen traps within a metallic alloy influences the rate of hydrogen diffusion. The electro-permeation (EP) test can be used to assess this: the permeation of hydrogen through a thin metallic sheet is measured by suitable control of hydrogen concentration on the front face and by recording the flux of hydrogen that exits the rear face. Additional insight is achieved by the more sophisticated three stage EP test: the concentration of free lattice hydrogen on the front face is set to an initial level, is then dropped to a lower intermediate value and is then restored to the initial level. The flux of hydrogen exiting the rear face is measured in all three stages of the test. In the present study, a transient analysis is performed of hydrogen permeation in a three stage EP test, assuming that lattice diffusion is accompanied by trap** and de-trap**. The sensitivity of the three stage EP response to the depth and density of hydrogen traps is quantified. A significant difference in permeation response can exist between the first and third stages of the EP test when the alloy contains a high number density of deep traps. △ Less

Submitted 13 June, 2023; originally announced June 2023.

arXiv:2306.00382 [pdf, other]

Calibrated and Conformal Propensity Scores for Causal Effect Estimation

Authors: Shachi Deshpande, Volodymyr Kuleshov

Abstract: Propensity scores are commonly used to estimate treatment effects from observational data. We argue that the probabilistic output of a learned propensity score model should be calibrated -- i.e., a predictive treatment probability of 90% should correspond to 90% of individuals being assigned the treatment group -- and we propose simple recalibration techniques to ensure this property. We prove tha… ▽ More Propensity scores are commonly used to estimate treatment effects from observational data. We argue that the probabilistic output of a learned propensity score model should be calibrated -- i.e., a predictive treatment probability of 90% should correspond to 90% of individuals being assigned the treatment group -- and we propose simple recalibration techniques to ensure this property. We prove that calibration is a necessary condition for unbiased treatment effect estimation when using popular inverse propensity weighted and doubly robust estimators. We derive error bounds on causal effect estimates that directly relate to the quality of uncertainties provided by the probabilistic propensity score model and show that calibration strictly improves this error bound while also avoiding extreme propensity weights. We demonstrate improved causal effect estimation with calibrated propensity scores in several tasks including high-dimensional image covariates and genome-wide association studies (GWASs). Calibrated propensity scores improve the speed of GWAS analysis by more than two-fold by enabling the use of simpler models that are faster to train. △ Less

Submitted 4 June, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: 23 pages, 3 figures

ACM Class: I.2.m

arXiv:2305.05752 [pdf, other]

doi 10.1515/jqas-2023-0048

Evaluating plate discipline in Major League Baseball with Bayesian Additive Regression Trees

Authors: Ryan Yee, Sameer K. Deshpande

Abstract: We introduce a three-step framework to determine at which pitches Major League batters should swing. Unlike traditional plate discipline metrics, which implicitly assume that all batters should always swing at (resp. take) pitches inside (resp. outside) the strike zone, our approach explicitly accounts not only for the players and umpires involved in the pitch but also in-game contextual informati… ▽ More We introduce a three-step framework to determine at which pitches Major League batters should swing. Unlike traditional plate discipline metrics, which implicitly assume that all batters should always swing at (resp. take) pitches inside (resp. outside) the strike zone, our approach explicitly accounts not only for the players and umpires involved in the pitch but also in-game contextual information like the number of outs, the count, baserunners, and score. We first fit flexible Bayesian nonparametric models to estimate (i) the probability that the pitch is called a strike if the batter takes the pitch; (ii) the probability that the batter makes contact if he swings; and (iii) the number of runs the batting team is expected to score following each pitch outcome (e.g. swing and miss, take a called strike, etc.). We then combine these intermediate estimates to determine whether swinging increases the batting team's run expectancy. Our approach enables natural uncertainty propagation so that we can not only determine the optimal swing/take decision but also quantify our confidence in that decision. We illustrate our framework using a case study of pitches faced by Mike Trout in 2019. △ Less

Submitted 20 September, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

arXiv:2305.05006 [pdf, other]

Synthesis of Annotated Colorectal Cancer Tissue Images from Gland Layout

Authors: Srijay Deshpande, Fayyaz Minhas, Nasir Rajpoot

Abstract: Generating realistic tissue images with annotations is a challenging task that is important in many computational histopathology applications. Synthetically generated images and annotations are valuable for training and evaluating algorithms in this domain. To address this, we propose an interactive framework generating pairs of realistic colorectal cancer histology images with corresponding gland… ▽ More Generating realistic tissue images with annotations is a challenging task that is important in many computational histopathology applications. Synthetically generated images and annotations are valuable for training and evaluating algorithms in this domain. To address this, we propose an interactive framework generating pairs of realistic colorectal cancer histology images with corresponding glandular masks from glandular structure layouts. The framework accurately captures vital features like stroma, goblet cells, and glandular lumen. Users can control gland appearance by adjusting parameters such as the number of glands, their locations, and sizes. The generated images exhibit good Frechet Inception Distance (FID) scores compared to the state-of-the-art image-to-image translation model. Additionally, we demonstrate the utility of our synthetic annotations for evaluating gland segmentation algorithms. Furthermore, we present a methodology for constructing glandular masks using advanced deep generative models, such as latent diffusion models. These masks enable tissue image generation through a residual encoder-decoder network. △ Less

Submitted 4 April, 2024; v1 submitted 8 May, 2023; originally announced May 2023.

arXiv:2304.06122 [pdf, other]

Analyzing ChatGPT's Aptitude in an Introductory Computer Engineering Course

Authors: Sanjay Deshpande, Jakub Szefer

Abstract: ChatGPT has recently gathered attention from the general public and academia as a tool that is able to generate plausible and human-sounding text answers to various questions. One potential use, or abuse, of ChatGPT is in answering various questions or even generating whole essays and research papers in an academic or classroom setting. While recent works have explored the use of ChatGPT in the co… ▽ More ChatGPT has recently gathered attention from the general public and academia as a tool that is able to generate plausible and human-sounding text answers to various questions. One potential use, or abuse, of ChatGPT is in answering various questions or even generating whole essays and research papers in an academic or classroom setting. While recent works have explored the use of ChatGPT in the context of humanities, business school, or medical school, this work explores how ChatGPT performs in the context of an introductory computer engineering course. This work assesses ChatGPT's aptitude in answering quizzes, homework, exam, and laboratory questions in an introductory-level computer engineering course. This work finds that ChatGPT can do well on questions asking about generic concepts. However, predictably, as a text-only tool, it cannot handle questions with diagrams or figures, nor can it generate diagrams and figures. Further, also clearly, the tool cannot do hands-on lab experiments, breadboard assembly, etc., but can generate plausible answers to some laboratory manual questions. One of the key observations presented in this work is that the ChatGPT tool could not be used to pass all components of the course. Nevertheless, it does well on quizzes and short-answer questions. On the other hand, plausible, human-sounding answers could confuse students when generating incorrect but still plausible answers. △ Less

Submitted 14 April, 2023; v1 submitted 13 March, 2023; originally announced April 2023.

Comments: 5 pages

arXiv:2304.05785 [pdf]

Non-Speckle-based DVC for Measuring Large Deformations in Homogeneous Solids using Laboratory X-ray CT

Authors: Zifan Wang, Akshay Joshi, Angkur Jyoti Dipanka Shaikeea, Vikram Susdhir Deshpande

Abstract: X-ray computed tomography (XCT) has become a reliable metrology tool for measuring internal flaws and other microstructural features in engineering materials. However, tracking of material points to measure three-dimensional (3D) deformations has hitherto relied on either artificially adding tracer particles (speckles) or exploiting inherent microstructural features such as inclusions. This has gr… ▽ More X-ray computed tomography (XCT) has become a reliable metrology tool for measuring internal flaws and other microstructural features in engineering materials. However, tracking of material points to measure three-dimensional (3D) deformations has hitherto relied on either artificially adding tracer particles (speckles) or exploiting inherent microstructural features such as inclusions. This has greatly limited the spatial resolution and magnitude of the deformation measurements. Here we report a novel Flux Enhanced Tomography for Correlation (FETC) technique that leverages the inherent inhomogeneities within nominally homogeneous engineering polymers to track 3D material point displacements without recourse to artificial speckles or microstructural features such as inclusions. The FETC is then combined with a Eulerian/Lagrangian transformation in a multi-step Digital Volume Correlation (DVC) methodology to measure all nine components of the deformation gradient within the volume of complex specimens undergoing extreme deformations. FETC is a powerful technique that greatly expands the capabilities of laboratory-based XCT to provide amongst other things the inputs required for data-driven constitutive modelling approaches. △ Less

Submitted 12 April, 2023; originally announced April 2023.

Comments: 20 pages, 7 figures

arXiv:2303.06274 [pdf]

CoNIC Challenge: Pushing the Frontiers of Nuclear Detection, Segmentation, Classification and Counting

Authors: Simon Graham, Quoc Dang Vu, Mostafa Jahanifar, Martin Weigert, Uwe Schmidt, Wenhua Zhang, Jun Zhang, Sen Yang, **xi Xiang, Xiyue Wang, Josef Lorenz Rumberger, Elias Baumann, Peter Hirsch, Lihao Liu, Chenyang Hong, Angelica I. Aviles-Rivero, Ayushi Jain, Heeyoung Ahn, Yiyu Hong, Hussam Azzuni, Min Xu, Mohammad Yaqub, Marie-Claire Blache, Benoît Piégu, Bertrand Vernay , et al. (64 additional authors not shown)

Abstract: Nuclear detection, segmentation and morphometric profiling are essential in hel** us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of repro… ▽ More Nuclear detection, segmentation and morphometric profiling are essential in hel** us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of reproducible algorithms for cellular recognition with real-time result inspection on public leaderboards. We conducted an extensive post-challenge analysis based on the top-performing models using 1,658 whole-slide images of colon tissue. With around 700 million detected nuclei per model, associated features were used for dysplasia grading and survival analysis, where we demonstrated that the challenge's improvement over the previous state-of-the-art led to significant boosts in downstream performance. Our findings also suggest that eosinophils and neutrophils play an important role in the tumour microevironment. We release challenge models and WSI-level results to foster the development of further methods for biomarker discovery. △ Less

Submitted 14 March, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

arXiv:2302.12196 [pdf, other]

Calibrated Regression Against An Adversary Without Regret

Authors: Shachi Deshpande, Charles Marx, Volodymyr Kuleshov

Abstract: We are interested in probabilistic prediction in online settings in which data does not follow a probability distribution. Our work seeks to achieve two goals: (1) producing valid probabilities that accurately reflect model confidence; and (2) ensuring that traditional notions of performance (e.g., high accuracy) still hold. We introduce online algorithms guaranteed to achieve these goals on arbit… ▽ More We are interested in probabilistic prediction in online settings in which data does not follow a probability distribution. Our work seeks to achieve two goals: (1) producing valid probabilities that accurately reflect model confidence; and (2) ensuring that traditional notions of performance (e.g., high accuracy) still hold. We introduce online algorithms guaranteed to achieve these goals on arbitrary streams of data points, including data chosen by an adversary. Specifically, our algorithms produce forecasts that are (1) calibrated -- i.e., an 80% confidence interval contains the true outcome 80% of the time -- and (2) have low regret relative to a user-specified baseline model. We implement a post-hoc recalibration strategy that provably achieves these goals in regression; previous algorithms applied to classification or achieved (1) but not (2). In the context of Bayesian optimization, an online model-based decision-making task in which the data distribution shifts over time, our method yields accelerated convergence to improved optima. △ Less

Submitted 4 June, 2024; v1 submitted 23 February, 2023; originally announced February 2023.

arXiv:2212.13780 [pdf, other]

SynCLay: Interactive Synthesis of Histology Images from Bespoke Cellular Layouts

Authors: Srijay Deshpande, Muhammad Dawood, Fayyaz Minhas, Nasir Rajpoot

Abstract: Automated synthesis of histology images has several potential applications in computational pathology. However, no existing method can generate realistic tissue images with a bespoke cellular layout or user-defined histology parameters. In this work, we propose a novel framework called SynCLay (Synthesis from Cellular Layouts) that can construct realistic and high-quality histology images from use… ▽ More Automated synthesis of histology images has several potential applications in computational pathology. However, no existing method can generate realistic tissue images with a bespoke cellular layout or user-defined histology parameters. In this work, we propose a novel framework called SynCLay (Synthesis from Cellular Layouts) that can construct realistic and high-quality histology images from user-defined cellular layouts along with annotated cellular boundaries. Tissue image generation based on bespoke cellular layouts through the proposed framework allows users to generate different histological patterns from arbitrary topological arrangement of different types of cells. SynCLay generated synthetic images can be helpful in studying the role of different types of cells present in the tumor microenvironmet. Additionally, they can assist in balancing the distribution of cellular counts in tissue images for designing accurate cellular composition predictors by minimizing the effects of data imbalance. We train SynCLay in an adversarial manner and integrate a nuclear segmentation and classification model in its training to refine nuclear structures and generate nuclear masks in conjunction with synthetic images. During inference, we combine the model with another parametric model for generating colon images and associated cellular counts as annotations given the grade of differentiation and cell densities of different cells. We assess the generated images quantitatively and report on feedback from trained pathologists who assigned realism scores to a set of images generated by the framework. The average realism score across all pathologists for synthetic images was as high as that for the real images. We also show that augmenting limited real data with the synthetic data generated by our framework can significantly boost prediction performance of the cellular composition prediction task. △ Less

Submitted 28 December, 2022; originally announced December 2022.

arXiv:2212.01386 [pdf, other]

doi 10.3389/fmats.2023.1128954

Convolution, aggregation and attention based deep neural networks for accelerating simulations in mechanics

Authors: Saurabh Deshpande, Raúl I. Sosa, Stéphane P. A. Bordas, Jakub Lengiewicz

Abstract: Deep learning surrogate models are being increasingly used in accelerating scientific simulations as a replacement for costly conventional numerical techniques. However, their use remains a significant challenge when dealing with real-world complex examples. In this work, we demonstrate three types of neural network architectures for efficient learning of highly non-linear deformations of solid bo… ▽ More Deep learning surrogate models are being increasingly used in accelerating scientific simulations as a replacement for costly conventional numerical techniques. However, their use remains a significant challenge when dealing with real-world complex examples. In this work, we demonstrate three types of neural network architectures for efficient learning of highly non-linear deformations of solid bodies. The first two architectures are based on the recently proposed CNN U-NET and MAgNET (graph U-NET) frameworks which have shown promising performance for learning on mesh-based data. The third architecture is Perceiver IO, a very recent architecture that belongs to the family of attention-based neural networks--a class that has revolutionised diverse engineering fields and is still unexplored in computational mechanics. We study and compare the performance of all three networks on two benchmark examples, and show their capabilities to accurately predict the non-linear mechanical responses of soft bodies. △ Less

Submitted 24 March, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

Journal ref: Front. Mater. 10:1128954

arXiv:2212.00219 [pdf, other]

Are you using test log-likelihood correctly?

Authors: Sameer K. Deshpande, Soumya Ghosh, Tin D. Nguyen, Tamara Broderick

Abstract: Test log-likelihood is commonly used to compare different models of the same data or different approximate inference algorithms for fitting the same probabilistic model. We present simple examples demonstrating how comparisons based on test log-likelihood can contradict comparisons according to other objectives. Specifically, our examples show that (i) approximate Bayesian inference algorithms tha… ▽ More Test log-likelihood is commonly used to compare different models of the same data or different approximate inference algorithms for fitting the same probabilistic model. We present simple examples demonstrating how comparisons based on test log-likelihood can contradict comparisons according to other objectives. Specifically, our examples show that (i) approximate Bayesian inference algorithms that attain higher test log-likelihoods need not also yield more accurate posterior approximations and (ii) conclusions about forecast accuracy based on test log-likelihood comparisons may not agree with conclusions based on root mean squared error. △ Less

Submitted 18 January, 2024; v1 submitted 30 November, 2022; originally announced December 2022.

Comments: Presented at the ICBINB Workshop at NeurIPS 2022. This version accepted at TMLR, available at https://openreview.net/forum?id=n2YifD4Dxo

arXiv:2211.04459 [pdf, other]

flexBART: Flexible Bayesian regression trees with categorical predictors

Authors: Sameer K. Deshpande

Abstract: Most implementations of Bayesian additive regression trees (BART) one-hot encode categorical predictors, replacing each one with several binary indicators, one for every level or category. Regression trees built with these indicators partition the discrete set of categorical levels by repeatedly removing one level at a time. Unfortunately, the vast majority of partitions cannot be built with this… ▽ More Most implementations of Bayesian additive regression trees (BART) one-hot encode categorical predictors, replacing each one with several binary indicators, one for every level or category. Regression trees built with these indicators partition the discrete set of categorical levels by repeatedly removing one level at a time. Unfortunately, the vast majority of partitions cannot be built with this strategy, severely limiting BART's ability to partially pool data across groups of levels. Motivated by analyses of baseball data and neighborhood-level crime dynamics, we overcame this limitation by re-implementing BART with regression trees that can assign multiple levels to both branches of a decision tree node. To model spatial data aggregated into small regions, we further proposed a new decision rule prior that creates spatially contiguous regions by deleting a random edge from a random spanning tree of a suitably defined network. Our re-implementation, which is available in the flexBART package, often yields improved out-of-sample predictive performance and scales better to larger datasets than existing implementations of BART. △ Less

Submitted 21 June, 2023; v1 submitted 8 November, 2022; originally announced November 2022.

Comments: Software available at https://github.com/skdeshpande91/flexBART

arXiv:2211.02104 [pdf, other]

Pre-analysis protocol for an observational study on the effects of adolescent sports participation on health in early adulthood

Authors: A**kya H Kokandakar, Yuzhou Lin, Steven **, Jordan Weiss, Amanda R Rabinowitz, Reuben A Buford May, Dylan Small, Sameer K Deshpande

Abstract: We will study the impact of adolescent sports participation on early-adulthood health using longitudinal data from the National Study of Youth and Religion. We focus on two primary outcomes measured at ages 23--28 -- self-rated health and total score on the PHQ9 Patient Depression Questionnaire -- and control for several potential confounders related to demographics and family socioeconomic status… ▽ More We will study the impact of adolescent sports participation on early-adulthood health using longitudinal data from the National Study of Youth and Religion. We focus on two primary outcomes measured at ages 23--28 -- self-rated health and total score on the PHQ9 Patient Depression Questionnaire -- and control for several potential confounders related to demographics and family socioeconomic status. Comparing outcomes between sports participants and matched non-sports participants with similar confounders is straightforward. Unfortunately, an analysis based on such a broad exposure cannot probe the possibility that participation in certain types of sports (e.g., collision sports like football or soccer) may have larger effects on health than others. In this study, we introduce a hierarchy of exposure definitions, ranging from broad (participation in any after-school organized activity) to narrow (e.g., participation in limited-contact sports). We will perform separate matched observational studies, one for each definition, to estimate the health effects of several levels of sports participation. In order to conduct these studies while maintaining a fixed family-wise error rate, we deployed an ordered testing approach that exploits the logical relationships between exposure definitions. Our study will also consider several secondary outcomes including body mass index, life satisfaction, and problematic drinking behavior. △ Less

Submitted 30 November, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

arXiv:2211.02020 [pdf, other]

Bayesian Causal Forests & the 2022 ACIC Data Challenge: Scalability and Sensitivity

Authors: A**kya H. Kokandakar, Hyunseung Kang, Sameer K. Deshpande

Abstract: We demonstrate how Hahn et al.'s Bayesian Causal Forests model (BCF) can be used to estimate conditional average treatment effects for the longitudinal dataset in the 2022 American Causal Inference Conference Data Challenge. Unfortunately, existing implementations of BCF do not scale to the size of the challenge data. Therefore, we developed flexBCF -- a more scalable and flexible implementation o… ▽ More We demonstrate how Hahn et al.'s Bayesian Causal Forests model (BCF) can be used to estimate conditional average treatment effects for the longitudinal dataset in the 2022 American Causal Inference Conference Data Challenge. Unfortunately, existing implementations of BCF do not scale to the size of the challenge data. Therefore, we developed flexBCF -- a more scalable and flexible implementation of BCF -- and used it in our challenge submission. We investigate the sensitivity of our results to the choice of propensity score estimation method and the use of sparsity-inducing regression tree priors. While we found that our overall point predictions were not especially sensitive to these modeling choices, we did observe that running BCF with flexibly estimated propensity scores often yielded better-calibrated uncertainty intervals. △ Less

Submitted 11 May, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

Journal ref: Observational Studies 9(3), 29-41 (2023). https://www.muse.jhu.edu/article/895651

arXiv:2211.00713 [pdf, other]

doi 10.1016/j.engappai.2024.108055

MAgNET: A Graph U-Net Architecture for Mesh-Based Simulations

Authors: Saurabh Deshpande, Stéphane P. A. Bordas, Jakub Lengiewicz

Abstract: In many cutting-edge applications, high-fidelity computational models prove to be too slow for practical use and are therefore replaced by much faster surrogate models. Recently, deep learning techniques have increasingly been utilized to accelerate such predictions. To enable learning on large-dimensional and complex data, specific neural network architectures have been developed, including convo… ▽ More In many cutting-edge applications, high-fidelity computational models prove to be too slow for practical use and are therefore replaced by much faster surrogate models. Recently, deep learning techniques have increasingly been utilized to accelerate such predictions. To enable learning on large-dimensional and complex data, specific neural network architectures have been developed, including convolutional and graph neural networks. In this work, we present a novel encoder-decoder geometric deep learning framework called MAgNET, which extends the well-known convolutional neural networks to accommodate arbitrary graph-structured data. MAgNET consists of innovative Multichannel Aggregation (MAg) layers and graph pooling/unpooling layers, forming a graph U-Net architecture that is analogous to convolutional U-Nets. We demonstrate the predictive capabilities of MAgNET in surrogate modeling for non-linear finite element simulations in the mechanics of solids. △ Less

Submitted 2 April, 2024; v1 submitted 1 November, 2022; originally announced November 2022.

Journal ref: Engineering Applications of Artificial Intelligence, Volume 133, Part B, 2024, 108055

arXiv:2210.06724 [pdf, other]

doi 10.1515/jqas-2022-0116

A Bayesian analysis of the time through the order penalty in baseball

Authors: Ryan S. Brill, Sameer K. Deshpande, Abraham J. Wyner

Abstract: As a baseball game progresses, batters appear to perform better the more times they face a particular pitcher. The apparent drop-off in pitcher performance from one time through the order to the next, known as the Time Through the Order Penalty (TTOP), is often attributed to within-game batter learning. Although the TTOP has largely been accepted within baseball and influences many managers' in-ga… ▽ More As a baseball game progresses, batters appear to perform better the more times they face a particular pitcher. The apparent drop-off in pitcher performance from one time through the order to the next, known as the Time Through the Order Penalty (TTOP), is often attributed to within-game batter learning. Although the TTOP has largely been accepted within baseball and influences many managers' in-game decision making, we argue that existing approaches of estimating the size of the TTOP cannot disentangle continuous evolution in pitcher performance over the course of the game from discontinuities between successive times through the order. Using a Bayesian multinomial regression model, we find that, after adjusting for confounders like batter and pitcher quality, handedness, and home field advantage, there is little evidence of strong discontinuity in pitcher performance between times through the order. Our analysis suggests that the start of the third time through the order should not be viewed as a special cutoff point in deciding whether to pull a starting pitcher. △ Less

Submitted 31 May, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

Comments: Accepted to JQAS

arXiv:2209.04389 [pdf, other]

Posterior contraction and uncertainty quantification for the multivariate spike-and-slab LASSO

Authors: Yunyi Shen, Sameer K. Deshpande

Abstract: We study the asymptotic properties of Deshpande et al.\ (2019)'s multivariate spike-and-slab LASSO (mSSL) procedure for simultaneous variable and covariance selection in the sparse multivariate linear regression problem. In that problem, $q$ correlated responses are regressed onto $p$ covariates and the mSSL works by placing separate spike-and-slab priors on the entries in the matrix of marginal c… ▽ More We study the asymptotic properties of Deshpande et al.\ (2019)'s multivariate spike-and-slab LASSO (mSSL) procedure for simultaneous variable and covariance selection in the sparse multivariate linear regression problem. In that problem, $q$ correlated responses are regressed onto $p$ covariates and the mSSL works by placing separate spike-and-slab priors on the entries in the matrix of marginal covariate effects and off-diagonal elements in the upper triangle of the residual precision matrix. Under mild assumptions about these matrices, we establish the posterior contraction rate for the mSSL posterior in the asymptotic regime where both $p$ and $q$ diverge with $n.$ By ``de-biasing'' the corresponding MAP estimates, we obtain confidence intervals for each covariate effect and residual partial correlation. In extensive simulation studies, these intervals displayed close-to-nominal frequentist coverage in finite sample settings but tended to be substantially longer than those obtained using a version of the Bayesian bootstrap that randomly re-weights the prior. We further show that the de-biased intervals for individual covariate effects are asymptotically valid. △ Less

Submitted 22 May, 2024; v1 submitted 9 September, 2022; originally announced September 2022.

arXiv:2207.12075 [pdf, other]

The Quantum Advantage in Decentralized Control

Authors: Shashank A. Deshpande, Ankur A. Kulkarni

Abstract: It is known in the context of decentralised control that there exist control strategies consistent with the requirements of a given information structure, yet physically unimplementable through any amount of passive common randomness. This imposes a natural set of limitations on what is achievable through common randomness in both cooperative and competitive settings. We show that it is possible t… ▽ More It is known in the context of decentralised control that there exist control strategies consistent with the requirements of a given information structure, yet physically unimplementable through any amount of passive common randomness. This imposes a natural set of limitations on what is achievable through common randomness in both cooperative and competitive settings. We show that it is possible to breach these limitations with the use of quantum-physical architectures. In particular, we present a class of stochastic strategies that leverage quantum entanglement to produce strategic distributions which compose a strict superclass of strategies implemented through passive common randomness. We investigate numerically, the `quantum advantage' offered by this new class over a parametric family of cooperative decision problems with static information structure. We demonstrate through variations across the parametric family that fundamental decision theoretic elements such as information and the cost determine the manifestation of quantum advantage in a given control problem. Our work motivates a novel decision and control paradigm with an enlarged space of control policies achievable by means of quantum architectures. △ Less

Submitted 16 June, 2023; v1 submitted 25 July, 2022; originally announced July 2022.

MSC Class: 91A12

arXiv:2207.07020 [pdf, other]

Estimating sparse direct effects in multivariate regression with the spike-and-slab LASSO

Authors: Yunyi Shen, Claudia Solís-Lemus, Sameer K. Deshpande

Abstract: The multivariate regression interpretation of the Gaussian chain graph model simultaneously parametrizes (i) the direct effects of $p$ predictors on $q$ outcomes and (ii) the residual partial covariances between pairs of outcomes. We introduce a new method for fitting sparse Gaussian chain graph models with spike-and-slab LASSO (SSL) priors. We develop an Expectation Conditional Maximization algor… ▽ More The multivariate regression interpretation of the Gaussian chain graph model simultaneously parametrizes (i) the direct effects of $p$ predictors on $q$ outcomes and (ii) the residual partial covariances between pairs of outcomes. We introduce a new method for fitting sparse Gaussian chain graph models with spike-and-slab LASSO (SSL) priors. We develop an Expectation Conditional Maximization algorithm to obtain sparse estimates of the $p \times q$ matrix of direct effects and the $q \times q$ residual precision matrix. Our algorithm iteratively solves a sequence of penalized maximum likelihood problems with self-adaptive penalties that gradually filter out negligible regression coefficients and partial covariances. Because it adaptively penalizes individual model parameters, our method is seen to outperform fixed-penalty competitors on simulated data. We establish the posterior contraction rate for our model, buttressing our method's excellent empirical performance with strong theoretical guarantees. Using our method, we estimated the direct effects of diet and residence type on the composition of the gut microbiome of elderly adults. △ Less

Submitted 26 March, 2024; v1 submitted 14 July, 2022; originally announced July 2022.

arXiv:2207.05016 [pdf, other]

Capacity Management in a Pandemic with Endogenous Patient Choices and Flows

Authors: Sanyukta Deshpande, Lavanya Marla, Alan Scheller-Wolf, Siddharth Prakash Singh

Abstract: Motivated by the experiences of a healthcare service provider during the Covid-19 pandemic, we aim to study the decisions of a provider that operates both an Emergency Department (ED) and a medical Clinic. Patients contact the provider through a phone call or may present directly at the ED: patients can be COVID (suspected/confirmed) or non-COVID, and have different severities. Depending on the se… ▽ More Motivated by the experiences of a healthcare service provider during the Covid-19 pandemic, we aim to study the decisions of a provider that operates both an Emergency Department (ED) and a medical Clinic. Patients contact the provider through a phone call or may present directly at the ED: patients can be COVID (suspected/confirmed) or non-COVID, and have different severities. Depending on the severity, patients who contact the provider may be directed to the ED (to be seen in a few hours), be offered an appointment at the Clinic (to be seen in a few days), or be treated via phone or telemedicine, avoiding a visit to a facility. All patients make joining decisions based on comparing their own risk perceptions versus their anticipated benefits: They then choose to enter a facility only if it is beneficial enough. Also, after initial contact, their severities may evolve, which may change their decision. The hospital system's objective is to allocate service capacity across facilities so as to minimize costs from patient deaths or defections. We model the system using a fluid approximation over multiple periods, possibly with different demand profiles. While the feasible space for this problem can be extremely complex, it is amenable to decomposition into different sub-regions that can be analyzed individually, the global optimal solution can be reached via provably parsimonious computational methods over a single period and over multiple periods with different demand rates. Our analytical and computational results indicate that endogeneity results in non-trivial and non-intuitive capacity allocations that do not always prioritize high severity patients, for both single and multi-period settings. △ Less

Submitted 11 July, 2022; originally announced July 2022.

arXiv:2205.02932 [pdf, other]

doi 10.1109/IGARSS46834.2022.9883890

Understanding Urban Water Consumption using Remotely Sensed Data

Authors: Shaswat Mohanty, Anirudh Vijay, Shailesh Deshpande

Abstract: Urban metabolism is an active field of research that deals with the estimation of emissions and resource consumption from urban regions. The analysis could be carried out through a manual surveyor by the implementation of elegant machine learning algorithms. In this exploratory work, we estimate the water consumption by the buildings in the region captured by satellite imagery. To this end, we bre… ▽ More Urban metabolism is an active field of research that deals with the estimation of emissions and resource consumption from urban regions. The analysis could be carried out through a manual surveyor by the implementation of elegant machine learning algorithms. In this exploratory work, we estimate the water consumption by the buildings in the region captured by satellite imagery. To this end, we break our analysis into three parts: i) Identification of building pixels, given a satellite image, followed by ii) identification of the building type (residential/non-residential) from the building pixels, and finally iii) using the building pixels along with their type to estimate the water consumption using the average per unit area consumption for different building types as obtained from municipal surveys. △ Less

Submitted 5 January, 2023; v1 submitted 3 May, 2022; originally announced May 2022.

Comments: 4 pages, 2 figures, IEEE Conference Proceedings (IGARSS 2022)

arXiv:2204.08491 [pdf, other]

Active Learning Helps Pretrained Models Learn the Intended Task

Authors: Alex Tamkin, Dat Nguyen, Salil Deshpande, Jesse Mu, Noah Goodman

Abstract: Models can fail in unpredictable ways during deployment due to task ambiguity, when multiple behaviors are consistent with the provided training data. An example is an object classifier trained on red squares and blue circles: when encountering blue squares, the intended behavior is undefined. We investigate whether pretrained models are better active learners, capable of disambiguating between th… ▽ More Models can fail in unpredictable ways during deployment due to task ambiguity, when multiple behaviors are consistent with the provided training data. An example is an object classifier trained on red squares and blue circles: when encountering blue squares, the intended behavior is undefined. We investigate whether pretrained models are better active learners, capable of disambiguating between the possible tasks a user may be trying to specify. Intriguingly, we find that better active learning is an emergent property of the pretraining process: pretrained models require up to 5 times fewer labels when using uncertainty-based active learning, while non-pretrained models see no or even negative benefit. We find these gains come from an ability to select examples with attributes that disambiguate the intended behavior, such as rare product categories or atypical backgrounds. These attributes are far more linearly separable in pretrained model's representation spaces vs non-pretrained models, suggesting a possible mechanism for this behavior. △ Less

Submitted 18 April, 2022; originally announced April 2022.

arXiv:2204.07788 [pdf, ps, other]

doi 10.1103/PhysRevA.105.063111

A simple, passive design for large optical trap arrays for single atoms

Authors: P. Huft, Y. Song, T. M. Graham, K. Jooya, S. Deshpande, C. Fang, M. Kats, M. Saffman

Abstract: We present an approach for trap** cold atoms in a 2D optical trap array generated with a novel 4f filtering scheme and custom transmission mask without any active device. The approach can be used to generate arrays of bright or dark traps, or both simultaneously with a single wavelength for forming two-species traps. We demonstrate the design by creating a 2D array of 1225 dark trap sites, where… ▽ More We present an approach for trap** cold atoms in a 2D optical trap array generated with a novel 4f filtering scheme and custom transmission mask without any active device. The approach can be used to generate arrays of bright or dark traps, or both simultaneously with a single wavelength for forming two-species traps. We demonstrate the design by creating a 2D array of 1225 dark trap sites, where single Cs atoms are loaded into regions of near-zero intensity in an approximately Gaussian profile trap. Moreover, we demonstrate a simple solution to the problem of out-of-focus trapped atoms, which occurs due to the Talbot effect in periodic optical lattices. Using a high power yet low cost spectrally and spatially broadband laser, out-of-focus interference is mitigated, leading to near perfect removal of Talbot plane traps. △ Less

Submitted 19 June, 2022; v1 submitted 16 April, 2022; originally announced April 2022.

Comments: v3: minor revisions

arXiv:2203.09672 [pdf, other]

Deep Multi-Modal Structural Equations For Causal Effect Estimation With Unstructured Proxies

Authors: Shachi Deshpande, Kaiwen Wang, Dhruv Sreenivas, Zheng Li, Volodymyr Kuleshov

Abstract: Estimating the effect of intervention from observational data while accounting for confounding variables is a key task in causal inference. Oftentimes, the confounders are unobserved, but we have access to large amounts of additional unstructured data (images, text) that contain valuable proxy signal about the missing confounders. This paper argues that leveraging this unstructured data can greatl… ▽ More Estimating the effect of intervention from observational data while accounting for confounding variables is a key task in causal inference. Oftentimes, the confounders are unobserved, but we have access to large amounts of additional unstructured data (images, text) that contain valuable proxy signal about the missing confounders. This paper argues that leveraging this unstructured data can greatly improve the accuracy of causal effect estimation. Specifically, we introduce deep multi-modal structural equations, a generative model for causal effect estimation in which confounders are latent variables and unstructured data are proxy variables. This model supports multiple multi-modal proxies (images, text) as well as missing data. We empirically demonstrate that our approach outperforms existing methods based on propensity scores and corrects for confounding using unstructured inputs on tasks in genomics and healthcare. Our methods can potentially support the use of large amounts of data that were previously not used in causal inference △ Less

Submitted 11 December, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

Comments: NeurIPS 2022 (accepted version)

arXiv:2203.02649 [pdf, other]

Towards an Antivirus for Quantum Computers

Authors: Sanjay Deshpande, Chuanqi Xu, Theodoros Trochatos, Yongshan Ding, Jakub Szefer

Abstract: Researchers are today exploring models for cloud-based usage of quantum computers where multi-tenancy can be used to share quantum computer hardware among multiple users. Multi-tenancy has a promise of allowing better utilization of the quantum computer hardware, but also opens up the quantum computer to new types of security attacks. As this and other recent research shows, it is possible to perf… ▽ More Researchers are today exploring models for cloud-based usage of quantum computers where multi-tenancy can be used to share quantum computer hardware among multiple users. Multi-tenancy has a promise of allowing better utilization of the quantum computer hardware, but also opens up the quantum computer to new types of security attacks. As this and other recent research shows, it is possible to perform a fault injection attack using crosstalk on quantum computers when a victim and attacker circuits are instantiated as co-tenants on the same quantum computer. To ensure such attacks do not happen, this paper proposes that new techniques should be developed to help catch malicious circuits before they are loaded onto quantum computer hardware. Following ideas from classical computers, a compile-time technique can be designed to scan quantum computer programs for malicious or suspicious code patterns before they are compiled into quantum circuits that run on a quantum computer. This paper presents ongoing work which demonstrates how crosstalk can affect Grover's algorithm, and then presents suggestions of how quantum programs could be analyzed to catch circuits that generate large amounts of crosstalk with malicious intent. △ Less

Submitted 4 March, 2022; originally announced March 2022.

Comments: 4 pages, 5 figures, HOST 2022 author version

arXiv:2203.02510 [pdf, ps, other]

Cellular Segmentation and Composition in Routine Histology Images using Deep Learning

Authors: Muhammad Dawood, Raja Muhammad Saad Bashir, Srijay Deshpande, Manahil Raza, Adam Shephard

Abstract: Identification and quantification of nuclei in colorectal cancer haematoxylin \& eosin (H\&E) stained histology images is crucial to prognosis and patient management. In computational pathology these tasks are referred to as nuclear segmentation, classification and composition and are used to extract meaningful interpretable cytological and architectural features for downstream analysis. The CoNIC… ▽ More Identification and quantification of nuclei in colorectal cancer haematoxylin \& eosin (H\&E) stained histology images is crucial to prognosis and patient management. In computational pathology these tasks are referred to as nuclear segmentation, classification and composition and are used to extract meaningful interpretable cytological and architectural features for downstream analysis. The CoNIC challenge poses the task of automated nuclei segmentation, classification and composition into six different types of nuclei from the largest publicly known nuclei dataset - Lizard. In this regard, we have developed pipelines for the prediction of nuclei segmentation using HoVer-Net and ALBRT for cellular composition. On testing on the preliminary test set, HoVer-Net achieved a PQ of 0.58, a PQ+ of 0.58 and finally a mPQ+ of 0.35. For the prediction of cellular composition with ALBRT on the preliminary test set, we achieved an overall $R^2$ score of 0.53, consisting of 0.84 for lymphocytes, 0.70 for epithelial cells, 0.70 for plasma and .060 for eosinophils. △ Less

Submitted 4 March, 2022; originally announced March 2022.

arXiv:2203.01183 [pdf]

doi 10.1109/CSCN53733.2021.9686150

Omnidirectional MediA Format (OMAF): Toolbox for Virtual Reality Services

Authors: Sachin Deshpande, Miska M. Hannuksela

Abstract: This paper provides an overview of the Omnidirectional Media Format (OMAF) standard, second edition, which has been recently finalized. OMAF specifies the media format for coding, storage, delivery, and rendering of omnidirectional media, including video, audio, images, and timed text. Additionally, OMAF supports multiple viewpoints corresponding to omnidirectional cameras and overlay images or vi… ▽ More This paper provides an overview of the Omnidirectional Media Format (OMAF) standard, second edition, which has been recently finalized. OMAF specifies the media format for coding, storage, delivery, and rendering of omnidirectional media, including video, audio, images, and timed text. Additionally, OMAF supports multiple viewpoints corresponding to omnidirectional cameras and overlay images or video rendered over the omnidirectional background image or video. Many examples of usage scenarios for multiple viewpoints and overlays are described in the paper. OMAF provides a toolbox of features, which can be selectively used in virtual reality services. Consequently, the paper presents the interoperability points specified in the OMAF standard, which enable signaling which OMAF features are in use or required to be supported in implementations. Finally, the paper summarizes which OMAF interoperability points have been taken into use in virtual reality service specifications by the 3rd Generation Partnership Project (3GPP) and the Virtual Reality Industry Forum (VRIF). △ Less

Submitted 2 March, 2022; originally announced March 2022.

Comments: 7 pages, 1 figure. This document is the accepted version of the paper that has been published in 2021 IEEE Conference on Standards for Communications and Networking (CSCN)

Journal ref: 2021 IEEE Conference on Standards for Communications and Networking (CSCN), 2021, pp. 20-25

arXiv:2202.09427 [pdf, other]

Predicting deformation mechanisms in architected metamaterials using GNN

Authors: Padmeya Prashant Indurkar, Sri Karlapati, Angkur Jyoti Dipanka Shaikeea, Vikram S. Deshpande

Abstract: The present paradigm in design and modelling of lattice architected mechanical metamaterials is mostly limited to traditional numerical methods like finite element analysis. Recently, the use of machine learning and artificial intelligence techniques have become popular and here we extend these ideas to architected metamaterials. We show that truss based lattices have a natural resemblance to comp… ▽ More The present paradigm in design and modelling of lattice architected mechanical metamaterials is mostly limited to traditional numerical methods like finite element analysis. Recently, the use of machine learning and artificial intelligence techniques have become popular and here we extend these ideas to architected metamaterials. We show that truss based lattices have a natural resemblance to computational graphs which serve as an input for the rapidly emerging field of graph neural networks (GNNs). A dataset comprising thousands of such extremely complex lattices is trained using a GNN to predict the underlying dominant deformation mechanism viz. stretching and bending. The trained GNN achieves > 90% accuracy on a previously unseen complex lattice dataset. Such graph-based learning of metamaterials has the capability to predict a range of properties, from elastic moduli to fracture toughness and promises AI driven discovery of emergent metamaterials possessing superlative properties. △ Less

Submitted 5 March, 2022; v1 submitted 18 February, 2022; originally announced February 2022.

Comments: 7 pages, 6 figures

arXiv:2201.04957 [pdf]

Dielectric Properties of Polysulfone Carbon Nanotube Composite Membranes

Authors: Bhakti Hirani, P. S. Goyal, Deepali Shrivastava, S. K. Deshpande

Abstract: Polymeric membranes, including Polysulfone (PSf) membranes, are routinely used for water treatment. To enhance water permeation of above membranes, it is common to synthesize polymeric membranes with carbon nanotubes (CNTs) embedded in them. It is seen that water permeability of membranes having vertically aligned CNTs is higher, as compared to those where CNTs are not aligned. It is of interest t… ▽ More Polymeric membranes, including Polysulfone (PSf) membranes, are routinely used for water treatment. To enhance water permeation of above membranes, it is common to synthesize polymeric membranes with carbon nanotubes (CNTs) embedded in them. It is seen that water permeability of membranes having vertically aligned CNTs is higher, as compared to those where CNTs are not aligned. It is of interest to examine if the dielectric constant of a CNT based nanocomposite membrane is sensitive to alignment of CNTs or not. This paper reports dielectric properties of PSf-MWCNT membranes, both, for aligned and unaligned MWCNTs. Multi Walled Carbon Nanotubes (MWCNTs) based polysulfone membranes were synthesized using standard methods. MWCNTs in above membranes were aligned by casting the membrane in presence of magnetic field. The present paper, for the first time, shows that the above result is valid for membranes also. △ Less

Submitted 6 January, 2022; originally announced January 2022.

Comments: Conference on Technologies for Future Cities 2021

arXiv:2112.07184 [pdf, other]

Calibrated and Sharp Uncertainties in Deep Learning via Density Estimation

Authors: Volodymyr Kuleshov, Shachi Deshpande

Abstract: Accurate probabilistic predictions can be characterized by two properties -- calibration and sharpness. However, standard maximum likelihood training yields models that are poorly calibrated and thus inaccurate -- a 90% confidence interval typically does not contain the true outcome 90% of the time. This paper argues that calibration is important in practice and is easy to maintain by performing l… ▽ More Accurate probabilistic predictions can be characterized by two properties -- calibration and sharpness. However, standard maximum likelihood training yields models that are poorly calibrated and thus inaccurate -- a 90% confidence interval typically does not contain the true outcome 90% of the time. This paper argues that calibration is important in practice and is easy to maintain by performing low-dimensional density estimation. We introduce a simple training procedure based on recalibration that yields calibrated models without sacrificing overall performance; unlike previous approaches, ours ensures the most general property of distribution calibration and applies to any model, including neural networks. We formally prove the correctness of our procedure assuming that we can estimate densities in low dimensions and we establish uniform convergence bounds. Our results yield empirical performance improvements on linear and deep Bayesian models and suggest that calibration should be increasingly leveraged across machine learning. △ Less

Submitted 19 September, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

ACM Class: I.2; I.5

arXiv:2112.04620 [pdf, other]

Online Calibrated and Conformal Prediction Improves Bayesian Optimization

Authors: Shachi Deshpande, Charles Marx, Volodymyr Kuleshov

Abstract: Accurate uncertainty estimates are important in sequential model-based decision-making tasks such as Bayesian optimization. However, these estimates can be imperfect if the data violates assumptions made by the model (e.g., Gaussianity). This paper studies which uncertainties are needed in model-based decision-making and in Bayesian optimization, and argues that uncertainties can benefit from cali… ▽ More Accurate uncertainty estimates are important in sequential model-based decision-making tasks such as Bayesian optimization. However, these estimates can be imperfect if the data violates assumptions made by the model (e.g., Gaussianity). This paper studies which uncertainties are needed in model-based decision-making and in Bayesian optimization, and argues that uncertainties can benefit from calibration -- i.e., an 80% predictive interval should contain the true outcome 80% of the time. Maintaining calibration, however, can be challenging when the data is non-stationary and depends on our actions. We propose using simple algorithms based on online learning to provably maintain calibration on non-i.i.d. data, and we show how to integrate these algorithms in Bayesian optimization with minimal overhead. Empirically, we find that calibrated Bayesian optimization converges to better optima in fewer steps, and we demonstrate improved performance on standard benchmark functions and hyperparameter optimization tasks. △ Less

Submitted 25 June, 2024; v1 submitted 8 December, 2021; originally announced December 2021.

ACM Class: I.2; I.5

Journal ref: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, May 2024; PMLR 238:1450-1458

Showing 1–50 of 109 results for author: Deshpande, S