-
Revisiting the decoupling limit of the Georgi-Machacek model with a scalar singlet
Authors:
Geneviève Bélanger,
Juhi Dutta,
Rohini M. Godbole,
Sabine Kraml,
Manimala Mitra,
Rojalin Padhan,
Abhishek Roy
Abstract:
We study the connection between collider and dark matter phenomenology in the singlet extension of the Georgi-Machacek model. In this framework, the singlet scalar serves as a suitable thermal dark matter (DM) candidate. Our focus lies on the region $v_χ<1$ GeV, where $v_χ$ is the common vacuum expectation value of the neutral components of the scalar triplets of the model. Setting bounds on the m…
▽ More
We study the connection between collider and dark matter phenomenology in the singlet extension of the Georgi-Machacek model. In this framework, the singlet scalar serves as a suitable thermal dark matter (DM) candidate. Our focus lies on the region $v_χ<1$ GeV, where $v_χ$ is the common vacuum expectation value of the neutral components of the scalar triplets of the model. Setting bounds on the model parameters from theoretical, electroweak precision and LHC experimental constraints, we find that the BSM Higgs sector is highly constrained. Allowed values for the masses of the custodial fiveplets, triplets and singlet are restricted to the range $140~ {\rm GeV }< M_{H_5} < 350~ {\rm GeV }$, $150~ {\rm GeV }< M_{H_3} < 270 ~{\rm GeV }$ and $145~ {\rm GeV }< M_{H} < 300~ {\rm GeV }$. The extended scalar sector provides new channels for DM annihilation into BSM scalars that allow to satisfy the observed relic density constraint while being consistent with direct DM detection limits. The allowed region of the parameter space of the model can be explored in the upcoming DM detection experiments, both direct and indirect. In particular, the possible high values of BR$(H^0_5\toγγ)$ can lead to an indirect DM signal within the reach of CTA. The same feature also provides the possibility of exploring the model at the High-Luminosity run of the LHC. In a simple cut-based analysis, we find that a signal of about $4σ$ significance can be achieved in final states with at least two photons for one of our benchmark points.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
The Evolution of Multimodal Model Architectures
Authors:
Shakti N. Wadekar,
Abhishek Chaurasia,
Aman Chadha,
Eugenio Culurciello
Abstract:
This work uniquely identifies and characterizes four prevalent multimodal model architectural patterns in the contemporary multimodal landscape. Systematically categorizing models by architecture type facilitates monitoring of developments in the multimodal domain. Distinct from recent survey papers that present general information on multimodal architectures, this research conducts a comprehensiv…
▽ More
This work uniquely identifies and characterizes four prevalent multimodal model architectural patterns in the contemporary multimodal landscape. Systematically categorizing models by architecture type facilitates monitoring of developments in the multimodal domain. Distinct from recent survey papers that present general information on multimodal architectures, this research conducts a comprehensive exploration of architectural details and identifies four specific architectural types. The types are distinguished by their respective methodologies for integrating multimodal inputs into the deep neural network model. The first two types (Type A and B) deeply fuses multimodal inputs within the internal layers of the model, whereas the following two types (Type C and D) facilitate early fusion at the input stage. Type-A employs standard cross-attention, whereas Type-B utilizes custom-designed layers for modality fusion within the internal layers. On the other hand, Type-C utilizes modality-specific encoders, while Type-D leverages tokenizers to process the modalities at the model's input stage. The identified architecture types aid the monitoring of any-to-any multimodal model development. Notably, Type-C and Type-D are currently favored in the construction of any-to-any multimodal models. Type-C, distinguished by its non-tokenizing multimodal model architecture, is emerging as a viable alternative to Type-D, which utilizes input-tokenizing techniques. To assist in model selection, this work highlights the advantages and disadvantages of each architecture type based on data and compute requirements, architecture complexity, scalability, simplification of adding modalities, training objectives, and any-to-any multimodal generation capability.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control
Authors:
Litu Rout,
Yujia Chen,
Nataniel Ruiz,
Abhishek Kumar,
Constantine Caramanis,
Sanjay Shakkottai,
Wen-Sheng Chu
Abstract:
We propose Reference-Based Modulation (RB-Modulation), a new plug-and-play solution for training-free personalization of diffusion models. Existing training-free approaches exhibit difficulties in (a) style extraction from reference images in the absence of additional style or content text descriptions, (b) unwanted content leakage from reference style images, and (c) effective composition of styl…
▽ More
We propose Reference-Based Modulation (RB-Modulation), a new plug-and-play solution for training-free personalization of diffusion models. Existing training-free approaches exhibit difficulties in (a) style extraction from reference images in the absence of additional style or content text descriptions, (b) unwanted content leakage from reference style images, and (c) effective composition of style and content. RB-Modulation is built on a novel stochastic optimal controller where a style descriptor encodes the desired attributes through a terminal cost. The resulting drift not only overcomes the difficulties above, but also ensures high fidelity to the reference style and adheres to the given text prompt. We also introduce a cross-attention-based feature aggregation scheme that allows RB-Modulation to decouple content and style from the reference image. With theoretical justification and empirical evidence, our framework demonstrates precise extraction and control of content and style in a training-free manner. Further, our method allows a seamless composition of content and style, which marks a departure from the dependency on external adapters or ControlNets.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Generalized hydrodynamics and approach to Generalized Gibbs equilibrium for a classical harmonic chain
Authors:
Saurav Pandey,
Abhishek Dhar,
Anupam Kundu
Abstract:
We study the evolution of a classical harmonic chain with nearest-neighbor interactions starting from domain wall initial conditions. The initial state is taken to be either a product of two Gibbs Ensembles (GEs) with unequal temperatures on the two halves of the chain or a product of two Generalized Gibbs Ensembles (GGEs) with different parameters in the two halves. For this system, we construct…
▽ More
We study the evolution of a classical harmonic chain with nearest-neighbor interactions starting from domain wall initial conditions. The initial state is taken to be either a product of two Gibbs Ensembles (GEs) with unequal temperatures on the two halves of the chain or a product of two Generalized Gibbs Ensembles (GGEs) with different parameters in the two halves. For this system, we construct the Wigner function and demonstrate that its evolution defines the Generalized Hydrodynamics (GHD) describing the evolution of the conserved quantities. We solve the GHD for both finite and infinite chains and compute the evolution of conserved densities and currents. For a finite chain with fixed boundaries, we show that these quantities relax as $\sim 1/\sqrt{t}$ to their respective steady-state values given by the final expected GE or GGE state, depending on the initial conditions. Exact expressions for the Lagrange multipliers of the final expected GGE state are obtained in terms of the steady state densities. In the case of an infinite chain, we find that the conserved densities and currents at any finite time exhibit ballistic scaling while, at infinite time, any finite segment of the system can be described by a current-carrying non-equilibrium steady state (NESS). We compute the scaling functions analytically and show that the relaxation to the NESS occurs as $\sim 1/t$ for the densities and as $\sim 1/t^2$ for the currents. We compare the analytic results from hydrodynamics with those from exact microscopic numerics and find excellent agreement.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models
Authors:
Abhishek Kumar,
Robert Morabito,
Sanzhar Umbet,
Jad Kabbara,
Ali Emami
Abstract:
As the use of Large Language Models (LLMs) becomes more widespread, understanding their self-evaluation of confidence in generated responses becomes increasingly important as it is integral to the reliability of the output of these models. We introduce the concept of Confidence-Probability Alignment, that connects an LLM's internal confidence, quantified by token probabilities, to the confidence c…
▽ More
As the use of Large Language Models (LLMs) becomes more widespread, understanding their self-evaluation of confidence in generated responses becomes increasingly important as it is integral to the reliability of the output of these models. We introduce the concept of Confidence-Probability Alignment, that connects an LLM's internal confidence, quantified by token probabilities, to the confidence conveyed in the model's response when explicitly asked about its certainty. Using various datasets and prompting techniques that encourage model introspection, we probe the alignment between models' internal and expressed confidence. These techniques encompass using structured evaluation scales to rate confidence, including answer options when prompting, and eliciting the model's confidence level for outputs it does not recognize as its own. Notably, among the models analyzed, OpenAI's GPT-4 showed the strongest confidence-probability alignment, with an average Spearman's $\hatρ$ of 0.42, across a wide range of tasks. Our work contributes to the ongoing efforts to facilitate risk assessment in the application of LLMs and to further our understanding of model trustworthiness.
△ Less
Submitted 15 June, 2024; v1 submitted 25 May, 2024;
originally announced May 2024.
-
Arbitrage equilibria in active matter systems
Authors:
Venkat Venkatasubramanian,
Abhishek Sivaram,
N. Sanjeevrajan,
Arun Sankar
Abstract:
The motility-induced phase separation (MIPS) phenomenon in active matter has been of great interest for the past decade or so. A central conceptual puzzle is that this behavior, which is generally characterized as a nonequilibrium phenomenon, can yet be explained using simple equilibrium models of thermodynamics. Here, we address this problem using a new theory, statistical teleodynamics, which is…
▽ More
The motility-induced phase separation (MIPS) phenomenon in active matter has been of great interest for the past decade or so. A central conceptual puzzle is that this behavior, which is generally characterized as a nonequilibrium phenomenon, can yet be explained using simple equilibrium models of thermodynamics. Here, we address this problem using a new theory, statistical teleodynamics, which is a conceptual synthesis of game theory and statistical mechanics. In this framework, active agents compete in their pursuit of maximum effective utility, and this self-organizing dynamics results in an arbitrage equilibrium in which all agents have the same effective utility. We show that MIPS is an example of arbitrage equilibrium and that it is mathematically equivalent to other phase-separation phenomena in entirely different domains, such as sociology and economics. As examples, we present the behavior of Janus particles in a potential trap and the effect of chemotaxis on MIPS.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
Digitized Counterdiabatic Quantum Algorithms for Logistics Scheduling
Authors:
Archismita Dalal,
Iraitz Montalban,
Narendra N. Hegade,
Alejandro Gomez Cadavid,
Enrique Solano,
Abhishek Awasthi,
Davide Vodola,
Caitlin Jones,
Horst Weiss,
Gernot Füchsel
Abstract:
We study a job shop scheduling problem for an automatized robot in a high-throughput laboratory and a travelling salesperson problem with recently proposed digitized counterdiabatic quantum optimization (DCQO) algorithms. In DCQO, we find the solution of an optimization problem via an adiabatic quantum dynamics, which is accelerated with counterdiabatic protocols. Thereafter, we digitize the globa…
▽ More
We study a job shop scheduling problem for an automatized robot in a high-throughput laboratory and a travelling salesperson problem with recently proposed digitized counterdiabatic quantum optimization (DCQO) algorithms. In DCQO, we find the solution of an optimization problem via an adiabatic quantum dynamics, which is accelerated with counterdiabatic protocols. Thereafter, we digitize the global unitary to encode it in a digital quantum computer. For the job-shop scheduling problem, we aim at finding the optimal schedule for a robot executing a number of tasks under specific constraints, such that the total execution time of the process is minimized. For the traveling salesperson problem, the goal is to find the path that covers all cities and is associated with the shortest traveling distance. We consider both hybrid and pure versions of DCQO algorithms and benchmark the performance against digitized quantum annealing and the quantum approximate optimization algorithm (QAOA). In comparison to QAOA, the DCQO solution is improved by several orders of magnitude in success probability using the same number of two-qubit gates. Moreover, we experimentally implement our algorithms on superconducting and trapped-ion quantum processors. Our results demonstrate that circuit compression using counterdiabatic protocols is amenable to current NISQ hardware and can solve logistics scheduling problems, where other digital quantum algorithms show insufficient performance.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Design and fabrication of autonomous electronic lablets for chemical control
Authors:
John S. McCaskill,
Thomas Maeke,
Dominic Funke,
Pierre Mayr,
Abhishek Sharma,
Patrick F. Wagler,
Jürgen Oehm
Abstract:
Lablets are autonomous microscopic particles with programmable CMOS electronics that canvcontrol electrokinetic phenomena and electrochemical reactions in solution via actuator and sensor microelectrodes. The lablets are designed to be rechargeable using an integrated supercapacitor, and to allow docking to one another or to a smart surface for interchange of energy, electronic information and che…
▽ More
Lablets are autonomous microscopic particles with programmable CMOS electronics that canvcontrol electrokinetic phenomena and electrochemical reactions in solution via actuator and sensor microelectrodes. The lablets are designed to be rechargeable using an integrated supercapacitor, and to allow docking to one another or to a smart surface for interchange of energy, electronic information and chemicals. In this paper, we describe the design and fabrication of singulated lablets (CMOS2) at the scale of 100 by 200 μm, with the supercap adjacent to the functional lablet and occupying half the space. In other works, we have characterized the supercap and described the electronic design and proven functionality using arrays of these lablets. Here we present fabrication details for integrating functional coatings and the supercap and demonstrate electronic functionality of the lablets following singulation.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Semantic Aware Diffusion Inverse Tone Map**
Authors:
Abhishek Goswami,
Aru Ranjan Singh,
Francesco Banterle,
Kurt Debattista,
Thomas Bashford-Rogers
Abstract:
The range of real-world scene luminance is larger than the capture capability of many digital camera sensors which leads to details being lost in captured images, most typically in bright regions. Inverse tone map** attempts to boost these captured Standard Dynamic Range (SDR) images back to High Dynamic Range (HDR) by creating a map** that linearizes the well exposed values from the SDR image…
▽ More
The range of real-world scene luminance is larger than the capture capability of many digital camera sensors which leads to details being lost in captured images, most typically in bright regions. Inverse tone map** attempts to boost these captured Standard Dynamic Range (SDR) images back to High Dynamic Range (HDR) by creating a map** that linearizes the well exposed values from the SDR image, and provides a luminance boost to the clipped content. However, in most cases, the details in the clipped regions cannot be recovered or estimated. In this paper, we present a novel inverse tone map** approach for map** SDR images to HDR that generates lost details in clipped regions through a semantic-aware diffusion based inpainting approach. Our method proposes two major contributions - first, we propose to use a semantic graph to guide SDR diffusion based inpainting in masked regions in a saturated image. Second, drawing inspiration from traditional HDR imaging and bracketing methods, we propose a principled formulation to lift the SDR inpainted regions to HDR that is compatible with generative inpainting methods. Results show that our method demonstrates superior performance across different datasets on objective metrics, and subjective experiments show that the proposed method matches (and in most cases outperforms) state-of-art inverse tone map** operators in terms of objective metrics and outperforms them for visual fidelity.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Necessity of Quantizable Geometry for Quantum Gravity
Authors:
Abhishek Kumar Mehta
Abstract:
In this paper, Dirac Quantization of $3D$ gravity in the first-order formalism is attempted where instead of quantizing the connection and triad fields, the connection and the triad 1-forms themselves are quantized. The exterior derivative operator on the space of differential forms is treated as the `time' derivative to compute the momenta conjugate to these 1-forms. This manner of quantization a…
▽ More
In this paper, Dirac Quantization of $3D$ gravity in the first-order formalism is attempted where instead of quantizing the connection and triad fields, the connection and the triad 1-forms themselves are quantized. The exterior derivative operator on the space of differential forms is treated as the `time' derivative to compute the momenta conjugate to these 1-forms. This manner of quantization allows one to compute the transition amplitude in $3D$ gravity which has a close, but not exact, match with the transition amplitude computed via LQG techniques. This inconsistency is interpreted as being due to the non-quantizable nature of differential geometry.
△ Less
Submitted 10 June, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Subtle Biases Need Subtler Measures: Dual Metrics for Evaluating Representative and Affinity Bias in Large Language Models
Authors:
Abhishek Kumar,
Sarfaroz Yunusov,
Ali Emami
Abstract:
Research on Large Language Models (LLMs) has often neglected subtle biases that, although less apparent, can significantly influence the models' outputs toward particular social narratives. This study addresses two such biases within LLMs: representative bias, which denotes a tendency of LLMs to generate outputs that mirror the experiences of certain identity groups, and affinity bias, reflecting…
▽ More
Research on Large Language Models (LLMs) has often neglected subtle biases that, although less apparent, can significantly influence the models' outputs toward particular social narratives. This study addresses two such biases within LLMs: representative bias, which denotes a tendency of LLMs to generate outputs that mirror the experiences of certain identity groups, and affinity bias, reflecting the models' evaluative preferences for specific narratives or viewpoints. We introduce two novel metrics to measure these biases: the Representative Bias Score (RBS) and the Affinity Bias Score (ABS), and present the Creativity-Oriented Generation Suite (CoGS), a collection of open-ended tasks such as short story writing and poetry composition, designed with customized rubrics to detect these subtle biases. Our analysis uncovers marked representative biases in prominent LLMs, with a preference for identities associated with being white, straight, and men. Furthermore, our investigation of affinity bias reveals distinctive evaluative patterns within each model, akin to `bias fingerprints'. This trend is also seen in human evaluators, highlighting a complex interplay between human and machine bias perceptions.
△ Less
Submitted 3 June, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Multi-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing
Authors:
Amutheezan Sivagnanam,
Ava Pettet,
Hunter Lee,
Ayan Mukhopadhyay,
Abhishek Dubey,
Aron Laszka
Abstract:
An emergency responder management (ERM) system dispatches responders, such as ambulances, when it receives requests for medical aid. ERM systems can also proactively reposition responders between predesignated waiting locations to cover any gaps that arise due to the prior dispatch of responders or significant changes in the distribution of anticipated requests. Optimal repositioning is computatio…
▽ More
An emergency responder management (ERM) system dispatches responders, such as ambulances, when it receives requests for medical aid. ERM systems can also proactively reposition responders between predesignated waiting locations to cover any gaps that arise due to the prior dispatch of responders or significant changes in the distribution of anticipated requests. Optimal repositioning is computationally challenging due to the exponential number of ways to allocate responders between locations and the uncertainty in future requests. The state-of-the-art approach in proactive repositioning is a hierarchical approach based on spatial decomposition and online Monte Carlo tree search, which may require minutes of computation for each decision in a domain where seconds can save lives. We address the issue of long decision times by introducing a novel reinforcement learning (RL) approach, based on the same hierarchical decomposition, but replacing online search with learning. To address the computational challenges posed by large, variable-dimensional, and discrete state and action spaces, we propose: (1) actor-critic based agents that incorporate transformers to handle variable-dimensional states and actions, (2) projections to fixed-dimensional observations to handle complex states, and (3) combinatorial techniques to map continuous actions to discrete allocations. We evaluate our approach using real-world data from two U.S. cities, Nashville, TN and Seattle, WA. Our experiments show that compared to the state of the art, our approach reduces computation time per decision by three orders of magnitude, while also slightly reducing average ambulance response time by 5 seconds.
△ Less
Submitted 8 June, 2024; v1 submitted 21 May, 2024;
originally announced May 2024.
-
Probing CP Violation and Mass Hierarchy in Neutrino Oscillations in Matter through Quantum Speed Limits
Authors:
Subhadip Bouri,
Abhishek Kumar Jha,
Subhashish Banerjee
Abstract:
The quantum speed limits (QSLs) set fundamental lower bounds on the time required for a quantum system to evolve from a given initial state to a final state. In this work, we investigate CP violation and the mass hierarchy problem of neutrino oscillations in matter using the QSL time as a key analytical tool. We examine the QSL time for the unitary evolution of two- and three-flavor neutrino state…
▽ More
The quantum speed limits (QSLs) set fundamental lower bounds on the time required for a quantum system to evolve from a given initial state to a final state. In this work, we investigate CP violation and the mass hierarchy problem of neutrino oscillations in matter using the QSL time as a key analytical tool. We examine the QSL time for the unitary evolution of two- and three-flavor neutrino states, both in vacuum and in the presence of matter. Two-flavor neutrino oscillations are used as a precursor to their three-flavor counterparts. We further compute the QSL time for neutrino state evolution and entanglement in terms of neutrino survival and oscillation probabilities, which are experimentally measurable quantities in neutrino experiments. A difference in the QSL time between the normal and inverted mass hierarchy scenarios, for neutrino state evolution as well as for entanglement, under the effect of a CP violation phase is observed. Our results are illustrated using energy-varying sets of accelerator neutrino sources from experiments such as T2K, NOvA, and DUNE. Notably, three-flavor neutrino oscillations in constant matter density exhibit faster state evolution across all these neutrino experiments in the normal mass hierarchy scenario. Additionally, we observe fast entanglement growth in DUNE assuming a normal mass hierarchy.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Stochastic Learning of Computational Resource Usage as Graph Structured Multimarginal Schrödinger Bridge
Authors:
Georgiy A. Bondar,
Robert Gifford,
Linh Thi Xuan Phan,
Abhishek Halder
Abstract:
We propose to learn the time-varying stochastic computational resource usage of software as a graph structured Schrödinger bridge problem. In general, learning the computational resource usage from data is challenging because resources such as the number of CPU instructions and the number of last level cache requests are both time-varying and statistically correlated. Our proposed method enables l…
▽ More
We propose to learn the time-varying stochastic computational resource usage of software as a graph structured Schrödinger bridge problem. In general, learning the computational resource usage from data is challenging because resources such as the number of CPU instructions and the number of last level cache requests are both time-varying and statistically correlated. Our proposed method enables learning the joint time-varying stochasticity in computational resource usage from the measured profile snapshots in a nonparametric manner. The method can be used to predict the most-likely time-varying distribution of computational resource availability at a desired time. We provide detailed algorithms for stochastic learning in both single and multi-core cases, discuss the convergence guarantees, computational complexities, and demonstrate their practical use in two case studies: a single-core nonlinear model predictive controller, and a synthetic multi-core software.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
URDFormer: A Pipeline for Constructing Articulated Simulation Environments from Real-World Images
Authors:
Zoey Chen,
Aaron Walsman,
Marius Memmel,
Kaichun Mo,
Alex Fang,
Karthikeya Vemuri,
Alan Wu,
Dieter Fox,
Abhishek Gupta
Abstract:
Constructing simulation scenes that are both visually and physically realistic is a problem of practical interest in domains ranging from robotics to computer vision. This problem has become even more relevant as researchers wielding large data-hungry learning methods seek new sources of training data for physical decision-making systems. However, building simulation models is often still done by…
▽ More
Constructing simulation scenes that are both visually and physically realistic is a problem of practical interest in domains ranging from robotics to computer vision. This problem has become even more relevant as researchers wielding large data-hungry learning methods seek new sources of training data for physical decision-making systems. However, building simulation models is often still done by hand. A graphic designer and a simulation engineer work with predefined assets to construct rich scenes with realistic dynamic and kinematic properties. While this may scale to small numbers of scenes, to achieve the generalization properties that are required for data-driven robotic control, we require a pipeline that is able to synthesize large numbers of realistic scenes, complete with 'natural' kinematic and dynamic structures. To attack this problem, we develop models for inferring structure and generating simulation scenes from natural images, allowing for scalable scene generation from web-scale datasets. To train these image-to-simulation models, we show how controllable text-to-image generative models can be used in generating paired training data that allows for modeling of the inverse problem, map** from realistic images back to complete scene models. We show how this paradigm allows us to build large datasets of scenes in simulation with semantic and physical realism. We present an integrated end-to-end pipeline that generates simulation scenes complete with articulated kinematic and dynamic structures from real-world images and use these for training robotic control policies. We then robustly deploy in the real world for tasks like articulated object manipulation. In doing so, our work provides both a pipeline for large-scale generation of simulation environments and an integrated system for training robust robotic control policies in the resulting environments.
△ Less
Submitted 31 May, 2024; v1 submitted 19 May, 2024;
originally announced May 2024.
-
Arbitrage equilibrium and the emergence of universal microstructure in deep neural networks
Authors:
Venkat Venkatasubramanian,
N Sanjeevrajan,
Manasi Khandekar,
Abhishek Sivaram,
Collin Szczepanski
Abstract:
Despite the stunning progress recently in large-scale deep neural network applications, our understanding of their microstructure, 'energy' functions, and optimal design remains incomplete. Here, we present a new game-theoretic framework, called statistical teleodynamics, that reveals important insights into these key properties. The optimally robust design of such networks inherently involves com…
▽ More
Despite the stunning progress recently in large-scale deep neural network applications, our understanding of their microstructure, 'energy' functions, and optimal design remains incomplete. Here, we present a new game-theoretic framework, called statistical teleodynamics, that reveals important insights into these key properties. The optimally robust design of such networks inherently involves computational benefit-cost trade-offs that are not adequately captured by physics-inspired models. These trade-offs occur as neurons and connections compete to increase their effective utilities under resource constraints during training. In a fully trained network, this results in a state of arbitrage equilibrium, where all neurons in a given layer have the same effective utility, and all connections to a given layer have the same effective utility. The equilibrium is characterized by the emergence of two lognormal distributions of connection weights and neuronal output as the universal microstructure of large deep neural networks. We call such a network the Jaynes Machine. Our theoretical predictions are shown to be supported by empirical data from seven large-scale deep neural networks. We also show that the Hopfield network and the Boltzmann Machine are the same special case of the Jaynes Machine.
△ Less
Submitted 5 June, 2024; v1 submitted 29 March, 2024;
originally announced May 2024.
-
Diagnosing and Decoupling the Degradation Mechanisms in Lithium Ion Cells: An Estimation Approach
Authors:
Raja Abhishek Appana,
Faissal El Idrissi,
Prashanth Ramesh,
Marcello Canova,
Chun Yong Kang,
Kimoon Um
Abstract:
Understanding battery degradation in electric vehicles (EVs) under real-world conditions remains a critical yet under-explored area of research. Central to this investigation is the challenge of estimating the specific degradation modes in aged cells with no available information on usage history, bypassing the conventional yet invasive method of tear-down tests. Using an electrochemical model, th…
▽ More
Understanding battery degradation in electric vehicles (EVs) under real-world conditions remains a critical yet under-explored area of research. Central to this investigation is the challenge of estimating the specific degradation modes in aged cells with no available information on usage history, bypassing the conventional yet invasive method of tear-down tests. Using an electrochemical model, this study pioneers a methodology to decouple and isolate the aging mechanisms in batteries sourced from EVs with varying mileages. A robust correlation is established between the model parameters and distinct degradation processes, enabling the diagnosis and estimation of each mechanism's impact on the battery's parameters. This paper sheds light on battery degradation in real-world scenarios and demonstrates the feasibility of their identification, isolation, and approximate quantification of their effects.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Parameter Identification for Electrochemical Models of Lithium-Ion Batteries Using Bayesian Optimization
Authors:
Jianzong Pi,
Samuel Filgueira da Silva,
Mehmet Fatih Ozkan,
Abhishek Gupta,
Marcello Canova
Abstract:
Efficient parameter identification of electrochemical models is crucial for accurate monitoring and control of lithium-ion cells. This process becomes challenging when applied to complex models that rely on a considerable number of interdependent parameters that affect the output response. Gradient-based and metaheuristic optimization techniques, although previously employed for this task, are lim…
▽ More
Efficient parameter identification of electrochemical models is crucial for accurate monitoring and control of lithium-ion cells. This process becomes challenging when applied to complex models that rely on a considerable number of interdependent parameters that affect the output response. Gradient-based and metaheuristic optimization techniques, although previously employed for this task, are limited by their lack of robustness, high computational costs, and susceptibility to local minima. In this study, Bayesian Optimization is used for tuning the dynamic parameters of an electrochemical equivalent circuit battery model (E-ECM) for a nickel-manganese-cobalt (NMC)-graphite cell. The performance of the Bayesian Optimization is compared with baseline methods based on gradient-based and metaheuristic approaches. The robustness of the parameter optimization method is tested by performing verification using an experimental drive cycle. The results indicate that Bayesian Optimization outperforms Gradient Descent and PSO optimization techniques, achieving reductions on average testing loss by 28.8% and 5.8%, respectively. Moreover, Bayesian optimization significantly reduces the variance in testing loss by 95.8% and 72.7%, respectively.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Dealing Doubt: Unveiling Threat Models in Gradient Inversion Attacks under Federated Learning, A Survey and Taxonomy
Authors:
Yichuan Shi,
Olivera Kotevska,
Viktor Reshniak,
Abhishek Singh,
Ramesh Raskar
Abstract:
Federated Learning (FL) has emerged as a leading paradigm for decentralized, privacy preserving machine learning training. However, recent research on gradient inversion attacks (GIAs) have shown that gradient updates in FL can leak information on private training samples. While existing surveys on GIAs have focused on the honest-but-curious server threat model, there is a dearth of research categ…
▽ More
Federated Learning (FL) has emerged as a leading paradigm for decentralized, privacy preserving machine learning training. However, recent research on gradient inversion attacks (GIAs) have shown that gradient updates in FL can leak information on private training samples. While existing surveys on GIAs have focused on the honest-but-curious server threat model, there is a dearth of research categorizing attacks under the realistic and far more privacy-infringing cases of malicious servers and clients. In this paper, we present a survey and novel taxonomy of GIAs that emphasize FL threat models, particularly that of malicious servers and clients. We first formally define GIAs and contrast conventional attacks with the malicious attacker. We then summarize existing honest-but-curious attack strategies, corresponding defenses, and evaluation metrics. Critically, we dive into attacks with malicious servers and clients to highlight how they break existing FL defenses, focusing specifically on reconstruction methods, target model architectures, target data, and evaluation metrics. Lastly, we discuss open problems and future research directions.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Trapped-Ion Quantum Simulation of Electron Transfer Models with Tunable Dissipation
Authors:
Visal So,
Midhuna Duraisamy Suganthi,
Abhishek Menon,
Mingjian Zhu,
Roman Zhuravel,
Han Pu,
Peter G. Wolynes,
José N. Onuchic,
Guido Pagano
Abstract:
Electron transfer is at the heart of many fundamental physical, chemical, and biochemical processes essential for life. Exact simulation of reactions in these systems is often hindered by the large number of degrees of freedom and by the essential role of quantum effects. In this work, we experimentally simulate a paradigmatic model of molecular electron transfer using a multi-species trapped-ion…
▽ More
Electron transfer is at the heart of many fundamental physical, chemical, and biochemical processes essential for life. Exact simulation of reactions in these systems is often hindered by the large number of degrees of freedom and by the essential role of quantum effects. In this work, we experimentally simulate a paradigmatic model of molecular electron transfer using a multi-species trapped-ion crystal, where the donor-acceptor gap, the electronic and vibronic couplings, and the bath relaxation dynamics can all be controlled independently. We employ the ground-state qubit of one ion to simulate the electronic degree of freedom and the optical qubit of another ion to perform reservoir engineering on a collective mode encoding a reaction coordinate. We observe the real-time dynamics of the spin excitation, measuring the transfer rate in several regimes of adiabaticity and relaxation dynamics. The setup allows access to the electron transfer dynamics in the non-perturbative regime, where there is no clear hierarchy among the energy scales in the model, as has been suggested to be optimal for many rate phenomena, including photosynthesis. Our results provide a testing ground for increasingly rich models of molecular excitation transfer processes that are relevant for molecular electronics and light-harvesting systems.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
A note on the equivalence of Gromov boundary and metric boundary
Authors:
Vasudevarao Allu,
Abhishek Pandey
Abstract:
In this paper, we introduce the concept of quasihyperbolically visible spaces. As a tool, we study the connection between the Gromov boundary and the metric boundary.
In this paper, we introduce the concept of quasihyperbolically visible spaces. As a tool, we study the connection between the Gromov boundary and the metric boundary.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
SynthesizRR: Generating Diverse Datasets with Retrieval Augmentation
Authors:
Abhishek Divekar,
Greg Durrett
Abstract:
Large language models (LLMs) are versatile and can address many tasks, but for computational efficiency, it is often desirable to distill their capabilities into smaller student models. One way to do this for classification tasks is via dataset synthesis, which can be accomplished by generating examples of each label from the LLM. Prior approaches to synthesis use few-shot prompting, which relies…
▽ More
Large language models (LLMs) are versatile and can address many tasks, but for computational efficiency, it is often desirable to distill their capabilities into smaller student models. One way to do this for classification tasks is via dataset synthesis, which can be accomplished by generating examples of each label from the LLM. Prior approaches to synthesis use few-shot prompting, which relies on the LLM's parametric knowledge to generate usable examples. However, this leads to issues of repetition, bias towards popular entities, and stylistic differences from human text. In this work, we propose Synthesize by Retrieval and Refinement (SynthesizRR), which uses retrieval augmentation to introduce variety into the dataset synthesis process: as retrieved passages vary, the LLM is "seeded" with different content to generate its examples. We empirically study the synthesis of six datasets, covering topic classification, sentiment analysis, tone detection, and humor, requiring complex synthesis strategies. We find SynthesizRR greatly improves lexical and semantic diversity, similarity to human-written text, and distillation performance, when compared to standard 32-shot prompting and six baseline approaches.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Reward Centering
Authors:
Abhishek Naik,
Yi Wan,
Manan Tomar,
Richard S. Sutton
Abstract:
We show that discounted methods for solving continuing reinforcement learning problems can perform significantly better if they center their rewards by subtracting out the rewards' empirical average. The improvement is substantial at commonly used discount factors and increases further as the discount factor approaches one. In addition, we show that if a problem's rewards are shifted by a constant…
▽ More
We show that discounted methods for solving continuing reinforcement learning problems can perform significantly better if they center their rewards by subtracting out the rewards' empirical average. The improvement is substantial at commonly used discount factors and increases further as the discount factor approaches one. In addition, we show that if a problem's rewards are shifted by a constant, then standard methods perform much worse, whereas methods with reward centering are unaffected. Estimating the average reward is straightforward in the on-policy setting; we propose a slightly more sophisticated method for the off-policy setting. Reward centering is a general idea, so we expect almost every reinforcement-learning algorithm to benefit by the addition of reward centering.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
NH3 gas sensing over 2D Phosphorene sheet: A First-Principles Study
Authors:
Naresh Kumar,
Yogendra K. Gautam,
Soni Mishra,
Anuj Kumar,
Abhishek Kumar Mishra
Abstract:
First-principles based calculations were executed to investigate the sensing properties of ammonia gas molecules on two-dimensional pristine black phosphorene towards its application as a gas sensor and related applications. We discuss in detail, the interaction of ammonia gas molecules on the phosphorene single sheet through the structural change analysis, electronic band gap, Bader charge transf…
▽ More
First-principles based calculations were executed to investigate the sensing properties of ammonia gas molecules on two-dimensional pristine black phosphorene towards its application as a gas sensor and related applications. We discuss in detail, the interaction of ammonia gas molecules on the phosphorene single sheet through the structural change analysis, electronic band gap, Bader charge transfer, and density-of-states calculations. Our calculations indicate that the phosphorene could be used as a detector of ammonia, where good sensitivity and very short recovery time at room temperature have confirmed the potential use of phosphorene in the detection of ammonia.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Properties that allow or prohibit transferability of adversarial attacks among quantized networks
Authors:
Abhishek Shrestha,
Jürgen Großmann
Abstract:
Deep Neural Networks (DNNs) are known to be vulnerable to adversarial examples. Further, these adversarial examples are found to be transferable from the source network in which they are crafted to a black-box target network. As the trend of using deep learning on embedded devices grows, it becomes relevant to study the transferability properties of adversarial examples among compressed networks.…
▽ More
Deep Neural Networks (DNNs) are known to be vulnerable to adversarial examples. Further, these adversarial examples are found to be transferable from the source network in which they are crafted to a black-box target network. As the trend of using deep learning on embedded devices grows, it becomes relevant to study the transferability properties of adversarial examples among compressed networks. In this paper, we consider quantization as a network compression technique and evaluate the performance of transfer-based attacks when the source and target networks are quantized at different bitwidths. We explore how algorithm specific properties affect transferability by considering various adversarial example generation algorithms. Furthermore, we examine transferability in a more realistic scenario where the source and target networks may differ in bitwidth and other model-related properties like capacity and architecture. We find that although quantization reduces transferability, certain attack types demonstrate an ability to enhance it. Additionally, the average transferability of adversarial examples among quantized versions of a network can be used to estimate the transferability to quantized target networks with varying capacity and architecture.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
A Comparison of Electronic, Dielectric, and Thermoelectric Properties of Monolayer of HfX2N4(X = Si, Ge) through First-Principles Calculations
Authors:
Chayan Das,
Abhishek,
Dibyajyoti Saikia,
Appala Naidu Gandi,
Satyajit Sahu
Abstract:
The newly emerged two-dimensional (2D) materials family of MSi2N4, where M is a transition metal atom (i.e., Mo, W, etc.), has the potential to be named after the conventional and very popular transition metal di-chalcogenides (TMDC), which got their reputation for having bandgap tunability and high mobility. The HfSi2N4 and HfGe2N4 2D materials are members of the MSi2N4 family and possess very go…
▽ More
The newly emerged two-dimensional (2D) materials family of MSi2N4, where M is a transition metal atom (i.e., Mo, W, etc.), has the potential to be named after the conventional and very popular transition metal di-chalcogenides (TMDC), which got their reputation for having bandgap tunability and high mobility. The HfSi2N4 and HfGe2N4 2D materials are members of the MSi2N4 family and possess very good figure of merit (ZT) and have high mobility, proving their suitability for thermoelectric applications. The HfSi2N4 and HfGe2N4 showed considerable ZT of 0.90 and 0.89, respectively, for p-type and 0.83 and 0.79 for n-type, at 900 K along with high mobility according to the solutions obtained after solving the Boltzmann Transport Equation (BTE). The HfGe2N4 also showed a ZT of 0.84 at 600 K and 0.68 at 300 K, which is also excellent for low-temperature operation. The bandgaps (BG) obtained for HfSi2N4 and HfGe2N4 according to the Heyd-Scuseria-Ernzerhof (HSE) approximation were 2.89 eV and 2.75 eV. The first absorption peak showed in the blue region of the visible spectrum; from this, their usefulness in visible range photodetectors can also be inferred.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Revealing the Production Mechanism of High-Energy Neutrinos from NGC 1068
Authors:
Abhishek Das,
B. Theodore Zhang,
Kohta Murase
Abstract:
The detection of high-energy neutrino signals from the nearby Seyfert galaxy NGC 1068 provides us with an opportunity to study nonthermal processes near the center of supermassive black holes. Using the IceCube and latest Fermi-LAT data, we present general multimessenger constraints on the energetics of cosmic rays and the size of neutrino emission regions. In the photohadronic scenario, the requi…
▽ More
The detection of high-energy neutrino signals from the nearby Seyfert galaxy NGC 1068 provides us with an opportunity to study nonthermal processes near the center of supermassive black holes. Using the IceCube and latest Fermi-LAT data, we present general multimessenger constraints on the energetics of cosmic rays and the size of neutrino emission regions. In the photohadronic scenario, the required cosmic-ray luminosity should be larger than about 1-10 percent of the Eddington luminosity, and the emission radius should be less than about 15 Schwarzschild radii in low-beta plasma and less than about 3 Schwarzschild radii in high-beta plasma. The leptonic scenario overshoots the NuSTAR or Fermi-LAT data for any emission radii we consider, and the required gamma-ray luminosity is much larger than the Eddington luminosity. The beta decay scenario also violates not only the energetics requirement but also gamma-ray constraints especially when the Bethe-Heitler and photomeson production processes are consistently considered. Our results rule out the leptonic and beta decay scenarios in a nearly model-independent manner, and support hadronic mechanisms in magnetically-powered coronae if NGC 1068 is a source of high-energy neutrinos.
△ Less
Submitted 18 June, 2024; v1 submitted 15 May, 2024;
originally announced May 2024.
-
Identification via Permutation Channels
Authors:
Abhishek Sarkar,
Bikash Kumar Dey
Abstract:
We study message identification over a $q$-ary uniform permutation channel, where the transmitted vector is permuted by a permutation chosen uniformly at random. For discrete memoryless channels (DMCs), the number of identifiable messages grows doubly exponentially. Identification capacity, the maximum second-order exponent, is known to be the same as the Shannon capacity of the DMC. Permutation c…
▽ More
We study message identification over a $q$-ary uniform permutation channel, where the transmitted vector is permuted by a permutation chosen uniformly at random. For discrete memoryless channels (DMCs), the number of identifiable messages grows doubly exponentially. Identification capacity, the maximum second-order exponent, is known to be the same as the Shannon capacity of the DMC. Permutation channels support reliable communication of only polynomially many messages. A simple achievability result shows that message sizes growing as $2^{c_nn^{q-1}}$ are identifiable for any $c_n\rightarrow 0$. We prove two converse results. A ``soft'' converse shows that for any $R>0$, there is no sequence of identification codes with message size growing as $2^{Rn^{q-1}}$ with a power-law decay ($n^{-μ}$) of the error probability. We also prove a ``strong" converse showing that for any sequence of identification codes with message size $2^{Rn^{q-1}\log n}$ ($R>0$), the sum of type I and type II error probabilities approaches at least $1$ as $n\rightarrow \infty$. To prove the soft converse, we use a sequence of steps to construct a new identification code with a simpler structure which relates to a set system, and then use a lower bound on the normalized maximum pairwise intersection of a set system. To prove the strong converse, we use results on approximation of distributions.
△ Less
Submitted 4 June, 2024; v1 submitted 15 May, 2024;
originally announced May 2024.
-
Tight Bounds for Online Convex Optimization with Adversarial Constraints
Authors:
Abhishek Sinha,
Rahul Vaze
Abstract:
A well-studied generalization of the standard online convex optimization (OCO) is constrained online convex optimization (COCO). In COCO, on every round, a convex cost function and a convex constraint function are revealed to the learner after the action for that round is chosen. The objective is to design an online policy that simultaneously achieves a small regret while ensuring small cumulative…
▽ More
A well-studied generalization of the standard online convex optimization (OCO) is constrained online convex optimization (COCO). In COCO, on every round, a convex cost function and a convex constraint function are revealed to the learner after the action for that round is chosen. The objective is to design an online policy that simultaneously achieves a small regret while ensuring small cumulative constraint violation (CCV) against an adaptive adversary. A long-standing open question in COCO is whether an online policy can simultaneously achieve $O(\sqrt{T})$ regret and $O(\sqrt{T})$ CCV without any restrictive assumptions. For the first time, we answer this in the affirmative and show that an online policy can simultaneously achieve $O(\sqrt{T})$ regret and $\tilde{O}(\sqrt{T})$ CCV. We establish this result by effectively combining the adaptive regret bound of the AdaGrad algorithm with Lyapunov optimization - a classic tool from control theory. Surprisingly, the analysis is short and elegant.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
The TDHF code Sky3D version 1.2
Authors:
Abhishek,
Paul Stevenson,
Yue Shi,
Esra Yüksel,
Sait Umar
Abstract:
The Sky3D code has been widely used to describe nuclear ground states, collective vibrational excitations, and heavy-ion collisions. The approach is based on Skyrme forces or related energy density functionals. The static and dynamic equations are solved on a three-dimensional grid, and pairing is been implemented in the BCS approximation. This updated version of the code aims to facilitate the ca…
▽ More
The Sky3D code has been widely used to describe nuclear ground states, collective vibrational excitations, and heavy-ion collisions. The approach is based on Skyrme forces or related energy density functionals. The static and dynamic equations are solved on a three-dimensional grid, and pairing is been implemented in the BCS approximation. This updated version of the code aims to facilitate the calculation of nuclear strength functions in the regime of linear response theory, while retaining all existing functionality and use cases. The strength functions are benchmarked against available RPA codes, and the user has the freedom of choice when selecting the nature of external excitation (from monopole to hexadecapole and more). Some utility programs are also provided that calculate the strength function from the time-dependent output of the dynamic calculations of the Sky3D code.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
On the Sombor index of the total graph and the unit graph of commutative rings
Authors:
Abhishek Vaibhav Pathak,
Anukul Sachan,
Raisa DSouza
Abstract:
In this paper, we investigate the Sombor index of the total graph and unit graph of $\mathbb{Z}_n$ which is denoted by $T_Γ(\mathbb{Z}_n)$ and $G(\mathbb{Z}_n)$ respectively for $n \in \{2k, p^α, pq, p^2q\}$ where $p$ and $q$ are distinct odd prime numbers such that $p < q$. Moreover, we compute the Sombor index of any finite local ring.
In this paper, we investigate the Sombor index of the total graph and unit graph of $\mathbb{Z}_n$ which is denoted by $T_Γ(\mathbb{Z}_n)$ and $G(\mathbb{Z}_n)$ respectively for $n \in \{2k, p^α, pq, p^2q\}$ where $p$ and $q$ are distinct odd prime numbers such that $p < q$. Moreover, we compute the Sombor index of any finite local ring.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Quantum-Accurate Machine Learning Potentials for Metal-Organic Frameworks using Temperature Driven Active Learning
Authors:
Abhishek Sharma,
Stefano Sanvito
Abstract:
Understanding how structural flexibility affects the properties of metal-organic frameworks (MOFs) is crucial for the design of better MOFs for targeted applications. Flexible MOFs can be studied with molecular dynamics simulations, whose accuracy depends on the force-field used to describe the interatomic interactions. Density functional theory (DFT) and quantum-chemistry methods are highly accur…
▽ More
Understanding how structural flexibility affects the properties of metal-organic frameworks (MOFs) is crucial for the design of better MOFs for targeted applications. Flexible MOFs can be studied with molecular dynamics simulations, whose accuracy depends on the force-field used to describe the interatomic interactions. Density functional theory (DFT) and quantum-chemistry methods are highly accurate, but the computational overheads limit their use in long time-dependent simulations for large systems. In contrast, classical force fields usually struggle with the description of coordination bonds.
In this work we develop a DFT-accurate machine-learning spectral neighbor analysis potential, trained on DFT energies, forces and stress tensors, for two representative MOFs, namely ZIF-8 and MOF-5. Their structural and vibrational properties are then studied as a function of temperature and tightly compared with available experimental data. Most importantly, we demonstrate an active-learning algorithm, based on map** the relevant internal coordinates, which drastically reduces the number of training data to be computed at the DFT level. Thus, the workflow presented here appears as an efficient strategy for the study of flexible MOFs with DFT accuracy, but at a fraction of the DFT computational cost.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
NurtureNet: A Multi-task Video-based Approach for Newborn Anthropometry
Authors:
Yash Khandelwal,
Mayur Arvind,
Sriram Kumar,
Ashish Gupta,
Sachin Kumar Danisetty,
Piyush Bagad,
Anish Madan,
Mayank Lunayach,
Aditya Annavajjala,
Abhishek Maiti,
Sansiddh Jain,
Aman Dalmia,
Namrata Deka,
Jerome White,
Jigar Doshi,
Angjoo Kanazawa,
Rahul Panicker,
Alpan Raval,
Srinivas Rana,
Makarand Tapaswi
Abstract:
Malnutrition among newborns is a top public health concern in develo** countries. Identification and subsequent growth monitoring are key to successful interventions. However, this is challenging in rural communities where health systems tend to be inaccessible and under-equipped, with poor adherence to protocol. Our goal is to equip health workers and public health systems with a solution for c…
▽ More
Malnutrition among newborns is a top public health concern in develo** countries. Identification and subsequent growth monitoring are key to successful interventions. However, this is challenging in rural communities where health systems tend to be inaccessible and under-equipped, with poor adherence to protocol. Our goal is to equip health workers and public health systems with a solution for contactless newborn anthropometry in the community.
We propose NurtureNet, a multi-task model that fuses visual information (a video taken with a low-cost smartphone) with tabular inputs to regress multiple anthropometry estimates including weight, length, head circumference, and chest circumference. We show that visual proxy tasks of segmentation and keypoint prediction further improve performance. We establish the efficacy of the model through several experiments and achieve a relative error of 3.9% and mean absolute error of 114.3 g for weight estimation. Model compression to 15 MB also allows offline deployment to low-cost smartphones.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Deep learning-based variational autoencoder for classification of quantum and classical states of light
Authors:
Mahesh Bhupati,
Abhishek Mall,
Anshuman Kumar,
Pankaj K. Jha
Abstract:
Advancements in optical quantum technologies have been enabled by the generation, manipulation, and characterization of light, with identification based on its photon statistics. However, characterizing light and its sources through single photon measurements often requires efficient detectors and longer measurement times to obtain high-quality photon statistics. Here we introduce a deep learning-…
▽ More
Advancements in optical quantum technologies have been enabled by the generation, manipulation, and characterization of light, with identification based on its photon statistics. However, characterizing light and its sources through single photon measurements often requires efficient detectors and longer measurement times to obtain high-quality photon statistics. Here we introduce a deep learning-based variational autoencoder (VAE) method for classifying single photon added coherent state (SPACS), single photon added thermal state (SPACS), mixed states between coherent/SPACS and thermal/SPATS of light. Our semisupervised learning-based VAE efficiently maps the photon statistics features of light to a lower dimension, enabling quasi-instantaneous classification with low average photon counts. The proposed VAE method is robust and maintains classification accuracy in the presence of losses inherent in an experiment, such as finite collection efficiency, non-unity quantum efficiency, finite number of detectors, etc. Additionally, leveraging the transfer learning capabilities of VAE enables successful classification of data of any quality using a single trained model. We envision that such a deep learning methodology will enable better classification of quantum light and light sources even in the presence of poor detection quality.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
ShadowNav: Autonomous Global Localization for Lunar Navigation in Darkness
Authors:
Deegan Atha,
R. Michael Swan,
Abhishek Cauligi,
Anne Bettens,
Edwin Goh,
Dima Kogan,
Larry Matthies,
Masahiro Ono
Abstract:
The ability to determine the pose of a rover in an inertial frame autonomously is a crucial capability necessary for the next generation of surface rover missions on other planetary bodies. Currently, most on-going rover missions utilize ground-in-the-loop interventions to manually correct for drift in the pose estimate and this human supervision bottlenecks the distance over which rovers can oper…
▽ More
The ability to determine the pose of a rover in an inertial frame autonomously is a crucial capability necessary for the next generation of surface rover missions on other planetary bodies. Currently, most on-going rover missions utilize ground-in-the-loop interventions to manually correct for drift in the pose estimate and this human supervision bottlenecks the distance over which rovers can operate autonomously and carry out scientific measurements. In this paper, we present ShadowNav, an autonomous approach for global localization on the Moon with an emphasis on driving in darkness and at nighttime. Our approach uses the leading edge of Lunar craters as landmarks and a particle filtering approach is used to associate detected craters with known ones on an offboard map. We discuss the key design decisions in develo** the ShadowNav framework for use with a Lunar rover concept equipped with a stereo camera and an external illumination source. Finally, we demonstrate the efficacy of our proposed approach in both a Lunar simulation environment and on data collected during a field test at Cinder Lakes, Arizona.
△ Less
Submitted 6 May, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
Systematic Construction of Golay Complementary Sets of Arbitrary Lengths and Alphabet Sizes
Authors:
Abhishek Roy,
Sudhan Majhi,
Subhabrata Paul
Abstract:
One of the important applications of Golay complementary sets (GCSs) is the reduction of peak-to-mean envelope power ratio (PMEPR) in orthogonal frequency division multiplexing (OFDM) systems. OFDM has played a major role in modern wireless systems such as long-term-evolution (LTE), 5th generation (5G) wireless standards, etc. This paper searches for systematic constructions of GCSs of arbitrary l…
▽ More
One of the important applications of Golay complementary sets (GCSs) is the reduction of peak-to-mean envelope power ratio (PMEPR) in orthogonal frequency division multiplexing (OFDM) systems. OFDM has played a major role in modern wireless systems such as long-term-evolution (LTE), 5th generation (5G) wireless standards, etc. This paper searches for systematic constructions of GCSs of arbitrary lengths and alphabet sizes. The proposed constructions are based on extended Boolean functions (EBFs). For the first time, we can generate codes of independent parameter choices.
△ Less
Submitted 8 May, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
On generators of $k$-PSD closures of the positive semidefinite cone
Authors:
Avinash Bhardwaj,
Vishnu Narayanan,
Abhishek Pathapati
Abstract:
Positive semidefinite (PSD) cone is the cone of positive semidefinite matrices, and is the object of interest in semidefinite programming (SDP). A computational efficient approximation of the PSD cone is the $k$-PSD closure, $1 \leq k < n$, cone of $n\times n$ real symmetric matrices such that all of their $k\times k$ principal submatrices are positive semidefinite. For $k=1$, one obtains a polyhe…
▽ More
Positive semidefinite (PSD) cone is the cone of positive semidefinite matrices, and is the object of interest in semidefinite programming (SDP). A computational efficient approximation of the PSD cone is the $k$-PSD closure, $1 \leq k < n$, cone of $n\times n$ real symmetric matrices such that all of their $k\times k$ principal submatrices are positive semidefinite. For $k=1$, one obtains a polyhedral approximation, while $k=2$ yields a second order conic (SOC) approximation of the PSD cone. These approximations of the PSD cone have been used extensively in real-world applications such as AC Optimal Power Flow (ACOPF) to address computational inefficiencies where SDP relaxations are utilized for convexification the non-convexities. However a theoretical discussion about the geometry of these conic approximations of the PSD cone is rather sparse. In this short communication, we attempt to provide a characterization of some family of generators of the aforementioned conic approximations.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Modified least squares method and a review of its applications in machine learning and fractional differential/integral equations
Authors:
Abhishek Kumar Singh,
Mani Mehra,
Anatoly A. Alikhanov
Abstract:
The least squares method provides the best-fit curve by minimizing the total squares error. In this work, we provide the modified least squares method based on the fractional orthogonal polynomials that belong to the space $M_{n}^λ := \text{span}\{1,x^λ,x^{2λ},\ldots,x^{nλ}\},~λ\in (0,2]$. Numerical experiments demonstrate how to solve different problems using the modified least squares method. Mo…
▽ More
The least squares method provides the best-fit curve by minimizing the total squares error. In this work, we provide the modified least squares method based on the fractional orthogonal polynomials that belong to the space $M_{n}^λ := \text{span}\{1,x^λ,x^{2λ},\ldots,x^{nλ}\},~λ\in (0,2]$. Numerical experiments demonstrate how to solve different problems using the modified least squares method. Moreover, the results show the advantage of the modified least squares method compared to the classical least squares method. Furthermore, we discuss the various applications of the modified least squares method in the fields like fractional differential/integral equations and machine learning.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Composite antiferromagnetic and orbital order with altermagnetic properties at a cuprate/manganite interface
Authors:
Subhrangsu Sarkar,
Roxana Capu,
Yurii G. Pashkevich,
Jonas Knobel,
Marli R. Cantarino,
Abhishek Nag,
Kurt Kummer,
Davide Betto,
Roberto Sant,
Christopher W. Nicholson,
Jarji Khmaladze,
Ke-**. Zhou,
Nicholas B. Brookes,
Claude Monney,
Christian Bernhard
Abstract:
Heterostructures from complex oxides allow one to combine various electronic and magnetic orders as to induce new quantum states. A prominent example is the coupling between superconducting and magnetic orders in multilayers from high-Tc cuprates and manganites. A key role is played here by the interfacial CuO2 layer whose distinct properties remain to be fully understood. Here, we study with reso…
▽ More
Heterostructures from complex oxides allow one to combine various electronic and magnetic orders as to induce new quantum states. A prominent example is the coupling between superconducting and magnetic orders in multilayers from high-Tc cuprates and manganites. A key role is played here by the interfacial CuO2 layer whose distinct properties remain to be fully understood. Here, we study with resonant inelastic X-ray scattering (RIXS) the magnon excitations of this interfacial CuO2 layer. In particular, we show that the underlying antiferromagnetic exchange interaction at the interface is strongly suppressed to J ~ 70 meV, as compared to J ~ 130 meV for the CuO2 layers away from the interface. Moreover, we observe an anomalous momentum dependence of the intensity of the interfacial magnon mode and show that it suggests that the antiferromagnetic order is accompanied by a particular kind of orbital order that yields a so-called altermagnetic state. Such a two-dimensional altermagnet has recently been predicted to enable new spintronic applications and superconducting proximity effects.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
FashionSD-X: Multimodal Fashion Garment Synthesis using Latent Diffusion
Authors:
Abhishek Kumar Singh,
Ioannis Patras
Abstract:
The rapid evolution of the fashion industry increasingly intersects with technological advancements, particularly through the integration of generative AI. This study introduces a novel generative pipeline designed to transform the fashion design process by employing latent diffusion models. Utilizing ControlNet and LoRA fine-tuning, our approach generates high-quality images from multimodal input…
▽ More
The rapid evolution of the fashion industry increasingly intersects with technological advancements, particularly through the integration of generative AI. This study introduces a novel generative pipeline designed to transform the fashion design process by employing latent diffusion models. Utilizing ControlNet and LoRA fine-tuning, our approach generates high-quality images from multimodal inputs such as text and sketches. We leverage and enhance state-of-the-art virtual try-on datasets, including Multimodal Dress Code and VITON-HD, by integrating sketch data. Our evaluation, utilizing metrics like FID, CLIP Score, and KID, demonstrates that our model significantly outperforms traditional stable diffusion models. The results not only highlight the effectiveness of our model in generating fashion-appropriate outputs but also underscore the potential of diffusion models in revolutionizing fashion design workflows. This research paves the way for more interactive, personalized, and technologically enriched methodologies in fashion design and representation, bridging the gap between creative vision and practical application.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
A Biased Estimator for MinMax Sampling and Distributed Aggregation
Authors:
Joel Wolfrath,
Abhishek Chandra
Abstract:
MinMax sampling is a technique for downsampling a real-valued vector which minimizes the maximum variance over all vector components. This approach is useful for reducing the amount of data that must be sent over a constrained network link (e.g. in the wide-area). MinMax can provide unbiased estimates of the vector elements, along with unbiased estimates of aggregates when vectors are combined fro…
▽ More
MinMax sampling is a technique for downsampling a real-valued vector which minimizes the maximum variance over all vector components. This approach is useful for reducing the amount of data that must be sent over a constrained network link (e.g. in the wide-area). MinMax can provide unbiased estimates of the vector elements, along with unbiased estimates of aggregates when vectors are combined from multiple locations. In this work, we propose a biased MinMax estimation scheme, B-MinMax, which trades an increase in estimator bias for a reduction in variance. We prove that when no aggregation is performed, B-MinMax obtains a strictly lower MSE compared to the unbiased MinMax estimator. When aggregation is required, B-MinMax is preferable when sample sizes are small or the number of aggregated vectors is limited. Our experiments show that this approach can substantially reduce the MSE for MinMax sampling in many practical settings.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Phase diagram of generalized XY model using tensor renormalization group
Authors:
Abhishek Samlodia,
Vamika Longia,
Raghav G. Jha,
Anosh Joseph
Abstract:
We use the higher-order tensor renormalization group method to study the two-dimensional generalized XY model that admits integer and half-integer vortices. This model is the deformation of the classical XY model and has a rich phase structure consisting of nematic, ferromagnetic, and disordered phases and three transition lines belonging to the Berezinskii-Kosterlitz-Thouless and Ising class. We…
▽ More
We use the higher-order tensor renormalization group method to study the two-dimensional generalized XY model that admits integer and half-integer vortices. This model is the deformation of the classical XY model and has a rich phase structure consisting of nematic, ferromagnetic, and disordered phases and three transition lines belonging to the Berezinskii-Kosterlitz-Thouless and Ising class. We explore the model for a wide range of temperatures, $T$, and the deformation parameter, $Δ$, and compute specific heat along with integer and half-integer magnetic susceptibility, finding both BKT-like and Ising-like transitions and the region where they meet.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
CyNetDiff -- A Python Library for Accelerated Implementation of Network Diffusion Models
Authors:
Eliot W. Robson,
Dhemath Reddy,
Abhishek K. Umrawal
Abstract:
In recent years, there has been increasing interest in network diffusion models and related problems. The most popular of these are the independent cascade and linear threshold models. Much of the recent experimental work done on these models requires a large number of simulations conducted on large graphs, a computationally expensive task suited for low-level languages. However, many researchers…
▽ More
In recent years, there has been increasing interest in network diffusion models and related problems. The most popular of these are the independent cascade and linear threshold models. Much of the recent experimental work done on these models requires a large number of simulations conducted on large graphs, a computationally expensive task suited for low-level languages. However, many researchers prefer the use of higher-level languages (such as Python) for their flexibility and shorter development times. Moreover, in many research tasks, these simulations are the most computationally intensive task, so it would be desirable to have a library for these with an interface to a high-level language with the performance of a low-level language. To fill this niche, we introduce CyNetDiff, a Python library with components written in Cython to provide improved performance for these computationally intensive diffusion tasks.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Quantification of 2D Interfaces: Quality of heterostructures, and what is inside a nanobubble
Authors:
Mainak Mondal,
Pawni Manchanda,
Soumadeep Saha,
Abhishek Jangid,
Akshay Singh
Abstract:
Trapped materials at the interfaces of two-dimensional heterostructures (HS) lead to reduced coupling between the layers, resulting in degraded optoelectronic performance and device variability. Further, nanobubbles can form at the interface during transfer or after annealing. The question of what is inside a nanobubble, i.e. the trapped material, remains unanswered, limiting the studies and appli…
▽ More
Trapped materials at the interfaces of two-dimensional heterostructures (HS) lead to reduced coupling between the layers, resulting in degraded optoelectronic performance and device variability. Further, nanobubbles can form at the interface during transfer or after annealing. The question of what is inside a nanobubble, i.e. the trapped material, remains unanswered, limiting the studies and applications of these nanobubble systems. In this work, we report two key advances. Firstly, we quantify the interface quality using RAW-format optical imaging, and distinguish between ideal and non-ideal interfaces. The HS-substrate ratio value is calculated using a transfer matrix model, and is able to detect the presence of trapped layers. The second key advance is identification of water as the trapped material inside a nanobubble. To the best of our knowledge, this is the first study to show that optical imaging alone can quantify interface quality, and find the type of trapped material inside spontaneously formed nanobubbles. We also define a quality index parameter to quantify the interface quality of HS. Quantitative measurement of the interface will help answer the question whether annealing is necessary during HS preparation, and will enable creation of complex HS with small twist angles. Identification of the trapped materials will pave the way towards using nanobubbles for novel optical and engineering applications.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Expected Time-Optimal Control: a Particle MPC-based Approach via Sequential Convex Programming
Authors:
Kazuya Echigo,
Abhishek Cauligi,
Behçet Açıkmeşe
Abstract:
In this paper, we consider the problem of minimum-time optimal control for a dynamical system with initial state uncertainties and propose a sequential convex programming (SCP) solution framework. We seek to minimize the expected terminal (mission) time, which is an essential capability for planetary exploration missions where ground rovers have to carry out scientific tasks efficiently within the…
▽ More
In this paper, we consider the problem of minimum-time optimal control for a dynamical system with initial state uncertainties and propose a sequential convex programming (SCP) solution framework. We seek to minimize the expected terminal (mission) time, which is an essential capability for planetary exploration missions where ground rovers have to carry out scientific tasks efficiently within the mission timelines in uncertain environments. Our main contribution is to convert the underlying stochastic optimal control problem into a deterministic, numerically tractable, optimal control problem. To this end, the proposed solution framework combines two strategies from previous methods: i) a partial model predictive control with consensus horizon approach and ii) a sum-of-norm cost, a temporally strictly increasing weighted-norm, promoting minimum-time trajectories. Our contribution is to adopt these formulations into an SCP solution framework and obtain a numerically tractable stochastic control algorithm. We then demonstrate the resulting control method in multiple applications: i) a closed-loop linear system as a representative result (a spacecraft double integrator model), ii) an open-loop linear system (the same model), and then iii) a nonlinear system (Dubin's car).
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
A proof theory of (omega-)context-free languages, via non-wellfounded proofs
Authors:
Anupam Das,
Abhishek De
Abstract:
We investigate the proof theory of regular expressions with fixed points, construed as a notation for (omega-)context-free grammars. Starting with a hypersequential system for regular expressions due to Das and Pous, we define its extension by least fixed points and prove soundness and completeness of its non-wellfounded proofs for the standard language model. From here we apply proof-theoretic te…
▽ More
We investigate the proof theory of regular expressions with fixed points, construed as a notation for (omega-)context-free grammars. Starting with a hypersequential system for regular expressions due to Das and Pous, we define its extension by least fixed points and prove soundness and completeness of its non-wellfounded proofs for the standard language model. From here we apply proof-theoretic techniques to recover an infinitary axiomatisation of the resulting equational theory, complete for inclusions of context-free languages. Finally, we extend our syntax by greatest fixed points, now computing omega-context-free languages. We show the soundness and completeness of the corresponding system using a mixture of proof-theoretic and game-theoretic techniques.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Warp Drives and Martel-Poisson charts
Authors:
Abhishek Chowdhury
Abstract:
We extend the construction of the Alcubierre-Natario class of warp drives to an infinite class of spacetimes with similar properties. This is achieved by utilising the Martel-Poisson charts which closely resemble the Weak Painleve-Gullstrand form for various background metrics (Mink, AdS, dS). The highlight of this construction is the non-flat intrinsic metric which in three-dimensional spacetimes…
▽ More
We extend the construction of the Alcubierre-Natario class of warp drives to an infinite class of spacetimes with similar properties. This is achieved by utilising the Martel-Poisson charts which closely resemble the Weak Painleve-Gullstrand form for various background metrics (Mink, AdS, dS). The highlight of this construction is the non-flat intrinsic metric which in three-dimensional spacetimes introduces conical singularities at the origin and in higher dimensions generates a non-zero Ricci scalar for the spatial hypersurfaces away from the origin. We analyse the expansion/contraction of space and negative energy densities associated with this class of warp drives and find interesting deviations due to the global imprints of the conical defects. Several generalizations are also discussed.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Jamming memory into acoustically trained dense suspensions under shear
Authors:
Edward Y. X. Ong,
Anna R. Barth,
Navneet Singh,
Meera Ramaswamy,
Abhishek Shetty,
Bulbul Chakraborty,
James P. Sethna,
Itai Cohen
Abstract:
Systems driven far from equilibrium often retain structural memories of their processing history. This memory has, in some cases, been shown to dramatically alter the material response. For example, work hardening in crystalline metals can alter the hardness, yield strength, and tensile strength to prevent catastrophic failure. Whether memory of processing history can be similarly exploited in flo…
▽ More
Systems driven far from equilibrium often retain structural memories of their processing history. This memory has, in some cases, been shown to dramatically alter the material response. For example, work hardening in crystalline metals can alter the hardness, yield strength, and tensile strength to prevent catastrophic failure. Whether memory of processing history can be similarly exploited in flowing systems, where significantly larger changes in structure should be possible, remains poorly understood. Here, we demonstrate a promising route to embedding such useful memories. We build on work showing that exposing a sheared dense suspension to acoustic perturbations of different power allows for dramatically tuning the sheared suspension viscosity and underlying structure. We find that, for sufficiently dense suspensions, upon removing the acoustic perturbations, the suspension shear jams with shear stress contributions from the maximum compressive and maximum extensive axes that reflect the acoustic training. Because the contributions from these two orthogonal axes to the total shear stress are antagonistic, it is possible to tune the resulting suspension response in surprising ways. For example, we show that differently trained sheared suspensions exhibit: 1) different susceptibility to the same acoustic perturbation; 2) orders of magnitude changes in their instantaneous viscosities upon shear reversal; and 3) even a shear stress that increases in magnitude upon shear cessation. To further illustrate the power of this approach for controlling suspension properties, we demonstrate that flowing states well below the shear jamming threshold can be shear jammed via acoustic training. Collectively, our work paves the way for using acoustically induced memory in dense suspensions to generate rapidly and widely tunable materials.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Inertia and Activity: Spiral transitions in semi-flexible, self-avoiding polymers
Authors:
Chitrak Karan,
Abhishek Chaudhuri,
Debasish Chaudhuri
Abstract:
We consider a two-dimensional, tangentially active, semi-flexible, self-avoiding polymer to find a dynamical re-entrant transition between motile open chains and spinning achiral spirals with increasing activity. Utilizing probability distributions of the turning number, we ascertain the comparative stability of the spiral structure and present a detailed phase diagram within the activity inertia…
▽ More
We consider a two-dimensional, tangentially active, semi-flexible, self-avoiding polymer to find a dynamical re-entrant transition between motile open chains and spinning achiral spirals with increasing activity. Utilizing probability distributions of the turning number, we ascertain the comparative stability of the spiral structure and present a detailed phase diagram within the activity inertia plane. The onset of spiral formation at low activity levels is governed by a torque balance and is independent of inertia. At higher activities, however, inertial effects lead to spiral destabilization, an effect absent in the overdamped limit. We further delineate alterations in size and shape by analyzing the end-to-end distance distribution and the radius of gyration tensor. The Kullback-Leibler divergence from equilibrium distributions exhibits a non-monotonic relationship with activity, reaching a peak at the most compact spirals characterized by the most persistent spinning. As inertia increases, this divergence from equilibrium diminishes.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Minimum Consistent Subset in Trees and Interval Graphs
Authors:
Aritra Banik,
Sayani Das,
Anil Maheshwari,
Bubai Manna,
Subhas C Nandy,
Krishna Priya K M,
Bodhayan Roy,
Sasanka Roy,
Abhishek Sahu
Abstract:
In the Minimum Consistent Subset (MCS) problem, we are presented with a connected simple undirected graph $G=(V,E)$, consisting of a vertex set $V$ of size $n$ and an edge set $E$. Each vertex in $V$ is assigned a color from the set $\{1,2,\ldots, c\}$. The objective is to determine a subset $V' \subseteq V$ with minimum possible cardinality, such that for every vertex $v \in V$, at least one of i…
▽ More
In the Minimum Consistent Subset (MCS) problem, we are presented with a connected simple undirected graph $G=(V,E)$, consisting of a vertex set $V$ of size $n$ and an edge set $E$. Each vertex in $V$ is assigned a color from the set $\{1,2,\ldots, c\}$. The objective is to determine a subset $V' \subseteq V$ with minimum possible cardinality, such that for every vertex $v \in V$, at least one of its nearest neighbors in $V'$ (measured in terms of the hop distance) shares the same color as $v$. The decision problem, indicating whether there exists a subset $V'$ of cardinality at most $l$ for some positive integer $l$, is known to be NP-complete even for planar graphs.
In this paper, we establish that the MCS problem for trees, when the number of colors $c$ is considered an input parameter, is NP-complete. We propose a fixed-parameter tractable (FPT) algorithm for MCS on trees running in $O(2^{6c}n^6)$ time, significantly improving the currently best-known algorithm whose running time is $O(2^{4c}n^{2c+3})$.
In an effort to comprehensively understand the computational complexity of the MCS problem across different graph classes, we extend our investigation to interval graphs. We show that it remains NP-complete for interval graphs, thus enriching graph classes where MCS remains intractable.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.