-
Sunburst quantum Ising battery
Authors:
Akash Mitra,
Shashi C. L. Srivastava
Abstract:
We study the energy transfer process in the recently proposed sunburst quantum Ising model, which consists of two interacting integrable systems: a transverse Ising chain with a very small transverse field and a finite number of external isolated qubits. We show that in this model of the quantum battery, coupling between the battery and charger can be used to optimize the ergotropy, which is the m…
▽ More
We study the energy transfer process in the recently proposed sunburst quantum Ising model, which consists of two interacting integrable systems: a transverse Ising chain with a very small transverse field and a finite number of external isolated qubits. We show that in this model of the quantum battery, coupling between the battery and charger can be used to optimize the ergotropy, which is the maximum amount of energy that can be extracted from the battery. At the same time, maximum charging power increases with the coupling strength, allowing for the simultaneous optimization of both ergotropy and charging power in the strong coupling limit. Furthermore, we show that both ergotropy and charging power are independent of the initial state of the charger.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Deep Learning for Slum Map** in Remote Sensing Images: A Meta-analysis and Review
Authors:
Anjali Raj,
Adway Mitra,
Manjira Sinha
Abstract:
The major Sustainable Development Goals (SDG) 2030, set by the United Nations Development Program (UNDP), include sustainable cities and communities, no poverty, and reduced inequalities. However, millions of people live in slums or informal settlements with poor living conditions in many major cities around the world, especially in less developed countries. To emancipate these settlements and the…
▽ More
The major Sustainable Development Goals (SDG) 2030, set by the United Nations Development Program (UNDP), include sustainable cities and communities, no poverty, and reduced inequalities. However, millions of people live in slums or informal settlements with poor living conditions in many major cities around the world, especially in less developed countries. To emancipate these settlements and their inhabitants through government intervention, accurate data about slum location and extent is required. While ground survey data is the most reliable, such surveys are costly and time-consuming. An alternative is remotely sensed data obtained from very high-resolution (VHR) imagery. With the advancement of new technology, remote sensing based map** of slums has emerged as a prominent research area. The parallel rise of Artificial Intelligence, especially Deep Learning has added a new dimension to this field as it allows automated analysis of satellite imagery to identify complex spatial patterns associated with slums. This article offers a detailed review and meta-analysis of research on slum map** using remote sensing imagery from 2014 to 2024, with a special focus on deep learning approaches. Our analysis reveals a trend towards increasingly complex neural network architectures, with advancements in data preprocessing and model training techniques significantly enhancing slum identification accuracy. We have attempted to identify key methodologies that are effective across diverse geographic contexts. While acknowledging the transformative impact Convolutional Neural Networks (CNNs) in slum detection, our review underscores the absence of a universally optimal model, suggesting the need for context-specific adaptations. We also identify prevailing challenges in this field, such as data limitations and a lack of model explainability and suggest potential strategies for overcoming these.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Bottlenecking in graphs and a coarse Menger-type theorem
Authors:
Michael Bruner,
Atish Mitra,
Heidi Steiger
Abstract:
We expand upon the notion of bottlenecking introduced in our earlier work, characterizing a spectrum of graphs and showing that this naturally extends to a concept of coarse bottlenecking. We examine how bottlenecking differs from other notions of connectedness. We show how the notion of bottlenecking provides a different approach to coarsening measures of connectedness than the Coarse Menger Conj…
▽ More
We expand upon the notion of bottlenecking introduced in our earlier work, characterizing a spectrum of graphs and showing that this naturally extends to a concept of coarse bottlenecking. We examine how bottlenecking differs from other notions of connectedness. We show how the notion of bottlenecking provides a different approach to coarsening measures of connectedness than the Coarse Menger Conjecture proposed independently by Georgakopoulos and Papasoglu as well as Albrechtsen, Huynh, Jacobs, Knappe, and Wollan, which was recently disproved by a counterexample. We give a proof of a coarse Menger-type theorem for the class of coarsely bottlenecked graphs, and provide a conjecture that would extend this to all graphs.
△ Less
Submitted 17 June, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
Exploring Alternative Cosmologies with the LSST: Simulated Forecasts and Current Observational Constraints
Authors:
Dharmendra Kumar,
Ayan Mitra,
Shahnawaz A. Adil,
Anjan A. Sen
Abstract:
In recent years, the Lambda Cold Dark Matter (LCDM) model, which has been pivotal in cosmological studies, has faced significant challenges due to emerging observational and theoretical inconsistencies. This paper explores alternative cosmological models to address these discrepancies, using simulated three years photometric Supernovae Ia data from the Legacy Survey of Space and Time (LSST), suppl…
▽ More
In recent years, the Lambda Cold Dark Matter (LCDM) model, which has been pivotal in cosmological studies, has faced significant challenges due to emerging observational and theoretical inconsistencies. This paper explores alternative cosmological models to address these discrepancies, using simulated three years photometric Supernovae Ia data from the Legacy Survey of Space and Time (LSST), supplemented with additional Pantheon+, Union, and the recently released Dark Energy Survey 5 Years (DESY5) supernova compilations and Baryon Acoustic Oscillation (BAO) measurements. We assess the constraining power of these datasets on various dynamic dark energy models, including CPL, BA, JBP, SCPL, and GCG. Our analysis demonstrates that the LSST with its high precision data, can provide tighter constraints on dark energy parameters compared to other datasets. Additionally, the inclusion of BAO measurements significantly improves parameter constraints across all models.
Except for Pantheon+, we find that across all the cosmological datasets, and the dark energy models considered in this work, there is a consistent deviation from the LCDM model that exceeds a 2-sigma significance level. Our findings underscore the necessity of exploring dynamic dark energy models, which offer more consistent frameworks with fundamental physics and observational data, potentially resolving tensions within the LCDM paradigm. Furthermore, the use of simulated LSST data highlights the survey's potential in offering significant advantages for exploring alternative cosmologies, suggesting that future LSST observations would play a crucial role.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text
Authors:
Avijit Mitra,
Emily Druhl,
Raelene Goodwin,
Hong Yu
Abstract:
Social and behavioral determinants of health (SBDH) play a crucial role in health outcomes and are frequently documented in clinical text. Automatically extracting SBDH information from clinical text relies on publicly available good-quality datasets. However, existing SBDH datasets exhibit substantial limitations in their availability and coverage. In this study, we introduce Synth-SBDH, a novel…
▽ More
Social and behavioral determinants of health (SBDH) play a crucial role in health outcomes and are frequently documented in clinical text. Automatically extracting SBDH information from clinical text relies on publicly available good-quality datasets. However, existing SBDH datasets exhibit substantial limitations in their availability and coverage. In this study, we introduce Synth-SBDH, a novel synthetic dataset with detailed SBDH annotations, encompassing status, temporal information, and rationale across 15 SBDH categories. We showcase the utility of Synth-SBDH on three tasks using real-world clinical datasets from two distinct hospital settings, highlighting its versatility, generalizability, and distillation capabilities. Models trained on Synth-SBDH consistently outperform counterparts with no Synth-SBDH training, achieving up to 62.5% macro-F improvements. Additionally, Synth-SBDH proves effective for rare SBDH categories and under-resource constraints. Human evaluation demonstrates a Human-LLM alignment of 71.06% and uncovers areas for future refinements.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Entanglement with neutral atoms in the simulation of nonequilibrium dynamics of one-dimensional spin models
Authors:
Anupam Mitra
Abstract:
Quantum entanglement is a key ingredient for quantum information processing with capabilities beyond that of classical computation. We study the generation and role of entanglement in the dynamics of spin-1/2 models, both for the design of quantum gates for general-purpose quantum computation and for quantum simulation of interacting spin models. We introduce the neutral atom Mølmer-Sørensen gate,…
▽ More
Quantum entanglement is a key ingredient for quantum information processing with capabilities beyond that of classical computation. We study the generation and role of entanglement in the dynamics of spin-1/2 models, both for the design of quantum gates for general-purpose quantum computation and for quantum simulation of interacting spin models. We introduce the neutral atom Mølmer-Sørensen gate, involving rapid adiabatic Rydberg dressing interleaved in a spin-echo sequence. We show its robustness to quasi-static experimental imperfections and favorable scaling with the time-energy scales of Rydberg-mediated entanglement generation. In quantum simulation, we consider critical behavior in quench dynamics of transverse field Ising models. Using matrix product states to calculate the dynamics, we find that order parameters, critical point, and critical exponents can be estimated using modest bond dimensions. Considering the role of chaos and equilibration in quenches, we find that local observables are well approximated either due to low global entanglement or the proximity of local marginals to the maximally mixed state. These findings highlight the challenge of identifying relevant quantum phenomena that remain inaccessible to classical descriptions. Understanding the regimes where classical descriptions fail but remain accessible to pre-fault tolerant quantum hardware will help inform the design of future quantum information processors
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Graph Skeletons and Diminishing Minors
Authors:
Michael Bruner,
Atish Mitra,
Heidi Steiger
Abstract:
We define coarse skeletons of graphs in terms of two constants. We introduce the notion of coarse bottlenecking in graphs and show how it can guarantee that a skeleton resembles (up to quasi-isometry) the original graph. We show how notions similar to the coarse skeleton have been previously used to classify some coarse families of graphs. We explore the properties of a coarse skeleton and of comb…
▽ More
We define coarse skeletons of graphs in terms of two constants. We introduce the notion of coarse bottlenecking in graphs and show how it can guarantee that a skeleton resembles (up to quasi-isometry) the original graph. We show how notions similar to the coarse skeleton have been previously used to classify some coarse families of graphs. We explore the properties of a coarse skeleton and of combinations of skeletons. We show how these tools can be used to simplify the structure of graphs that have an excluded asymptotic minor, reducing it to a skeleton of the original containing at most a 3-fat minor, we give an example to show that a similar result does not hold for 2-fat minors.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Krylov complexity of deformed conformal field theories
Authors:
Arghya Chattopadhyay,
Vinay Malvimat,
Arpita Mitra
Abstract:
We consider a perturbative expansion of the Lanczos coefficients and the Krylov complexity for two-dimensional conformal field theories under integrable deformations. Specifically, we explore the consequences of $T{\bar{T}}$, $J{\bar{T}}$, and $J{\bar{J}}$ deformations, focusing on first-order corrections in the deformation parameter. Under $T\bar{T}$ deformation, we demonstrate that the Lanczos c…
▽ More
We consider a perturbative expansion of the Lanczos coefficients and the Krylov complexity for two-dimensional conformal field theories under integrable deformations. Specifically, we explore the consequences of $T{\bar{T}}$, $J{\bar{T}}$, and $J{\bar{J}}$ deformations, focusing on first-order corrections in the deformation parameter. Under $T\bar{T}$ deformation, we demonstrate that the Lanczos coefficients $b_n$ exhibit unexpected behavior, deviating from linear growth within the valid perturbative regime. Notably, the Krylov exponent characterizing the rate of exponential growth of complexity surpasses that of the undeformed theory for positive value of deformation parameter, suggesting a potential violation of the conjectured operator growth bound within the realm of perturbative analysis. One may attribute this to the existence of logarithmic branch points along with higher order poles in the autocorrelation function compared to the undeformed case. In contrast to this, both $J{\bar{J}}$ and $J{\bar{T}}$ deformations induce no first order correction to either the linear growth of Lanczos coefficients at large-$n$ or the Krylov exponent and hence the results for these two deformations align with those of the undeformed theory.
△ Less
Submitted 20 May, 2024; v1 submitted 6 May, 2024;
originally announced May 2024.
-
A new hybrid gadolinium nanoparticles-loaded polymeric material for neutron detection in rare event searches
Authors:
DarkSide-20k Collaboration,
:,
F. Acerbi,
P. Adhikari,
P. Agnes,
I. Ahmad,
S. Albergo,
I. F. Albuquerque,
T. Alexander,
A. K. Alton,
P. Amaudruz,
M. Angiolilli,
E. Aprile,
R. Ardito,
M. Atzori Corona,
D. J. Auty,
M. Ave,
I. C. Avetisov,
O. Azzolini,
H. O. Back,
Z. Balmforth,
A. Barrado Olmedo,
P. Barrillon,
G. Batignani,
P. Bhowmick
, et al. (290 additional authors not shown)
Abstract:
Experiments aimed at direct searches for WIMP dark matter require highly effective reduction of backgrounds and control of any residual radioactive contamination. In particular, neutrons interacting with atomic nuclei represent an important class of backgrounds due to the expected similarity of a WIMP-nucleon interaction, so that such experiments often feature a dedicated neutron detector surround…
▽ More
Experiments aimed at direct searches for WIMP dark matter require highly effective reduction of backgrounds and control of any residual radioactive contamination. In particular, neutrons interacting with atomic nuclei represent an important class of backgrounds due to the expected similarity of a WIMP-nucleon interaction, so that such experiments often feature a dedicated neutron detector surrounding the active target volume. In the context of the development of DarkSide-20k detector at INFN Gran Sasso National Laboratory (LNGS), several R&D projects were conceived and developed for the creation of a new hybrid material rich in both hydrogen and gadolinium nuclei to be employed as an essential element of the neutron detector. Thanks to its very high cross-section for neutron capture, gadolinium is one of the most widely used elements in neutron detectors, while the hydrogen-rich material is instrumental in efficiently moderating the neutrons. In this paper results from one of the R&Ds are presented. In this effort the new hybrid material was obtained as a poly(methyl methacrylate) (PMMA) matrix, loaded with gadolinium oxide in the form of nanoparticles. We describe its realization, including all phases of design, purification, construction, characterization, and determination of mechanical properties of the new material.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt -- A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis
Authors:
Parth Vashisht,
Abhilasha Lodha,
Mukta Maddipatla,
Zonghai Yao,
Avijit Mitra,
Zhichao Yang,
Junda Wang,
Sunjae Kwon,
Hong Yu
Abstract:
This paper presents our team's participation in the MEDIQA-ClinicalNLP2024 shared task B. We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework. Our investigation reveals that GPT-4V, when used as a retrieval agent, can accurately retrieve the correc…
▽ More
This paper presents our team's participation in the MEDIQA-ClinicalNLP2024 shared task B. We present a novel approach to diagnosing clinical dermatology cases by integrating large multimodal models, specifically leveraging the capabilities of GPT-4V under a retriever and a re-ranker framework. Our investigation reveals that GPT-4V, when used as a retrieval agent, can accurately retrieve the correct skin condition 85% of the time using dermatological images and brief patient histories. Additionally, we empirically show that Naive Chain-of-Thought (CoT) works well for retrieval while Medical Guidelines Grounded CoT is required for accurate dermatological diagnosis. Further, we introduce a Multi-Agent Conversation (MAC) framework and show its superior performance and potential over the best CoT strategy. The experiments suggest that using naive CoT for retrieval and multi-agent conversation for critique-based diagnosis, GPT-4V can lead to an early and accurate diagnosis of dermatological conditions. The implications of this work extend to improving diagnostic workflows, supporting dermatological education, and enhancing patient care by providing a scalable, accessible, and accurate diagnostic tool.
△ Less
Submitted 8 May, 2024; v1 submitted 26 April, 2024;
originally announced April 2024.
-
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models
Authors:
Mihir Parmar,
Nisarg Patel,
Neeraj Varshney,
Mutsumi Nakamura,
Man Luo,
Santosh Mashetty,
Arindam Mitra,
Chitta Baral
Abstract:
Recently developed large language models (LLMs) have been shown to perform remarkably well on a wide range of language understanding tasks. But, can they really "reason" over the natural language? This question has been receiving significant research attention and many reasoning skills such as commonsense, numerical, and qualitative have been studied. However, the crucial skill pertaining to 'logi…
▽ More
Recently developed large language models (LLMs) have been shown to perform remarkably well on a wide range of language understanding tasks. But, can they really "reason" over the natural language? This question has been receiving significant research attention and many reasoning skills such as commonsense, numerical, and qualitative have been studied. However, the crucial skill pertaining to 'logical reasoning' has remained underexplored. Existing work investigating this reasoning ability of LLMs has focused only on a couple of inference rules (such as modus ponens and modus tollens) of propositional and first-order logic. Addressing the above limitation, we comprehensively evaluate the logical reasoning ability of LLMs on 25 different reasoning patterns spanning over propositional, first-order, and non-monotonic logics. To enable systematic evaluation, we introduce LogicBench, a natural language question-answering dataset focusing on the use of a single inference rule. We conduct detailed analysis with a range of LLMs such as GPT-4, ChatGPT, Gemini, Llama-2, and Mistral using chain-of-thought prompting. Experimental results show that existing LLMs do not fare well on LogicBench; especially, they struggle with instances involving complex reasoning and negations. Furthermore, they sometimes overlook contextual information necessary for reasoning to arrive at the correct conclusion. We believe that our work and findings facilitate future research for evaluating and enhancing the logical reasoning ability of LLMs. Data and code are available at https://github.com/Mihir3009/LogicBench.
△ Less
Submitted 6 June, 2024; v1 submitted 23 April, 2024;
originally announced April 2024.
-
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Authors:
Marah Abdin,
Sam Ade Jacobs,
Ammar Ahmad Awan,
Jyoti Aneja,
Ahmed Awadallah,
Hany Awadalla,
Nguyen Bach,
Amit Bahree,
Arash Bakhtiari,
Jianmin Bao,
Harkirat Behl,
Alon Benhaim,
Misha Bilenko,
Johan Bjorck,
Sébastien Bubeck,
Qin Cai,
Martin Cai,
Caio César Teodoro Mendes,
Weizhu Chen,
Vishrav Chaudhary,
Dong Chen,
Dongdong Chen,
Yen-Chun Chen,
Yi-Ling Chen,
Parul Chopra
, et al. (90 additional authors not shown)
Abstract:
We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset…
▽ More
We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset for training, a scaled-up version of the one used for phi-2, composed of heavily filtered publicly available web data and synthetic data. The model is also further aligned for robustness, safety, and chat format. We also provide some initial parameter-scaling results with a 7B and 14B models trained for 4.8T tokens, called phi-3-small and phi-3-medium, both significantly more capable than phi-3-mini (e.g., respectively 75% and 78% on MMLU, and 8.7 and 8.9 on MT-bench). Moreover, we also introduce phi-3-vision, a 4.2 billion parameter model based on phi-3-mini with strong reasoning capabilities for image and text prompts.
△ Less
Submitted 23 May, 2024; v1 submitted 22 April, 2024;
originally announced April 2024.
-
The Localized Active Space Method with Unitary Selective Coupled Cluster
Authors:
Abhishek Mitra,
Ruhee D'Cunha,
Qiaohong Wang,
Matthew R. Hermes,
Yuri Alexeev,
Stephen K. Gray,
Matthew Otten,
Laura Gagliardi
Abstract:
We introduce a hybrid quantum-classical algorithm, the localized active space unitary selective coupled cluster singles and doubles (LAS-USCCSD) method. Derived from the localized active space unitary coupled cluster (LAS-UCCSD) method, LAS-USCCSD first performs a classical LASSCF calculation, then selectively identifies the most important parameters (cluster amplitudes used to build the multirefe…
▽ More
We introduce a hybrid quantum-classical algorithm, the localized active space unitary selective coupled cluster singles and doubles (LAS-USCCSD) method. Derived from the localized active space unitary coupled cluster (LAS-UCCSD) method, LAS-USCCSD first performs a classical LASSCF calculation, then selectively identifies the most important parameters (cluster amplitudes used to build the multireference UCC ansatz) for restoring inter-fragment interaction energy using this reduced set of parameters with the variational quantum eigensolver method. We benchmark LAS-USCCSD against LAS-UCCSD by calculating the total energies of $(\mathrm{H}_2)_2$, $(\mathrm{H}_2)_4$ and \textit{trans}-butadiene, and the magnetic coupling constant for a bimetallic compound [Cr$_2$(OH)$_3$(NH$_3$)$_6$]$^{3+}$. For these systems, we find that LAS-USCCSD reduces the number of required parameters and thus the circuit depth by at least one order of magnitude, an aspect which is important for the practical implementation of multireference hybrid quantum-classical algorithms like LAS-UCCSD on near-term quantum computers.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
Self-diffusion is temperature independent on active membranes
Authors:
Saurav G. Varma,
Argha Mitra,
Sumantra Sarkar
Abstract:
Molecular transport maintains cellular structures and functions. For example, lipid and protein diffusion sculpts the dynamic shapes and structures on the cell membrane that perform essential cellular functions, such as cell signaling. Temperature variations in thermal equilibrium rapidly change molecular transport properties. The coefficient of lipid self-diffusion increases exponentially with te…
▽ More
Molecular transport maintains cellular structures and functions. For example, lipid and protein diffusion sculpts the dynamic shapes and structures on the cell membrane that perform essential cellular functions, such as cell signaling. Temperature variations in thermal equilibrium rapidly change molecular transport properties. The coefficient of lipid self-diffusion increases exponentially with temperature in thermal equilibrium, for example. Hence, in the noisy cellular environment, where temperatures can fluctuate widely due to local heat generation, maintaining cellular homeostasis through molecular transport is hard in thermal equilibrium. In this paper, using both molecular and lattice-based modeling of membrane transport, we show that the presence of active transport originating from the cell's cytoskeleton can make the self-diffusion of the molecules on the membrane robust to temperature fluctuations. The resultant temperature-independence of self-diffusion keeps the precision of cellular signaling invariant over a broad range of ambient temperatures, allowing cells to make robust decisions. We have also found that the Kawasaki algorithm, the widely used model of lipid transport on lattices, predicts incorrect temperature dependence of lipid self-diffusion in equilibrium. We propose a new algorithm that correctly captures the equilibrium properties of lipid self-diffusion and reproduces experimental observations.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Monotonicity of renormalization group flow, Perelman's entropy functional, and emergent dual holography in the worldsheet nonlinear $σ$ model
Authors:
Ki-Seok Kim,
Arpita Mitra,
Debangshu Mukherjee,
Shinsei Ryu
Abstract:
Based on the renormalization group (RG) flow of worldsheet bosonic string theory, we construct an effective holographic dual description, where an extra dimension is identified with an RG scale. As a result, we obtain a dilaton-gravity effective theory for the dynamics of an emergent target spacetime, analogous to the low-energy description of bosonic M theory. We argue that this holographic dual…
▽ More
Based on the renormalization group (RG) flow of worldsheet bosonic string theory, we construct an effective holographic dual description, where an extra dimension is identified with an RG scale. As a result, we obtain a dilaton-gravity effective theory for the dynamics of an emergent target spacetime, analogous to the low-energy description of bosonic M theory. We argue that this holographic dual effective field theory is non-perturbative in nature for the $α'$ expansion, where the RG flow of the target spacetime manifests in the level of an effective bulk action. Based on the holographic dual effective field theory, we investigate the monotonicity of the RG flow. Inspired by the monotonicity of the Ricci flow given by Perelman, we propose a holographic construction of the Perelman's entropy functional. Based on the equivalence between the Hamilton-Jacobi equation and the local RG equation, we show that the RG flow of holographic Perelman's entropy functional is nothing but the Weyl anomaly. This leads us to the monotonicity of the RG flow of the emergent target spacetime. Furthermore, considering the entropy production along the RG flow, we construct a microscopic entropy functional based on the probability distribution function of the holographic dual effective field theory, regarded as Gibbs or Shannon entropy. We find that the monotonicity of this microscopically constructed entropy functional shows a strong connection with the monotonicity of the holographic Perelman's entropy functional.
△ Less
Submitted 19 April, 2024; v1 submitted 13 April, 2024;
originally announced April 2024.
-
Quantum Spin Liquids in Weak Mott Insulators with a Spin-Orbit Coupling
Authors:
Asimpunya Mitra,
Daniel J. Schultz,
Yong Baek Kim
Abstract:
The weak Mott insulating regime of the triangular lattice Hubbard model exhibits a rich magnetic phase diagram as a result of the ring exchange interaction in the spin Hamiltonian. These phases include the Kalmeyer-Laughlin type chiral spin liquid (CSL) and a valence bond solid (VBS). A natural question arises regarding the robustness of these phases in the presence of a weak spin-orbit coupling (…
▽ More
The weak Mott insulating regime of the triangular lattice Hubbard model exhibits a rich magnetic phase diagram as a result of the ring exchange interaction in the spin Hamiltonian. These phases include the Kalmeyer-Laughlin type chiral spin liquid (CSL) and a valence bond solid (VBS). A natural question arises regarding the robustness of these phases in the presence of a weak spin-orbit coupling (SOC). In this study, we derive the effective spin model for the spin-orbit coupled triangular lattice Hubbard model in the weak Mott insulting regime, including all SOC-mediated spin-bilinears and ring-exchange interactions. We then construct a simplified spin model kee** only the most relevant SOC-mediated spin interactions. Using infinite density matrix renormalization group (iDMRG) we show that the CSL and VBS phases of the triangular lattice Hubbard model can be stabilized in the presence of a weak SOC. The stabilization results from a compensation between the Dzyaloshinskii-Moriya interaction and a SOC-mediated ring exchange interaction. We also provide additional qualitative arguments to intuitively understand the compensation mechanism in the iDMRG quantum phase diagrams. This mechanism for stabilization can potentially be useful for the experimental realization of quantum spin liquids.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Semantic Stealth: Adversarial Text Attacks on NLP Using Several Methods
Authors:
Roopkatha Dey,
Aivy Debnath,
Sayak Kumar Dutta,
Kaustav Ghosh,
Arijit Mitra,
Arghya Roy Chowdhury,
Jaydip Sen
Abstract:
In various real-world applications such as machine translation, sentiment analysis, and question answering, a pivotal role is played by NLP models, facilitating efficient communication and decision-making processes in domains ranging from healthcare to finance. However, a significant challenge is posed to the robustness of these natural language processing models by text adversarial attacks. These…
▽ More
In various real-world applications such as machine translation, sentiment analysis, and question answering, a pivotal role is played by NLP models, facilitating efficient communication and decision-making processes in domains ranging from healthcare to finance. However, a significant challenge is posed to the robustness of these natural language processing models by text adversarial attacks. These attacks involve the deliberate manipulation of input text to mislead the predictions of the model while maintaining human interpretability. Despite the remarkable performance achieved by state-of-the-art models like BERT in various natural language processing tasks, they are found to remain vulnerable to adversarial perturbations in the input text. In addressing the vulnerability of text classifiers to adversarial attacks, three distinct attack mechanisms are explored in this paper using the victim model BERT: BERT-on-BERT attack, PWWS attack, and Fraud Bargain's Attack (FBA). Leveraging the IMDB, AG News, and SST2 datasets, a thorough comparative analysis is conducted to assess the effectiveness of these attacks on the BERT classifier model. It is revealed by the analysis that PWWS emerges as the most potent adversary, consistently outperforming other methods across multiple evaluation scenarios, thereby emphasizing its efficacy in generating adversarial examples for text classification. Through comprehensive experimentation, the performance of these attacks is assessed and the findings indicate that the PWWS attack outperforms others, demonstrating lower runtime, higher accuracy, and favorable semantic similarity scores. The key insight of this paper lies in the assessment of the relative performances of three prevalent state-of-the-art attack mechanisms.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Authors:
Corby Rosset,
Ching-An Cheng,
Arindam Mitra,
Michael Santacroce,
Ahmed Awadallah,
Tengyang Xie
Abstract:
This paper studies post-training large language models (LLMs) using preference feedback from a powerful oracle to help a model iteratively improve over itself. The typical approach for post-training LLMs involves Reinforcement Learning from Human Feedback (RLHF), which traditionally separates reward learning and subsequent policy optimization. However, such a reward maximization approach is limite…
▽ More
This paper studies post-training large language models (LLMs) using preference feedback from a powerful oracle to help a model iteratively improve over itself. The typical approach for post-training LLMs involves Reinforcement Learning from Human Feedback (RLHF), which traditionally separates reward learning and subsequent policy optimization. However, such a reward maximization approach is limited by the nature of "point-wise" rewards (such as Bradley-Terry model), which fails to express complex intransitive or cyclic preference relations. While advances on RLHF show reward learning and policy optimization can be merged into a single contrastive objective for stability, they yet still remain tethered to the reward maximization framework. Recently, a new wave of research sidesteps the reward maximization presumptions in favor of directly optimizing over "pair-wise" or general preferences. In this paper, we introduce Direct Nash Optimization (DNO), a provable and scalable algorithm that marries the simplicity and stability of contrastive learning with theoretical generality from optimizing general preferences. Because DNO is a batched on-policy algorithm using a regression-based objective, its implementation is straightforward and efficient. Moreover, DNO enjoys monotonic improvement across iterations that help it improve even over a strong teacher (such as GPT-4). In our experiments, a resulting 7B parameter Orca-2.5 model aligned by DNO achieves the state-of-the-art win-rate against GPT-4-Turbo of 33% on AlpacaEval 2.0 (even after controlling for response length), an absolute gain of 26% (7% to 33%) over the initializing model. It outperforms models with far more parameters, including Mistral Large, Self-Rewarding LM (70B parameters), and older versions of GPT-4.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Floquet Product Mode
Authors:
Hsiu-Chung Yeh,
Achim Rosch,
Aditi Mitra
Abstract:
Results are presented for the dynamics of edge modes in interacting Floquet Ising chains. It is shown that in addition to the quasi-stable $0$ and $π$ edge modes, a third long lived edge mode arising from the operator product of the $0$ and $π$ edge modes exists. Depending on the microscopic parameters, this Floquet product mode is shown to have a substantially longer lifetime than the individual…
▽ More
Results are presented for the dynamics of edge modes in interacting Floquet Ising chains. It is shown that in addition to the quasi-stable $0$ and $π$ edge modes, a third long lived edge mode arising from the operator product of the $0$ and $π$ edge modes exists. Depending on the microscopic parameters, this Floquet product mode is shown to have a substantially longer lifetime than the individual $0$ and $π$ modes. This is triggered by a scattering process which converts a $0$ mode into a $π$ mode while scattering two bulk excitations. This process can lead to a rapid decay of both $0$ and $π$ mode without affecting the product mode.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
DASA: Delay-Adaptive Multi-Agent Stochastic Approximation
Authors:
Nicolo Dal Fabbro,
Arman Adibi,
H. Vincent Poor,
Sanjeev R. Kulkarni,
Aritra Mitra,
George J. Pappas
Abstract:
We consider a setting in which $N$ agents aim to speedup a common Stochastic Approximation (SA) problem by acting in parallel and communicating with a central server. We assume that the up-link transmissions to the server are subject to asynchronous and potentially unbounded time-varying delays. To mitigate the effect of delays and stragglers while rea** the benefits of distributed computation,…
▽ More
We consider a setting in which $N$ agents aim to speedup a common Stochastic Approximation (SA) problem by acting in parallel and communicating with a central server. We assume that the up-link transmissions to the server are subject to asynchronous and potentially unbounded time-varying delays. To mitigate the effect of delays and stragglers while rea** the benefits of distributed computation, we propose \texttt{DASA}, a Delay-Adaptive algorithm for multi-agent Stochastic Approximation. We provide a finite-time analysis of \texttt{DASA} assuming that the agents' stochastic observation processes are independent Markov chains. Significantly advancing existing results, \texttt{DASA} is the first algorithm whose convergence rate depends only on the mixing time $τ_{mix}$ and on the average delay $τ_{avg}$ while jointly achieving an $N$-fold convergence speedup under Markovian sampling. Our work is relevant for various SA applications, including multi-agent and distributed temporal difference (TD) learning, Q-learning and stochastic optimization with correlated data.
△ Less
Submitted 28 March, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
Quantum Fluctuations Suppress the Critical Fields in BaCo$_2$(AsO$_4$)$_2$
Authors:
Shiva Safari,
William Bateman-Hemphill,
Asimpunya Mitra,
Félix Desrochers,
Emily Z. Zhang,
Lubuna Shafeek,
Austin Ferrenti,
Tyrel M. McQueen,
Arkady Shekhter,
Zoltán Köllö,
Yong Baek Kim,
B. J. Ramshaw,
K. A. Modic
Abstract:
Early efforts to realize exotic quantum ground states in frustrated magnets focused on frustration arising from the lattice geometry alone. Attention has shifted to bond-dependent anisotropic interactions, as well as further-neighbor interactions, on non-geometrically-frustrated lattices due to their greater versatility. The honeycomb magnet BaCo$_2$(AsO$_4$)$_2$ recently emerged as a candidate ho…
▽ More
Early efforts to realize exotic quantum ground states in frustrated magnets focused on frustration arising from the lattice geometry alone. Attention has shifted to bond-dependent anisotropic interactions, as well as further-neighbor interactions, on non-geometrically-frustrated lattices due to their greater versatility. The honeycomb magnet BaCo$_2$(AsO$_4$)$_2$ recently emerged as a candidate host for both bond-dependent (e.g. Kitaev) and third-neighbor ($J_3$) interactions, and has become a model experimental system due to its relatively low levels of disorder. Understanding the relative importance of different exchange interactions holds the key to achieving novel ground states, such as quantum spin liquids. Here, we use the magnetotropic susceptibility to map out the intermediate and high-field phase diagram of BaCo$_2$(AsO$_4$)$_2$ as a function of the out-of-plane magnetic field direction at $T = 1.6$ K. We show that the experimental data are qualitatively consistent with classical Monte Carlo results of the XXZ-$J_1$-$J_3$ model with small Kitaev and off-diagonal exchange couplings included. However, the calculated critical fields are systematically larger than the experimental values. Infinite-DMRG computations on the quantum model reveal that quantum corrections from a nearby ferromagnetic state are likely responsible for the suppressed critical fields. Together, our experiment and theory analyses demonstrate that, while quantum fluctuations play an important role in determining the phase diagram, most of the physics of BaCo$_2$(AsO$_4$)$_2$ can be understood in terms of the classical dynamics of long-range ordered states, leaving little room for the possibility of a quantum spin liquid.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
ClinicalMamba: A Generative Clinical Language Model on Longitudinal Clinical Notes
Authors:
Zhichao Yang,
Avijit Mitra,
Sunjae Kwon,
Hong Yu
Abstract:
The advancement of natural language processing (NLP) systems in healthcare hinges on language model ability to interpret the intricate information contained within clinical notes. This process often requires integrating information from various time points in a patient's medical history. However, most earlier clinical language models were pretrained with a context length limited to roughly one cli…
▽ More
The advancement of natural language processing (NLP) systems in healthcare hinges on language model ability to interpret the intricate information contained within clinical notes. This process often requires integrating information from various time points in a patient's medical history. However, most earlier clinical language models were pretrained with a context length limited to roughly one clinical document. In this study, We introduce ClinicalMamba, a specialized version of the Mamba language model, pretrained on a vast corpus of longitudinal clinical notes to address the unique linguistic characteristics and information processing needs of the medical domain. ClinicalMamba, with 130 million and 2.8 billion parameters, demonstrates a superior performance in modeling clinical language across extended text lengths compared to Mamba and clinical Llama. With few-shot learning, ClinicalMamba achieves notable benchmarks in speed and accuracy, outperforming existing clinical language models and general domain large models like GPT-4 in longitudinal clinical notes information extraction tasks.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
A Simple Finite-Time Analysis of TD Learning with Linear Function Approximation
Authors:
Aritra Mitra
Abstract:
We study the finite-time convergence of TD learning with linear function approximation under Markovian sampling. Existing proofs for this setting either assume a projection step in the algorithm to simplify the analysis, or require a fairly intricate argument to ensure stability of the iterates. We ask: \textit{Is it possible to retain the simplicity of a projection-based analysis without actually…
▽ More
We study the finite-time convergence of TD learning with linear function approximation under Markovian sampling. Existing proofs for this setting either assume a projection step in the algorithm to simplify the analysis, or require a fairly intricate argument to ensure stability of the iterates. We ask: \textit{Is it possible to retain the simplicity of a projection-based analysis without actually performing a projection step in the algorithm?} Our main contribution is to show this is possible via a novel two-step argument. In the first step, we use induction to prove that under a standard choice of a constant step-size $α$, the iterates generated by TD learning remain uniformly bounded in expectation. In the second step, we establish a recursion that mimics the steady-state dynamics of TD learning up to a bounded perturbation on the order of $O(α^2)$ that captures the effect of Markovian sampling. Combining these pieces leads to an overall approach that considerably simplifies existing proofs. We conjecture that our inductive proof technique will find applications in the analyses of more complex stochastic approximation algorithms, and conclude by providing some examples of such applications.
△ Less
Submitted 25 June, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
Global shift symmetry on ADM hypersurface: Towards emergent "gravity"
Authors:
Ki-Seok Kim,
Arpita Mitra,
Debangshu Mukherjee,
Mitsuhiro Nishida
Abstract:
Generalized symmetries and their spontaneous breakdown serve as the fundamental concept to constrain the many-body entanglement structure, which allows us to characterize quantum phases of matter and emergent collective excitations. For example, emergent photons may be understood by spontaneous 1-form symmetry breaking, which results from a long-ranged entanglement structure between UV microscopic…
▽ More
Generalized symmetries and their spontaneous breakdown serve as the fundamental concept to constrain the many-body entanglement structure, which allows us to characterize quantum phases of matter and emergent collective excitations. For example, emergent photons may be understood by spontaneous 1-form symmetry breaking, which results from a long-ranged entanglement structure between UV microscopic degrees of freedom. In this study, we show that emergent ``gravity" may also arise in a similar fashion, where ``" has been used to emphasize that the symmetry-constrained gravitons show unconventional properties compared to usual gravitons. As the electric 1-form symmetry in Maxwell theory is realized as a global shift symmetry of the spatial component of the U(1) gauge field, generated by the electric field, we demonstrate that a constant shift of the ADM metric on the spatial hypersurface can be viewed as a global symmetry, generated by the ADM canonical momentum. Deriving a vector-type conserved charge from the variation of action, we construct a shift symmetry operator. Considering a Wick rotation, we demonstrate that a gravitational Wilson loop is charged under the action of this shift symmetry operator, which thus confirms the existence of a generalized global symmetry on the ADM hypersurface. Based on the Ward identity, we show that the spontaneous breaking of this global shift symmetry may give rise to a non-propagating massless symmetric gauge field at the boundary of the hypersurface.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Transcription and translation of videos using fine-tuned XLSR Wav2Vec2 on custom dataset and mBART
Authors:
Aniket Tathe,
Anand Kamble,
Suyash Kumbharkar,
Atharva Bhandare,
Anirban C. Mitra
Abstract:
This research addresses the challenge of training an ASR model for personalized voices with minimal data. Utilizing just 14 minutes of custom audio from a YouTube video, we employ Retrieval-Based Voice Conversion (RVC) to create a custom Common Voice 16.0 corpus. Subsequently, a Cross-lingual Self-supervised Representations (XLSR) Wav2Vec2 model is fine-tuned on this dataset. The developed web-bas…
▽ More
This research addresses the challenge of training an ASR model for personalized voices with minimal data. Utilizing just 14 minutes of custom audio from a YouTube video, we employ Retrieval-Based Voice Conversion (RVC) to create a custom Common Voice 16.0 corpus. Subsequently, a Cross-lingual Self-supervised Representations (XLSR) Wav2Vec2 model is fine-tuned on this dataset. The developed web-based GUI efficiently transcribes and translates input Hindi videos. By integrating XLSR Wav2Vec2 and mBART, the system aligns the translated text with the video timeline, delivering an accessible solution for multilingual video content transcription and translation for personalized voice.
△ Less
Submitted 29 February, 2024;
originally announced March 2024.
-
From Simulations to Reality: Dark Energy Reconstruction with Simulated SNIa data from the Vera C. Rubin Observatory
Authors:
Ayan Mitra,
Isidro Gómez-Vargas,
Vasilios Zarikas
Abstract:
In this paper, we present an Artificial Neural Network (ANN) based reconstruction analysis of the Supernova Ia (SNIa) distance moduli ($μ(z)$), and hence dark energy, using LSST simulated three-year SNIa data. Our ANN reconstruction architecture can model both the distance moduli and their corresponding error estimates. For this we employ astroANN and incorporate Monte Carlo dropout techniques to…
▽ More
In this paper, we present an Artificial Neural Network (ANN) based reconstruction analysis of the Supernova Ia (SNIa) distance moduli ($μ(z)$), and hence dark energy, using LSST simulated three-year SNIa data. Our ANN reconstruction architecture can model both the distance moduli and their corresponding error estimates. For this we employ astroANN and incorporate Monte Carlo dropout techniques to quantify uncertainties in our predictions. We tune our hyperparameters through advanced genetic algorithms, including elitism, utilizing the DEAP library. We compared the performance of the ANN based reconstruction with two theoretical descriptions of dark energy models, $Λ$CDM and Chevallier-Linder-Polarski (CPL). We perform a Bayesian analysis for these two theoretical models using the LSST simulations and also compare with observations from Pantheon and Pantheon+ SNIa real data. We show that our model-independent reconstruction using ANN is consistent with both of them. We assessed the performance using mean squared error (MSE) and showed that the ANN can produce distance estimates in better agreement with the LSST dataset than either $Λ$CDM or CPL, albeit very small. We included an additional residual analysis and a null test with $F$-scores to show that the reconstructed distances from the ANN model, are in excellent agreement with the $Λ$CDM or CPL model.
△ Less
Submitted 29 February, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Spinning Black Hole in a Fluid
Authors:
Surojit Dalui,
Arpan Krishna Mitra,
Deeshani Mitra,
Subir Ghosh
Abstract:
In this paper, we propose a new Analogue Gravity example - a spinning (or Kerr) Black Hole in an extended fluid model. The fluid model receives Berry curvature contributions and applies to electron dynamics in Condensed Matter lattice systems in the hydrodynamic limit. We construct the acoustic metric for sonic fluctuations that obey a structurally relativistic wave equation in an effective curved…
▽ More
In this paper, we propose a new Analogue Gravity example - a spinning (or Kerr) Black Hole in an extended fluid model. The fluid model receives Berry curvature contributions and applies to electron dynamics in Condensed Matter lattice systems in the hydrodynamic limit. We construct the acoustic metric for sonic fluctuations that obey a structurally relativistic wave equation in an effective curved background. In a novel approach of dimensional analysis, we have derived explicit expressions for effective mass and angular momentum per unit mass in the acoustic metric (in terms of fluid parameters), to identify with corresponding parameters of the Kerr metric. The spin is a manifestation of the Berry curvature-induced effective noncommutative structure in the fluid. Finally we put the Kerr Black Hole analogy in a robust setting by revealing explicitly the presence of horizon and ergo-region for a specific background fluid velocity profile. We also show that near horizon behavior of the phase-space trajectory of a probe particle agrees with Kerr Black Hole analogy. In fluid dynamics perspective, presence of a horizon signifies the wave blocking phenomenon.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
On inference for modularity statistics in structured networks
Authors:
Anirban Mitra,
Konasale Prasad,
Joshua Cape
Abstract:
This paper revisits the classical concept of network modularity and its spectral relaxations used throughout graph data analysis. We formulate and study several modularity statistic variants for which we establish asymptotic distributional results in the large-network limit for networks exhibiting nodal community structure. Our work facilitates testing for network differences and can be used in co…
▽ More
This paper revisits the classical concept of network modularity and its spectral relaxations used throughout graph data analysis. We formulate and study several modularity statistic variants for which we establish asymptotic distributional results in the large-network limit for networks exhibiting nodal community structure. Our work facilitates testing for network differences and can be used in conjunction with existing theoretical guarantees for stochastic blockmodel random graphs. Our results are enabled by recent advances in the study of low-rank truncations of large network adjacency matrices. We provide confirmatory simulation studies and real data analysis pertaining to the network neuroscience study of psychosis, specifically schizophrenia. Collectively, this paper contributes to the limited existing literature to date on statistical inference for modularity-based network analysis. Supplemental materials for this article are available online.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Orca-Math: Unlocking the potential of SLMs in Grade School Math
Authors:
Arindam Mitra,
Hamed Khanpour,
Corby Rosset,
Ahmed Awadallah
Abstract:
Mathematical word problem-solving has long been recognized as a complex task for small language models (SLMs). A recent study hypothesized that the smallest model size, needed to achieve over 80% accuracy on the GSM8K benchmark, is 34 billion parameters. To reach this level of performance with smaller models, researcher often train SLMs to generate Python code or use tools to help avoid calculatio…
▽ More
Mathematical word problem-solving has long been recognized as a complex task for small language models (SLMs). A recent study hypothesized that the smallest model size, needed to achieve over 80% accuracy on the GSM8K benchmark, is 34 billion parameters. To reach this level of performance with smaller models, researcher often train SLMs to generate Python code or use tools to help avoid calculation errors. Additionally, they employ ensembling, where outputs of up to 100 model runs are combined to arrive at a more accurate result. Result selection is done using consensus, majority vote or a separate a verifier model used in conjunction with the SLM. Ensembling provides a substantial boost in accuracy but at a significant cost increase with multiple calls to the model (e.g., Phi-GSM uses top-48 to boost the performance from 68.2 to 81.5).
In this work, we present Orca-Math, a 7-billion-parameter SLM based on the Mistral-7B, which achieves 86.81% on GSM8k without the need for multiple model calls or the use of verifiers, code execution or any other external tools. Our approach has the following key elements: (1) A high quality synthetic dataset of 200K math problems created using a multi-agent setup where agents collaborate to create the data, (2) An iterative learning techniques that enables the SLM to practice solving problems, receive feedback on its solutions and learn from preference pairs incorporating the SLM solutions and the feedback. When trained with Supervised Fine-Tuning alone, Orca-Math achieves 81.50% on GSM8k pass@1 metric. With iterative preference learning, Orca-Math achieves 86.81% pass@1. Orca-Math surpasses the performance of significantly larger models such as LLAMA-2-70B, WizardMath-70B, Gemini-Pro, ChatGPT-3.5. It also significantly outperforms other smaller models while using much smaller data (hundreds of thousands vs. millions of problems).
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling
Authors:
Arman Adibi,
Nicolo Dal Fabbro,
Luca Schenato,
Sanjeev Kulkarni,
H. Vincent Poor,
George J. Pappas,
Hamed Hassani,
Aritra Mitra
Abstract:
Motivated by applications in large-scale and multi-agent reinforcement learning, we study the non-asymptotic performance of stochastic approximation (SA) schemes with delayed updates under Markovian sampling. While the effect of delays has been extensively studied for optimization, the manner in which they interact with the underlying Markov process to shape the finite-time performance of SA remai…
▽ More
Motivated by applications in large-scale and multi-agent reinforcement learning, we study the non-asymptotic performance of stochastic approximation (SA) schemes with delayed updates under Markovian sampling. While the effect of delays has been extensively studied for optimization, the manner in which they interact with the underlying Markov process to shape the finite-time performance of SA remains poorly understood. In this context, our first main contribution is to show that under time-varying bounded delays, the delayed SA update rule guarantees exponentially fast convergence of the \emph{last iterate} to a ball around the SA operator's fixed point. Notably, our bound is \emph{tight} in its dependence on both the maximum delay $τ_{max}$, and the mixing time $τ_{mix}$. To achieve this tight bound, we develop a novel inductive proof technique that, unlike various existing delayed-optimization analyses, relies on establishing uniform boundedness of the iterates. As such, our proof may be of independent interest. Next, to mitigate the impact of the maximum delay on the convergence rate, we provide the first finite-time analysis of a delay-adaptive SA scheme under Markovian sampling. In particular, we show that the exponent of convergence of this scheme gets scaled down by $τ_{avg}$, as opposed to $τ_{max}$ for the vanilla delayed SA rule; here, $τ_{avg}$ denotes the average delay across all iterations. Moreover, the adaptive scheme requires no prior knowledge of the delay sequence for step-size tuning. Our theoretical findings shed light on the finite-time effects of delays for a broad class of algorithms, including TD learning, Q-learning, and stochastic gradient descent under Markovian sampling.
△ Less
Submitted 27 March, 2024; v1 submitted 18 February, 2024;
originally announced February 2024.
-
MERP: Metaverse Extended Realtiy Portal
Authors:
Anisha Ghosh,
Aditya Mitra,
Anik Saha,
Sibi Chakkaravarthy Sethuraman,
Anitha Subramanian
Abstract:
A standardized control system called Metaverse Extended Reality Portal (MERP) is presented as a solution to the issues with conventional VR eyewear. The MERP system improves user awareness of the physical world while offering an immersive 3D view of the metaverse by using a shouldermounted projector to display a Heads-Up Display (HUD) in a designated Metaverse Experience Room. To provide natural a…
▽ More
A standardized control system called Metaverse Extended Reality Portal (MERP) is presented as a solution to the issues with conventional VR eyewear. The MERP system improves user awareness of the physical world while offering an immersive 3D view of the metaverse by using a shouldermounted projector to display a Heads-Up Display (HUD) in a designated Metaverse Experience Room. To provide natural and secure interaction inside the metaverse, a compass module and gyroscope integration enable accurate map** of real-world motions to avatar actions. Through user tests and research, the MERP system shows that it may reduce mishaps brought on by poor spatial awareness, offering an improved metaverse experience and laying the groundwork for future developments in virtual reality technology. MERP, which is compared with existing Virtual Reality (VR) glasses used to traverse the metaverse, is projected to become a seamless, novel and better alternative. Existing VR headsets and AR glasses have well-known drawbacks that making them ineffective for prolonged usage as it causes harm to the eyes.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
The World of Generative AI: Deepfakes and Large Language Models
Authors:
Alakananda Mitra,
Saraju P. Mohanty,
Elias Kougianos
Abstract:
We live in the era of Generative Artificial Intelligence (GenAI). Deepfakes and Large Language Models (LLMs) are two examples of GenAI. Deepfakes, in particular, pose an alarming threat to society as they are capable of spreading misinformation and changing the truth. LLMs are powerful language models that generate general-purpose language. However due to its generative aspect, it can also be a ri…
▽ More
We live in the era of Generative Artificial Intelligence (GenAI). Deepfakes and Large Language Models (LLMs) are two examples of GenAI. Deepfakes, in particular, pose an alarming threat to society as they are capable of spreading misinformation and changing the truth. LLMs are powerful language models that generate general-purpose language. However due to its generative aspect, it can also be a risk for people if used with ill intentions. The ethical use of these technologies is a big concern. This short article tries to find out the interrelationship between them.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning
Authors:
Chenyu Zhang,
Han Wang,
Aritra Mitra,
James Anderson
Abstract:
Federated reinforcement learning (FRL) has emerged as a promising paradigm for reducing the sample complexity of reinforcement learning tasks by exploiting information from different agents. However, when each agent interacts with a potentially different environment, little to nothing is known theoretically about the non-asymptotic performance of FRL algorithms. The lack of such results can be att…
▽ More
Federated reinforcement learning (FRL) has emerged as a promising paradigm for reducing the sample complexity of reinforcement learning tasks by exploiting information from different agents. However, when each agent interacts with a potentially different environment, little to nothing is known theoretically about the non-asymptotic performance of FRL algorithms. The lack of such results can be attributed to various technical challenges and their intricate interplay: Markovian sampling, linear function approximation, multiple local updates to save communication, heterogeneity in the reward functions and transition kernels of the agents' MDPs, and continuous state-action spaces. Moreover, in the on-policy setting, the behavior policies vary with time, further complicating the analysis. In response, we introduce FedSARSA, a novel federated on-policy reinforcement learning scheme, equipped with linear function approximation, to address these challenges and provide a comprehensive finite-time error analysis. Notably, we establish that FedSARSA converges to a policy that is near-optimal for all agents, with the extent of near-optimality proportional to the level of heterogeneity. Furthermore, we prove that FedSARSA leverages agent collaboration to enable linear speedups as the number of agents increases, which holds for both fixed and adaptive step-size configurations.
△ Less
Submitted 14 April, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
Strong zero modes in integrable quantum circuits
Authors:
Eric Vernier,
Hsiu-Chung Yeh,
Lorenzo Piroli,
Aditi Mitra
Abstract:
It is a classic result that certain interacting integrable spin chains host robust edge modes known as strong zero modes (SZMs). In this work, we extend this result to the Floquet setting of local quantum circuits, focusing on a prototypical model providing an integrable Trotterization for the evolution of the XXZ Heisenberg spin chain. By exploiting the algebraic structures of integrability, we s…
▽ More
It is a classic result that certain interacting integrable spin chains host robust edge modes known as strong zero modes (SZMs). In this work, we extend this result to the Floquet setting of local quantum circuits, focusing on a prototypical model providing an integrable Trotterization for the evolution of the XXZ Heisenberg spin chain. By exploiting the algebraic structures of integrability, we show that an exact SZM operator can be constructed for these integrable quantum circuits in certain regions of parameter space. Our construction, which recovers a well-known result by Paul Fendley in the continuous-time limit, relies on a set of commuting transfer matrices known from integrability, and allows us to easily prove important properties of the SZM, including normalizabilty. Our approach is different from previous methods and could be of independent interest even in the Hamiltonian setting. Our predictions, which are corroborated by numerical simulations of infinite-temperature autocorrelation functions, are potentially interesting for implementations of the XXZ quantum circuit on available quantum platforms.
△ Less
Submitted 17 February, 2024; v1 submitted 22 January, 2024;
originally announced January 2024.
-
End to end Hindi to English speech conversion using Bark, mBART and a finetuned XLSR Wav2Vec2
Authors:
Aniket Tathe,
Anand Kamble,
Suyash Kumbharkar,
Atharva Bhandare,
Anirban C. Mitra
Abstract:
Speech has long been a barrier to effective communication and connection, persisting as a challenge in our increasingly interconnected world. This research paper introduces a transformative solution to this persistent obstacle an end-to-end speech conversion framework tailored for Hindi-to-English translation, culminating in the synthesis of English audio. By integrating cutting-edge technologies…
▽ More
Speech has long been a barrier to effective communication and connection, persisting as a challenge in our increasingly interconnected world. This research paper introduces a transformative solution to this persistent obstacle an end-to-end speech conversion framework tailored for Hindi-to-English translation, culminating in the synthesis of English audio. By integrating cutting-edge technologies such as XLSR Wav2Vec2 for automatic speech recognition (ASR), mBART for neural machine translation (NMT), and a Text-to-Speech (TTS) synthesis component, this framework offers a unified and seamless approach to cross-lingual communication. We delve into the intricate details of each component, elucidating their individual contributions and exploring the synergies that enable a fluid transition from spoken Hindi to synthesized English audio.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Geometric embeddings of spaces of persistence diagrams with explicit distortions
Authors:
Atish Mitra,
Ziga Virk
Abstract:
Let $n$ be a positive integer. We provide an explicit geometrically motivated $1$-Lipschitz map from the space of persistence diagrams on $n$ points (equipped with the Bottleneck distance) into Hilbert space. Such maps are a crucial step in topological data analysis, allowing the use of statistic (and thus data analysis) on collections of persistence diagrams. The main advantage of our maps as com…
▽ More
Let $n$ be a positive integer. We provide an explicit geometrically motivated $1$-Lipschitz map from the space of persistence diagrams on $n$ points (equipped with the Bottleneck distance) into Hilbert space. Such maps are a crucial step in topological data analysis, allowing the use of statistic (and thus data analysis) on collections of persistence diagrams. The main advantage of our maps as compared to most of the other such transformations is that they are coarse and uniform embeddings with explicit distortion functions. This allows us to control the amount of geometric information lost through their application. Furthermore, we provide an explicit $1$-Lipschitz map from the space of persistence diagrams on $n$ points on a bounded domain into a Euclidean space with an explicit distortion function. The mentioned maps are fairly simple, with each component depending depending only on the bottleneck distance to the corresponding landmark persistence diagram. Due to geometric motivation from classical dimension theory, our methods are best described as quantitative dimension theory. We discuss the advantages and disadvantages of our approach. We conclude with differently flavoured embedding of the space of persistence diagrams on $n$ points on a bounded domain into $\mathbb{R}^{n(n+1)}$.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Towards Model-Free LQR Control over Rate-Limited Channels
Authors:
Aritra Mitra,
Lintao Ye,
Vijay Gupta
Abstract:
Given the success of model-free methods for control design in many problem settings, it is natural to ask how things will change if realistic communication channels are utilized for the transmission of gradients or policies. While the resulting problem has analogies with the formulations studied under the rubric of networked control systems, the rich literature in that area has typically assumed t…
▽ More
Given the success of model-free methods for control design in many problem settings, it is natural to ask how things will change if realistic communication channels are utilized for the transmission of gradients or policies. While the resulting problem has analogies with the formulations studied under the rubric of networked control systems, the rich literature in that area has typically assumed that the model of the system is known. As a step towards bridging the fields of model-free control design and networked control systems, we ask: \textit{Is it possible to solve basic control problems - such as the linear quadratic regulator (LQR) problem - in a model-free manner over a rate-limited channel?} Toward answering this question, we study a setting where a worker agent transmits quantized policy gradients (of the LQR cost) to a server over a noiseless channel with a finite bit-rate. We propose a new algorithm titled Adaptively Quantized Gradient Descent (\texttt{AQGD}), and prove that above a certain finite threshold bit-rate, \texttt{AQGD} guarantees exponentially fast convergence to the globally optimal policy, with \textit{no deterioration of the exponent relative to the unquantized setting}. More generally, our approach reveals the benefits of adaptive quantization in preserving fast linear convergence rates, and, as such, may be of independent interest to the literature on compressed optimization.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
Evaluating District-based Election Surveys with Synthetic Dirichlet Likelihood
Authors:
Adway Mitra,
Palash Dey
Abstract:
In district-based multi-party elections, electors cast votes in their respective districts. In each district, the party with maximum votes wins the corresponding seat in the governing body. Election Surveys try to predict the election outcome (vote shares and seat shares of parties) by querying a random sample of electors. However, the survey results are often inconsistent with the actual results,…
▽ More
In district-based multi-party elections, electors cast votes in their respective districts. In each district, the party with maximum votes wins the corresponding seat in the governing body. Election Surveys try to predict the election outcome (vote shares and seat shares of parties) by querying a random sample of electors. However, the survey results are often inconsistent with the actual results, which could be due to multiple reasons. The aim of this work is to estimate a posterior distribution over the possible outcomes of the election, given one or more survey results. This is achieved using a prior distribution over vote shares, election models to simulate the complete election from the vote share, and survey models to simulate survey results from a complete election. The desired posterior distribution over the space of possible outcomes is constructed using Synthetic Dirichlet Likelihoods, whose parameters are estimated from Monte Carlo sampling of elections using the election models. We further show the same approach can also use be used to evaluate the surveys - whether they were biased or not, based on the true outcome once it is known. Our work offers the first-ever probabilistic model to analyze district-based election surveys. We illustrate our approach with extensive experiments on real and simulated data of district-based political elections in India.
△ Less
Submitted 23 December, 2023;
originally announced December 2023.
-
Scalable simulation of non-equilibrium quantum dynamics via classically optimised unitary circuits
Authors:
Luke Causer,
Felix Jung,
Asimpunya Mitra,
Frank Pollmann,
Adam Smith
Abstract:
The advent of near-term digital quantum computers could offer us an exciting opportunity to investigate quantum many-body phenomena beyond that of classical computing. To make the best use of the hardware available, it is paramount that we have methods that accurately simulate Hamiltonian dynamics for limited circuit depths. In this paper, we propose a method to classically optimise unitary brickw…
▽ More
The advent of near-term digital quantum computers could offer us an exciting opportunity to investigate quantum many-body phenomena beyond that of classical computing. To make the best use of the hardware available, it is paramount that we have methods that accurately simulate Hamiltonian dynamics for limited circuit depths. In this paper, we propose a method to classically optimise unitary brickwall circuits to approximate quantum time evolution operators. Our method is scalable in system size through the use of tensor networks. We demonstrate that, for various three-body Hamiltonians, our approach produces quantum circuits that can outperform Trotterization in both their accuracy and the quantum circuit depth needed to implement the dynamics, with the exact details being dependent on the Hamiltonian. We also explain how to choose an optimal time step that minimises the combined errors of the quantum device and the brickwall circuit approximation.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Quantum-centric Supercomputing for Materials Science: A Perspective on Challenges and Future Directions
Authors:
Yuri Alexeev,
Maximilian Amsler,
Paul Baity,
Marco Antonio Barroca,
Sanzio Bassini,
Torey Battelle,
Daan Camps,
David Casanova,
Young jai Choi,
Frederic T. Chong,
Charles Chung,
Chris Codella,
Antonio D. Corcoles,
James Cruise,
Alberto Di Meglio,
Jonathan Dubois,
Ivan Duran,
Thomas Eckl,
Sophia Economou,
Stephan Eidenbenz,
Bruce Elmegreen,
Clyde Fare,
Ismael Faro,
Cristina Sanz Fernández,
Rodrigo Neumann Barros Ferreira
, et al. (102 additional authors not shown)
Abstract:
Computational models are an essential tool for the design, characterization, and discovery of novel materials. Hard computational tasks in materials science stretch the limits of existing high-performance supercomputing centers, consuming much of their simulation, analysis, and data resources. Quantum computing, on the other hand, is an emerging technology with the potential to accelerate many of…
▽ More
Computational models are an essential tool for the design, characterization, and discovery of novel materials. Hard computational tasks in materials science stretch the limits of existing high-performance supercomputing centers, consuming much of their simulation, analysis, and data resources. Quantum computing, on the other hand, is an emerging technology with the potential to accelerate many of the computational tasks needed for materials science. In order to do that, the quantum technology must interact with conventional high-performance computing in several ways: approximate results validation, identification of hard problems, and synergies in quantum-centric supercomputing. In this paper, we provide a perspective on how quantum-centric supercomputing can help address critical computational problems in materials science, the challenges to face in order to solve representative use cases, and new suggested directions.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Three-dimensional modelling of polygonal ridges in salt playas
Authors:
R. A. I. Haque,
A. J. Mitra,
T. Dutta
Abstract:
Salt playas with their tessellated surface of polygonal salt ridges are beautiful and intriguing, but the scientific community lacks a realistic and physically meaningful model that thoroughly explains their formation. In this work, we investigated the formation phenomena via suitable three-dimensional modelling and simulation of the dynamical processes that are responsible. We employed fracture m…
▽ More
Salt playas with their tessellated surface of polygonal salt ridges are beautiful and intriguing, but the scientific community lacks a realistic and physically meaningful model that thoroughly explains their formation. In this work, we investigated the formation phenomena via suitable three-dimensional modelling and simulation of the dynamical processes that are responsible. We employed fracture mechanics, principles of energy minimization, fluid and mass transport in fracture channels and processes of crystallization and self organisation to finally replicate the almost Voronoidal pattern of salt ridges that tessellate salt playas. The model is applicable to playas having different salt compositions, as the effect of the salt diffusion coefficient and critical salinity at supersaturation for a particular ambient condition are factored in. The model closely reproduces the height distribution and geometry of the salt ridges reported in the literature. Further, we prove that the final stable polygonal geometry of the salt playas is an effort towards the total minimization of system energy.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Cotton Yield Prediction Using Random Forest
Authors:
Alakananda Mitra,
Sahila Beegum,
David Fleisher,
Vangimalla R. Reddy,
Wenguang Sun,
Chittaranjan Ray,
Dennis Timlin,
Arindam Malakar
Abstract:
The cotton industry in the United States is committed to sustainable production practices that minimize water, land, and energy use while improving soil health and cotton output. Climate-smart agricultural technologies are being developed to boost yields while decreasing operating expenses. Crop yield prediction, on the other hand, is difficult because of the complex and nonlinear impacts of culti…
▽ More
The cotton industry in the United States is committed to sustainable production practices that minimize water, land, and energy use while improving soil health and cotton output. Climate-smart agricultural technologies are being developed to boost yields while decreasing operating expenses. Crop yield prediction, on the other hand, is difficult because of the complex and nonlinear impacts of cultivar, soil type, management, pest and disease, climate, and weather patterns on crops. To solve this issue, we employ machine learning (ML) to forecast production while considering climate change, soil diversity, cultivar, and inorganic nitrogen levels. From the 1980s to the 1990s, field data were gathered across the southern cotton belt of the United States. To capture the most current effects of climate change over the previous six years, a second data source was produced using the process-based crop model, GOSSYM. We concentrated our efforts on three distinct areas inside each of the three southern states: Texas, Mississippi, and Georgia. To simplify the amount of computations, accumulated heat units (AHU) for each set of experimental data were employed as an analogy to use time-series weather data. The Random Forest Regressor yielded a 97.75% accuracy rate, with a root mean square error of 55.05 kg/ha and an R2 of around 0.98. These findings demonstrate how an ML technique may be developed and applied as a reliable and easy-to-use model to support the cotton climate-smart initiative.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Quantum Multiple Kernel Learning in Financial Classification Tasks
Authors:
Shungo Miyabe,
Brian Quanz,
Noriaki Shimada,
Abhijit Mitra,
Takahiro Yamamoto,
Vladimir Rastunkov,
Dimitris Alevras,
Mekena Metcalf,
Daniel J. M. King,
Mohammad Mamouei,
Matthew D. Jackson,
Martin Brown,
Philip Intallura,
Jae-Eun Park
Abstract:
Financial services is a prospect industry where unlocked near-term quantum utility could yield profitable potential, and, in particular, quantum machine learning algorithms could potentially benefit businesses by improving the quality of predictive models. Quantum kernel methods have demonstrated success in financial, binary classification tasks, like fraud detection, and avoid issues found in var…
▽ More
Financial services is a prospect industry where unlocked near-term quantum utility could yield profitable potential, and, in particular, quantum machine learning algorithms could potentially benefit businesses by improving the quality of predictive models. Quantum kernel methods have demonstrated success in financial, binary classification tasks, like fraud detection, and avoid issues found in variational quantum machine learning approaches. However, choosing a suitable quantum kernel for a classical dataset remains a challenge. We propose a hybrid, quantum multiple kernel learning (QMKL) methodology that can improve classification quality over a single kernel approach. We test the robustness of QMKL on several financially relevant datasets using both fidelity and projected quantum kernel approaches. We further demonstrate QMKL on quantum hardware using an error mitigation pipeline and show the benefits of QMKL in the large qubit regime.
△ Less
Submitted 30 November, 2023;
originally announced December 2023.
-
A Universal Model of Floquet Operator Krylov Space
Authors:
Hsiu-Chung Yeh,
Aditi Mitra
Abstract:
It is shown that the stroboscopic time-evolution under a Floquet unitary, in any spatial dimension, and of any Hermitian operator, can be mapped to an operator Krylov space which is identical to that generated by the edge operator of the non-interacting Floquet transverse-field Ising model (TFIM) in one-spatial dimension, and with inhomogeneous Ising and transverse field couplings. The latter has…
▽ More
It is shown that the stroboscopic time-evolution under a Floquet unitary, in any spatial dimension, and of any Hermitian operator, can be mapped to an operator Krylov space which is identical to that generated by the edge operator of the non-interacting Floquet transverse-field Ising model (TFIM) in one-spatial dimension, and with inhomogeneous Ising and transverse field couplings. The latter has four topological phases reflected by the absence (topologically trivial) or presence (topologically non-trivial) of edge modes at $0$ and/or $π$ quasi-energies. It is shown that the Floquet dynamics share certain universal features characterized by how the Krylov parameters vary in the topological phase diagram of the Floquet TFIM with homogeneous couplings. These results are highlighted through examples, all chosen for numerical convenience to be in one spatial dimension: non-integrable Floquet spin $1/2$ chains and Floquet $Z_3$ clock model where the latter hosts period-tripled edge modes.
△ Less
Submitted 25 November, 2023;
originally announced November 2023.
-
Custom Data Augmentation for low resource ASR using Bark and Retrieval-Based Voice Conversion
Authors:
Anand Kamble,
Aniket Tathe,
Suyash Kumbharkar,
Atharva Bhandare,
Anirban C. Mitra
Abstract:
This paper proposes two innovative methodologies to construct customized Common Voice datasets for low-resource languages like Hindi. The first methodology leverages Bark, a transformer-based text-to-audio model developed by Suno, and incorporates Meta's enCodec and a pre-trained HuBert model to enhance Bark's performance. The second methodology employs Retrieval-Based Voice Conversion (RVC) and u…
▽ More
This paper proposes two innovative methodologies to construct customized Common Voice datasets for low-resource languages like Hindi. The first methodology leverages Bark, a transformer-based text-to-audio model developed by Suno, and incorporates Meta's enCodec and a pre-trained HuBert model to enhance Bark's performance. The second methodology employs Retrieval-Based Voice Conversion (RVC) and uses the Ozen toolkit for data preparation. Both methodologies contribute to the advancement of ASR technology and offer valuable insights into addressing the challenges of constructing customized Common Voice datasets for under-resourced languages. Furthermore, they provide a pathway to achieving high-quality, personalized voice generation for a range of applications.
△ Less
Submitted 9 January, 2024; v1 submitted 24 November, 2023;
originally announced November 2023.
-
On the Maximum Energy Release from Formation of Static Compact Objects
Authors:
A. Mitra,
K. K. Singh
Abstract:
Type II Supernova 1987A (SN 1987A), observed in 1987, released an energy of \mbox{$Q \approx 3 \times 10^{53}$ erg}}. This huge energy is essentially the magnitude of gravitational potential or self-gravitational energy (PE) of a new born cold neutron star having a gravitational compactness or redshift $z_b \approx 0.15$. One may wonder what could be the upper limit on the amount of energy that mi…
▽ More
Type II Supernova 1987A (SN 1987A), observed in 1987, released an energy of \mbox{$Q \approx 3 \times 10^{53}$ erg}}. This huge energy is essentially the magnitude of gravitational potential or self-gravitational energy (PE) of a new born cold neutron star having a gravitational compactness or redshift $z_b \approx 0.15$. One may wonder what could be the upper limit on the amount of energy that might be released with the formation of a cold Ultra Compact Object (UCO) with an arbitrary high $z_b$. Accordingly, here, for the first time, we obtain an analytical expression for the PE of a homogeneous general relativistic UCO assuming it to be cold and static. It is found that the PE of a homogeneous UCO of mass $M$ may exceed Mc$^2$ and be as large as 1.34 Mc$^2$. This result, though surprising, follows from an \textit{exact and correct} analytical calculation based on the standard General Theory of Relativity (GTR). Further, UCOs supported by tangential stresses may be inhomogeneous and much more massive than neutron stars with PE $\sim$ 2.1 Mc$^2$ Thus, in principle, formation of an UCO of a few solar masses ($M_\odot$) might release an energy $Q\sim10^{55}$ erg.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Equivariant Global Hopf Bifurcation in Abstract Nonlinear Parabolic Equations
Authors:
Zalman Balanov,
Wieslaw Krawcewicz,
Arnaja Mitra,
Dmitrii Rachinskii
Abstract:
In this paper we study local and global symmetric Hopf bifurcation in abstract parabolic systems by means of the twisted equivariant degree.
In this paper we study local and global symmetric Hopf bifurcation in abstract parabolic systems by means of the twisted equivariant degree.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Orca 2: Teaching Small Language Models How to Reason
Authors:
Arindam Mitra,
Luciano Del Corro,
Shweti Mahajan,
Andres Codas,
Clarisse Simoes,
Sahaj Agarwal,
Xuxi Chen,
Anastasia Razdaibiedina,
Erik Jones,
Kriti Aggarwal,
Hamid Palangi,
Guoqing Zheng,
Corby Rosset,
Hamed Khanpour,
Ahmed Awadallah
Abstract:
Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on training small LMs has often relied on imitation learning to replicate the output of more capable models. We…
▽ More
Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on training small LMs has often relied on imitation learning to replicate the output of more capable models. We contend that excessive emphasis on imitation may restrict the potential of smaller models. We seek to teach small LMs to employ different solution strategies for different tasks, potentially different from the one used by the larger model. For example, while larger models might provide a direct answer to a complex task, smaller models may not have the same capacity. In Orca 2, we teach the model various reasoning techniques (step-by-step, recall then generate, recall-reason-generate, direct answer, etc.). More crucially, we aim to help the model learn to determine the most effective solution strategy for each task. We evaluate Orca 2 using a comprehensive set of 15 diverse benchmarks (corresponding to approximately 100 tasks and over 36,000 unique prompts). Orca 2 significantly surpasses models of similar size and attains performance levels similar or better to those of models 5-10x larger, as assessed on complex tasks that test advanced reasoning abilities in zero-shot settings. make Orca 2 weights publicly available at aka.ms/orca-lm to support research on the development, evaluation, and alignment of smaller LMs
△ Less
Submitted 21 November, 2023; v1 submitted 18 November, 2023;
originally announced November 2023.
-
Soft factors with AdS radius corrections
Authors:
Karan Fernandes,
Nabamita Banerjee,
Arpita Mitra
Abstract:
We review recent developments concerning the soft factorization of scattering amplitudes that arise in the large radius limit of four dimensional Anti-de Sitter (AdS$_4$) spacetimes. This includes the presence of AdS radius dependent corrections of known flat spacetime soft factors and their implication on the relationship between soft theorems and Ward identities of the boundary conformal field t…
▽ More
We review recent developments concerning the soft factorization of scattering amplitudes that arise in the large radius limit of four dimensional Anti-de Sitter (AdS$_4$) spacetimes. This includes the presence of AdS radius dependent corrections of known flat spacetime soft factors and their implication on the relationship between soft theorems and Ward identities of the boundary conformal field theory.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
Signature quasinormal modes of Ellis-Bronnikov wormhole embedded in warped braneworld background
Authors:
Antariksha Mitra,
Suman Ghosh
Abstract:
We examine the quasi normal modes of Ellis-Bronnikov wormholes embedded in a warped five dimensional braneworld background and compare with it's four dimensional counterpart. These scalar quasi normal frequencies are obtained using the WKB formula, Prony method and the direct integration method. The signature of the warped extra dimension shows up as two distinct quasi normal ringing era, characte…
▽ More
We examine the quasi normal modes of Ellis-Bronnikov wormholes embedded in a warped five dimensional braneworld background and compare with it's four dimensional counterpart. These scalar quasi normal frequencies are obtained using the WKB formula, Prony method and the direct integration method. The signature of the warped extra dimension shows up as two distinct quasi normal ringing era, characterised by two distinct dominant quasi normal modes. Features of the latter region are similar to that observed earlier for massive scalar field in black hole background. We also discuss the how steepness of the neck of the wormhole effects the quasi normal frequencies.
△ Less
Submitted 10 February, 2024; v1 submitted 26 October, 2023;
originally announced October 2023.