Search | arXiv e-print repository

arXiv:2407.02662 [pdf, other]

Supporters and Skeptics: LLM-based Analysis of Engagement with Mental Health (Mis)Information Content on Video-sharing Platforms

Authors: Viet Cuong Nguyen, Mini Jain, Abhijat Chauhan, Heather Jaime Soled, Santiago Alvarez Lesmes, Zihang Li, Michael L. Birnbaum, Sunny X. Tang, Srijan Kumar, Munmun De Choudhury

Abstract: Over one in five adults in the US lives with a mental illness. In the face of a shortage of mental health professionals and offline resources, online short-form video content has grown to serve as a crucial conduit for disseminating mental health help and resources. However, the ease of content creation and access also contributes to the spread of misinformation, posing risks to accurate diagnosis… ▽ More Over one in five adults in the US lives with a mental illness. In the face of a shortage of mental health professionals and offline resources, online short-form video content has grown to serve as a crucial conduit for disseminating mental health help and resources. However, the ease of content creation and access also contributes to the spread of misinformation, posing risks to accurate diagnosis and treatment. Detecting and understanding engagement with such content is crucial to mitigating their harmful effects on public health. We perform the first quantitative study of the phenomenon using YouTube Shorts and Bitchute as the sites of study. We contribute MentalMisinfo, a novel labeled mental health misinformation (MHMisinfo) dataset of 739 videos (639 from Youtube and 100 from Bitchute) and 135372 comments in total, using an expert-driven annotation schema. We first found that few-shot in-context learning with large language models (LLMs) are effective in detecting MHMisinfo videos. Next, we discover distinct and potentially alarming linguistic patterns in how audiences engage with MHMisinfo videos through commentary on both video-sharing platforms. Across the two platforms, comments could exacerbate prevailing stigma with some groups showing heightened susceptibility to and alignment with MHMisinfo. We discuss technical and public health-driven adaptive solutions to tackling the "epidemic" of mental health misinformation online. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 12 pages, in submission to ICWSM

arXiv:2406.01678 [pdf, other]

New insights into axion freeze-in

Authors: Mudit Jain, Angelo Maggi, Wen-Yuan Ai, David J. E. Marsh

Abstract: Freeze-in via the axion-photon coupling, $g_{φγ}$, can produce axions in the early Universe. At low reheating temperatures close to the minimum allowed value $T_{\rm reh}\approx T_{\rm BBN}\approx 10\,{\rm MeV}$, the abundance peaks for axion masses $m_φ\approx T_{\rm reh}$. Such heavy axions are unstable and subsequently decay, leading to strong constraints on $g_{φγ}$ from astrophysics and cosmo… ▽ More Freeze-in via the axion-photon coupling, $g_{φγ}$, can produce axions in the early Universe. At low reheating temperatures close to the minimum allowed value $T_{\rm reh}\approx T_{\rm BBN}\approx 10\,{\rm MeV}$, the abundance peaks for axion masses $m_φ\approx T_{\rm reh}$. Such heavy axions are unstable and subsequently decay, leading to strong constraints on $g_{φγ}$ from astrophysics and cosmology. In this work, we revisit the computation of the freeze-in abundance and clarify important issues. We begin with a complete computation of the collision terms for the Primakoff process, electron-positron annihilation, and photon-to-axion (inverse-)decay, while approximately taking into account plasma screening and threshold effects. We then solve the Boltzmann equation for the full axion distribution function. We confirm previous results about the importance of both processes to the effective "relic abundance" (defined as density prior to decay), and provide useful fitting formulae to estimate the freeze-in abundance from the equilibrium interaction rate. For the distribution function, we find an out-of-equilibrium population of axions and introduce an effective temperature for them. We follow the evolution right up until decay, and find that the average axion kinetic energy is larger than a thermal relic by between 20\% and 80\%, which may have implications for limits on decaying axions from X-ray spectra. We extend our study to a two-axion system with quartic cross-coupling, and find that for typical/expected couplings, freeze-in of a second axion flavour by annihilations leads to a negligibly small contribution to the relic density. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 18 pages, 10 figures, 3 appendices

Report number: KCL-PH-TH/2024-31

arXiv:2405.20971 [pdf, other]

Amortizing intractable inference in diffusion models for vision, language, and control

Authors: Siddarth Venkatraman, Moksh Jain, Luca Scimeca, Minsu Kim, Marcin Sendera, Mohsin Hasan, Luke Rowe, Sarthak Mittal, Pablo Lemos, Emmanuel Bengio, Alexandre Adam, Jarrid Rector-Brooks, Yoshua Bengio, Glen Berseth, Nikolay Malkin

Abstract: Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data, $\mathbf{x}\sim p^{\rm post}(\mathbf{x})\propto p(\mathbf{x})r(\mathbf{x})$, in a model that consists of a diffusion generat… ▽ More Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data, $\mathbf{x}\sim p^{\rm post}(\mathbf{x})\propto p(\mathbf{x})r(\mathbf{x})$, in a model that consists of a diffusion generative model prior $p(\mathbf{x})$ and a black-box constraint or likelihood function $r(\mathbf{x})$. We state and prove the asymptotic correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from this posterior, a problem that existing methods solve only approximately or in restricted cases. Relative trajectory balance arises from the generative flow network perspective on diffusion models, which allows the use of deep reinforcement learning techniques to improve mode coverage. Experiments illustrate the broad potential of unbiased inference of arbitrary posteriors under diffusion priors: in vision (classifier guidance), language (infilling under a discrete diffusion LLM), and multimodal data (text-to-image generation). Beyond generative modeling, we apply relative trajectory balance to the problem of continuous control with a score-based behavior prior, achieving state-of-the-art results on benchmarks in offline reinforcement learning. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: Code: https://github.com/GFNOrg/diffusion-finetuning

arXiv:2405.18540 [pdf, other]

Learning diverse attacks on large language models for robust red-teaming and safety tuning

Authors: Seanie Lee, Minsu Kim, Lynn Cherif, David Dobre, Juho Lee, Sung Ju Hwang, Kenji Kawaguchi, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Moksh Jain

Abstract: Red-teaming, or identifying prompts that elicit harmful responses, is a critical step in ensuring the safe and responsible deployment of large language models (LLMs). Develo** effective protection against many modes of attack prompts requires discovering diverse attacks. Automated red-teaming typically uses reinforcement learning to fine-tune an attacker language model to generate prompts that e… ▽ More Red-teaming, or identifying prompts that elicit harmful responses, is a critical step in ensuring the safe and responsible deployment of large language models (LLMs). Develo** effective protection against many modes of attack prompts requires discovering diverse attacks. Automated red-teaming typically uses reinforcement learning to fine-tune an attacker language model to generate prompts that elicit undesirable responses from a target LLM, as measured, for example, by an auxiliary toxicity classifier. We show that even with explicit regularization to favor novelty and diversity, existing approaches suffer from mode collapse or fail to generate effective attacks. As a flexible and probabilistically principled alternative, we propose to use GFlowNet fine-tuning, followed by a secondary smoothing phase, to train the attacker model to generate diverse and effective attack prompts. We find that the attacks generated by our method are effective against a wide range of target LLMs, both with and without safety tuning, and transfer well between target LLMs. Finally, we demonstrate that models safety-tuned using a dataset of red-teaming prompts generated by our method are robust to attacks from other RL-based red-teaming approaches. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.18024 [pdf, other]

Emergent inhomogeneity and non-locality in a graphene field-effect transistor on a near-parallel moire superlattice of transition metal dichalcogenides

Authors: Shaili Sett, Rahul Debnath, Arup Singha, Shinjan Mandal, Jyothsna K, Monika Bhakar, Kenji Watanabe, Takashi Taniguchi, Varun Raghunathan, Goutam Sheet, Manish Jain, Arindam Ghosh

Abstract: At near-parallel orientation, twisted bilayer of transition metal dichalcogenides exhibit inter-layer charge transfer-driven out-of-plane ferroelectricity that may lead to unique electronic device architectures. Here we report detailed electrical transport in a dual-gated graphene field-effect transistor placed on 3R stacked twisted bilayer of WSe2 at a twist angle of 2.1 degree. We observe hyster… ▽ More At near-parallel orientation, twisted bilayer of transition metal dichalcogenides exhibit inter-layer charge transfer-driven out-of-plane ferroelectricity that may lead to unique electronic device architectures. Here we report detailed electrical transport in a dual-gated graphene field-effect transistor placed on 3R stacked twisted bilayer of WSe2 at a twist angle of 2.1 degree. We observe hysteretic transfer characteristics and an emergent charge inhomogeneity with multiple local Dirac points as the electric displacement field (D) is increased. Concomitantly, we also observe a strong non-local voltage signal at D = 0 V/nm that decreases rapidly with increasing D. A linear scaling of the non-local signal with longitudinal resistance suggests edge mode transport, which we attribute to the breaking of valley symmetry of the graphene channel due to the spatially fluctuating electric field from the moire domains of the underlying twisted WSe2. A quantitative analysis connecting the non-locality and channel inhomogeneity suggests emergence of finite-size domains in the graphene channel that modulate the charge and the valley currents simultaneously. This work underlines efficient control and impact of interfacial ferroelectricity that can trigger a new genre of devices for twistronic applications. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 16 pages, 15 figures

arXiv:2405.14684 [pdf, other]

Engineering ultra-strong electron-phonon coupling and nonclassical electron transport in crystalline gold with nanoscale interfaces

Authors: Shreya Kumbhakar, Tuhin Kumar Maji, Binita Tongbram, Shinjan Mandal, Shri Hari Soundararaj, Banashree Debnath, T. Phanindra Sai, Manish Jain, H. R. Krishnamurthy, Anshu Pandey, Arindam Ghosh

Abstract: Electrical resistivity in good metals, particularly noble metals such as gold (Au), silver (Ag), or copper, increases linearly with temperature ($T$) for $T > Θ_{\mathrm{D}}$, where $Θ_{\mathrm{D}}$ is the Debye temperature. This is because the coupling ($λ$) between the electrons and the lattice vibrations, or phonons, in these metals is rather weak with $λ\sim 0.1-0.2$, and a perturbative analys… ▽ More Electrical resistivity in good metals, particularly noble metals such as gold (Au), silver (Ag), or copper, increases linearly with temperature ($T$) for $T > Θ_{\mathrm{D}}$, where $Θ_{\mathrm{D}}$ is the Debye temperature. This is because the coupling ($λ$) between the electrons and the lattice vibrations, or phonons, in these metals is rather weak with $λ\sim 0.1-0.2$, and a perturbative analysis suffices to explain the $T$-linear electron-phonon scattering rate. In this work, we outline a new nanostructuring strategy of crystalline Au where this foundational concept of metallic transport breaks down. We show that by embedding a distributed network of ultra-small Ag nanoparticles (AgNPs) of radius $\sim1-2$ nm inside a crystalline Au shell, an unprecedented enhancement in the electron-phonon interaction, with $λ$ as high as $\approx 20$, can be achieved. This is over hundred times that of bare Au or Ag, and ten times larger than any known metal. With increasing AgNP density, the electrical resistivity deviates from $T$-linearity, and approaches a saturation to the Mott-Ioffe-Regel scale $ρ_{\mathrm{MIR}}\sim h a /e^2$ for both disorder ($T\to 0$) and phonon ($T \gg Θ_{\mathrm{D}}$)-dependent components of resistivity (here, $a=0.3$~nm, is the lattice constant of Au). This giant electron-phonon interaction, which we suggest arises from the coulomb interaction-induced coupling of conduction electrons to the localized phonon modes at the buried Au-Ag hetero-interfaces, allows experimental access to a regime of nonclassical metallic transport that has never been probed before. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 7+12 pages, total 4+8 figures

arXiv:2405.13546 [pdf, other]

Knowledge-Driven Cross-Document Relation Extraction

Authors: Monika Jain, Raghava Mutharaju, Kuldeep Singh, Ramakanth Kavuluru

Abstract: Relation extraction (RE) is a well-known NLP application often treated as a sentence- or document-level task. However, a handful of recent efforts explore it across documents or in the cross-document setting (CrossDocRE). This is distinct from the single document case because different documents often focus on disparate themes, while text within a document tends to have a single goal. Linking find… ▽ More Relation extraction (RE) is a well-known NLP application often treated as a sentence- or document-level task. However, a handful of recent efforts explore it across documents or in the cross-document setting (CrossDocRE). This is distinct from the single document case because different documents often focus on disparate themes, while text within a document tends to have a single goal. Linking findings from disparate documents to identify new relationships is at the core of the popular literature-based knowledge discovery paradigm in biomedicine and other domains. Current CrossDocRE efforts do not consider domain knowledge, which are often assumed to be known to the reader when documents are authored. Here, we propose a novel approach, KXDocRE, that embed domain knowledge of entities with input text for cross-document RE. Our proposed framework has three main benefits over baselines: 1) it incorporates domain knowledge of entities along with documents' text; 2) it offers interpretability by producing explanatory text for predicted relations between entities 3) it improves performance over the prior methods. △ Less

Submitted 18 June, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

Comments: Accepted in ACL 2024 Findings

arXiv:2405.01616 [pdf, other]

Generative Active Learning for the Search of Small-molecule Protein Binders

Authors: Maksym Korablyov, Cheng-Hao Liu, Moksh Jain, Almer M. van der Sloot, Eric Jolicoeur, Edward Ruediger, Andrei Cristian Nica, Emmanuel Bengio, Kostiantyn Lapchevskyi, Daniel St-Cyr, Doris Alexandra Schuetz, Victor Ion Butoi, Jarrid Rector-Brooks, Simon Blackburn, Leo Feng, Hadi Nekoei, SaiKrishna Gottipati, Priyesh Vijayan, Prateek Gupta, Ladislav Rampášek, Sasikanth Avancha, Pierre-Luc Bacon, William L. Hamilton, Brooks Paige, Sanchit Misra , et al. (9 additional authors not shown)

Abstract: Despite substantial progress in machine learning for scientific discovery in recent years, truly de novo design of small molecules which exhibit a property of interest remains a significant challenge. We introduce LambdaZero, a generative active learning approach to search for synthesizable molecules. Powered by deep reinforcement learning, LambdaZero learns to search over the vast space of molecu… ▽ More Despite substantial progress in machine learning for scientific discovery in recent years, truly de novo design of small molecules which exhibit a property of interest remains a significant challenge. We introduce LambdaZero, a generative active learning approach to search for synthesizable molecules. Powered by deep reinforcement learning, LambdaZero learns to search over the vast space of molecules to discover candidates with a desired property. We apply LambdaZero with molecular docking to design novel small molecules that inhibit the enzyme soluble Epoxide Hydrolase 2 (sEH), while enforcing constraints on synthesizability and drug-likeliness. LambdaZero provides an exponential speedup in terms of the number of calls to the expensive molecular docking oracle, and LambdaZero de novo designed molecules reach docking scores that would otherwise require the virtual screening of a hundred billion molecules. Importantly, LambdaZero discovers novel scaffolds of synthesizable, drug-like inhibitors for sEH. In in vitro experimental validation, a series of ligands from a generated quinazoline-based scaffold were synthesized, and the lead inhibitor N-(4,6-di(pyrrolidin-1-yl)quinazolin-2-yl)-N-methylbenzamide (UM0152893) displayed sub-micromolar enzyme inhibition of sEH. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2404.16494 [pdf, ps, other]

Theoretical Insights into Inorganic Antiperovskite Nitrides (X$_3$NA; X = Mg, Sr, Ca, Ba; A = Sb, As): An Emerging Class of Materials for Photovoltaics

Authors: Sanchi Monga, Manjari Jain, Claudia Draxl, Saswata Bhattacharya

Abstract: Antiperovskite nitrides are potential candidates for applications harvesting solar light. With a comprehensive state-of-the-art approach combining hybrid density-functional theory, many-body perturbation theory, the Wannier-Mott model, density-functional perturbation theory, and the Feynman polaron model, we explore excitonic and polaronic effects in X$_3$NA (X: Mg, Ca, Sr, Ba, A = Sb, As). For al… ▽ More Antiperovskite nitrides are potential candidates for applications harvesting solar light. With a comprehensive state-of-the-art approach combining hybrid density-functional theory, many-body perturbation theory, the Wannier-Mott model, density-functional perturbation theory, and the Feynman polaron model, we explore excitonic and polaronic effects in X$_3$NA (X: Mg, Ca, Sr, Ba, A = Sb, As). For all of them, we uncover a significant influence of the ionic dielectric screening on the static dielectric constant. Small exciton binding energies, weak electron-phonon coupling, and high charge-carrier mobilities facilitate enhanced charge transport in Mg$_3$NSb, Sr$_3$NSb, and Ba$_3$NSb. Our results highlight the potential of these nitrides as optimal candidates for efficient photovoltaic absorbers. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.10094 [pdf, other]

Towards DNA-Encoded Library Generation with GFlowNets

Authors: Michał Koziarski, Mohammed Abukalam, Vedant Shah, Louis Vaillancourt, Doris Alexandra Schuetz, Moksh Jain, Almer van der Sloot, Mathieu Bourgey, Anne Marinier, Yoshua Bengio

Abstract: DNA-encoded libraries (DELs) are a powerful approach for rapidly screening large numbers of diverse compounds. One of the key challenges in using DELs is library design, which involves choosing the building blocks that will be combinatorially combined to produce the final library. In this paper we consider the task of protein-protein interaction (PPI) biased DEL design. To this end, we evaluate se… ▽ More DNA-encoded libraries (DELs) are a powerful approach for rapidly screening large numbers of diverse compounds. One of the key challenges in using DELs is library design, which involves choosing the building blocks that will be combinatorially combined to produce the final library. In this paper we consider the task of protein-protein interaction (PPI) biased DEL design. To this end, we evaluate several machine learning algorithms on the PPI modulation task and use them as a reward for the proposed GFlowNet-based generative approach. We additionally investigate the possibility of using structural information about building blocks to design a hierarchical action space for the GFlowNet. The observed results indicate that GFlowNets are a promising approach for generating diverse combinatorial library candidates. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.08423 [pdf, other]

SIR-RL: Reinforcement Learning for Optimized Policy Control during Epidemiological Outbreaks in Emerging Market and Develo** Economies

Authors: Maeghal Jain, Ziya Uddin, Wubshet Ibrahim

Abstract: The outbreak of COVID-19 has highlighted the intricate interplay between public health and economic stability on a global scale. This study proposes a novel reinforcement learning framework designed to optimize health and economic outcomes during pandemics. The framework leverages the SIR model, integrating both lockdown measures (via a stringency index) and vaccination strategies to simulate dise… ▽ More The outbreak of COVID-19 has highlighted the intricate interplay between public health and economic stability on a global scale. This study proposes a novel reinforcement learning framework designed to optimize health and economic outcomes during pandemics. The framework leverages the SIR model, integrating both lockdown measures (via a stringency index) and vaccination strategies to simulate disease dynamics. The stringency index, indicative of the severity of lockdown measures, influences both the spread of the disease and the economic health of a country. Develo** nations, which bear a disproportionate economic burden under stringent lockdowns, are the primary focus of our study. By implementing reinforcement learning, we aim to optimize governmental responses and strike a balance between the competing costs associated with public health and economic stability. This approach also enhances transparency in governmental decision-making by establishing a well-defined reward function for the reinforcement learning agent. In essence, this study introduces an innovative and ethical strategy to navigate the challenge of balancing public health and economic stability amidst infectious disease outbreaks. △ Less

Submitted 30 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

Comments: 27 pages, 12 figures

arXiv:2403.09692 [pdf, other]

Phonon Linewidths in Twisted Bilayer Graphene near Magic Angle

Authors: Shinjan Mandal, Indrajit Maity, H. R. Krishnamurthy, Manish Jain

Abstract: We present a computational study of the phonon linewidths in twisted bilayer graphene arising from electron-phonon interactions and anharmonic effects. The electronic structure is calculated using distance-dependent transfer integrals based on the atomistic Slater-Koster tight-binding formalism, including electron-electron interactions treated at the Hartree level, and the phonons are calculated u… ▽ More We present a computational study of the phonon linewidths in twisted bilayer graphene arising from electron-phonon interactions and anharmonic effects. The electronic structure is calculated using distance-dependent transfer integrals based on the atomistic Slater-Koster tight-binding formalism, including electron-electron interactions treated at the Hartree level, and the phonons are calculated using classical force fields. These ingredients are used to calculate the phonon linewidths arising from electron-phonon interactions. Furthermore, anharmonic effects on the linewidths are computed using the mode-projected velocity autocorrelation function obtained from classical molecular dynamics. We predict a moiré potential induced splitting of this mode, which arises due to contributions from high symmetry stacking regions. Our findings show that both electron-phonon and anharmonic effects have a significant impact on the linewidth of the Raman active G mode near the magic angle. △ Less

Submitted 2 July, 2024; v1 submitted 15 February, 2024; originally announced March 2024.

Comments: 6 pages, 4 figures (Supplementary 13 pages, 11 figures)

arXiv:2403.02381 [pdf, other]

Vector Wave Dark Matter and Terrestrial Quantum Sensors

Authors: Dorian W. P. Amaral, Mudit Jain, Mustafa A. Amin, Christopher Tunnell

Abstract: (Ultra)light spin-$1$ particles -- dark photons -- can constitute all of dark matter (DM) and have beyond Standard Model couplings. This can lead to a coherent, oscillatory signature in terrestrial detectors that depends on the coupling strength. We provide a signal analysis and statistical framework for inferring the properties of such DM by taking into account (i) the stochastic and (ii) the vec… ▽ More (Ultra)light spin-$1$ particles -- dark photons -- can constitute all of dark matter (DM) and have beyond Standard Model couplings. This can lead to a coherent, oscillatory signature in terrestrial detectors that depends on the coupling strength. We provide a signal analysis and statistical framework for inferring the properties of such DM by taking into account (i) the stochastic and (ii) the vector nature of the underlying field, along with (iii) the effects due to the Earth's rotation. Owing to equipartition, on time scales shorter than the coherence time the DM field vector typically traces out a fixed ellipse. Taking this ellipse and the rotation of the Earth into account, we highlight a distinctive three-peak signal in Fourier space that can be used to constrain DM coupling strengths. Accounting for all three peaks, we derive latitude-independent constraints on such DM couplings, unlike those stemming from single-peak studies. We apply our framework to the search for ultralight $B - L$ DM using optomechanical sensors, demonstrating the ability to delve into previously unprobed regions of this DM candidate's parameter space. △ Less

Submitted 26 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: 23 pages, 9 figures, and 5 appendices. Compared to v2, added 1 figure. Also, for a movie of the vector field behavior based on simulations, visit https://www.youtube.com/watch?v=bbw6yFRLS7s

Report number: KCL-PH-TH/2024-09

Journal ref: JCAP 06 (2024) 050

arXiv:2402.14705 [pdf, other]

Engineering and Revealing Dirac Strings in Spinor Condensates

Authors: Gui-Sheng Xu, Mudit Jain, Xiang-Fa Zhou, Guang-Can Guo, Mustafa A. Amin, Han Pu, Zheng-Wei Zhou

Abstract: Artificial monopoles have been engineered in various systems, yet there has been no systematic study of the singular vector potentials associated with the monopole field. We show that the Dirac string, the line singularity of the vector potential, can be engineered, manipulated, and made manifest in a spinor atomic condensate. We elucidate the connection among spin, orbital degrees of freedom, and… ▽ More Artificial monopoles have been engineered in various systems, yet there has been no systematic study of the singular vector potentials associated with the monopole field. We show that the Dirac string, the line singularity of the vector potential, can be engineered, manipulated, and made manifest in a spinor atomic condensate. We elucidate the connection among spin, orbital degrees of freedom, and the artificial gauge, and show that there exists a map** between the vortex filament and the Dirac string. We also devise a proposal where preparing initial spin states with relevant symmetries can result in different vortex patterns, revealing an underlying correspondence between the internal spin states and the spherical vortex structures. Such a map** also leads to a new way of constructing spherical Landau levels, and monopole harmonics. Our observation provides insights into the behavior of quantum matter possessing internal symmetries in curved spaces. △ Less

Submitted 9 April, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

Comments: 5 pages + 5 figures + 5 appendices. In comparison with the previous version, we have (1) added supplementary material as appendices, and (2) made some minor revisions in the main text. Also, a simulation video for creating two artificial Dirac Strings with a mixed spin-1 state, is available at https://youtu.be/PCQGMqd-DfQ

arXiv:2402.11649 [pdf, other]

Electric field tunable superconductivity with competing orders in near magic-angle twisted bilayer graphene

Authors: Ranit Dutta, Ayan Ghosh, Shinjan Mandal, K. Watanabe, T. Taniguchi, H. R. Krishnamurthy, Sumilan Banerjee, Manish Jain, Anindya Das

Abstract: Superconductivity (SC) in twisted bilayer graphene (tBLG) has been explored by varying carrier concentrations, twist angles, and screening strength, with the aim of uncovering its origin and possible connections to strong electronic correlations in narrow bands and various resulting broken symmetries. However, the link between the tBLG band structure and the onset of SC and other orders largely re… ▽ More Superconductivity (SC) in twisted bilayer graphene (tBLG) has been explored by varying carrier concentrations, twist angles, and screening strength, with the aim of uncovering its origin and possible connections to strong electronic correlations in narrow bands and various resulting broken symmetries. However, the link between the tBLG band structure and the onset of SC and other orders largely remains unclear. In this study, we address this crucial gap by examining in-situ band structure tuning of a near magic-angle ($θ\approx0.95^0$) tBLG device with displacement field ($D$) and reveal remarkable competition between SC and other broken symmetries. At zero $D$, the device exhibits superconducting signatures without the resistance peak at half-filling, a characteristic signature with a strong electronic correlation. As $D$ increases, the SC is suppressed, accompanied by the appearance of a resistance peak at half-filling. Hall density measurements reveal that at zero $D$, SC arises around the van Hove singularity (vHs) from an isospin or spin-valley unpolarized band. At higher $D$, the suppression of SC coincides with broken isospin symmetry near half-filling with lifted degeneracy ($g_d \sim 2$). Additionally, when SC is suppressed at higher $D$, density waves around the superconducting dome are seen. These findings, with our theoretical calculations, highlight the competition between SC and other orders, and the stabilization of the latter by the electric field enhanced nesting in the band structure of tBLG. △ Less

Submitted 18 February, 2024; originally announced February 2024.

arXiv:2402.04620 [pdf, other]

CataractBot: An LLM-Powered Expert-in-the-Loop Chatbot for Cataract Patients

Authors: Pragnya Ramjee, Bhuvan Sachdeva, Satvik Golechha, Shreyas Kulkarni, Geeta Fulari, Kaushik Murali, Mohit Jain

Abstract: The healthcare landscape is evolving, with patients seeking more reliable information about their health conditions, treatment options, and potential risks. Despite the abundance of information sources, the digital age overwhelms individuals with excess, often inaccurate information. Patients primarily trust doctors and hospital staff, highlighting the need for expert-endorsed health information.… ▽ More The healthcare landscape is evolving, with patients seeking more reliable information about their health conditions, treatment options, and potential risks. Despite the abundance of information sources, the digital age overwhelms individuals with excess, often inaccurate information. Patients primarily trust doctors and hospital staff, highlighting the need for expert-endorsed health information. However, the pressure on experts has led to reduced communication time, impacting information sharing. To address this gap, we propose CataractBot, an experts-in-the-loop chatbot powered by large language models (LLMs). Developed in collaboration with a tertiary eye hospital in India, CataractBot answers cataract surgery related questions instantly by querying a curated knowledge base, and provides expert-verified responses asynchronously. CataractBot features multimodal support and multilingual capabilities. In an in-the-wild deployment study with 49 participants, CataractBot proved valuable, providing anytime accessibility, saving time, and accommodating diverse literacy levels. Trust was established through expert verification. Broadly, our results could inform future work on designing expert-mediated LLM bots. △ Less

Submitted 7 February, 2024; originally announced February 2024.

arXiv:2402.02787 [pdf, other]

Ab initio Investigation of Thermal Transport in Insulators: Unveiling the Roles of Phonon Renormalization and Higher-Order Anharmonicity

Authors: Soham Mandal, Manish Jain, Prabal K. Maiti

Abstract: The occurrence of thermal transport phenomena is widespread, exerting a pivotal influence on the functionality of diverse electronic and thermo-electric energy-conversion devices. The traditional first-principles theory governing the thermal and thermodynamic characteristics of insulators relies on the perturbative treatment of interatomic potential and ad-hoc displacement of atoms within supercel… ▽ More The occurrence of thermal transport phenomena is widespread, exerting a pivotal influence on the functionality of diverse electronic and thermo-electric energy-conversion devices. The traditional first-principles theory governing the thermal and thermodynamic characteristics of insulators relies on the perturbative treatment of interatomic potential and ad-hoc displacement of atoms within supercells. However, the limitations of these approaches for highly anharmonic and weakly bonded materials, along with discrepancies arising from not considering explicit finite temperature effects, highlight the necessity for a well-defined quasiparticle approach to the lattice vibrations. To address these limitations, we present a comprehensive numerical framework in this study, designed to compute the thermal and thermodynamic characteristics of crystalline semiconductors and insulators. The self-consistent phonon renormalization method we have devised reveals phonons as quasiparticles, diverging from their conventional characterization as bare normal modes of lattice vibration. The extension of the renormalization impact to interatomic force constants (IFCs) of third and fourth orders is also integrated and demonstrated. For the comprehensive physical insights, we employed an iterative solution of the Peierls-Boltzmann transport equation (PBTE) to determine thermal conductivity and carry out Helmholtz free energy calculations, encompassing anharmonicity effects up to the fourth order. In this study, we utilize our numerical framework to showcase its applicability through an examination of phonon dispersion, phonon linewidth, anharmonic phonon scattering, and temperature-dependent lattice thermal conductivity in both highly anharmonic materials (NaCl and AgI) and weakly anharmonic materials (cBN and 3C-SiC). △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2401.13880 [pdf, other]

Principal Component Regression to Study the Impact of Economic Factors on Disadvantaged Communities

Authors: Narmadha M. Mohankumar, Milan Jain, Heng Wan, Sumitrra Ganguli, Kyle D. Wilson, David M. Anderson

Abstract: The Council on Environmental Quality's Climate and Economic Justice Screening Tool defines "disadvantaged communities" (DAC) in the USA, highlighting census tracts where benefits of climate and energy investments are not accruing. We use a principal component generalized linear model, which addresses the intertwined nature of economic factors, income and employment and model their relationship to… ▽ More The Council on Environmental Quality's Climate and Economic Justice Screening Tool defines "disadvantaged communities" (DAC) in the USA, highlighting census tracts where benefits of climate and energy investments are not accruing. We use a principal component generalized linear model, which addresses the intertwined nature of economic factors, income and employment and model their relationship to DAC status. Our study 1) identifies the most significant income groups and employment industries that impact DAC status, 2) provides the probability of DAC status across census tracts and compares the predictive accuracy with widely used machine learning approaches, 3) obtains historical predictions of the probability of DAC status, 4) obtains spatial downscaling of DAC status across block groups. Our study provides valuable insights for policymakers and stakeholders to develop strategies that promote sustainable development and address inequities in climate and energy investments in the USA. △ Less

Submitted 24 January, 2024; originally announced January 2024.

Comments: 13 pages, 9 figures, 2 tables

arXiv:2401.11800 [pdf, other]

Revisiting Document-Level Relation Extraction with Context-Guided Link Prediction

Authors: Monika Jain, Raghava Mutharaju, Ramakanth Kavuluru, Kuldeep Singh

Abstract: Document-level relation extraction (DocRE) poses the challenge of identifying relationships between entities within a document as opposed to the traditional RE setting where a single sentence is input. Existing approaches rely on logical reasoning or contextual cues from entities. This paper reframes document-level RE as link prediction over a knowledge graph with distinct benefits: 1) Our approac… ▽ More Document-level relation extraction (DocRE) poses the challenge of identifying relationships between entities within a document as opposed to the traditional RE setting where a single sentence is input. Existing approaches rely on logical reasoning or contextual cues from entities. This paper reframes document-level RE as link prediction over a knowledge graph with distinct benefits: 1) Our approach combines entity context with document-derived logical reasoning, enhancing link prediction quality. 2) Predicted links between entities offer interpretability, elucidating employed reasoning. We evaluate our approach on three benchmark datasets: DocRED, ReDocRED, and DWIE. The results indicate that our proposed method outperforms the state-of-the-art models and suggests that incorporating context-based link prediction techniques can enhance the performance of document-level relation extraction models. △ Less

Submitted 22 January, 2024; originally announced January 2024.

Comments: Accepted in AAAI 2024

arXiv:2401.09154

ANFIS and metaheuristic optimization for green supply chain with inspection and rework

Authors: Nidhi Sharma, Madhu Jain, Dinesh Sharma

Abstract: The focus of present article is to investigate a supply chain inventory model along with inspection and stock dependent demand with use of green technology to reduce carbon emissions. Products that are decaying, or those that change over time, have a high sensitivity to the environment in terms of temperature, carbon emission, humidity, waste disposal, etc. This study develops a profit maximizatio… ▽ More The focus of present article is to investigate a supply chain inventory model along with inspection and stock dependent demand with use of green technology to reduce carbon emissions. Products that are decaying, or those that change over time, have a high sensitivity to the environment in terms of temperature, carbon emission, humidity, waste disposal, etc. This study develops a profit maximization model in the presence of deterioration, preservation, imperfect production, inspection error, rework, stock and price-dependent demand. The three carbon emission strategies are proposed to reduce the expenses in different carbon emissions scenarios. The suggested approach may be used to determine the optimal production period, preservation investment, and level of green investment. The solution of the non-linear constraint optimization is provided by using a penalty method in metaheuristic approaches. In order to conduct a sensitivity analysis for the essential model parameters, a numerical example is presented. The soft computing results produced by DE and PSO are compared with the results obtained by Adaptive Neuro-Fuzzy Inference System (ANFIS) technique. △ Less

Submitted 18 January, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

Comments: One of the co-author is not agreed

arXiv:2401.00714 [pdf, other]

Calculation of Gilbert dam** and magnetic moment of inertia using torque-torque correlation model within ab initio Wannier framework

Authors: Robin Bajaj, Seung-Cheol Lee, H. R. Krishnamurthy, Satadeep Bhattacharjee, Manish Jain

Abstract: Magnetization dynamics in magnetic materials are well described by the modified semiclassical Landau-Lifshitz-Gilbert (LLG) equation, which includes the magnetic dam** $α$ and the magnetic moment of inertia $\mathrm{I}$ tensors as key parameters. Both parameters are material-specific and physically represent the time scales of dam** of precession and nutation in magnetization dynamics. $α$ and… ▽ More Magnetization dynamics in magnetic materials are well described by the modified semiclassical Landau-Lifshitz-Gilbert (LLG) equation, which includes the magnetic dam** $α$ and the magnetic moment of inertia $\mathrm{I}$ tensors as key parameters. Both parameters are material-specific and physically represent the time scales of dam** of precession and nutation in magnetization dynamics. $α$ and $\mathrm{I}$ can be calculated quantum mechanically within the framework of the torque-torque correlation model. The quantities required for the calculation are torque matrix elements, the real and imaginary parts of the Green's function and its derivatives. Here, we calculate these parameters for the elemental magnets such as Fe, Co and Ni in an ab initio framework using density functional theory and Wannier functions. We also propose a method to calculate the torque matrix elements within the Wannier framework. We demonstrate the effectiveness of the method by comparing it with the experiments and the previous ab initio and empirical studies and show its potential to improve our understanding of spin dynamics and to facilitate the design of spintronic devices. △ Less

Submitted 1 January, 2024; originally announced January 2024.

arXiv:2312.11805 [pdf, other]

Gemini: A Family of Highly Capable Multimodal Models

Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI. △ Less

Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

arXiv:2311.14018 [pdf]

Fabrication and extreme micromechanics of additive metal microarchitectures

Authors: Sung-Gyu Kang, Barbara Bellon, Lalithkumar Bhaskar, Siyuan Zhang, Alexander Gotz, Janis Wirth, Benjamin Apeleo Zubiri, Szilvia Kalacska, Manish Jain, Amit Sharma, Wabe Koelmans, Giorgio Ercolano, Erdmann Spiecker, Johann Michler, Jakob Schwiedrzik, Gerhard Dehm, Rajaprakash Ramachandramoorthy

Abstract: The mechanical performance of metallic metamaterials with 3-dimensional solid frames is typically a combination of the geometrical effect ("architecture") and the characteristic size effects of the base material ("microstructure"). In this study, for the first time, the temperature- and rate-dependent mechanical response of copper microlattices has been investigated. The microlattices were fabrica… ▽ More The mechanical performance of metallic metamaterials with 3-dimensional solid frames is typically a combination of the geometrical effect ("architecture") and the characteristic size effects of the base material ("microstructure"). In this study, for the first time, the temperature- and rate-dependent mechanical response of copper microlattices has been investigated. The microlattices were fabricated via a localized electrodeposition in liquid (LEL) process which enables high-precision additive manufacturing of metal at the micro-scale. The metal microlattices possess a unique microstructure with micron sized grains that are rich with randomly oriented growth twins and near-ideal nodal connectivity. Importantly, copper microlattices exhibited unique temperature (-150 and 25 degree C) and strain rate (0.001~100 s-1) dependent deformation behavior during in situ micromechanical testing. Systematic compression tests of fully dense copper micropillars, equivalent in diameter and length to the struts of the microlattice at comparable extreme loading conditions, allow us to investigate the intrinsic deformation mechanism of copper. Combined with the post-mortem microstructural analysis, substantial shifts in deformation mechanisms depending on the temperature and strain rate were revealed. On the one hand, at room temperature (25 degree C), dislocation slip based plastic deformation occurs and leads to a localized deformation of the micropillars. On the other hand, at cryogenic temperature (-150 degree C), mechanical twinning occurs and leads to relatively homogeneous deformation of the micropillars. Based on the intrinsic deformation mechanisms of copper, the temperature and strain rate dependent deformation behavior of microlattices could be explained. △ Less

Submitted 3 April, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

arXiv:2311.08382 [pdf, other]

Mott insulating negative thermal expansion perovskite TiF3

Authors: Donal Sheets, Kaitlin Lyszak, Menka Jain, Gayanath W. Fernando, Ilya Sochnikov, Jacob Franklin, R. Mattias Geilhufe, Jason N. Hancock

Abstract: We characterize perovskite TiF_3, a material which displays significant negative thermal expansion at elevated temperatures above its cubic-to-rhombohedral structural phase transition at 330 K. We find the optical response favors an insulating state in both structural phases, which we show can be produced in density functional theory calculations only through the introduction of an on-site Coulomb… ▽ More We characterize perovskite TiF_3, a material which displays significant negative thermal expansion at elevated temperatures above its cubic-to-rhombohedral structural phase transition at 330 K. We find the optical response favors an insulating state in both structural phases, which we show can be produced in density functional theory calculations only through the introduction of an on-site Coulomb repulsion. Analysis of the magnetic susceptibility data gives a S=1/2 local moment per Ti+3 ion and an antiferromagnetic exchange coupling. Together, these results show that TiF_3 is a strongly correlated electron system, a fact which constrains possible mechanisms of strong negative thermal expansion in the Sc_1-xTi_xF3 system. We consider the relative strength of the Jahn-Teller and electric dipole interactions in driving the structural transition. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Comments: 8 pages, 4 figures, in review Physical Review B

arXiv:2310.08906 [pdf, other]

doi 10.1021/acs.nanolett.3c04223

Controlling Umklapp scattering in bilayer graphene moir'e superlattice

Authors: Mohit Kumar Jat, Shubhankar Mishra, Harsimran Kaur Mann, Robin Bajaj, Kenji Watanabe, Takashi Taniguchi, H. R. Krishnamurthy, Manish Jain, Aveek Bid

Abstract: In this Letter, we present experimental findings on electron-electron scattering in a two-dimensional moir'e heterostructure with tunable Fermi wave vector, reciprocal lattice vector, and band gap. We achieve this in high-mobility aligned heterostructures of bilayer graphene (BLG) and hBN. Around half-filling, the primary contribution to the resistance of BLG/hBN aligned superlattices arises from… ▽ More In this Letter, we present experimental findings on electron-electron scattering in a two-dimensional moir'e heterostructure with tunable Fermi wave vector, reciprocal lattice vector, and band gap. We achieve this in high-mobility aligned heterostructures of bilayer graphene (BLG) and hBN. Around half-filling, the primary contribution to the resistance of BLG/hBN aligned superlattices arises from electron-electron Umklapp (Uee) scattering, making the resistance of graphene/hBN moir'e devices significantly larger than that of non-aligned devices (where Uee is forbidden). We quantify the strength of the Uee scattering and find that it follows a universal scaling with Fermi energy and has a non-monotonic dependence on the charge carrier density. The Uee scattering is strongly electric field tunable and affected by layer-polarization of BLG. It has a strong particle-hole asymmetry - the resistance when the chemical potential is in the conduction band is significantly lesser than when it is in the valence band, making the electron-doped regime more practical for potential applications. △ Less

Submitted 15 February, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

Comments: Comments and suggestion are welcome

Journal ref: Nano Letters (2024)

arXiv:2310.08774 [pdf, other]

PhyloGFN: Phylogenetic inference with generative flow networks

Authors: Mingyang Zhou, Zichao Yan, Elliot Layne, Nikolay Malkin, Dinghuai Zhang, Moksh Jain, Mathieu Blanchette, Yoshua Bengio

Abstract: Phylogenetics is a branch of computational biology that studies the evolutionary relationships among biological entities. Its long history and numerous applications notwithstanding, inference of phylogenetic trees from sequence data remains challenging: the high complexity of tree space poses a significant obstacle for the current combinatorial and probabilistic techniques. In this paper, we adopt… ▽ More Phylogenetics is a branch of computational biology that studies the evolutionary relationships among biological entities. Its long history and numerous applications notwithstanding, inference of phylogenetic trees from sequence data remains challenging: the high complexity of tree space poses a significant obstacle for the current combinatorial and probabilistic techniques. In this paper, we adopt the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and Bayesian phylogenetic inference. Because GFlowNets are well-suited for sampling complex combinatorial structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies and evolutionary distances. We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets. PhyloGFN is competitive with prior works in marginal likelihood estimation and achieves a closer fit to the target distribution than state-of-the-art variational inference methods. Our code is available at https://github.com/zmy1116/phylogfn. △ Less

Submitted 24 March, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

arXiv:2310.04363 [pdf, other]

Amortizing intractable inference in large language models

Authors: Edward J. Hu, Moksh Jain, Eric Elmoznino, Younesse Kaddar, Guillaume Lajoie, Yoshua Bengio, Nikolay Malkin

Abstract: Autoregressive large language models (LLMs) compress knowledge from their training data through next-token conditional distributions. This limits tractable querying of this knowledge to start-to-end autoregressive sampling. However, many tasks of interest -- including sequence continuation, infilling, and other forms of constrained generation -- involve sampling from intractable posterior distribu… ▽ More Autoregressive large language models (LLMs) compress knowledge from their training data through next-token conditional distributions. This limits tractable querying of this knowledge to start-to-end autoregressive sampling. However, many tasks of interest -- including sequence continuation, infilling, and other forms of constrained generation -- involve sampling from intractable posterior distributions. We address this limitation by using amortized Bayesian inference to sample from these intractable posteriors. Such amortization is algorithmically achieved by fine-tuning LLMs via diversity-seeking reinforcement learning algorithms: generative flow networks (GFlowNets). We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training and reward-maximizing policy optimization. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem and demonstrate that our approach enables data-efficient adaptation of LLMs to tasks that require multi-step rationalization and tool use. △ Less

Submitted 13 March, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

Comments: ICLR 2024; 23 pages; code: https://github.com/GFNOrg/gfn-lm-tuning

arXiv:2310.03419 [pdf, other]

Pre-Training and Fine-Tuning Generative Flow Networks

Authors: Ling Pan, Moksh Jain, Kanika Madan, Yoshua Bengio

Abstract: Generative Flow Networks (GFlowNets) are amortized samplers that learn stochastic policies to sequentially generate compositional objects from a given unnormalized reward distribution. They can generate diverse sets of high-reward objects, which is an important consideration in scientific discovery tasks. However, as they are typically trained from a given extrinsic reward function, it remains an… ▽ More Generative Flow Networks (GFlowNets) are amortized samplers that learn stochastic policies to sequentially generate compositional objects from a given unnormalized reward distribution. They can generate diverse sets of high-reward objects, which is an important consideration in scientific discovery tasks. However, as they are typically trained from a given extrinsic reward function, it remains an important open challenge about how to leverage the power of pre-training and train GFlowNets in an unsupervised fashion for efficient adaptation to downstream tasks. Inspired by recent successes of unsupervised pre-training in various domains, we introduce a novel approach for reward-free pre-training of GFlowNets. By framing the training as a self-supervised problem, we propose an outcome-conditioned GFlowNet (OC-GFN) that learns to explore the candidate space. Specifically, OC-GFN learns to reach any targeted outcomes, akin to goal-conditioned policies in reinforcement learning. We show that the pre-trained OC-GFN model can allow for a direct extraction of a policy capable of sampling from any new reward functions in downstream tasks. Nonetheless, adapting OC-GFN on a downstream task-specific reward involves an intractable marginalization over possible outcomes. We propose a novel way to approximate this marginalization by learning an amortized predictor enabling efficient fine-tuning. Extensive experimental results validate the efficacy of our approach, demonstrating the effectiveness of pre-training the OC-GFN, and its ability to swiftly adapt to downstream tasks and discover modes more efficiently. This work may serve as a foundation for further exploration of pre-training strategies in the context of GFlowNets. △ Less

Submitted 5 October, 2023; originally announced October 2023.

arXiv:2310.02771 [pdf, other]

doi 10.1088/2058-9565/ad4c91

Controlling the interactions in a cold atom quantum impurity system

Authors: Thomas Hewitt, Tom Bertheas, Manan Jain, Yusuke Nishida, Giovanni Barontini

Abstract: We implement an experimental architecture in which a single atom of K is trapped in an optical tweezer, and is immersed in a bath of Rb atoms at ultralow temperatures. In this regime, the motion of the single trapped atom is confined to the lowest quantum vibrational levels. This realizes an elementary and fully controllable quantum impurity system. For the trap** of the K atom, we use a species… ▽ More We implement an experimental architecture in which a single atom of K is trapped in an optical tweezer, and is immersed in a bath of Rb atoms at ultralow temperatures. In this regime, the motion of the single trapped atom is confined to the lowest quantum vibrational levels. This realizes an elementary and fully controllable quantum impurity system. For the trap** of the K atom, we use a species-selective dipole potential, that allows us to independently manipulate the quantum impurity and the bath. We concentrate on the characterization and control of the interactions between the two subsystems. To this end, we perform Feshbach spectroscopy, detecting several inter-dimensional confinement-induced Feshbach resonances for the KRb interspecies scattering length, that parametrizes the strength of the interactions. We compare our data to a theory for inter-dimensional scattering, finding good agreement. Notably, we also detect a series of p-wave resonances stemming from the underlying free-space s-wave interactions. We further determine how the resonances behave as the temperature of the bath and the dimensionality of the interactions change. Additionally, we are able to screen the quantum impurity from the bath by finely tuning the wavelength of the light that produces the optical tweezer, providing us with a new effective tool to control and minimize the interactions. Our results open a range of new possibilities in quantum simulations of quantum impurity models, quantum information, and quantum thermodynamics, where the interactions between a quantized system and the bath is a powerful yet largely underutilized resource. △ Less

Submitted 28 May, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

Journal ref: Quantum Sci. Technol. 9 035039 (2024)

arXiv:2310.00058 [pdf, other]

doi 10.1103/PhysRevD.109.016002

Kinetic relaxation and nucleation of Bose stars in self-interacting wave dark matter

Authors: Mudit Jain, Wisha Wanichwecharungruang, Jonathan Thomas

Abstract: We revisit kinetic relaxation and soliton/Boson star nucleation in fuzzy scalar dark matter featuring short-ranged self-interactions $\mathcal{H}_{\rm int} = -λ|ψ|^4/2m^2$, alongside gravitational self-interactions. We map out the full curve of nucleation timescale for both repulsive ($λ< 0$) and attractive ($λ> 0$) short-ranged self-interaction strength, and in doing so reveal two new points. Fir… ▽ More We revisit kinetic relaxation and soliton/Boson star nucleation in fuzzy scalar dark matter featuring short-ranged self-interactions $\mathcal{H}_{\rm int} = -λ|ψ|^4/2m^2$, alongside gravitational self-interactions. We map out the full curve of nucleation timescale for both repulsive ($λ< 0$) and attractive ($λ> 0$) short-ranged self-interaction strength, and in doing so reveal two new points. Firstly, besides the two usual terms, $\propto G^2$ and $\propto λ^2$, in the total relaxation rate $Γ_{\rm relax}$, there is an additional cross term $\propto Gλ$ arising due to interference between gravitational and short-ranged self-interaction scattering amplitudes. This yields a critical repulsive interaction strength $λ_{\rm cr} \simeq - 2πGm^2/v_{0}^2$, at which the relaxation rate is smallest and serves as the transition point between typical net attractive self-interaction ($λ\gtrsim λ_{\rm cr}$), and net repulsive self-interaction ($-λ\gtrsim -λ_{\rm cr}$). Secondly, while in the net attractive regime, nucleation time scale is similar to inverse relaxation time scale $τ_{\rm nuc} \sim Γ^{-1}_{\rm relax}$, in the net repulsive regime nucleation occurs at a delayed time $τ_{\rm nuc} \sim (λ/λ_{\rm cr})Γ^{-1}_{\rm relax}$. We confirm our analytical understanding by performing 3D field simulations with varying average mass density $\barρ$, box size $L$ and grid size $N$. △ Less

Submitted 29 September, 2023; originally announced October 2023.

Comments: 11 pages, 5 figures, 2 appendices

Journal ref: Phys. Rev. D 109, 016002, 2 January 2024

arXiv:2309.09495 [pdf, other]

PwR: Exploring the Role of Representations in Conversational Programming

Authors: Pradyumna YM, Vinod Ganesan, Dinesh Kumar Arumugam, Meghna Gupta, Nischith Shadagopan, Tanay Dixit, Sameer Segal, Pratyush Kumar, Mohit Jain, Sriram Rajamani

Abstract: Large Language Models (LLMs) have revolutionized programming and software engineering. AI programming assistants such as GitHub Copilot X enable conversational programming, narrowing the gap between human intent and code generation. However, prior literature has identified a key challenge--there is a gap between user's mental model of the system's understanding after a sequence of natural language… ▽ More Large Language Models (LLMs) have revolutionized programming and software engineering. AI programming assistants such as GitHub Copilot X enable conversational programming, narrowing the gap between human intent and code generation. However, prior literature has identified a key challenge--there is a gap between user's mental model of the system's understanding after a sequence of natural language utterances, and the AI system's actual understanding. To address this, we introduce Programming with Representations (PwR), an approach that uses representations to convey the system's understanding back to the user in natural language. We conducted an in-lab task-centered study with 14 users of varying programming proficiency and found that representations significantly improve understandability, and instilled a sense of agency among our participants. Expert programmers use them for verification, while intermediate programmers benefit from confirmation. Natural language-based development with LLMs, coupled with representations, promises to transform software development, making it more accessible and efficient. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: 23 pages, 3 figures, 2 tables, under submission for ACM CHI 2024

ACM Class: H.5.2

arXiv:2309.06386 [pdf, other]

Lung Diseases Image Segmentation using Faster R-CNNs

Authors: Mihir Jain

Abstract: Lung diseases are a leading cause of child mortality in the develo** world, with India accounting for approximately half of global pneumonia deaths (370,000) in 2016. Timely diagnosis is crucial for reducing mortality rates. This paper introduces a low-density neural network structure to mitigate topological challenges in deep networks. The network incorporates parameters into a feature pyramid,… ▽ More Lung diseases are a leading cause of child mortality in the develo** world, with India accounting for approximately half of global pneumonia deaths (370,000) in 2016. Timely diagnosis is crucial for reducing mortality rates. This paper introduces a low-density neural network structure to mitigate topological challenges in deep networks. The network incorporates parameters into a feature pyramid, enhancing data extraction and minimizing information loss. Soft Non-Maximal Suppression optimizes regional proposals generated by the Region Proposal Network. The study evaluates the model on chest X-ray images, computing a confusion matrix to determine accuracy, precision, sensitivity, and specificity. We analyze loss functions, highlighting their trends during training. The regional proposal loss and classification loss assess model performance during training and classification phases. This paper analysis lung disease detection and neural network structures. △ Less

Submitted 10 September, 2023; originally announced September 2023.

arXiv:2309.06082 [pdf, other]

A Machine Learning Framework to Deconstruct the Primary Drivers for Electricity Market Price Events

Authors: Milan Jain, Xueqing Sun, Sohom Datta, Abhishek Somani

Abstract: Power grids are moving towards 100% renewable energy source bulk power grids, and the overall dynamics of power system operations and electricity markets are changing. The electricity markets are not only dispatching resources economically but also taking into account various controllable actions like renewable curtailment, transmission congestion mitigation, and energy storage optimization to ens… ▽ More Power grids are moving towards 100% renewable energy source bulk power grids, and the overall dynamics of power system operations and electricity markets are changing. The electricity markets are not only dispatching resources economically but also taking into account various controllable actions like renewable curtailment, transmission congestion mitigation, and energy storage optimization to ensure grid reliability. As a result, price formations in electricity markets have become quite complex. Traditional root cause analysis and statistical approaches are rendered inapplicable to analyze and infer the main drivers behind price formation in the modern grid and markets with variable renewable energy (VRE). In this paper, we propose a machine learning-based analysis framework to deconstruct the primary drivers for price spike events in modern electricity markets with high renewable energy. The outcomes can be utilized for various critical aspects of market design, renewable dispatch and curtailment, operations, and cyber-security applications. The framework can be applied to any ISO or market data; however, in this paper, it is applied to open-source publicly available datasets from California Independent System Operator (CAISO) and ISO New England (ISO-NE). △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: Published in IEEE PES GM 2023

arXiv:2309.04235 [pdf, other]

Quasi-integrability and nonlinear resonances in cold atoms under modulation

Authors: Rahul Gupta, Manan Jain, Sudhir R. Jain

Abstract: Quantum dynamics of a collection of atoms subjected to phase modulation has been carefully revisited. We present an exact analysis of the evolution of a two-level system (represented by a spinor) under the action of a time-dependent matrix Hamiltonian. The dynamics is shown to evolve on two coupled potential energy surfaces, one of them binding while the other one scattering type. The dynamics is… ▽ More Quantum dynamics of a collection of atoms subjected to phase modulation has been carefully revisited. We present an exact analysis of the evolution of a two-level system (represented by a spinor) under the action of a time-dependent matrix Hamiltonian. The dynamics is shown to evolve on two coupled potential energy surfaces, one of them binding while the other one scattering type. The dynamics is shown to be quasi-integrable with nonlinear resonances. The bounded dynamics with intermittent scattering at random moments presents the scenario reminiscent to Anderson and dynamical localization. We believe that a careful analytical investigation of a multi-component system which is classically non-integrable is relevant to many other fields, including quantum computation with multi-qubit system. △ Less

Submitted 8 September, 2023; originally announced September 2023.

Comments: 18 pages, 4 figures

arXiv:2309.01370 [pdf, other]

ReOnto: A Neuro-Symbolic Approach for Biomedical Relation Extraction

Authors: Monika Jain, Kuldeep Singh, Raghava Mutharaju

Abstract: Relation Extraction (RE) is the task of extracting semantic relationships between entities in a sentence and aligning them to relations defined in a vocabulary, which is generally in the form of a Knowledge Graph (KG) or an ontology. Various approaches have been proposed so far to address this task. However, applying these techniques to biomedical text often yields unsatisfactory results because i… ▽ More Relation Extraction (RE) is the task of extracting semantic relationships between entities in a sentence and aligning them to relations defined in a vocabulary, which is generally in the form of a Knowledge Graph (KG) or an ontology. Various approaches have been proposed so far to address this task. However, applying these techniques to biomedical text often yields unsatisfactory results because it is hard to infer relations directly from sentences due to the nature of the biomedical relations. To address these issues, we present a novel technique called ReOnto, that makes use of neuro symbolic knowledge for the RE task. ReOnto employs a graph neural network to acquire the sentence representation and leverages publicly accessible ontologies as prior knowledge to identify the sentential relation between two entities. The approach involves extracting the relation path between the two entities from the ontology. We evaluate the effect of using symbolic knowledge from ontologies with graph neural networks. Experimental results on two public biomedical datasets, BioRel and ADE, show that our method outperforms all the baselines (approximately by 3\%). △ Less

Submitted 4 September, 2023; originally announced September 2023.

Comments: Accepted in ECML 2023

arXiv:2308.09726 [pdf, other]

Equitable Restless Multi-Armed Bandits: A General Framework Inspired By Digital Health

Authors: Jackson A. Killian, Manish Jain, Yugang Jia, Jonathan Amar, Erich Huang, Milind Tambe

Abstract: Restless multi-armed bandits (RMABs) are a popular framework for algorithmic decision making in sequential settings with limited resources. RMABs are increasingly being used for sensitive decisions such as in public health, treatment scheduling, anti-poaching, and -- the motivation for this work -- digital health. For such high stakes settings, decisions must both improve outcomes and prevent disp… ▽ More Restless multi-armed bandits (RMABs) are a popular framework for algorithmic decision making in sequential settings with limited resources. RMABs are increasingly being used for sensitive decisions such as in public health, treatment scheduling, anti-poaching, and -- the motivation for this work -- digital health. For such high stakes settings, decisions must both improve outcomes and prevent disparities between groups (e.g., ensure health equity). We study equitable objectives for RMABs (ERMABs) for the first time. We consider two equity-aligned objectives from the fairness literature, minimax reward and max Nash welfare. We develop efficient algorithms for solving each -- a water filling algorithm for the former, and a greedy algorithm with theoretically motivated nuance to balance disparate group sizes for the latter. Finally, we demonstrate across three simulation domains, including a new digital health model, that our approaches can be multiple times more equitable than the current state of the art without drastic sacrifices to utility. Our findings underscore our work's urgency as RMABs permeate into systems that impact human and wildlife outcomes. Code is available at https://github.com/google-research/socialgood/tree/equitable-rmab △ Less

Submitted 17 August, 2023; originally announced August 2023.

Comments: 16 pages, 8 figures, 2 tables

arXiv:2308.07277 [pdf, other]

Quantum MASALA: Quantum MAterialS Ab initio eLectronic-structure pAckage

Authors: Shri Hari Soundararaj, Agrim Sharma, Manish Jain

Abstract: We present QuantumMASALA, a compact package that implements different electronic structure methods in Python. Within just 8000 lines of pure Python code, we have implemented Density Functional Theory (DFT), Time dependent Density Functional Theory (TD-DFT) and the GW Method. The program can run across multiple process cores and in Graphical Processing Units (GPU) with the help of easily-accessible… ▽ More We present QuantumMASALA, a compact package that implements different electronic structure methods in Python. Within just 8000 lines of pure Python code, we have implemented Density Functional Theory (DFT), Time dependent Density Functional Theory (TD-DFT) and the GW Method. The program can run across multiple process cores and in Graphical Processing Units (GPU) with the help of easily-accessible Python libraries. With QuantumESPRESSO and BerkeleyGW I/O interfaces implemented, it can also be used as a substitute for small scale calculations, making it a perfect learning tool for ab initio methods. The package is aimed to provide a framework with its modular and simple code design to rapidly build and test new methods for first-principles calculation. △ Less

Submitted 14 August, 2023; originally announced August 2023.

Comments: 42 pages, 5 figures

arXiv:2306.17693 [pdf, other]

Thompson sampling for improved exploration in GFlowNets

Authors: Jarrid Rector-Brooks, Kanika Madan, Moksh Jain, Maksym Korablyov, Cheng-Hao Liu, Sarath Chandar, Nikolay Malkin, Yoshua Bengio

Abstract: Generative flow networks (GFlowNets) are amortized variational inference algorithms that treat sampling from a distribution over compositional objects as a sequential decision-making problem with a learnable action policy. Unlike other algorithms for hierarchical sampling that optimize a variational bound, GFlowNet algorithms can stably run off-policy, which can be advantageous for discovering mod… ▽ More Generative flow networks (GFlowNets) are amortized variational inference algorithms that treat sampling from a distribution over compositional objects as a sequential decision-making problem with a learnable action policy. Unlike other algorithms for hierarchical sampling that optimize a variational bound, GFlowNet algorithms can stably run off-policy, which can be advantageous for discovering modes of the target distribution. Despite this flexibility in the choice of behaviour policy, the optimal way of efficiently selecting trajectories for training has not yet been systematically explored. In this paper, we view the choice of trajectories for training as an active learning problem and approach it using Bayesian techniques inspired by methods for multi-armed bandits. The proposed algorithm, Thompson sampling GFlowNets (TS-GFN), maintains an approximate posterior distribution over policies and samples trajectories from this posterior for training. We show in two domains that TS-GFN yields improved exploration and thus faster convergence to the target distribution than the off-policy exploration strategies used in past work. △ Less

Submitted 30 June, 2023; originally announced June 2023.

Comments: Structured Probabilistic Inference and Generative Modeling (SPIGM) workshop @ ICML 2023

arXiv:2306.15058 [pdf, other]

BatchGFN: Generative Flow Networks for Batch Active Learning

Authors: Shreshth A. Malik, Salem Lahlou, Andrew Jesson, Moksh Jain, Nikolay Malkin, Tristan Deleu, Yoshua Bengio, Yarin Gal

Abstract: We introduce BatchGFN -- a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward. With an appropriate reward function to quantify the utility of acquiring a batch, such as the joint mutual information between the batch and the model parameters, BatchGFN is able to construct highly informative batches for active… ▽ More We introduce BatchGFN -- a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward. With an appropriate reward function to quantify the utility of acquiring a batch, such as the joint mutual information between the batch and the model parameters, BatchGFN is able to construct highly informative batches for active learning in a principled way. We show our approach enables sampling near-optimal utility batches at inference time with a single forward pass per point in the batch in toy regression problems. This alleviates the computational complexity of batch-aware algorithms and removes the need for greedy approximations to find maximizers for the batch reward. We also present early results for amortizing training across acquisition steps, which will enable scaling to real-world tasks. △ Less

Submitted 26 June, 2023; originally announced June 2023.

Comments: Accepted at the Structured Probabilistic Inference & Generative Modeling workshop, ICML 2023

arXiv:2306.14939 [pdf, other]

The Art of Embedding Fusion: Optimizing Hate Speech Detection

Authors: Mohammad Aflah Khan, Neemesh Yadav, Mohit Jain, Sanyam Goyal

Abstract: Hate speech detection is a challenging natural language processing task that requires capturing linguistic and contextual nuances. Pre-trained language models (PLMs) offer rich semantic representations of text that can improve this task. However there is still limited knowledge about ways to effectively combine representations across PLMs and leverage their complementary strengths. In this work, w… ▽ More Hate speech detection is a challenging natural language processing task that requires capturing linguistic and contextual nuances. Pre-trained language models (PLMs) offer rich semantic representations of text that can improve this task. However there is still limited knowledge about ways to effectively combine representations across PLMs and leverage their complementary strengths. In this work, we shed light on various combination techniques for several PLMs and comprehensively analyze their effectiveness. Our findings show that combining embeddings leads to slight improvements but at a high computational cost and the choice of combination has marginal effect on the final outcome. We also make our codebase public at https://github.com/aflah02/The-Art-of-Embedding-Fusion-Optimizing-Hate-Speech-Detection . △ Less

Submitted 8 October, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

Comments: Published as a Tiny Paper at ICLR 2023, 12 Pages

arXiv:2306.14464 [pdf, other]

doi 10.1038/s41467-024-47956-4

Quantum fluctuations lead to glassy electron dynamics in the good metal regime of electron doped KTaO3

Authors: Shashank Kumar Ojha, Sankalpa Hazra, Surajit Bera, Sanat Kumar Gogoi, Prithwijit Mandal, Jyotirmay Maity, A. Gloskovskii, C. Schlueter, Smarajit Karmakar, Manish Jain, Sumilan Banerjee, Venkatraman Gopalan, Srimanta Middey

Abstract: One of the central challenges in condensed matter physics is to comprehend systems that have strong disorder and strong interactions. In the strongly localized regime, their subtle competition leads to glassy electron dynamics which ceases to exist well before the insulator-to-metal transition is approached as a function of do**. Here, we report on the discovery of glassy electron dynamics deep… ▽ More One of the central challenges in condensed matter physics is to comprehend systems that have strong disorder and strong interactions. In the strongly localized regime, their subtle competition leads to glassy electron dynamics which ceases to exist well before the insulator-to-metal transition is approached as a function of do**. Here, we report on the discovery of glassy electron dynamics deep inside the good metal regime of an electron-doped quantum paraelectric system: KTaO$_3$. We reveal that upon excitation of electrons from defect states to the conduction band, the excess injected carriers in the conduction band relax in a stretched exponential manner with a large relaxation time, and the system evinces simple aging phenomena - a telltale sign of glassy dynamics. Most significantly, we observe a critical slowing down of carrier dynamics below 35 K, concomitant with the onset of quantum paraelectricity in the undoped KTaO$_3$. Our combined investigation using second harmonic generation technique, density functional theory and phenomenological modeling demonstrates quantum fluctuation-stabilized soft polar modes as the impetus for the glassy behavior. This study addresses one of the most fundamental questions regarding the potential promotion of glassiness by quantum fluctuations and opens a route for exploring glassy dynamics of electrons in a well-delocalized regime. △ Less

Submitted 5 June, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

Comments: 51 pages and 18 figures including supplemental

Journal ref: Nature Communications 15, 3830 (2024)

arXiv:2306.11715 [pdf, other]

Multi-Fidelity Active Learning with GFlowNets

Authors: Alex Hernandez-Garcia, Nikita Saxena, Moksh Jain, Cheng-Hao Liu, Yoshua Bengio

Abstract: In the last decades, the capacity to generate large amounts of data in science and engineering applications has been growing steadily. Meanwhile, the progress in machine learning has turned it into a suitable tool to process and utilise the available data. Nonetheless, many relevant scientific and engineering problems present challenges where current machine learning methods cannot yet efficiently… ▽ More In the last decades, the capacity to generate large amounts of data in science and engineering applications has been growing steadily. Meanwhile, the progress in machine learning has turned it into a suitable tool to process and utilise the available data. Nonetheless, many relevant scientific and engineering problems present challenges where current machine learning methods cannot yet efficiently leverage the available data and resources. For example, in scientific discovery, we are often faced with the problem of exploring very large, high-dimensional spaces, where querying a high fidelity, black-box objective function is very expensive. Progress in machine learning methods that can efficiently tackle such problems would help accelerate currently crucial areas such as drug and materials discovery. In this paper, we propose the use of GFlowNets for multi-fidelity active learning, where multiple approximations of the black-box function are available at lower fidelity and cost. GFlowNets are recently proposed methods for amortised probabilistic inference that have proven efficient for exploring large, high-dimensional spaces and can hence be practical in the multi-fidelity setting too. Here, we describe our algorithm for multi-fidelity active learning with GFlowNets and evaluate its performance in both well-studied synthetic tasks and practically relevant applications of molecular discovery. Our results show that multi-fidelity active learning with GFlowNets can efficiently leverage the availability of multiple oracles with different costs and fidelities to accelerate scientific discovery and engineering design. △ Less

Submitted 20 June, 2023; originally announced June 2023.

Comments: Code: https://github.com/nikita-0209/mf-al-gfn

arXiv:2305.16082 [pdf]

doi 10.1103/PhysRevB.107.214433

Observation of c-axis Magnetization at Low Temperatures in Weak Ferromagnet FeBO$_3$ Reveals a Spin-Reorientation Transition

Authors: Jacob Franklin, Jacob Pfund, Joshua Bedard, Weiguo Zhang, P. Shiv Halasyamani, Menka Jain, Ilya Sochnikov

Abstract: The weak ferromagnet FeBO$_3$ is well known for being a unique system for modelling and testing magnetic dynamics primarily due to relatively simple and localized magnetic structure and its interesting spin wave dynamics. At room temperature, it has slightly canted iron moments lying in the a-b plane that result in a strong antiferromagnetic moment and a weak ferromagnetic moment, which results in… ▽ More The weak ferromagnet FeBO$_3$ is well known for being a unique system for modelling and testing magnetic dynamics primarily due to relatively simple and localized magnetic structure and its interesting spin wave dynamics. At room temperature, it has slightly canted iron moments lying in the a-b plane that result in a strong antiferromagnetic moment and a weak ferromagnetic moment, which results in pronounced ferromagnetic and antiferromagnetic spin modes. However, some previous studies have shown unusual low-temperature behavior that suggests a phase transition. By performing low-temperature magnetization measurements, both in bulk and on the mesoscale, we have observed a low temperature magnetic texture in this material in which a large c-axis magnetization occurs. Magnetic fields along the c-axis as high as 1300 Oe were observed close to the sample surface. This presents evidence for the onset of a Morin transition or another type of spin-reorientation phase transition wherein the Fe3+ moments would acquire a c-axis component to their canting below a critical temperature. The observation of this c-axis magnetization suggests that there is a different ground state in this material than has been previously expected and could be due to as yet unexplored intricacies of the Dzyaloshinskii-Moriya interaction. △ Less

Submitted 25 May, 2023; originally announced May 2023.

arXiv:2305.09600 [pdf, other]

Deep Reinforcement Learning to Maximize Arterial Usage during Extreme Congestion

Authors: Ashutosh Dutta, Milan Jain, Arif Khan, Arun Sathanur

Abstract: Collisions, crashes, and other incidents on road networks, if left unmitigated, can potentially cause cascading failures that can affect large parts of the system. Timely handling such extreme congestion scenarios is imperative to reduce emissions, enhance productivity, and improve the quality of urban living. In this work, we propose a Deep Reinforcement Learning (DRL) approach to reduce traffic… ▽ More Collisions, crashes, and other incidents on road networks, if left unmitigated, can potentially cause cascading failures that can affect large parts of the system. Timely handling such extreme congestion scenarios is imperative to reduce emissions, enhance productivity, and improve the quality of urban living. In this work, we propose a Deep Reinforcement Learning (DRL) approach to reduce traffic congestion on multi-lane freeways during extreme congestion. The agent is trained to learn adaptive detouring strategies for congested freeway traffic such that the freeway lanes along with the local arterial network in proximity are utilized optimally, with rewards being congestion reduction and traffic speed improvement. The experimental setup is a 2.6-mile-long 4-lane freeway stretch in Shoreline, Washington, USA with two exits and associated arterial roads simulated on a microscopic and continuous multi-modal traffic simulator SUMO (Simulation of Urban MObility) while using parameterized traffic profiles generated using real-world traffic data. Our analysis indicates that DRL-based controllers can improve average traffic speed by 21\% when compared to no-action during steep congestion. The study further discusses the trade-offs involved in the choice of reward functions, the impact of human compliance on agent performance, and the feasibility of knowledge transfer from one agent to other to address data sparsity and scaling issues. △ Less

Submitted 16 May, 2023; originally announced May 2023.

arXiv:2305.07552 [pdf, other]

Dish detection in food platters: A framework for automated diet logging and nutrition management

Authors: Mansi Goel, Shashank Dargar, Shounak Ghatak, Nidhi Verma, Pratik Chauhan, Anushka Gupta, Nikhila Vishnumolakala, Hareesh Amuru, Ekta Gambhir, Ronak Chhajed, Meenal Jain, Astha Jain, Samiksha Garg, Nitesh Narwade, Nikhilesh Verhwani, Abhuday Tiwari, Kirti Vashishtha, Ganesh Bagler

Abstract: Diet is central to the epidemic of lifestyle disorders. Accurate and effortless diet logging is one of the significant bottlenecks for effective diet management and calorie restriction. Dish detection from food platters is a challenging problem due to a visually complex food layout. We present an end-to-end computational framework for diet management, from data compilation, annotation, and state-o… ▽ More Diet is central to the epidemic of lifestyle disorders. Accurate and effortless diet logging is one of the significant bottlenecks for effective diet management and calorie restriction. Dish detection from food platters is a challenging problem due to a visually complex food layout. We present an end-to-end computational framework for diet management, from data compilation, annotation, and state-of-the-art model identification to its mobile app implementation. As a case study, we implement the framework in the context of Indian food platters known for their complex presentation that poses a challenge for the automated detection of dishes. Starting with the 61 most popular Indian dishes, we identify the state-of-the-art model through a comparative analysis of deep-learning-based object detection architectures. Rooted in a meticulous compilation of 68,005 platter images with 134,814 manual dish annotations, we first compare ten architectures for multi-label classification to identify ResNet152 (mAP=84.51%) as the best model. YOLOv8x (mAP=87.70%) emerged as the best model architecture for dish detection among the eight deep-learning models implemented after a thorough performance evaluation. By comparing with the state-of-the-art model for the IndianFood10 dataset, we demonstrate the superior object detection performance of YOLOv8x for this subset and establish Resnet152 as the best architecture for multi-label classification. The models thus trained on richly annotated data can be extended to include dishes from across global cuisines. The proposed framework is demonstrated through a proof-of-concept mobile application with diverse applications for diet logging, food recommendation systems, nutritional interventions, and mitigation of lifestyle disorders. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: 11 pages, 5 figures, 5 tables. Submitted to the 8th International Conference on Computer Vision & Image Processing (CVIP-2023)

ACM Class: I.4.9; I.5.4; J.3

arXiv:2305.01675 [pdf, other]

doi 10.1103/PhysRevE.108.055305

i-SPin 2: An integrator for general spin-s Gross-Pitaevskii systems

Authors: Mudit Jain, Mustafa A. Amin, Han Pu

Abstract: We provide an algorithm for evolving general spin-$s$ Gross-Pitaevskii / non-linear Schrödinger systems carrying a variety of interactions, where the $2s+1$ components of the `spinor' field represent the different spin-multiplicity states. We consider many nonrelativistic interactions up to quartic order in the Schrödinger field (both short and long-range, and spin-dependent and spin-independent i… ▽ More We provide an algorithm for evolving general spin-$s$ Gross-Pitaevskii / non-linear Schrödinger systems carrying a variety of interactions, where the $2s+1$ components of the `spinor' field represent the different spin-multiplicity states. We consider many nonrelativistic interactions up to quartic order in the Schrödinger field (both short and long-range, and spin-dependent and spin-independent interactions), including explicit spin-orbit couplings. The algorithm allows for spatially varying external and/or self-generated vector potentials that couple to the spin density of the field. Our work can be used for scenarios ranging from laboratory systems such as spinor Bose-Einstein condensates (BECs), to cosmological/astrophysical systems such as self-interacting bosonic dark matter. As examples, we provide results for two different setups of spin-$1$ BECs that employ a varying magnetic field and spin-orbit coupling, respectively, and also collisions of spin-$1$ solitons in dark matter. Our symplectic algorithm is second-order accurate in time, and is extensible to the known higher-order accurate methods. △ Less

Submitted 2 May, 2023; originally announced May 2023.

Comments: 13 pages, 3 figures, 2 appendices

Journal ref: Phys. Rev. E 108, 055305, 15 November 2023

arXiv:2304.14916 [pdf, other]

"Can't Take the Pressure?": Examining the Challenges of Blood Pressure Estimation via Pulse Wave Analysis

Authors: Suril Mehta, Nipun Kwatra, Mohit Jain, Daniel McDuff

Abstract: The use of observed wearable sensor data (e.g., photoplethysmograms [PPG]) to infer health measures (e.g., glucose level or blood pressure) is a very active area of research. Such technology can have a significant impact on health screening, chronic disease management and remote monitoring. A common approach is to collect sensor data and corresponding labels from a clinical grade device (e.g., blo… ▽ More The use of observed wearable sensor data (e.g., photoplethysmograms [PPG]) to infer health measures (e.g., glucose level or blood pressure) is a very active area of research. Such technology can have a significant impact on health screening, chronic disease management and remote monitoring. A common approach is to collect sensor data and corresponding labels from a clinical grade device (e.g., blood pressure cuff), and train deep learning models to map one to the other. Although well intentioned, this approach often ignores a principled analysis of whether the input sensor data has enough information to predict the desired metric. We analyze the task of predicting blood pressure from PPG pulse wave analysis. Our review of the prior work reveals that many papers fall prey data leakage, and unrealistic constraints on the task and the preprocessing steps. We propose a set of tools to help determine if the input signal in question (e.g., PPG) is indeed a good predictor of the desired label (e.g., blood pressure). Using our proposed tools, we have found that blood pressure prediction using PPG has a high multi-valued map** factor of 33.2% and low mutual information of 9.8%. In comparison, heart rate prediction using PPG, a well-established task, has a very low multi-valued map** factor of 0.75% and high mutual information of 87.7%. We argue that these results provide a more realistic representation of the current progress towards to goal of wearable blood pressure measurement via PPG pulse wave analysis. △ Less

Submitted 23 April, 2023; originally announced April 2023.

arXiv:2304.01985 [pdf, other]

doi 10.1103/PhysRevD.108.043535

Kinetic relaxation and Bose-star formation in multicomponent dark matter- I

Authors: Mudit Jain, Mustafa A. Amin, Jonathan Thomas, Wisha Wanichwecharungruang

Abstract: Using wave kinetics, we estimate the emergence time-scale of gravitating Bose-Einstein condensates/Bose stars in the kinetic regime for a general multicomponent Schrödinger-Poisson (SP) system. We identify some effects of the diffusion and friction pieces in the wave-kinetic Boltzmann equation (at leading order in perturbation theory) and provide estimates for the kinetic nucleation rate of conden… ▽ More Using wave kinetics, we estimate the emergence time-scale of gravitating Bose-Einstein condensates/Bose stars in the kinetic regime for a general multicomponent Schrödinger-Poisson (SP) system. We identify some effects of the diffusion and friction pieces in the wave-kinetic Boltzmann equation (at leading order in perturbation theory) and provide estimates for the kinetic nucleation rate of condensates. We test our analysis using full $3+1$ dimensional simulations of multicomponent SP system. With an eye towards applications to multicomponent dark matter, we investigate two general cases in detail. First is a massive spin-$s$ field with $N=2s+1$ components (scalar $s=0$, vector $s=1$ and tensor $s=2$). We find that for a democratic population of different components, the condensation time-scale is $τ_{(s)}\approx τ_0\times N$, where $τ_0$ is the condensation time scale for the scalar case. Second is the case of two scalars with different boson masses. In this case, we map-out how the condensation time depends on the ratios of their average mass densities and boson masses, revealing competition and assistance between components, and a guide towards which component condenses first. For instance, with $m_1 < m_2$ and not too disparate mass densities, we verify that the time scale of condensation of the first species quickly becomes independent of $m_2/m_1$, whereas for equal average number densities, the emergence time scale decreases with increasing $m_2/m_1$. △ Less

Submitted 19 May, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

Comments: 8 pages + 3 appendices, 5 figures. Videos from simulations are available at https://mustafa-amin.com/home/multicomponent-dark-matter. In comparison with the previous version, we have (1) added more references; (2) provide clarifications in Appendix A and account for an additional factor of 2 in the wave kinetic equation

Journal ref: Phys. Rev. D 108, 043535, 28 August 2023

arXiv:2304.01720 [pdf, other]

Higher-order Bragg gaps in the electronic band structure of bilayer graphene renormalized by recursive supermoiré potential

Authors: Mohit Kumar Jat, Priya Tiwari, Robin Bajaj, Ishita Shitut, Shinjan Mandal, Kenji Watanabe, Takashi Taniguchi, H. R. Krishnamurthy, Manish Jain, Aveek Bid

Abstract: This letter presents our findings on the recursive band gap engineering of chiral fermions in bilayer graphene doubly aligned with hBN. By utilizing two interfering moiré potentials, we generate a supermoiré pattern which renormalizes the electronic bands of the pristine bilayer graphene, resulting in higher-order fractal gaps even at very low energies. These Bragg gaps can be mapped using a uniqu… ▽ More This letter presents our findings on the recursive band gap engineering of chiral fermions in bilayer graphene doubly aligned with hBN. By utilizing two interfering moiré potentials, we generate a supermoiré pattern which renormalizes the electronic bands of the pristine bilayer graphene, resulting in higher-order fractal gaps even at very low energies. These Bragg gaps can be mapped using a unique linear combination of periodic areas within the system. To validate our findings, we used electronic transport measurements to identify the position of these gaps as functions of the carrier density and establish their agreement with the predicted carrier densities and corresponding quantum numbers obtained using the continuum model. Our work provides direct experimental evidence of the quantization of the area of quasi-Brillouin zones in supermoiré systems. It fills essential gaps in understanding the band structure engineering of Dirac fermions by a recursive doubly periodic superlattice potential. △ Less

Submitted 4 April, 2023; originally announced April 2023.

Comments: 29 pages (including Supplementary Materials)

arXiv:2304.00132 [pdf, other]

doi 10.1145/3544548.3581542

Understanding Journalists' Workflows in News Curation

Authors: Shubham Atreja, Shruthi Srinath, Mohit Jain, Joyojeet Pal

Abstract: With the increasing dominance of the internet as a source of news consumption, there has been a rise in the production and popularity of email newsletters compiled by individual journalists. However, there is little research on the processes of aggregation, and how these differ between expert journalists and trained machines. In this paper, we interviewed journalists who curate newsletters from ar… ▽ More With the increasing dominance of the internet as a source of news consumption, there has been a rise in the production and popularity of email newsletters compiled by individual journalists. However, there is little research on the processes of aggregation, and how these differ between expert journalists and trained machines. In this paper, we interviewed journalists who curate newsletters from around the world. Through an in-depth understanding of journalists' workflows, our findings lay out the role of their prior experience in the value they bring into the curation process, their use of algorithms in finding stories for their newsletter, and their internalization of their readers' interests and the context they are curating for. While identifying the role of human expertise, we highlight the importance of hybrid curation and provide design insights on how technology can support the work of these experts. △ Less

Submitted 31 March, 2023; originally announced April 2023.

Comments: accepted at CHI'23

Showing 1–50 of 228 results for author: Jain, M