Search | arXiv e-print repository

"Best" iterative coupled-cluster triples model: More evidence for 3CC

Authors: Nakul Teke, Ajay Melekamburath, Bimal Gaudel, Edward F. Valeev

Abstract: To follow up on the unexpectedly-good performance of several coupled-cluster models with approximate inclusion of 3-body clusters [J. Chem. Phys. 151, 064102 (2019)] we performed a more complete assessment of the 3CC method [J. Chem. Phys. 125, 204105 (2006)] for accurate computational thermochemistry in the standard HEAT framework. New spin-integrated implementation of the 3CC method applicable t… ▽ More To follow up on the unexpectedly-good performance of several coupled-cluster models with approximate inclusion of 3-body clusters [J. Chem. Phys. 151, 064102 (2019)] we performed a more complete assessment of the 3CC method [J. Chem. Phys. 125, 204105 (2006)] for accurate computational thermochemistry in the standard HEAT framework. New spin-integrated implementation of the 3CC method applicable to closed- and open-shell systems utilizes a new automated toolchain for derivation, optimization, and evaluation of operator algebra in many-body electronic structure. We found that with a double-zeta basis set the 3CC correlation energies and their atomization energy contributions are almost always more accurate (with respect to the CCSDTQ reference) than the CCSDT model as well as the standard CCSD(T) model. The mean absolute errors in cc-pVDZ {3CC, CCSDT, and CCSD(T)} electronic (per valence electron) and atomization energies relative to the CCSDTQ reference for the HEAT dataset [J. Chem. Phys. 121, 11599 (2004)], were {24, 70, 122} $μE_h/e$ and {0.46, 2.00, 2.58} kJ/mol, respectively. The mean absolute errors in the complete-basis-set limit {3CC, CCSDT, and CCSD(T)} atomization energies relative to the HEAT model reference, were {0.52, 2.00, and 1.07} kJ/mol, The significant and systematic reduction of the error by the 3CC method and its lower cost than CCSDT suggests it as a viable candidate for post-CCSD(T) thermochemistry applications, as well as the preferred alternative to CCSDT in general. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 30 pages, 3 tables

arXiv:2407.08296 [pdf, other]

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

Authors: Zhenyu Zhang, Ajay Jaiswal, Lu Yin, Shiwei Liu, Jiawei Zhao, Yuandong Tian, Zhangyang Wang

Abstract: Training Large Language Models (LLMs) is memory-intensive due to the large number of parameters and associated optimization states. GaLore, a recent method, reduces memory usage by projecting weight gradients into a low-rank subspace without compromising performance. However, GaLore relies on time-consuming Singular Value Decomposition (SVD) operations to identify the subspace, and the frequent su… ▽ More Training Large Language Models (LLMs) is memory-intensive due to the large number of parameters and associated optimization states. GaLore, a recent method, reduces memory usage by projecting weight gradients into a low-rank subspace without compromising performance. However, GaLore relies on time-consuming Singular Value Decomposition (SVD) operations to identify the subspace, and the frequent subspace updates lead to significant training time overhead. Moreover, GaLore offers minimal improvements in accuracy and efficiency compared to LoRA in more accessible fine-tuning scenarios. To address these limitations, we introduce Q-Galore, a novel approach that substantially reduces memory usage by combining quantization and low-rank projection, surpassing the benefits of GaLore. Our method is based on two key observations: (i) the gradient subspace exhibits diverse properties, with some layers converging early in training while others are subject to frequent changes; (ii) the projection matrices are highly resilient to low-bit quantization. Leveraging these insights, Q-GaLore adaptively updates the gradient subspace based on its convergence statistics, achieving comparable performance while significantly reducing the number of SVD operations. We maintain the projection matrices in INT4 format and weights in INT8 format, incorporating stochastic rounding to capture accumulated gradient information. This approach enables a high-precision training trajectory using only low-precision weights. We demonstrate that Q-GaLore achieves highly competitive performance with exceptional memory efficiency. At pre-training, Q-GaLore facilitates training a LLaMA-7B model from scratch on a single NVIDIA RTX 4060 Ti with only 16 GB memory. At fine-tuning, it reduces memory consumption by up to 50% compared to LoRA and GaLore, while consistently outperforming QLoRA at the same memory cost. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.07946 [pdf, other]

The Type I Superluminous Supernova Catalog I: Light Curve Properties, Models, and Catalog Description

Authors: Sebastian Gomez, Matt Nicholl, Edo Berger, Peter K. Blanchard, V. Ashley Villar, Sofia Rest, Griffin Hosseinzadeh, Aysha Aamer, Yukta Ajay, Wasundara Athukoralalage, David C. Coulter, Tarraneh Eftekhari, Achille Fiore, Noah Franz, Ori Fox, Alexander Gagliano, Daichi Hiramatsu, D. Andrew Howell, Brian Hsu, Mitchell Karmen, Matthew R. Siebert, Réka Könyves-Tóth, Harsh Kumar, Curtis McCully, Craig Pellegrino , et al. (3 additional authors not shown)

Abstract: We present the most comprehensive catalog to date of Type I Superluminous Supernovae (SLSNe), a class of stripped envelope supernovae (SNe) characterized by exceptionally high luminosities. We have compiled a sample of 262 SLSNe reported through 2022 December 31. We verified the spectroscopic classification of each SLSN and collated an exhaustive data set of UV, optical and IR photometry from both… ▽ More We present the most comprehensive catalog to date of Type I Superluminous Supernovae (SLSNe), a class of stripped envelope supernovae (SNe) characterized by exceptionally high luminosities. We have compiled a sample of 262 SLSNe reported through 2022 December 31. We verified the spectroscopic classification of each SLSN and collated an exhaustive data set of UV, optical and IR photometry from both publicly available data and our own FLEET observational follow-up program, totaling over 30,000 photometric detections. Using these data we derive observational parameters such as the peak absolute magnitudes, rise and decline timescales, as well as bolometric luminosities, temperature and photospheric radius evolution for all SLSNe. Additionally, we model all light curves using a hybrid model that includes contributions from both a magnetar central engine and the radioactive decay of $^{56}$Ni. We explore correlations among various physical and observational parameters, and recover the previously found relation between ejecta mass and magnetar spin, as well as the overall progenitor pre-explosion mass distribution with a peak at $\approx 6.5$ M$_\odot$. We find no significant redshift dependence for any parameter, and no evidence for distinct sub-types of SLSNe. We find that $< 3$\% of SLSNe are best fit with a significant contribution from radioactive decay $\gtrsim 50$\%, representing a set of relatively dim and slowly declining SNe. We provide several analytical tools designed to simulate typical SLSN light curves across a broad range of wavelengths and phases, enabling accurate K-corrections, bolometric scaling calculations, and inclusion of SLSNe in survey simulations or future comparison works. The complete catalog, including all of the photometry, models, and derived parameters, is made available as an open-source resource on GitHub. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: 59 pages, 22 Figures, Submitted to MNRAS

arXiv:2407.07480 [pdf, other]

The discovery of a nearby 421~s transient with CHIME/FRB/Pulsar

Authors: Fengqiu Adam Dong, Tracy Clarke, Alice P. Curtin, Ajay Kumar, Ingrid Stairs, Shami Chatterjee, Amanda M. Cook, Emmanuel Fonseca, B. M. Gaensler, Jason W. T. Hessels, Victoria M. Kaspi, Mattias Lazda, Kiyoshi W. Masui, James W. McKee, Bradley W. Meyers, Aaron B. Pearlman, Scott M. Ransom, Paul Scholz, Kaitlyn Shin, Kendrick M. Smith, Chia Min Tan

Abstract: Neutron stars and white dwarfs are both dense remnants of post-main-sequence stars. Pulsars, magnetars and strongly magnetised white dwarfs have all been seen to been observed to exhibit coherent, pulsed radio emission in relation to their rotational period. Recently, a new type of radio long period transient (LPT) has been discovered. The bright radio emission of LPTs resembles that of radio puls… ▽ More Neutron stars and white dwarfs are both dense remnants of post-main-sequence stars. Pulsars, magnetars and strongly magnetised white dwarfs have all been seen to been observed to exhibit coherent, pulsed radio emission in relation to their rotational period. Recently, a new type of radio long period transient (LPT) has been discovered. The bright radio emission of LPTs resembles that of radio pulsars and magnetars. However, they pulse on timescales (minutes) much longer than previously seen. While minute timescales are common rotation periods for white dwarfs, LPTs are much brighter than the known pulsating white dwarfs, and dipolar radiation from isolated (as opposed to binary) magnetic white dwarfs has yet to be observed. Here, we report the discovery of a new $\sim$421~s LPT, CHIME J0630+25, using the CHIME/FRB and CHIME/Pulsar instruments. We used standard pulsar timing techniques and obtained a phase-coherent timing solution which yielded limits on the inferred magnetic field and characteristic age. CHIME J0630+25 is remarkably nearby ($170 \pm 80$~pc), making it the closest LPT discovered to date. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: Submitted

arXiv:2407.04749 [pdf, other]

Towards imaging-spectro-polarimetry of solar flares in the X-rays

Authors: Sergio Fabiani, John Rankin, Stefano Basso, Enrico Costa, Ettore Del Monte, Klaus Desch, Alessandro Di Marco, Markus Gruber, Jochen Kaminski, Dawoon E. Kim, Saba Imtiaz, Carlo Lefevre, Pasqualino Loffredo, Hemant Manikantan, Alfredo Morbidini, Fabio Muleri, Giovanni Pareschi, Vladilavs Plesanovs, Ajay Ratheesh, Alda Rubini, Paolo Soffitta, Daniele Spiga

Abstract: X-ray polarimetry of solar flares is still a not well established field of observation of our star. Past polarimeters were not able to measure with a high significance the polarization in X-rays from solar flares. Moreover, they had no imaging capabilities and measured only the polarization by integrating on all the image of the source. We propose a mission concept based on a gas photoelectric pol… ▽ More X-ray polarimetry of solar flares is still a not well established field of observation of our star. Past polarimeters were not able to measure with a high significance the polarization in X-rays from solar flares. Moreover, they had no imaging capabilities and measured only the polarization by integrating on all the image of the source. We propose a mission concept based on a gas photoelectric polarimeter, coupled with multilayer lobster-eye optics, to perform imaging-spectro-polarimetry of solar flares while monitoring the entire solar disc. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Proceeding of SPIE Conference "Astronomical Telescopes+ Instrumentation", Yokohama (Japan), 16-21 June 2024

Report number: Paper No. 13093-311

arXiv:2407.04182 [pdf, other]

Towards Generalized On-Chip Communication for Programmable Accelerators in Heterogeneous Architectures

Authors: Joseph Zuckerman, John-David Wellman, Ajay Vanamali, Manish Shankar, Gabriele Tombesi, Karthik Swaminathan, Kevin Lee, Mohit Kapur, Robert Philhower, Pradip Bose, Luca P. Carloni

Abstract: We present several enhancements to the open-source ESP platform to support flexible and efficient on-chip communication for programmable accelerators in heterogeneous SoCs. These enhancements include 1) a flexible point-to-point communication mechanism between accelerators, 2) a multicast NoC that supports data forwarding to multiple accelerators simultaneously, 3) accelerator synchronization leve… ▽ More We present several enhancements to the open-source ESP platform to support flexible and efficient on-chip communication for programmable accelerators in heterogeneous SoCs. These enhancements include 1) a flexible point-to-point communication mechanism between accelerators, 2) a multicast NoC that supports data forwarding to multiple accelerators simultaneously, 3) accelerator synchronization leveraging the SoC's coherence protocol, 4) an accelerator interface that offers fine-grained control over the communication mode used, and 5) an example ISA extension to support our enhancements. Our solution adds negligible area to the SoC architecture and requires minimal changes to the accelerators themselves. We have validated most of these features in complex FPGA prototypes and plan to include them in the open-source release of ESP in the coming months. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Appeared in the Sixth International Workshop on Domain Specific System Architecture (DOSSA-6)

arXiv:2407.02425 [pdf, other]

Reinforcement Learning and Machine ethics:a systematic review

Authors: Ajay Vishwanath, Louise A. Dennis, Marija Slavkovik

Abstract: Machine ethics is the field that studies how ethical behaviour can be accomplished by autonomous systems. While there exist some systematic reviews aiming to consolidate the state of the art in machine ethics prior to 2020, these tend to not include work that uses reinforcement learning agents as entities whose ethical behaviour is to be achieved. The reason for this is that only in the last years… ▽ More Machine ethics is the field that studies how ethical behaviour can be accomplished by autonomous systems. While there exist some systematic reviews aiming to consolidate the state of the art in machine ethics prior to 2020, these tend to not include work that uses reinforcement learning agents as entities whose ethical behaviour is to be achieved. The reason for this is that only in the last years we have witnessed an increase in machine ethics studies within reinforcement learning. We present here a systematic review of reinforcement learning for machine ethics and machine ethics within reinforcement learning. Additionally, we highlight trends in terms of ethics specifications, components and frameworks of reinforcement learning, and environments used to result in ethical behaviour. Our systematic review aims to consolidate the work in machine ethics and reinforcement learning thus completing the gap in the state of the art machine ethics landscape △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.02407 [pdf, other]

Emergence of spin-phonon coupling in Gd-doped Y$_2$CoMnO$_6$ double perovskite oxide: a combined experimental and ab-initio study

Authors: Anasua Khan, Debdatta Banerjee, Divya Rawat, T. K Nath, Ajay Soni, Swastika Chatterjee, A. Taraphder

Abstract: One of the fundamental interactions that is found in many functional materials is the spin-phonon coupling (SPC), which is at the heart of many novel functionalities. The simultaneous presence of multi-magnetic phases makes SPC even more intriguing. We have used Raman spectroscopy as well as first-principles methods to investigate the possibility of the appearance of SPC in Gd-doped Y$_2$CoMnO… ▽ More One of the fundamental interactions that is found in many functional materials is the spin-phonon coupling (SPC), which is at the heart of many novel functionalities. The simultaneous presence of multi-magnetic phases makes SPC even more intriguing. We have used Raman spectroscopy as well as first-principles methods to investigate the possibility of the appearance of SPC in Gd-doped Y$_2$CoMnO$_6$ (YGCMO) double perovskite oxide and the influence of anti-site disorder on the same. YGCMO is found to exhibit anti-site disorder leading to both ferromagnetic (between Co and Mn) and anti-ferromagnetic interactions (Co-Co, Mn-Mn, Gd-Co/Mn). An analysis of the temperature-dependent phonon frequency for the stretching modes of YGCMO, obtained using RAMAN spectroscopy, indicates that SPC is possibly emerging from simultaneous presence of ferromagnetic and antiferromagnetic interactions. The nature of the phonon linewidth and the insulating state of the material eliminate the role of magnetostriction on the observed anomaly. The spin-phonon coupling strength comes out to be 0.29 cm$^{-1}$. Our experimental findings are corroborated by first-principles DFT calculations which indicate the presence of SPC in ordered YGCMO getting enhanced in the presence of anti-site disorder. This indicates a strong influence of B-site (Co/Mn) ordering on SPC in the bulk double perovskite systems. An analysis of the cause behind the enhanced SPC in the presence of anti-site disorder is also presented. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 23 pages, 15 figures

arXiv:2407.02352 [pdf, other]

Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification

Authors: Pritish Sahu, Karan Sikka, Ajay Divakaran

Abstract: Large Visual Language Models (LVLMs) struggle with hallucinations in visual instruction following task(s), limiting their trustworthiness and real-world applicability. We propose Pelican -- a novel framework designed to detect and mitigate hallucinations through claim verification. Pelican first decomposes the visual claim into a chain of sub-claims based on first-order predicates. These sub-claim… ▽ More Large Visual Language Models (LVLMs) struggle with hallucinations in visual instruction following task(s), limiting their trustworthiness and real-world applicability. We propose Pelican -- a novel framework designed to detect and mitigate hallucinations through claim verification. Pelican first decomposes the visual claim into a chain of sub-claims based on first-order predicates. These sub-claims consist of (predicate, question) pairs and can be conceptualized as nodes of a computational graph. We then use Program-of-Thought prompting to generate Python code for answering these questions through flexible composition of external tools. Pelican improves over prior work by introducing (1) intermediate variables for precise grounding of object instances, and (2) shared computation for answering the sub-question to enable adaptive corrections and inconsistency identification. We finally use reasoning abilities of LLM to verify the correctness of the the claim by considering the consistency and confidence of the (question, answer) pairs from each sub-claim. Our experiments reveal a drop in hallucination rate by $\sim$8%-32% across various baseline LVLMs and a 27% drop compared to approaches proposed for hallucination mitigation on MMHal-Bench. Results on two other benchmarks further corroborate our results. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01655 [pdf, other]

Interpretation of recently discovered single bottom baryons in the relativistic flux tube model

Authors: Pooja Jakhad, Juhi Oudichhya, Ajay Kumar Rai

Abstract: Following recent experimental progress in the study of bottom baryons, we systematically calculate the mass spectra of $Λ_{b}$, $Ξ_{b}$, $Σ_{b}$, $Ξ_{b}^{'}$, and $Ω_{b}$ baryons with a quark-diquark picture in the framework of a relativistic flux tube model with spin-dependent interactions in the j-j coupling scheme. Furthermore, we calculate the strong decay width of bottom baryons decaying into… ▽ More Following recent experimental progress in the study of bottom baryons, we systematically calculate the mass spectra of $Λ_{b}$, $Ξ_{b}$, $Σ_{b}$, $Ξ_{b}^{'}$, and $Ω_{b}$ baryons with a quark-diquark picture in the framework of a relativistic flux tube model with spin-dependent interactions in the j-j coupling scheme. Furthermore, we calculate the strong decay width of bottom baryons decaying into a bottom baryon and a light pseudoscalar meson. A good agreement is found between the calculated masses and the experimentally available masses of singly bottom baryons. %We interpret $Σ_{b}(6097)$ as a $1P(3/2^{-})$ state, $Ξ_{b}(6100)$ as $1P(1/2^{-})$ state of $Ξ_{b}$ baryon, $Ξ_{b}(6227)$ as a $1P(1/2^{-})$ or $1P(3/2^{-})$ state of $Ξ_{b}'$ baryon, $Ξ_{b}(6327)$ as a $1P(3/2^{-})$ state of $Ξ_{b}'$ baryon, and $Ξ_{b}(6333)$ as a $1P(3/2^{-})$ state of $Ξ_{b}'$ baryon. By analysing both mass spectra and strong decay widths, we interpret $Σ_{b}(6097)$ as a $1P(3/2^{-})$ state and $Ξ_{b}(6100)$ as a $1P(1/2^{-})$ state of $Ξ_{b}$ baryon. The $Ξ_{b}(6227)$ is identified to be an orbital excitation $1P$ of the $Ξ_{b}^{'}$ baryon with $J^{P}=3/2^{-}$. Further, we determine $Ξ_{b}(6327)$ and $Ξ_{b}(6333)$ as a $1P(3/2^{-})$ state and $1P(5/2^{-})$ state, respectively, of $Ξ_{b}^{'}$ baryon. From the obtained mass spectra, we construct the Regge trajectories in the $(J,M^{2})$ plane, which are found to be essentially linear, parallel, and equidistant. Our predictions for higher orbital and radial excited states can help experimentalists identify missing excited states of singly bottom baryons. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 21 pages, 7 figures, Submitted in PRD

arXiv:2406.17963 [pdf, other]

Empowering Interdisciplinary Insights with Dynamic Graph Embedding Trajectories

Authors: Yiqiao **, Andrew Zhao, Yeon-Chang Lee, Meng Ye, Ajay Divakaran, Srijan Kumar

Abstract: We developed DyGETViz, a novel framework for effectively visualizing dynamic graphs (DGs) that are ubiquitous across diverse real-world systems. This framework leverages recent advancements in discrete-time dynamic graph (DTDG) models to adeptly handle the temporal dynamics inherent in dynamic graphs. DyGETViz effectively captures both micro- and macro-level structural shifts within these graphs,… ▽ More We developed DyGETViz, a novel framework for effectively visualizing dynamic graphs (DGs) that are ubiquitous across diverse real-world systems. This framework leverages recent advancements in discrete-time dynamic graph (DTDG) models to adeptly handle the temporal dynamics inherent in dynamic graphs. DyGETViz effectively captures both micro- and macro-level structural shifts within these graphs, offering a robust method for representing complex and massive dynamic graphs. The application of DyGETViz extends to a diverse array of domains, including ethology, epidemiology, finance, genetics, linguistics, communication studies, social studies, and international relations. Through its implementation, DyGETViz has revealed or confirmed various critical insights. These include the diversity of content sharing patterns and the degree of specialization within online communities, the chronological evolution of lexicons across decades, and the distinct trajectories exhibited by aging-related and non-related genes. Importantly, DyGETViz enhances the accessibility of scientific findings to non-domain experts by simplifying the complexities of dynamic graphs. Our framework is released as an open-source Python package for use across diverse disciplines. Our work not only addresses the ongoing challenges in visualizing and analyzing DTDG models but also establishes a foundational framework for future investigations into dynamic graph representation and analysis across various disciplines. △ Less

Submitted 28 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

Comments: 27 pages, 11 figures

arXiv:2406.17304 [pdf, other]

Leveraging LLMs for Dialogue Quality Measurement

Authors: **ghan Jia, Abi Komma, Timothy Leffel, Xujun Peng, Ajay Nagesh, Tamer Soliman, Aram Galstyan, Anoop Kumar

Abstract: In task-oriented conversational AI evaluation, unsupervised methods poorly correlate with human judgments, and supervised approaches lack generalization. Recent advances in large language models (LLMs) show robust zeroshot and few-shot capabilities across NLP tasks. This paper explores using LLMs for automated dialogue quality evaluation, experimenting with various configurations on public and pro… ▽ More In task-oriented conversational AI evaluation, unsupervised methods poorly correlate with human judgments, and supervised approaches lack generalization. Recent advances in large language models (LLMs) show robust zeroshot and few-shot capabilities across NLP tasks. This paper explores using LLMs for automated dialogue quality evaluation, experimenting with various configurations on public and proprietary datasets. Manipulating factors such as model size, in-context examples, and selection techniques, we examine "chain-of-thought" (CoT) reasoning and label extraction procedures. Our results show that (1) larger models yield more accurate dialogue labels; (2) algorithmic selection of in-context examples outperforms random selection; (3) CoT reasoning where an LLM is asked to provide justifications before outputting final labels improves performance; and (4) fine-tuned LLMs outperform out-of-the-box ones. Our results indicate that LLMs that are suitably fine-tuned and have sufficient reasoning capabilities can be leveraged for automated dialogue evaluation. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.15586 [pdf, other]

TinyStyler: Efficient Few-Shot Text Style Transfer with Authorship Embeddings

Authors: Zachary Horvitz, Ajay Patel, Kanishk Singh, Chris Callison-Burch, Kathleen McKeown, Zhou Yu

Abstract: The goal of text style transfer is to transform the style of texts while preserving their original meaning, often with only a few examples of the target style. Existing style transfer methods generally rely on the few-shot capabilities of large language models or on complex controllable text generation approaches that are inefficient and underperform on fluency metrics. We introduce TinyStyler, a… ▽ More The goal of text style transfer is to transform the style of texts while preserving their original meaning, often with only a few examples of the target style. Existing style transfer methods generally rely on the few-shot capabilities of large language models or on complex controllable text generation approaches that are inefficient and underperform on fluency metrics. We introduce TinyStyler, a lightweight but effective approach, which leverages a small language model (800M params) and pre-trained authorship embeddings to perform efficient, few-shot text style transfer. We evaluate on the challenging task of authorship style transfer and find TinyStyler outperforms strong approaches such as GPT-4. We also evaluate TinyStyler's ability to perform text attribute style transfer (formal $\leftrightarrow$ informal) with automatic and human evaluations and find that the approach outperforms recent controllable text generation methods. Our model has been made publicly available at https://huggingface.co/tinystyler/tinystyler . △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.12804 [pdf, other]

Varying activity and the bursts properties of FRB 20240114A probed with GMRT down to 300 MHz

Authors: Ajay Kumar, Yogesh Maan, Yash Bhusare

Abstract: Repeating fast radio bursts can exhibit a wide range of burst repetition rates, from none to hundreds of bursts per hour. Here, we report the detection and characteristics of 57 bursts from the recently discovered FRB 20240114A, observed with GMRT in the frequency ranges 300-500 MHz and 550-750 MHz. Majority of the bursts show narrow emission-bandwidth with $Δν/ν\sim$ around 10 %. All of the burst… ▽ More Repeating fast radio bursts can exhibit a wide range of burst repetition rates, from none to hundreds of bursts per hour. Here, we report the detection and characteristics of 57 bursts from the recently discovered FRB 20240114A, observed with GMRT in the frequency ranges 300-500 MHz and 550-750 MHz. Majority of the bursts show narrow emission-bandwidth with $Δν/ν\sim$ around 10 %. All of the bursts we detect are faint ($<$10 Jy ms), and thus probe the lower end of the energy distribution. We determine the rate function for FRB 20240114A at 400 MHz, and downward drift rates at 400 and 650 MHz, and discuss our measurements in the context of the repeating FRB population. We observe sudden variations in the burst activity of FRB 20240114A over time. Our data as well as the other publicly available information on other observations of FRB 20240114A so far, there is an indication that FRB 20240114A potentially exhibit a chromaticity in its burst activity. While the burst properties of FRB 20240114A are similar to ther repeating FRBs, the frequency-dependent activity, if established, could provide crucial clues to the origin of repeating FRBs. We also place the most stringent 5$σ$ upper limits of 600 $μ$Jy and 89 $μ$Jy on any persistent radio source (PRS) associated with FRB 20240114A at 400 MHz and 650 MHz, respectively, and compare these with the luminosity of the known PRSs associated with FRB121102A and FRB190520B. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 13 Pages, 5 Figures, Submitted to ApJ

arXiv:2406.12014 [pdf, other]

An IXPE-Led X-ray Spectro-Polarimetric Campaign on the Soft State of Cygnus X-1: X-ray Polarimetric Evidence for Strong Gravitational Lensing

Authors: James F. Steiner, Edward Nathan, Kun Hu, Henric Krawczynski, Michal Dovciak, Alexandra Veledina, Fabio Muleri, Jiri Svoboda, Kevin Alabarta, Maxime Parra, Yash Bhargava, Giorgio Matt, Juri Poutanen, Pierre-Olivier Petrucci, Allyn F. Tennant, M. Cristina Baglio, Luca Baldini, Samuel Barnier, Sudip Bhattacharyya, Stefano Bianchi, Maimouna Brigitte, Mauricio Cabezas, Floriane Cangemi, Fiamma Capitanio, Jacob Casey , et al. (112 additional authors not shown)

Abstract: We present the first X-ray spectropolarimetric results for Cygnus X-1 in its soft state from a campaign of five IXPE observations conducted during 2023 May-June. Companion multiwavelength data during the campaign are likewise shown. The 2-8 keV X-rays exhibit a net polarization degree PD=1.99%+/-0.13% (68% confidence). The polarization signal is found to increase with energy across IXPE's 2-8 keV… ▽ More We present the first X-ray spectropolarimetric results for Cygnus X-1 in its soft state from a campaign of five IXPE observations conducted during 2023 May-June. Companion multiwavelength data during the campaign are likewise shown. The 2-8 keV X-rays exhibit a net polarization degree PD=1.99%+/-0.13% (68% confidence). The polarization signal is found to increase with energy across IXPE's 2-8 keV bandpass. The polarized X-rays exhibit an energy-independent polarization angle of PA=-25.7+/-1.8 deg. East of North (68% confidence). This is consistent with being aligned to Cyg X-1's AU-scale compact radio jet and its pc-scale radio lobes. In comparison to earlier hard-state observations, the soft state exhibits a factor of 2 lower polarization degree, but a similar trend with energy and a similar (also energy-independent) position angle. When scaling by the natural unit of the disk temperature, we find the appearance of a consistent trendline in the polarization degree between soft and hard states. Our favored polarimetric model indicates Cyg X-1's spin is likely high (a* above ~0.96). The substantial X-ray polarization in Cyg X-1's soft state is most readily explained as resulting from a large portion of X-rays emitted from the disk returning and reflecting off the disk surface, generating a high polarization degree and a polarization direction parallel to the black hole spin axis and radio jet. In IXPE's bandpass, the polarization signal is dominated by the returning reflection emission. This constitutes polarimetric evidence for strong gravitational lensing of X-rays close to the black hole. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 20 pages, accepted for publication in ApJL

arXiv:2406.08988 [pdf, other]

Probing the polarized emission from SMC X-1: the brightest X-ray pulsar observed by IXPE

Authors: Sofia V. Forsblom, Sergey S. Tsygankov, Juri Poutanen, Victor Doroshenko, Alexander A. Mushtukov, Mason Ng, Swati Ravi, Herman L. Marshall, Alessandro Di Marco, Fabio La Monaca, Christian Malacaria, Guglielmo Mastroserio, Vladislav Loktev, Andrea Possenti, Valery F. Suleimanov, Roberto Taverna, Ivan Agudo, Lucio A. Antonelli, Matteo Bachetti, Luca Baldini, Wayne H. Baumgartner, Ronaldo Bellazzini, Stefano Bianchi, Stephen D. Bongiorno, Raffaella Bonino , et al. (79 additional authors not shown)

Abstract: Recent observations of X-ray pulsars (XRPs) performed by the Imaging X-ray Polarimetry Explorer (IXPE) have made it possible to investigate the intricate details of these objects in a new way, thanks to the added value of X-ray polarimetry. Here we present the results of the IXPE observations of SMC X-1, a member of the small group of XRPs displaying super-orbital variability. SMC X-1 was observed… ▽ More Recent observations of X-ray pulsars (XRPs) performed by the Imaging X-ray Polarimetry Explorer (IXPE) have made it possible to investigate the intricate details of these objects in a new way, thanks to the added value of X-ray polarimetry. Here we present the results of the IXPE observations of SMC X-1, a member of the small group of XRPs displaying super-orbital variability. SMC X-1 was observed by IXPE three separate times during the high state of its super-orbital period. The observed luminosity in the 2-8 keV energy band of $L=2\times10^{38}$ erg/s makes SMC X-1 the brightest XRP ever observed by IXPE. We detect significant polarization in all three observations, with values of the phase-averaged polarization degree (PD) and polarization angle (PA) of $3.2\pm0.8$% and $97°\pm8°$ for Observation 1, $3.0\pm0.9$% and $90°\pm8°$ for Observation 2, and $5.5\pm1.1$% and $80°\pm6°$ for Observation 3, for the spectro-polarimetric analysis. The observed PD shows an increase over time with decreasing luminosity, while the PA decreases in decrements of 10°. The phase-resolved spectro-polarimetric analysis reveals significant detection of polarization in three out of seven phase bins, with the PD ranging between 2% and 10%, and a corresponding range in the PA from $\sim$70° to $\sim$100°. The pulse-phase resolved PD displays an apparent anti-correlation with the flux. Using the rotating vector model, we obtain constraints on the pulsar's geometrical properties for the individual observations. The position angle of the pulsar displays an evolution over time supporting the idea that we observe changes related to different super-orbital phases. Scattering in the wind of the precessing accretion disk may be responsible for the behavior of the polarimetric properties observed during the high-state of SMC X-1's super-orbital period. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 11 pages, 11 figures, submitted to A&A

arXiv:2406.07629 [pdf, other]

Exact lattice bosonization of finite N matrix quantum mechanics and c = 1

Authors: Gautam Mandal, Ajay Mohan

Abstract: We describe a new exact lattice bosonization of matrix quantum mechanics (equivalently of non-relativistic fermions) that is valid for arbitrary rank N of the matrix, based on an exact operator bosonization introduced earlier in [1]. The trace identities are automatically incorporated in this formalism. The finite number N of fermions is reflected in the finite number N of bosonic oscillators, or… ▽ More We describe a new exact lattice bosonization of matrix quantum mechanics (equivalently of non-relativistic fermions) that is valid for arbitrary rank N of the matrix, based on an exact operator bosonization introduced earlier in [1]. The trace identities are automatically incorporated in this formalism. The finite number N of fermions is reflected in the finite number N of bosonic oscillators, or equivalently to the finite number N of lattice points. The fermion Hamiltonian is exactly mappable to a bosonic Hamiltonian. At large N, the latter becomes local and corresponds to the lattice version of a relativistic boson Hamiltonian, with a lattice spacing of order 1/N. The finite lattice spacing leads to a finite entanglement entropy (EE) of the bosonic theory, which reproduces the finite EE of the fermionic theory. Such a description is not available in the standard bosonization in terms of fermion density fluctuations on the Fermi surface, which does not have a built-in short distance cut-off (see, however, [2]). The bosonic lattice is equipped with a geometry determined by the matrix potential or equivalently by the shape of the Fermi surface. Our bosonization also works in the double scaled c=1 model, where the bosonic EE again turns out to be finite, with the short distance cut-off turning as g_s l_s, and reproduces the matrix result. Once again, such a short distance cut-off cannot appear in the conventional dual of c=1 in terms of the 2D string ``tachyon'', where the expected short distance scale is l_s. This indicates our bosonization as a possibly different dual description to the c=1 matrix model appropriate for ``local physics'' like quantum entanglement, by contrast with the conventional duality to the eigenvalue density which works well for asymptotic observables like S-matrices. We briefly discuss possible relation of our bosonization to D0 branes. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 25 pages + appendices, 9 figures (v1)

Report number: TIFR/TH/24-6

arXiv:2406.07622 [pdf, other]

Primordial Black Holes from First-Order Phase Transition in the xSM

Authors: Dorival Gonçalves, Ajay Kaladharan, Yongcheng Wu

Abstract: Supercooled first-order phase transition (FOPT) can lead to the formation of primordial black holes (PBHs). This scenario imposes stringent requirements on the profile of the effective potential. In this work, we use the singlet extended Standard Model (xSM) as a benchmark model to investigate this possibility at the electroweak scale. The PBHs formed during a supercooled FOPT have a narrow mass d… ▽ More Supercooled first-order phase transition (FOPT) can lead to the formation of primordial black holes (PBHs). This scenario imposes stringent requirements on the profile of the effective potential. In this work, we use the singlet extended Standard Model (xSM) as a benchmark model to investigate this possibility at the electroweak scale. The PBHs formed during a supercooled FOPT have a narrow mass distribution around the mass of Earth. This distribution is closely tied to the temperature at which the PBHs form, corresponding to the FOPT at the electroweak scale. This scenario can be probed with microlensing experiments, space-based gravitational wave detectors, and collider experiments. Remarkably, the future space-based gravitational wave detector LISA will hold the potential to either confirm this PBH scenario in the xSM or completely rule it out for extremely small total dark matter fraction made of PBHs, down to $f_{\rm PBH}> 10^{-300}$. Interestingly, our findings suggest that PBHs within the xSM framework may align with observations of the six ultrashort timescale events reported by the OGLE microlensing experiment. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 46 pages, 12 figures and 1 table

arXiv:2406.05643 [pdf, other]

doi 10.1103/PhysRevB.109.235401

Predicting edge-localized monovacancy defects in zigzag graphene nanoribbons from Floquet quasienergy spectrum

Authors: Gulshan Kumar, Shashikant Kumar, Ajay Kumar, Prakash Parida

Abstract: In this work, we prescribe a theoretical framework aiming at predicting the position of monovacancy defects at the edges of zigzag graphene nanoribbons (ZGNRs) using Floquet-Bloch formalism, which can be experimentally observed through time- and angle-resolved photoemission spectroscopy (tr-ARPES). Our methodology involves an in-depth investigation of the Floquet quasienergy band spectrum influenc… ▽ More In this work, we prescribe a theoretical framework aiming at predicting the position of monovacancy defects at the edges of zigzag graphene nanoribbons (ZGNRs) using Floquet-Bloch formalism, which can be experimentally observed through time- and angle-resolved photoemission spectroscopy (tr-ARPES). Our methodology involves an in-depth investigation of the Floquet quasienergy band spectrum influenced by light with varying polarization across a range of frequencies. Particularly under the influence of circularly polarized light with a frequency comparable to the bandwidth of the system, our findings suggest a promising approach for locating monovacancy defects at either edge, a challenge that proves intricate to predict from the ARPES spectrum of ZGNRs with monovacancy defects. This has been achieved by analyzing the orientation of the Floquet edge state and the appearance of new Dirac points in the vicinity of the Fermi level. The real-world applications of these captivating characteristics underscore the importance and pertinence of our theoretical framework, paving the way for additional exploration and practical use. Our approach, employing the Floquet formalism, is not limited to monovacancy-type defects; rather, it can be expanded to encompass various types of vacancy defects. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: Total number of 10 pages and 12 figures

Journal ref: Physical Review B 109, 235401 (2024)

arXiv:2406.05342 [pdf]

Compensation for reactive power and harmonic current drawn by a non-linear load in a pv-micro hydro grid

Authors: Raj Krishna Nepal, Bibek Khanal, Sanket Khatiwada, Nirajan Bhandari, Bishal Rijal, Raisha Karmacharya, Ajay Thapa

Abstract: This paper presents a simulation approach to enhance the power quality of a PV-micro hydro grid supplying both linear consumer load and non-linear industrial load by integrating Shunt Active Power Filter (SAPF), utilizing instantaneous PQ theory and hysteresis current control band logic. The non-linear load draws reactive power and harmonic current from the source thereby affecting the power quali… ▽ More This paper presents a simulation approach to enhance the power quality of a PV-micro hydro grid supplying both linear consumer load and non-linear industrial load by integrating Shunt Active Power Filter (SAPF), utilizing instantaneous PQ theory and hysteresis current control band logic. The non-linear load draws reactive power and harmonic current from the source thereby affecting the power quality. The integration of the SAPF at the point of common coupling (PCC) offers reactive power and harmonic current compensation, ensuring that the current supply to the grid remains nearly sinusoidal and proportional to the active power. By injecting equal and opposite harmonic components, the SAPF effectively reduces Total Harmonic Distortion (THD) from 7% to 2.96%, thereby enhancing the overall power quality of the PV-micro hydro grid system. △ Less

Submitted 8 June, 2024; originally announced June 2024.

Comments: 5 pages, 21 figures, submitted on IEEE powercon 2024 conference

arXiv:2406.04805 [pdf, other]

GENIE: Watermarking Graph Neural Networks for Link Prediction

Authors: Venkata Sai Pranav Bachina, Ankit Gangwal, Aaryan Ajay Sharma, Charu Sharma

Abstract: Graph Neural Networks (GNNs) have advanced the field of machine learning by utilizing graph-structured data, which is ubiquitous in the real world. GNNs have applications in various fields, ranging from social network analysis to drug discovery. GNN training is strenuous, requiring significant computational resources and human expertise. It makes a trained GNN an indispensable Intellectual Propert… ▽ More Graph Neural Networks (GNNs) have advanced the field of machine learning by utilizing graph-structured data, which is ubiquitous in the real world. GNNs have applications in various fields, ranging from social network analysis to drug discovery. GNN training is strenuous, requiring significant computational resources and human expertise. It makes a trained GNN an indispensable Intellectual Property (IP) for its owner. Recent studies have shown GNNs to be vulnerable to model-stealing attacks, which raises concerns over IP rights protection. Watermarking has been shown to be effective at protecting the IP of a GNN model. Existing efforts to develop a watermarking scheme for GNNs have only focused on the node classification and the graph classification tasks. To the best of our knowledge, we introduce the first-ever watermarking scheme for GNNs tailored to the Link Prediction (LP) task. We call our proposed watermarking scheme GENIE (watermarking Graph nEural Networks for lInk prEdiction). We design GENIE using a novel backdoor attack to create a trigger set for two key methods of LP: (1) node representation-based and (2) subgraph-based. In GENIE, the watermark is embedded into the GNN model by training it on both the trigger set and a modified training set, resulting in a watermarked GNN model. To assess a suspect model, we verify the watermark against the trigger set. We extensively evaluate GENIE across 3 model architectures (i.e., SEAL, GCN, and GraphSAGE) and 7 real-world datasets. Furthermore, we validate the robustness of GENIE against 11 state-of-the-art watermark removal techniques and 3 model extraction attacks. We also demonstrate that GENIE is robust against ownership piracy attack. Our ownership demonstration scheme statistically guarantees both False Positive Rate (FPR) and False Negative Rate (FNR) to be less than $10^{-6}$. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 20 pages, 12 figures

arXiv:2406.02523 [pdf, other]

RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots

Authors: Soroush Nasiriany, Abhiram Maddukuri, Lance Zhang, Adeet Parikh, Aaron Lo, Abhishek Joshi, Ajay Mandlekar, Yuke Zhu

Abstract: Recent advancements in Artificial Intelligence (AI) have largely been propelled by scaling. In Robotics, scaling is hindered by the lack of access to massive robot datasets. We advocate using realistic physical simulation as a means to scale environments, tasks, and datasets for robot learning methods. We present RoboCasa, a large-scale simulation framework for training generalist robots in everyd… ▽ More Recent advancements in Artificial Intelligence (AI) have largely been propelled by scaling. In Robotics, scaling is hindered by the lack of access to massive robot datasets. We advocate using realistic physical simulation as a means to scale environments, tasks, and datasets for robot learning methods. We present RoboCasa, a large-scale simulation framework for training generalist robots in everyday environments. RoboCasa features realistic and diverse scenes focusing on kitchen environments. We provide thousands of 3D assets across over 150 object categories and dozens of interactable furniture and appliances. We enrich the realism and diversity of our simulation with generative AI tools, such as object assets from text-to-3D models and environment textures from text-to-image models. We design a set of 100 tasks for systematic evaluation, including composite tasks generated by the guidance of large language models. To facilitate learning, we provide high-quality human demonstrations and integrate automated trajectory generation methods to substantially enlarge our datasets with minimal human burden. Our experiments show a clear scaling trend in using synthetically generated robot data for large-scale imitation learning and show great promise in harnessing simulation data in real-world tasks. Videos and open-source code are available at https://robocasa.ai/ △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: RSS 2024

arXiv:2406.01693 [pdf, other]

IXPE observation of PKS 2155-304 reveals the most highly polarized blazar

Authors: Pouya M. Kouch, Ioannis Liodakis, Riccardo Middei, Dawoon E. Kim, Fabrizio Tavecchio, Alan P. Marscher, Herman L. Marshall, Steven R. Ehlert, Laura Di Gesu, Svetlana G. Jorstad, Iván Agudo, Grzegorz M. Madejski, Roger W. Romani, Manel Errando, Elina Lindfors, Kari Nilsson, Ella Toppari, Stephen B. Potter, Ryo Imazawa, Mahito Sasada, Yasushi Fukazawa, Koji S. Kawabata, Makoto Uemura, Tsunefumi Mizuno, Tatsuya Nakaoka , et al. (111 additional authors not shown)

Abstract: We report the X-ray polarization properties of the high-synchrotron-peaked (HSP) blazar PKS 2155$-$304 based on observations with the Imaging X-ray Polarimetry Explorer (IXPE). We observed the source between Oct 27 and Nov 7, 2023. We also conducted an extensive contemporaneous multiwavelength (MW) campaign. We find that during the first half ($T_1$) of the IXPE pointing, the source exhibited the… ▽ More We report the X-ray polarization properties of the high-synchrotron-peaked (HSP) blazar PKS 2155$-$304 based on observations with the Imaging X-ray Polarimetry Explorer (IXPE). We observed the source between Oct 27 and Nov 7, 2023. We also conducted an extensive contemporaneous multiwavelength (MW) campaign. We find that during the first half ($T_1$) of the IXPE pointing, the source exhibited the highest X-ray polarization degree detected for an HSP blazar thus far, (30.7$\pm$2.0)%, which dropped to (15.3$\pm$2.1)% during the second half ($T_2$). The X-ray polarization angle remained stable during the IXPE pointing at 129.4$^\circ$$\pm$1.8$^\circ$ and 125.4$^\circ$$\pm$3.9$^\circ$ during $T_1$ and $T_2$, respectively. Meanwhile, the optical polarization degree remained stable during the IXPE pointing, with average host-galaxy-corrected values of (4.3$\pm$0.7)% and (3.8$\pm$0.9)% during the $T_1$ and $T_2$, respectively. During the IXPE pointing, the optical polarization angle changed achromatically from $\sim$140$^\circ$ to $\sim$90$^\circ$ and back to $\sim$130$^\circ$. Despite several attempts, we only detected (99.7% conf.) the radio polarization once (during $T_2$, at 225.5 GHz): with degree (1.7$\pm$0.4)% and angle 112.5$^\circ$$\pm$5.5$^\circ$. The direction of the broad pc-scale jet is rather ambiguous and has been found to point to the east and south at different epochs; however, on larger scales (> 1.5 pc) the jet points toward the southeast ($\sim$135$^\circ$), similar to all of the MW polarization angles. Moreover, the X-ray to optical polarization degree ratios of $\sim$7 and $\sim$4 during $T_1$ and $T_2$, respectively, are similar to previous IXPE results for several HSP blazars. These findings, combined with the lack of correlation of temporal variability between the MW polarization properties, agree with an energy-stratified shock-acceleration scenario in HSP blazars. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 17 pages, 10 figures, 4 tables, Accepted for publication in A&A

arXiv:2406.00989 [pdf, other]

On the exact limit of the time-dependent coupled cluster ansatz and its approximations in the real-time equation-of-motion coupled cluster cumulant Green's function approach

Authors: Bo Peng, Himadri Pathak, Ajay Panyala, Fernando D. Vila, John J. Rehr, Karol Kowalski

Abstract: In this paper, we analyze the properties of the recently proposed real-time equation-of-motion coupled-cluster (RT-EOM-CC) cumulant Green's function approach [J. Chem. Phys. 2020, 152, 174113]. We specifically focus on identifying the limitations of the original time-dependent coupled cluster (TDCC) ansatz and propose an enhanced extended TDCC ansatz ensuring the exactness in the expansion limit.… ▽ More In this paper, we analyze the properties of the recently proposed real-time equation-of-motion coupled-cluster (RT-EOM-CC) cumulant Green's function approach [J. Chem. Phys. 2020, 152, 174113]. We specifically focus on identifying the limitations of the original time-dependent coupled cluster (TDCC) ansatz and propose an enhanced extended TDCC ansatz ensuring the exactness in the expansion limit. Additionally, we introduce a practical cluster-analysis-based approach for characterizing the peaks in the computed spectral function from the RT-EOM-CC cumulant Green's function approach, which is particularly useful for the assignments of satellite peaks when many-body effects dominate the spectra. Our preliminary numerical tests focus on reproducing, approximating, and characterizing the exact impurity Green's function of the three-site and four-site single impurity Anderson models using the RT-EOM-CC cumulant Green's function approach. The numerical tests allow us to have a direct comparison between the RT-EOM-CC cumulant Green's function approach and other Green's function approaches in the numerical exact limit. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.00681 [pdf, other]

Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient

Authors: Zechu Li, Rickmer Krohn, Tao Chen, Anurag Ajay, Pulkit Agrawal, Georgia Chalvatzaki

Abstract: Deep reinforcement learning (RL) algorithms typically parameterize the policy as a deep network that outputs either a deterministic action or a stochastic one modeled as a Gaussian distribution, hence restricting learning to a single behavioral mode. Meanwhile, diffusion models emerged as a powerful framework for multimodal learning. However, the use of diffusion policies in online RL is hindered… ▽ More Deep reinforcement learning (RL) algorithms typically parameterize the policy as a deep network that outputs either a deterministic action or a stochastic one modeled as a Gaussian distribution, hence restricting learning to a single behavioral mode. Meanwhile, diffusion models emerged as a powerful framework for multimodal learning. However, the use of diffusion policies in online RL is hindered by the intractability of policy likelihood approximation, as well as the greedy objective of RL methods that can easily skew the policy to a single mode. This paper presents Deep Diffusion Policy Gradient (DDiffPG), a novel actor-critic algorithm that learns from scratch multimodal policies parameterized as diffusion models while discovering and maintaining versatile behaviors. DDiffPG explores and discovers multiple modes through off-the-shelf unsupervised clustering combined with novelty-based intrinsic motivation. DDiffPG forms a multimodal training batch and utilizes mode-specific Q-learning to mitigate the inherent greediness of the RL objective, ensuring the improvement of the diffusion policy across all modes. Our approach further allows the policy to be conditioned on mode-specific embeddings to explicitly control the learned modes. Empirical studies validate DDiffPG's capability to master multimodal behaviors in complex, high-dimensional continuous control tasks with sparse rewards, also showcasing proof-of-concept dynamic online replanning when navigating mazes with unseen obstacles. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.00086 [pdf, ps, other]

Leptogenesis in exponential $f(R)$ gravity model

Authors: Suhail Khan, Ajay Bassi, Rathin Adhikari

Abstract: We show that gravitational leptogenesis with dynamical $CPT$ breaking in an expanding universe can be reconciled with exponential $f(R)$ gravity models with axion as cold dark matter. For $L$ violating interactions, we consider both non-supersymmetric model with heavy right handed neutrino decay and supersymmetric model with sneutrino decay. For both the cases, we have shown that the required bary… ▽ More We show that gravitational leptogenesis with dynamical $CPT$ breaking in an expanding universe can be reconciled with exponential $f(R)$ gravity models with axion as cold dark matter. For $L$ violating interactions, we consider both non-supersymmetric model with heavy right handed neutrino decay and supersymmetric model with sneutrino decay. For both the cases, we have shown that the required baryonis asymmetry could be obtained and also have shown the variation of decoupling temperature for lepton number violating interactions with $β$ parameter in exponential $f(R)$ gravity. Lepton number violating model parameters are contrained with $β$ through decoupling temperature. Upper bound on $β$ parameter is also obtained. △ Less

Submitted 31 May, 2024; originally announced June 2024.

Comments: 6 pages, 3 figures

arXiv:2405.20820 [pdf, ps, other]

Constrained Dynamics Simulation: More With Less

Authors: Ajay Suresha Sathya

Abstract: Efficient robot dynamics simulation is a fundamental problem key for robot control, identification, design and analysis. This research statement explores my current progress in this field and future research directions. Efficient robot dynamics simulation is a fundamental problem key for robot control, identification, design and analysis. This research statement explores my current progress in this field and future research directions. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: Accepted submission to RSS:24 Pioneers Workshop

arXiv:2405.20309 [pdf, other]

Large Language Models Can Self-Improve At Web Agent Tasks

Authors: Ajay Patel, Markus Hofmarcher, Claudiu Leoveanu-Condrei, Marius-Constantin Dinu, Chris Callison-Burch, Sepp Hochreiter

Abstract: Training models to act as agents that can effectively navigate and perform actions in a complex environment, such as a web browser, has typically been challenging due to lack of training data. Large language models (LLMs) have recently demonstrated some capability to navigate novel environments as agents in a zero-shot or few-shot fashion, purely guided by natural language instructions as prompts.… ▽ More Training models to act as agents that can effectively navigate and perform actions in a complex environment, such as a web browser, has typically been challenging due to lack of training data. Large language models (LLMs) have recently demonstrated some capability to navigate novel environments as agents in a zero-shot or few-shot fashion, purely guided by natural language instructions as prompts. Recent research has also demonstrated LLMs have the capability to exceed their base performance through self-improvement, i.e. fine-tuning on data generated by the model itself. In this work, we explore the extent to which LLMs can self-improve their performance as agents in long-horizon tasks in a complex environment using the WebArena benchmark. In WebArena, an agent must autonomously navigate and perform actions on web pages to achieve a specified objective. We explore fine-tuning on three distinct synthetic training data mixtures and achieve a 31\% improvement in task completion rate over the base model on the WebArena benchmark through a self-improvement procedure. We additionally contribute novel evaluation metrics for assessing the performance, robustness, capabilities, and quality of trajectories of our fine-tuned agent models to a greater degree than simple, aggregate-level benchmark scores currently used to measure self-improvement. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.19457 [pdf, ps, other]

Construction of a Byzantine Linearizable SWMR Atomic Register from SWSR Atomic Registers

Authors: Ajay D. Kshemkalyani, Manaswini Piduguralla, Sathya Peri, Anshuman Misra

Abstract: The SWMR atomic register is a fundamental building block in shared memory distributed systems and implementing it from SWSR atomic registers is an important problem. While this problem has been solved in crash-prone systems, it has received less attention in Byzantine systems. Recently, Hu and Toueg gave such an implementation of the SWMR register from SWSR registers. While their definition of reg… ▽ More The SWMR atomic register is a fundamental building block in shared memory distributed systems and implementing it from SWSR atomic registers is an important problem. While this problem has been solved in crash-prone systems, it has received less attention in Byzantine systems. Recently, Hu and Toueg gave such an implementation of the SWMR register from SWSR registers. While their definition of register linearizability is consistent with the definition of Byzantine linearizability of a concurrent history of Cohen and Keidar, it has these drawbacks. (1) If the writer is Byzantine, the register is linearizable no matter what values the correct readers return. (2) It ignores values written consistently by a Byzantine writer. We need a stronger notion of a {\em correct write operation}. (3) It allows a value written to just one or a few readers' SWSR registers to be returned, thereby not validating the intention of the writer to write that value honestly. (4) Its notion of a ``current'' value returned by a correct reader is not related to the most recent value written by a correct write operation of a Byzantine writer. We need a more up to date version of the value that can be returned by a correct reader. In this paper, we give a stronger definition of a Byzantine linearizable register that overcomes the above drawbacks. Then we give a construction of a Byzantine linearizable SWMR atomic register from SWSR registers that meets our stronger definition. The construction is correct when $n>3f$, where $n$ is the number of readers, $f$ is the maximum number of Byzantine readers, and the writer can also be Byzantine. The construction relies on a public-key infrastructure. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 18 pages

ACM Class: C.2.4; D.1.3

arXiv:2405.17247 [pdf, other]

An Introduction to Vision-Language Modeling

Authors: Florian Bordes, Richard Yuanzhe Pang, Anurag Ajay, Alexander C. Li, Adrien Bardes, Suzanne Petryk, Oscar Mañas, Zhiqiu Lin, Anas Mahmoud, Bargav Jayaraman, Mark Ibrahim, Melissa Hall, Yunyang Xiong, Jonathan Lebensold, Candace Ross, Srihari Jayakumar, Chuan Guo, Diane Bouchacourt, Haider Al-Tahan, Karthik Padthe, Vasu Sharma, Hu Xu, Xiaoqing Ellen Tan, Megan Richards, Samuel Lavoie , et al. (16 additional authors not shown)

Abstract: Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technol… ▽ More Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technology. However, there are many challenges that need to be addressed to improve the reliability of those models. While language is discrete, vision evolves in a much higher dimensional space in which concepts cannot always be easily discretized. To better understand the mechanics behind map** vision to language, we present this introduction to VLMs which we hope will help anyone who would like to enter the field. First, we introduce what VLMs are, how they work, and how to train them. Then, we present and discuss approaches to evaluate VLMs. Although this work primarily focuses on map** images to language, we also discuss extending VLMs to videos. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.16615 [pdf, ps, other]

Rough geometric integration

Authors: Ajay Chandra, Harprit Singh

Abstract: We introduce a notion of distributional $k$-forms on $d$-dimensional manifolds which can be integrated against suitably regular $k$-submanifolds. Our approach combines ideas from Whitney's geometric integration [Whi57] with those of sewing approaches to rough integration [Gub04, FdLP06]. We introduce a notion of distributional $k$-forms on $d$-dimensional manifolds which can be integrated against suitably regular $k$-submanifolds. Our approach combines ideas from Whitney's geometric integration [Whi57] with those of sewing approaches to rough integration [Gub04, FdLP06]. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.15428 [pdf, other]

Enhancing Pollinator Conservation towards Agriculture 4.0: Monitoring of Bees through Object Recognition

Authors: Ajay John Alex, Chloe M. Barnes, Pedro Machado, Isibor Ihianle, Gábor Markó, Martin Bencsik, Jordan J. Bird

Abstract: In an era of rapid climate change and its adverse effects on food production, technological intervention to monitor pollinator conservation is of paramount importance for environmental monitoring and conservation for global food security. The survival of the human species depends on the conservation of pollinators. This article explores the use of Computer Vision and Object Recognition to autonomo… ▽ More In an era of rapid climate change and its adverse effects on food production, technological intervention to monitor pollinator conservation is of paramount importance for environmental monitoring and conservation for global food security. The survival of the human species depends on the conservation of pollinators. This article explores the use of Computer Vision and Object Recognition to autonomously track and report bee behaviour from images. A novel dataset of 9664 images containing bees is extracted from video streams and annotated with bounding boxes. With training, validation and testing sets (6722, 1915, and 997 images, respectively), the results of the COCO-based YOLO model fine-tuning approaches show that YOLOv5m is the most effective approach in terms of recognition accuracy. However, YOLOv5s was shown to be the most optimal for real-time bee detection with an average processing and inference time of 5.1ms per video frame at the cost of slightly lower ability. The trained model is then packaged within an explainable AI interface, which converts detection events into timestamped reports and charts, with the aim of facilitating use by non-technical users such as expert stakeholders from the apiculture industry towards informing responsible consumption and production. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.13856 [pdf, ps, other]

In-beam $γ-$spectroscopy of the transitional nucleus $^{217}$Ac

Authors: Dhananjaya Sahoo, A. Y. Deo, Madhu, Khamosh Yadav, S. S. Tiwary, P. C. Srivastava, R. Palit, S. K. Tandel, Anil Kumar, P. Dey, Biswajit Das, Vishal Malik, A. Kundu, A. Sindhu, S. V. Jadhav, B. S. Naidu, A. V. Thomas

Abstract: High-spin states in the transitional $^{217}$Ac nucleus are established up to 3.8 MeV excitation energy and $I^π =$ 41/2$^+$ with the addition of around 20 new transitions. The structure of the yrast and near-yrast states below the 29/2$^+$ isomer is revisited. The inconsistencies in the level schemes reported earlier are resolved. The level structure above the 29/2$^+$ isomer is established for t… ▽ More High-spin states in the transitional $^{217}$Ac nucleus are established up to 3.8 MeV excitation energy and $I^π =$ 41/2$^+$ with the addition of around 20 new transitions. The structure of the yrast and near-yrast states below the 29/2$^+$ isomer is revisited. The inconsistencies in the level schemes reported earlier are resolved. The level structure above the 29/2$^+$ isomer is established for the first time. Large-basis shell-model calculations with the KHPE interaction are performed to compare the experimentally observed level energies with the theoretical predictions. A comparison with the systematics of the N = 128 isotones suggests that the yrast structures result from a weak coupling of the odd proton to the even-even 216Ra core, which is consistent with the shell-model configurations. Furthermore, alpha decay of the 29/2$^+$ isomer is revisited and the decay scheme established from this work is discussed in the framework of the shell model. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 11 pages, 9 figures

arXiv:2405.11362 [pdf, other]

The Role of a Diluent in Deformation-Induced Bonding of Glassy Polymer Bidisperse Blends

Authors: Ajay Vallabh, John G Tsavalas

Abstract: Bonding between polymers below the glass transition temperature through molecular-scale dilatation (or densification)-based interdiffusion of macromolecules has recently been introduced. In this mechanism, short timeframe plastic deformation enables polymer chains to interdiffuse and form entanglements at the interface, facilitating rapid bonding below the glass transition temperature ($T_g$). Her… ▽ More Bonding between polymers below the glass transition temperature through molecular-scale dilatation (or densification)-based interdiffusion of macromolecules has recently been introduced. In this mechanism, short timeframe plastic deformation enables polymer chains to interdiffuse and form entanglements at the interface, facilitating rapid bonding below the glass transition temperature ($T_g$). Here, we are addressing the role of a lower molecular weight diluent in bonding polymer interfaces of bidisperse blends through deformation-induced bonding (DIB) at temperatures well below both the surface and bulk glass transition temperatures, $T_g^s$ and $T_g^b$, respectively, by using molecular simulations. These simulations reveal that addition of the diluent ($φ\le$20\%) drastically enhances the number of chain-ends at the interfacial region compared to a pure glass sample ($φ=0\%$) during deformation below $T_g^s$, which improves the possibility of opposite side entanglement formation. The changes in stress-strain response of debonded samples correlate with the normalized entanglement density. Likewise, the maximum interfacial fracture energy $G_{I,max}$ of debonded samples is correlated with the diluent concentration ($φ$), below $T_g^s$. Furthermore, the optimization of material and process conditions for DIB has yielded a notable advancement for the conditions tested here: achieving a higher bonding strength, approximately one-third of the bulk, all while remaining below $T_g$. △ Less

Submitted 29 May, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

arXiv:2405.11343 [pdf, other]

Sub-relativistic Outflow and Hours-Timescale Large-amplitude X-ray Dips during Super-Eddington Accretion onto a Low-mass Massive Black Hole in the Tidal Disruption Event AT2022lri

Authors: Yuhan Yao, Muryel Guolo, Francesco Tombesi, Ruancun Li, Suvi Gezari, Javier A. García, Lixin Dai, Ryan Chornock, Wenbin Lu, S. R. Kulkarni, Keith C. Gendreau, Dheeraj R. Pasham, S. Bradley Cenko, Erin Kara, Raffaella Margutti, Yukta Ajay, Thomas Wevers, Tom M. Kwan, Igor Andreoni, Joshua S. Bloom, Andrew J. Drake, Matthew J. Graham, Erica Hammerstein, Russ R. Laher, Natalie LeBaron , et al. (10 additional authors not shown)

Abstract: We present the tidal disruption event (TDE) AT2022lri, hosted in a nearby ($\approx\!144$ Mpc) quiescent galaxy with a low-mass massive black hole ($10^4\,M_\odot < M_{\rm BH} < 10^6\,M_\odot$). AT2022lri belongs to the TDE-H+He subtype. More than 1 Ms of X-ray data were collected with NICER, Swift, and XMM-Newton from 187 d to 672 d after peak. The X-ray luminosity gradually declined from… ▽ More We present the tidal disruption event (TDE) AT2022lri, hosted in a nearby ($\approx\!144$ Mpc) quiescent galaxy with a low-mass massive black hole ($10^4\,M_\odot < M_{\rm BH} < 10^6\,M_\odot$). AT2022lri belongs to the TDE-H+He subtype. More than 1 Ms of X-ray data were collected with NICER, Swift, and XMM-Newton from 187 d to 672 d after peak. The X-ray luminosity gradually declined from $1.5\times 10^{44}\,{\rm erg\,s^{-1}}$ to $1.5\times 10^{43}\,{\rm erg\,s^{-1}}$ and remains much above the UV and optical luminosity, consistent with a super-Eddington accretion flow viewed face-on. Sporadic strong X-ray dips atop a long-term decline are observed, with variability timescale of $\approx\!0.5$ hr--1 d and amplitude of $\approx\!2$--8. When fitted with simple continuum models, the X-ray spectrum is dominated by a thermal disk component with inner temperature going from $\sim\! 146$ eV to $\sim\! 86$ eV. However, there are residual features that peak around 1 keV, which, in some cases, cannot be reproduced by a single broad emission line. We analyzed a subset of time-resolved spectra with two physically motivated models describing either a scenario where ionized absorbers contribute extra absorption and emission lines or where disk reflection plays an important role. Both models provide good and statistically comparable fits, show that the X-ray dips are correlated with drops in the inner disk temperature, and require the existence of sub-relativistic (0.1--0.3$c$) ionized outflows. We propose that the disk temperature fluctuation stems from episodic drops of the mass accretion rate triggered by magnetic instabilities or/and wobbling of the inner accretion disk along the black hole's spin axis. △ Less

Submitted 18 May, 2024; originally announced May 2024.

Comments: 35 pages, 20 figures, submitted

arXiv:2405.08107 [pdf, other]

Studying geometry of the ultraluminous X-ray pulsar Swift J0243.6+6124 using X-ray and optical polarimetry

Authors: Juri Poutanen, Sergey S. Tsygankov, Victor Doroshenko, Sofia V. Forsblom, Peter Jenke, Philip Kaaret, Andrei V. Berdyugin, Dmitry Blinov, Vadim Kravtsov, Ioannis Liodakis, Anastasia Tzouvanou, Alessandro Di Marco, Jeremy Heyl, Fabio La Monaca, Alexander A. Mushtukov, George G. Pavlov, Alexander Salganik, Alexandra Veledina, Martin C. Weisskopf, Silvia Zane, Vladislav Loktev, Valery F. Suleimanov, Colleen Wilson-Hodge, Svetlana V. Berdyugina, Masato Kagitani , et al. (86 additional authors not shown)

Abstract: Discovery of pulsations from a number of ultra-luminous X-ray (ULX) sources proved that accretion onto neutron stars can produce luminosities exceeding the Eddington limit by a couple of orders of magnitude. The conditions necessary to achieve such high luminosities as well as the exact geometry of the accretion flow in the neutron star vicinity are, however, a matter of debate. The pulse phase-re… ▽ More Discovery of pulsations from a number of ultra-luminous X-ray (ULX) sources proved that accretion onto neutron stars can produce luminosities exceeding the Eddington limit by a couple of orders of magnitude. The conditions necessary to achieve such high luminosities as well as the exact geometry of the accretion flow in the neutron star vicinity are, however, a matter of debate. The pulse phase-resolved polarization measurements that became possible with the launch of the IXPE can be used to determine the pulsar geometry and its orientation relative to the orbital plane. They provide an avenue to test different theoretical models of ULX pulsars. In this paper we present the results of three IXPE observations of the first Galactic ULX pulsar Swift J0243.6+6124 during its 2023 outburst. We find strong variations of the polarization characteristics with the pulsar phase. The average polarization degree increases from about 5% to 15% as the flux dropped by a factor of three in the course of the outburst. The polarization angle (PA) as function of the pulsar phase shows two peaks in the first two observations, but changes to a characteristic sawtooth pattern in the remaining data set. This is not consistent with a simple rotating vector model. Assuming the existence of an additional constant polarized component, we were able to fit the three observations with a common rotating vector model and obtain constraints on the pulsar geometry. In particular, we find the pulsar angular momentum inclination with respect to the line-of-sight of 15-40 deg, the magnetic obliquity of 60-80 deg, and the pulsar spin position angle of -50 deg, which differs from the constant component PA of about 10 deg. Combining these X-ray measurements with the optical PA, we find evidence for a 30 deg misalignment between the pulsar spin and the binary orbital axis. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 13 pages, 10 figures, submitted to A&A

arXiv:2405.07577 [pdf, other]

doi 10.3847/2041-8213/ad4a68

Discovery of a shock-compressed magnetic field in the north-western rim of the young supernova remnant RX J1713.7-3946 with X-ray polarimetry

Authors: Riccardo Ferrazzoli, Dmitry Prokhorov, Niccolò Bucciantini, Patrick Slane, Jacco Vink, Martina Cardillo, Yi-Jung Yang, Stefano Silvestri, ** Zhou, Enrico Costa, Nicola Omodei, C. -Y. Ng, Paolo Soffitta, Martin C. Weisskopf, Luca Baldini, Alessandro Di Marco, Victor Doroshenko, Jeremy Heyl, Philip Kaaret, Dawoon E. Kim, Frédéric Marin, Tsunefumi Mizuno, Melissa Pesce-Rollins, Carmelo Sgrò, Douglas A. Swartz , et al. (77 additional authors not shown)

Abstract: Supernova remnants (SNRs) provide insights into cosmic-ray acceleration and magnetic field dynamics at shock fronts. Recent X-ray polarimetric measurements by the Imaging X-ray Polarimetry Explorer (IXPE) have revealed radial magnetic fields near particle acceleration sites in young SNRs, including Cassiopeia A, Tycho, and SN 1006. We present here the spatially-resolved IXPE X-ray polarimetric obs… ▽ More Supernova remnants (SNRs) provide insights into cosmic-ray acceleration and magnetic field dynamics at shock fronts. Recent X-ray polarimetric measurements by the Imaging X-ray Polarimetry Explorer (IXPE) have revealed radial magnetic fields near particle acceleration sites in young SNRs, including Cassiopeia A, Tycho, and SN 1006. We present here the spatially-resolved IXPE X-ray polarimetric observation of the northwestern rim of SNR RX J1713.7-3946. For the first time, our analysis shows that the magnetic field in particle acceleration sites of this SNR is oriented tangentially with respect to the shock front. Because of the lack of precise Faraday-rotation measurements in the radio band, this was not possible before. The average measured polarization degree (PD) of the synchtrotron emission is 12.5 {\pm} 3.3%, lower than the one measured by IXPE in SN 1006, comparable to the Tycho one, but notably higher than the one in Cassiopeia A. On sub-parsec scales, localized patches within RX J1713.7-3946 display PD up to 41.5 {\pm} 9.5%. These results are compatible with a shock-compressed magnetic field. However, in order to explain the observed PD, either the presence of a radial net magnetic field upstream of the shock, or partial reisotropization of the turbulence downstream by radial magneto-hydrodynamical instabilities, can be invoked. From comparison of PD and magnetic field distribution with γ-rays and 12 CO data, our results provide new inputs in favor of a leptonic origin of the γ-ray emission. △ Less

Submitted 10 June, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

Comments: 18 pages, 6 figures, 2 tables, published in ApJ Letters

Journal ref: ApJL 967 L38 (2024)

arXiv:2405.06777 [pdf, other]

Multiple magnetic interactions and large inverse magnetocaloric effect in TbSi and TbSi$_{0.6}$Ge$_{0.4}$

Authors: Ajay Kumar, Prashant Singh, Andrew Doyle, Deborah L. Schlagel, Yaroslav Mudryk

Abstract: We present a comprehensive investigation of the electronic structure, magnetization, specific heat, and crystallography of TbSi (FeB structure type) and TbSi$_{0.6}$Ge$_{0.4}$ (CrB structure type) compounds. Both TbSi and TbSi$_{0.6}$Ge$_{0.4}$ exhibit two antiferromagnetic (AFM) transitions at T$_{\rm N1}\approx$ 58~K and 57~K, and T$_{\rm N2}\approx$ 36~K and 44~K, respectively, along with an on… ▽ More We present a comprehensive investigation of the electronic structure, magnetization, specific heat, and crystallography of TbSi (FeB structure type) and TbSi$_{0.6}$Ge$_{0.4}$ (CrB structure type) compounds. Both TbSi and TbSi$_{0.6}$Ge$_{0.4}$ exhibit two antiferromagnetic (AFM) transitions at T$_{\rm N1}\approx$ 58~K and 57~K, and T$_{\rm N2}\approx$ 36~K and 44~K, respectively, along with an onset of weak metamagnetic-like transition around 6~T between T$_{\rm N1}$ and T$_{\rm N2}$. High-resolution specific heat (C$_{\rm P}$) measurements show the second- and first-order nature of the magnetic transition at T$_{\rm N1}$ and T$_{\rm N2}$, respectively, for both samples. However, in the case of TbSi, the low-temperature (LT) AFM to high-temperature (HT) AFM transition takes place via an additional AFM phase at the intermediate temperature (IT), where both LT to IT AFM and IT to HT AFM phase transitions exhibit a first-order nature. Both TbSi and TbSi$_{0.6}$Ge$_{0.4}$ manifest significant magnetic entropy changes ($ΔS_{\rm M}$) of 9.6 and 11.6~J/kg-K, respectively, for $Δμ_0H$=7~T, at T$_{\rm N2}$. The HT AFM phase of TbSi$_{0.6}$Ge$_{0.4}$ is found to be more susceptible to the external magnetic field, causing a significant broadening in the peaks of $ΔS_{\rm M}$ curves at higher magnetic fields. Temperature and field-dependent specific heat data have been utilized to construct the complex H-T phase diagram of these compounds. Furthermore, temperature-dependent x-ray diffraction measurements demonstrate substantial magnetostriction and anisotropic thermal expansion of the unit cell in both samples. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: Submitted on 10 May 2024

arXiv:2405.06676 [pdf, other]

EDA Corpus: A Large Language Model Dataset for Enhanced Interaction with OpenROAD

Authors: Bing-Yue Wu, Utsav Sharma, Sai Rahul Dhanvi Kankipati, Ajay Yadav, Bintu Kappil George, Sai Ritish Guntupalli, Austin Rovinski, Vidya A. Chhabria

Abstract: Large language models (LLMs) serve as powerful tools for design, providing capabilities for both task automation and design assistance. Recent advancements have shown tremendous potential for facilitating LLM integration into the chip design process; however, many of these works rely on data that are not publicly available and/or not permissively licensed for use in LLM training and distribution.… ▽ More Large language models (LLMs) serve as powerful tools for design, providing capabilities for both task automation and design assistance. Recent advancements have shown tremendous potential for facilitating LLM integration into the chip design process; however, many of these works rely on data that are not publicly available and/or not permissively licensed for use in LLM training and distribution. In this paper, we present a solution aimed at bridging this gap by introducing an open-source dataset tailored for OpenROAD, a widely adopted open-source EDA toolchain. The dataset features over 1000 data points and is structured in two formats: (i) a pairwise set comprised of question prompts with prose answers, and (ii) a pairwise set comprised of code prompts and their corresponding OpenROAD scripts. By providing this dataset, we aim to facilitate LLM-focused research within the EDA domain. The dataset is available at https://github.com/OpenROAD-Assistant/EDA-Corpus. △ Less

Submitted 4 May, 2024; originally announced May 2024.

Comments: Under review at Workshop on LLM-Aided Design (LAD'24)

arXiv:2405.06417 [pdf, other]

Hadron Spectroscopy: Light, Strange Baryons

Authors: Chandni Menapara, Ajay Kumar Rai

Abstract: The resonance mass spectra have been studied through a non-relativistic hypercentral Constituent Quark Model (hCQM) using a linear potential. Also, the effects of higher order correction terms (${\cal{O}}(\frac{1}{m})$, ${\cal{O}}(\frac{1}{m^{2}})$) have been studied for improvisation of the results. Other baryonic properties such as Regge trajectories, magnetic moment and decay widths have been c… ▽ More The resonance mass spectra have been studied through a non-relativistic hypercentral Constituent Quark Model (hCQM) using a linear potential. Also, the effects of higher order correction terms (${\cal{O}}(\frac{1}{m})$, ${\cal{O}}(\frac{1}{m^{2}})$) have been studied for improvisation of the results. Other baryonic properties such as Regge trajectories, magnetic moment and decay widths have been considered. A detailed comparison with other approaches are discussed in the present review. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: This is a Brief Review of Light Baryons. To be published in Few-Body Systems

arXiv:2405.05506 [pdf, other]

Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias

Authors: Shan Chen, Jack Gallifant, Mingye Gao, Pedro Moreira, Nikolaj Munch, Ajay Muthukkumar, Arvind Rajan, Jaya Kolluri, Amelia Fiske, Janna Hastings, Hugo Aerts, Brian Anthony, Leo Anthony Celi, William G. La Cava, Danielle S. Bitterman

Abstract: Large language models (LLMs) are increasingly essential in processing natural languages, yet their application is frequently compromised by biases and inaccuracies originating in their training data. In this study, we introduce Cross-Care, the first benchmark framework dedicated to assessing biases and real world knowledge in LLMs, specifically focusing on the representation of disease prevalence… ▽ More Large language models (LLMs) are increasingly essential in processing natural languages, yet their application is frequently compromised by biases and inaccuracies originating in their training data. In this study, we introduce Cross-Care, the first benchmark framework dedicated to assessing biases and real world knowledge in LLMs, specifically focusing on the representation of disease prevalence across diverse demographic groups. We systematically evaluate how demographic biases embedded in pre-training corpora like $ThePile$ influence the outputs of LLMs. We expose and quantify discrepancies by juxtaposing these biases against actual disease prevalences in various U.S. demographic groups. Our results highlight substantial misalignment between LLM representation of disease prevalence and real disease prevalence rates across demographic subgroups, indicating a pronounced risk of bias propagation and a lack of real-world grounding for medical applications of LLMs. Furthermore, we observe that various alignment methods minimally resolve inconsistencies in the models' representation of disease prevalence across different languages. For further exploration and analysis, we make all data and a data visualization tool available at: www.crosscare.net. △ Less

Submitted 24 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

Comments: Submitted for review, data visualization tool available at: www.crosscare.net

arXiv:2405.04437 [pdf, other]

vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention

Authors: Ramya Prabhu, Ajay Nayak, Jayashree Mohan, Ramachandran Ramjee, Ashish Panwar

Abstract: Efficient management of GPU memory is essential for high throughput LLM inference. Prior systems used to reserve KV-cache memory ahead-of-time that resulted in wasted capacity due to internal fragmentation. Inspired by demand paging, vLLM proposed PagedAttention to enable dynamic memory allocation for KV-cache. This approach eliminates fragmentation and improves serving throughout. However, to be… ▽ More Efficient management of GPU memory is essential for high throughput LLM inference. Prior systems used to reserve KV-cache memory ahead-of-time that resulted in wasted capacity due to internal fragmentation. Inspired by demand paging, vLLM proposed PagedAttention to enable dynamic memory allocation for KV-cache. This approach eliminates fragmentation and improves serving throughout. However, to be able to allocate physical memory dynamically, PagedAttention changes the layout of KV-cache from contiguous virtual memory to non-contiguous virtual memory. As a consequence, one needs to rewrite the attention kernels to support paging, and implement a memory manager in the serving framework. This results in both performance and programming overheads, as well as portability challenges in adopting state-of-the-art attention kernels. In this paper, we propose vAttention, a new approach for dynamic KV-cache memory management. In contrast to PagedAttention, vAttention stores KV-cache in contiguous virtual memory and leverages OS support for on-demand allocation of physical memory. vAttention thus enables one to use state-of-the art attention kernels out-of-the-box by adding support for dynamic allocation of physical memory without having to re-write their code. We implement vAttention in the vLLM serving stack to show that it also helps improve decode throughput by up to 1.99x over vLLM, and the end-to-end serving throughput by up to 1.22x and 1.29x, compared to using the state-of-the-art PagedAttention based kernels of FlashAttention and FlashInfer. △ Less

Submitted 12 July, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

Comments: 14 pages, 13 figures, 10 tables

arXiv:2405.04305 [pdf, other]

A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields

Authors: Raiyan Rahman, Christopher Indris, Goetz Bramesfeld, Tianxiao Zhang, Kaidong Li, Xiangyu Chen, Ivan Grijalva, Brian McCornack, Daniel Flippo, Ajay Sharda, Guanghui Wang

Abstract: Aphid infestations are one of the primary causes of extensive damage to wheat and sorghum fields and are one of the most common vectors for plant viruses, resulting in significant agricultural yield losses. To address this problem, farmers often employ the inefficient use of harmful chemical pesticides that have negative health and environmental impacts. As a result, a large amount of pesticide is… ▽ More Aphid infestations are one of the primary causes of extensive damage to wheat and sorghum fields and are one of the most common vectors for plant viruses, resulting in significant agricultural yield losses. To address this problem, farmers often employ the inefficient use of harmful chemical pesticides that have negative health and environmental impacts. As a result, a large amount of pesticide is wasted on areas without significant pest infestation. This brings to attention the urgent need for an intelligent autonomous system that can locate and spray sufficiently large infestations selectively within the complex crop canopies. We have developed a large multi-scale dataset for aphid cluster detection and segmentation, collected from actual sorghum fields and meticulously annotated to include clusters of aphids. Our dataset comprises a total of 54,742 image patches, showcasing a variety of viewpoints, diverse lighting conditions, and multiple scales, highlighting its effectiveness for real-world applications. In this study, we trained and evaluated four real-time semantic segmentation models and three object detection models specifically for aphid cluster segmentation and detection. Considering the balance between accuracy and efficiency, Fast-SCNN delivered the most effective segmentation results, achieving 80.46% mean precision, 81.21% mean recall, and 91.66 frames per second (FPS). For object detection, RT-DETR exhibited the best overall performance with a 61.63% mean average precision (mAP), 92.6% mean recall, and 72.55 on an NVIDIA V100 GPU. Our experiments further indicate that aphid cluster segmentation is more suitable for assessing aphid infestations than using detection models. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.02198 [pdf, other]

The Cambridge RoboMaster: An Agile Multi-Robot Research Platform

Authors: Jan Blumenkamp, Ajay Shankar, Matteo Bettini, Joshua Bird, Amanda Prorok

Abstract: Compact robotic platforms with powerful compute and actuation capabilities are key enablers for practical, real-world deployments of multi-agent research. This article introduces a tightly integrated hardware, control, and simulation software stack on a fleet of holonomic ground robot platforms designed with this motivation. Our robots, a fleet of customised DJI Robomaster S1 vehicles, offer a bal… ▽ More Compact robotic platforms with powerful compute and actuation capabilities are key enablers for practical, real-world deployments of multi-agent research. This article introduces a tightly integrated hardware, control, and simulation software stack on a fleet of holonomic ground robot platforms designed with this motivation. Our robots, a fleet of customised DJI Robomaster S1 vehicles, offer a balance between small robots that do not possess sufficient compute or actuation capabilities and larger robots that are unsuitable for indoor multi-robot tests. They run a modular ROS2-based optimal estimation and control stack for full onboard autonomy, contain ad-hoc peer-to-peer communication infrastructure, and can zero-shot run multi-agent reinforcement learning (MARL) policies trained in our vectorized multi-agent simulation framework. We present an in-depth review of other platforms currently available, showcase new experimental validation of our system's capabilities, and introduce case studies that highlight the versatility and reliabilty of our system as a testbed for a wide range of research demonstrations. Our system as well as supplementary material is available online: https://proroklab.github.io/cambridge-robomaster △ Less

Submitted 3 May, 2024; originally announced May 2024.

arXiv:2405.01472 [pdf, other]

IntervenGen: Interventional Data Generation for Robust and Data-Efficient Robot Imitation Learning

Authors: Ryan Hoque, Ajay Mandlekar, Caelan Garrett, Ken Goldberg, Dieter Fox

Abstract: Imitation learning is a promising paradigm for training robot control policies, but these policies can suffer from distribution shift, where the conditions at evaluation time differ from those in the training data. A popular approach for increasing policy robustness to distribution shift is interactive imitation learning (i.e., DAgger and variants), where a human operator provides corrective inter… ▽ More Imitation learning is a promising paradigm for training robot control policies, but these policies can suffer from distribution shift, where the conditions at evaluation time differ from those in the training data. A popular approach for increasing policy robustness to distribution shift is interactive imitation learning (i.e., DAgger and variants), where a human operator provides corrective interventions during policy rollouts. However, collecting a sufficient amount of interventions to cover the distribution of policy mistakes can be burdensome for human operators. We propose IntervenGen (I-Gen), a novel data generation system that can autonomously produce a large set of corrective interventions with rich coverage of the state space from a small number of human interventions. We apply I-Gen to 4 simulated environments and 1 physical environment with object pose estimation error and show that it can increase policy robustness by up to 39x with only 10 human interventions. Videos and more results are available at https://sites.google.com/view/intervengen2024. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2405.00140 [pdf, other]

Hydrodynamical simulations of merging galaxy clusters: giant dark matter particle colliders, powered by gravity

Authors: Ellen L. Sirks, David Harvey, Richard Massey, Kyle A. Oman, Andrew Robertson, Carlos Frenk, Spencer Everett, Ajay S. Gill, David Lagattuta, Jacqueline McCleary

Abstract: Terrestrial particle accelerators collide charged particles, then watch the trajectory of outgoing debris - but they cannot manipulate dark matter. Fortunately, dark matter is the main component of galaxy clusters, which are continuously pulled together by gravity. We show that galaxy cluster mergers can be exploited as enormous, natural dark matter colliders. We analyse hydrodynamical simulations… ▽ More Terrestrial particle accelerators collide charged particles, then watch the trajectory of outgoing debris - but they cannot manipulate dark matter. Fortunately, dark matter is the main component of galaxy clusters, which are continuously pulled together by gravity. We show that galaxy cluster mergers can be exploited as enormous, natural dark matter colliders. We analyse hydrodynamical simulations of a universe containing self-interacting dark matter (SIDM) in which all particles interact via gravity, and dark matter particles can also scatter off each other via a massive mediator. During cluster collisions, SIDM spreads out and lags behind cluster member galaxies. Individual systems can have quirky dynamics that makes them difficult to interpret. Statistically, however, we find that the mean or median of dark matter's spatial offset in many collisions can be robustly modelled, and is independent of our viewing angle and halo mass even in collisions between unequal-mass systems. If the SIDM cross-section were sigma/m = 0.1cm^2/g = 0.18 barn/GeV, the 'bulleticity' lag would be ~5 percent that of gas due to ram pressure, and could be detected at 95 percent confidence in weak lensing observations of ~100 well-chosen clusters. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: 8 pages, 5 figures plus appendices. MNRAS in press

arXiv:2404.18910 [pdf, ps, other]

Gyrokinetic investigation of toroidal Alfven eigenmode (TAE) turbulence

Authors: Ajay C. J., Ben McMillan, Arkaprava Bokshi, Alessandro di Siena, M. J. Pueschel, Juan Ruiz Ruiz

Abstract: Toroidal Alfvén eigenmodes (TAEs) can transport fusion-born energetic particles out of the plasma volume, thereby decreasing plasma self-heating efficiency and possibly damaging reactor walls. Therefore, understanding TAE destabilisation and identifying saturation mechanisms is crucial to achieving burning plasma. While TAEs have been studies extensively in the past using kinetic-MHD codes, here a… ▽ More Toroidal Alfvén eigenmodes (TAEs) can transport fusion-born energetic particles out of the plasma volume, thereby decreasing plasma self-heating efficiency and possibly damaging reactor walls. Therefore, understanding TAE destabilisation and identifying saturation mechanisms is crucial to achieving burning plasma. While TAEs have been studies extensively in the past using kinetic-MHD codes, here a fully gyrokinetic study is employed which allows for additional physics. In the case studied, the primary drive mechanism is identified as the resonance between the magnetic drifts and the TAE, and this is seen to be disrupted by equilibrium flow shear which can stabilize the mode by rotating it in the the poloidal plane. It is found that zonal flows do not play a significant role in the saturation of these TAEs, and there are no saturation mechanisms present in the local gyrokinetic picture that are able to saturate the mode at physically relevant transport levels in the case of TAE-only turbulence. Instead, we confirm that the global profile flattening of fast-ion density is the key saturation mechanism. The nonlinear excitation of TAE travelling along the electron diamagnetic direction and its beating with the ion diamagnetic TAE, resulting in large amplitude oscillations that may help detect TAEs more easily in tokamaks, is also reported. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.18632 [pdf, ps, other]

Mass Spectra of $Ξ$ and $Ω$ Baryons using hypercentral Constituent Quark Model

Authors: Chandni Menapara, Ajay Kumar Rai

Abstract: Hadron spectroscopy is an important tool towards the study of internal quark dynamics in a composite system. The present article focuses on the study of resonance spectra of strange baryons with S=-2, -3. The non-relativistic approach utilizes screened potential as hypercentral one to obtain masses. The spin-dependent part for all possible hyperfine states has been incorporated. Hadron spectroscopy is an important tool towards the study of internal quark dynamics in a composite system. The present article focuses on the study of resonance spectra of strange baryons with S=-2, -3. The non-relativistic approach utilizes screened potential as hypercentral one to obtain masses. The spin-dependent part for all possible hyperfine states has been incorporated. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: HADRON 2023, Nuovo Cimento II

arXiv:2404.15606 [pdf, other]

Multilevel Particle Filters for Partially Observed McKean-Vlasov Stochastic Differential Equations

Authors: Elsiddig Awadelkarim, Ajay Jasra

Abstract: In this paper we consider the filtering problem associated to partially observed McKean-Vlasov stochastic differential equations (SDEs). The model consists of data that are observed at regular and discrete times and the objective is to compute the conditional expectation of (functionals) of the solutions of the SDE at the current time. This problem, even the ordinary SDE case is challenging and re… ▽ More In this paper we consider the filtering problem associated to partially observed McKean-Vlasov stochastic differential equations (SDEs). The model consists of data that are observed at regular and discrete times and the objective is to compute the conditional expectation of (functionals) of the solutions of the SDE at the current time. This problem, even the ordinary SDE case is challenging and requires numerical approximations. Based upon the ideas in [3, 12] we develop a new particle filter (PF) and multilevel particle filter (MLPF) to approximate the afore-mentioned expectations. We prove under assumptions that, for $ε>0$, to obtain a mean square error of $\mathcal{O}(ε^2)$ the PF has a cost per-observation time of $\mathcal{O}(ε^{-5})$ and the MLPF costs $\mathcal{O}(ε^{-4})$ (best case) or $\mathcal{O}(ε^{-4}\log(ε)^2)$ (worst case). Our theoretical results are supported by numerical experiments. △ Less

Submitted 25 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

Comments: 21 pages, 2 figures

arXiv:2404.15510 [pdf, other]

NeuraChip: Accelerating GNN Computations with a Hash-based Decoupled Spatial Accelerator

Authors: Kaustubh Shivdikar, Nicolas Bohm Agostini, Malith Jayaweera, Gilbert Jonatan, Jose L. Abellan, Ajay Joshi, John Kim, David Kaeli

Abstract: Graph Neural Networks (GNNs) are emerging as a formidable tool for processing non-euclidean data across various domains, ranging from social network analysis to bioinformatics. Despite their effectiveness, their adoption has not been pervasive because of scalability challenges associated with large-scale graph datasets, particularly when leveraging message passing. To tackle these challenges, we… ▽ More Graph Neural Networks (GNNs) are emerging as a formidable tool for processing non-euclidean data across various domains, ranging from social network analysis to bioinformatics. Despite their effectiveness, their adoption has not been pervasive because of scalability challenges associated with large-scale graph datasets, particularly when leveraging message passing. To tackle these challenges, we introduce NeuraChip, a novel GNN spatial accelerator based on Gustavson's algorithm. NeuraChip decouples the multiplication and addition computations in sparse matrix multiplication. This separation allows for independent exploitation of their unique data dependencies, facilitating efficient resource allocation. We introduce a rolling eviction strategy to mitigate data idling in on-chip memory as well as address the prevalent issue of memory bloat in sparse graph computations. Furthermore, the compute resource load balancing is achieved through a dynamic reseeding hash-based map**, ensuring uniform utilization of computing resources agnostic of sparsity patterns. Finally, we present NeuraSim, an open-source, cycle-accurate, multi-threaded, modular simulator for comprehensive performance analysis. Overall, NeuraChip presents a significant improvement, yielding an average speedup of 22.1x over Intel's MKL, 17.1x over NVIDIA's cuSPARSE, 16.7x over AMD's hipSPARSE, and 1.5x over prior state-of-the-art SpGEMM accelerator and 1.3x over GNN accelerator. The source code for our open-sourced simulator and performance visualizer is publicly accessible on GitHub https://neurachip.us △ Less

Submitted 26 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

Comments: Visit https://neurachip.us for WebGUI based simulations

Showing 1–50 of 1,342 results for author: Ajay