-
The $\ell$-test: leveraging sparsity in the Gaussian linear model for improved inference
Authors:
Souhardya Sengupta,
Lucas Janson
Abstract:
We develop novel LASSO-based methods for coefficient testing and confidence interval construction in the Gaussian linear model with $n\ge d$. Our methods' finite-sample guarantees are identical to those of their ubiquitous ordinary-least-squares-$t$-test-based analogues, yet have substantially higher power when the true coefficient vector is sparse. In particular, our coefficient test, which we ca…
▽ More
We develop novel LASSO-based methods for coefficient testing and confidence interval construction in the Gaussian linear model with $n\ge d$. Our methods' finite-sample guarantees are identical to those of their ubiquitous ordinary-least-squares-$t$-test-based analogues, yet have substantially higher power when the true coefficient vector is sparse. In particular, our coefficient test, which we call the $\ell$-test, performs like the one-sided $t$-test (despite not being given any information about the sign) under sparsity, and the corresponding confidence intervals are more than 10% shorter than the standard $t$-test based intervals. The nature of the $\ell$-test directly provides a novel exact adjustment conditional on LASSO selection for post-selection inference, allowing for the construction of post-selection p-values and confidence intervals. None of our methods require resampling or Monte Carlo estimation. We perform a variety of simulations and a real data analysis on an HIV drug resistance data set to demonstrate the benefits of the $\ell$-test. We end with a discussion of how the $\ell$-test may asymptotically apply to a much more general class of parametric models.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Inverse melting and re-entrant transformations of the vortex lattice in amorphous Re6Zr thin film
Authors:
Rishabh Duhan,
Subhamita Sengupta,
John Jesudasan,
Somak Basistha,
Pratap Raychaudhuri
Abstract:
Melting of a solid is one of the most ubiquitous phenomena observed in nature. Most solids, when heated, melt from a crystalline state to an isotropic liquid at a characteristic temperature. There are however situations where increase in temperature can induce a transition to a more ordered state. Broadly termed as "inverse melting", experimental realisations of such situations are rare. Here, we…
▽ More
Melting of a solid is one of the most ubiquitous phenomena observed in nature. Most solids, when heated, melt from a crystalline state to an isotropic liquid at a characteristic temperature. There are however situations where increase in temperature can induce a transition to a more ordered state. Broadly termed as "inverse melting", experimental realisations of such situations are rare. Here, we report such a phenomenon in the 2-dimensional vortex liquid that forms in a moderately pinned amorphous Re6Zr (a-ReZr) thin film, from direct imaging of the vortex lattice using a scanning tunnelling microscope. At low temperature and magnetic fields, we find that the vortices form a "pinned liquid" , that is characterised by a low mobility of the vortices and vortex density that is spatially inhomogeneous. As the temperature or magnetic field is increased the vortices become more ordered, eventually forming a nearly perfectly ordered vortex lattice. Above this temperature/magnetic field, the ordered vortex lattice melts again into a vortex liquid. This re-entrant transformation from a liquid to solid-like state and then back to a liquid also leaves distinct signature in the magnetotransport properties of the superconductor.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Latent Directions: A Simple Pathway to Bias Mitigation in Generative AI
Authors:
Carolina Lopez Olmos,
Alexandros Neophytou,
Sunando Sengupta,
Dim P. Papadopoulos
Abstract:
Mitigating biases in generative AI and, particularly in text-to-image models, is of high importance given their growing implications in society. The biased datasets used for training pose challenges in ensuring the responsible development of these models, and mitigation through hard prompting or embedding alteration, are the most common present solutions. Our work introduces a novel approach to ac…
▽ More
Mitigating biases in generative AI and, particularly in text-to-image models, is of high importance given their growing implications in society. The biased datasets used for training pose challenges in ensuring the responsible development of these models, and mitigation through hard prompting or embedding alteration, are the most common present solutions. Our work introduces a novel approach to achieve diverse and inclusive synthetic images by learning a direction in the latent space and solely modifying the initial Gaussian noise provided for the diffusion process. Maintaining a neutral prompt and untouched embeddings, this approach successfully adapts to diverse debiasing scenarios, such as geographical biases. Moreover, our work proves it is possible to linearly combine these learned latent directions to introduce new mitigations, and if desired, integrate it with text embedding adjustments. Furthermore, text-to-image models lack transparency for assessing bias in outputs, unless visually inspected. Thus, we provide a tool to empower developers to select their desired concepts to mitigate. The project page with code is available online.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Oracle-Efficient Reinforcement Learning for Max Value Ensembles
Authors:
Marcel Hussing,
Michael Kearns,
Aaron Roth,
Sikata Bela Sengupta,
Jessica Sorrell
Abstract:
Reinforcement learning (RL) in large or infinite state spaces is notoriously challenging, both theoretically (where worst-case sample and computational complexities must scale with state space cardinality) and experimentally (where function approximation and policy gradient techniques often scale poorly and suffer from instability and high variance). One line of research attempting to address thes…
▽ More
Reinforcement learning (RL) in large or infinite state spaces is notoriously challenging, both theoretically (where worst-case sample and computational complexities must scale with state space cardinality) and experimentally (where function approximation and policy gradient techniques often scale poorly and suffer from instability and high variance). One line of research attempting to address these difficulties makes the natural assumption that we are given a collection of heuristic base or $\textit{constituent}$ policies upon which we would like to improve in a scalable manner. In this work we aim to compete with the $\textit{max-following policy}$, which at each state follows the action of whichever constituent policy has the highest value. The max-following policy is always at least as good as the best constituent policy, and may be considerably better. Our main result is an efficient algorithm that learns to compete with the max-following policy, given only access to the constituent policies (but not their value functions). In contrast to prior work in similar settings, our theoretical results require only the minimal assumption of an ERM oracle for value function approximation for the constituent policies (and not the global optimal policy or the max-following policy itself) on samplable distributions. We illustrate our algorithm's experimental effectiveness and behavior on several robotic simulation testbeds.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Regularized-Renormalized-Resummed loop corrected power spectrum of non-singular bounce with Primordial Black Hole formation
Authors:
Sayantan Choudhury,
Ahaskar Karde,
Sudhakar Panda,
Soumitra SenGupta
Abstract:
We present a complete and consistent exposition of the regularization, renormalization, and resummation procedures in the setup of having a contraction and then non-singular bounce followed by inflation with a sharp transition from slow-roll (SR) to ultra-slow roll (USR) phase for generating primordial black holes (PBHs). We consider following an effective field theory (EFT) approach and study the…
▽ More
We present a complete and consistent exposition of the regularization, renormalization, and resummation procedures in the setup of having a contraction and then non-singular bounce followed by inflation with a sharp transition from slow-roll (SR) to ultra-slow roll (USR) phase for generating primordial black holes (PBHs). We consider following an effective field theory (EFT) approach and study the quantum loop corrections to the power spectrum from each phase. We demonstrate the complete removal of quadratic UV divergences after renormalization and softened logarithmic IR divergences after resummation and illustrate the scheme-independent nature of our renormalization approach. We further show that the addition of a contracting and bouncing phase allows us to successfully generate PBHs of solar-mass order, $M_{\rm PBH}\sim {\cal O}(M_{\odot})$, by achieving the minimum e-folds during inflation to be $ΔN_{\rm Total}\sim {\cal O}(60)$ and in this process successfully evading the strict no-go theorem. We notice that varying the effective sound speed between $0.88\leq c_{s}\leq 1$, allows the peak spectrum amplitude to lie within $10^{-3}\leq A \leq 10^{-2}$, indicating that causality and unitarity remain protected in the theory. We analyse PBHs in the extremely small, $M_{\rm PBH}\sim {\cal O}(10^{-33}-10^{-27})M_{\odot}$, and the large, $M_{\rm PBH}\sim {\cal O}(10^{-6}-10^{-1})M_{\odot}$, mass limits and confront the PBH abundance results with the latest microlensing constraints. We also study the cosmological beta functions across all phases and find their interpretation consistent in the context of bouncing and inflationary scenarios while satisfying the pivot scale normalization requirement. Further, we estimate the spectral distortion effects and shed light on controlling PBH overproduction.
△ Less
Submitted 4 June, 2024; v1 submitted 10 May, 2024;
originally announced May 2024.
-
Sheet model description of spatio-temporal evolution of upper-hybrid oscillations in an inhomogeneous magnetic field
Authors:
Nidhi Rathee,
Someswar Dutta,
R. Srinivasan,
Sudip Sengupta
Abstract:
Spatio-temporal evolution of large amplitude upper hybrid oscillations in a cold homogeneous plasma in the presence of an inhomogeneous magnetic field is studied analytically and numerically using the Dawson sheet model. It is observed that the inhomogeneity in magnetic field which causes the upper hybrid frequency to acquire a spatial dependence, results in phase mixing and subsequent breaking of…
▽ More
Spatio-temporal evolution of large amplitude upper hybrid oscillations in a cold homogeneous plasma in the presence of an inhomogeneous magnetic field is studied analytically and numerically using the Dawson sheet model. It is observed that the inhomogeneity in magnetic field which causes the upper hybrid frequency to acquire a spatial dependence, results in phase mixing and subsequent breaking of the upper hybrid oscillations at arbitrarily low amplitudes. This result is in sharp contrast to the usual upper hybrid oscillations in a homogeneous magnetic field where the oscillations break within a fraction of a period when the amplitude exceeds a certain critical value. Our perturbative calculations show that the phase mixing (wave breaking) time scales inversely with the amplitude of magnetic field inhomogeneity ($Δ$) and amplitude of imposed density perturbation ($δ$), and scales directly with the ratio of magnetic field inhomogeneity scale length to imposed density perturbation scale length ($(α/k_L)^{-1}$ ) as $ω_{pe}τ_{mix} \sim \left( 1+β^2 \right) ^{3/2}k_L/(β^2δΔα)$, where $β$ is the ratio of electron cyclotron frequency to electron plasma frequency. Further phase mixing time measured in simulations, performed using a 1-1/2 D code based on Dawson sheet model, shows good agreement with the above mentioned scaling. This result may be of relevance to plasma based particle acceleration experiments in the presence of a transverse inhomogeneous magnetic field.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Inflation via Moduli Potentials in a Nested Warped Geometry
Authors:
Arko Bhaumik,
Soumitra SenGupta
Abstract:
We analyze the effective four-dimensional dynamics of the extra-dimensional moduli fields in curved braneworlds having nested war**, with particular emphasis on the doubly warped model which is interesting in the light of current collider constraints on the mass of the Kaluza-Klein graviton. The presence of a non-zero brane cosmological constant ($Ω$) naturally induces an effective moduli potent…
▽ More
We analyze the effective four-dimensional dynamics of the extra-dimensional moduli fields in curved braneworlds having nested war**, with particular emphasis on the doubly warped model which is interesting in the light of current collider constraints on the mass of the Kaluza-Klein graviton. The presence of a non-zero brane cosmological constant ($Ω$) naturally induces an effective moduli potential in the four-dimensional action, which shows distinct features in dS ($Ω>0$) and AdS ($Ω<0$) branches. For the observationally interesting case of dS 4-branes, a metastable minimum in the potential arises along the first modulus, with no minima along the higher moduli. The underlying nested geometry also leads to interesting separable forms of the non-canonical kinetic terms in the Einstein frame, where the brane curvature directly impacts the kinetic properties of only the first modulus. We subsequently explore the ability of curved multiply warped geometries to drive inflation with an in-built exit mechanism, by assuming predominant slow roll along each modular direction on a case-by-case basis. We find slow roll on top of the metastable plateau along the first modular direction to be the most viable scenario, with the higher-dimensional moduli parametrically tuning the height of the potential without significant impact on the inflationary observables. On the other hand, while slow roll along the higher moduli can successfully inflate the background and eventually lead to an exit, consistency with observations seemingly requires unphysical hierarchies among the extra-dimensional radii, thus disfavouring such scenarios.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
BASS: Batched Attention-optimized Speculative Sampling
Authors:
Haifeng Qian,
Sujan Kumar Gonugondla,
Sungsoo Ha,
Mingyue Shang,
Sanjay Krishna Gouda,
Ramesh Nallapati,
Sudipta Sengupta,
Xiaofei Ma,
Anoop Deoras
Abstract:
Speculative decoding has emerged as a powerful method to improve latency and throughput in hosting large language models. However, most existing implementations focus on generating a single sequence. Real-world generative AI applications often require multiple responses and how to perform speculative decoding in a batched setting while preserving its latency benefits poses non-trivial challenges.…
▽ More
Speculative decoding has emerged as a powerful method to improve latency and throughput in hosting large language models. However, most existing implementations focus on generating a single sequence. Real-world generative AI applications often require multiple responses and how to perform speculative decoding in a batched setting while preserving its latency benefits poses non-trivial challenges. This paper describes a system of batched speculative decoding that sets a new state of the art in multi-sequence generation latency and that demonstrates superior GPU utilization as well as quality of generations within a time budget. For example, for a 7.8B-size model on a single A100 GPU and with a batch size of 8, each sequence is generated at an average speed of 5.8ms per token, the overall throughput being 1.1K tokens per second. These results represent state-of-the-art latency and a 2.15X speed-up over optimized regular decoding. Within a time budget that regular decoding does not finish, our system is able to generate sequences with HumanEval Pass@First of 43% and Pass@All of 61%, far exceeding what's feasible with single-sequence speculative decoding. Our peak GPU utilization during decoding reaches as high as 15.8%, more than 3X the highest of that of regular decoding and around 10X of single-sequence speculative decoding.
△ Less
Submitted 26 June, 2024; v1 submitted 24 April, 2024;
originally announced April 2024.
-
Asymptotically flat galactic rotation curves in gravity theory
Authors:
Sandipan Sengupta
Abstract:
We present a new set of four-geometries exhibiting asymptotically flat galactic rotation curves. These are found as explicit solutions to 5D vacuum Hilbert-Palatini theory, where the fifth dimen sion has vanishing proper length. In the emergent 4D dynamics, governed by the condition that the Ricci scalar must vanish (upto a cosmological constant), these correspond to anisotropic effective pressure…
▽ More
We present a new set of four-geometries exhibiting asymptotically flat galactic rotation curves. These are found as explicit solutions to 5D vacuum Hilbert-Palatini theory, where the fifth dimen sion has vanishing proper length. In the emergent 4D dynamics, governed by the condition that the Ricci scalar must vanish (upto a cosmological constant), these correspond to anisotropic effective pressure. The enhancement in the deflection angle of a light ray penetrating the halo is obtained, which could provide a realistic testing ground for the model as a purely geometric alternative to `dark matter'. For very large halo radii, the leading nonbaryonic contribution to the bending angle is predicted to be $3πv^2/2c^2$ (v being the asymptotic rotational velocity), a constant that is different from the result for an isothermal CDM halo.
△ Less
Submitted 23 April, 2024; v1 submitted 19 April, 2024;
originally announced April 2024.
-
Large bubble drives melting in circular DNA
Authors:
Souradeep Sengupta,
Somendra M. Bhattacharjee,
Garima Mishra
Abstract:
We investigate the melting transition of non-supercoiled circular DNA of different lengths, employing Brownian dynamics simulation. In the absence of supercoiling, we find that melting of circular DNA is driven by a large bubble, which agrees with the previous predictions of circular DNA melting in the presence of supercoiling. By analyzing sector-wise changes in average base-pair distance, our st…
▽ More
We investigate the melting transition of non-supercoiled circular DNA of different lengths, employing Brownian dynamics simulation. In the absence of supercoiling, we find that melting of circular DNA is driven by a large bubble, which agrees with the previous predictions of circular DNA melting in the presence of supercoiling. By analyzing sector-wise changes in average base-pair distance, our study reveals that the melting behavior of circular DNA closely resembles that of linear DNA. Additionally, we find a marked difference in the thermal stability of circular DNA over linear DNA at very short length scales, an effect that diminishes as the length of circular DNA increases. The stability of smaller circular DNA is linked to the occurrence of transient small bubbles, characterized by a lower probability of growth.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Second law of horizon thermodynamics during cosmic evolution
Authors:
Sergei D. Odintsov,
Tanmoy Paul,
Soumitra SenGupta
Abstract:
We examine the second law of thermodynamics in the context of horizon cosmology, in particular, whether the change of total entropy (i.e. the sum of the entropy for the apparent horizon and the entropy for the matter fields) proves to be positive with the cosmic expansion of the universe. The matter fields inside the horizon obey the thermodynamics of an open system as the matter fields has a flux…
▽ More
We examine the second law of thermodynamics in the context of horizon cosmology, in particular, whether the change of total entropy (i.e. the sum of the entropy for the apparent horizon and the entropy for the matter fields) proves to be positive with the cosmic expansion of the universe. The matter fields inside the horizon obey the thermodynamics of an open system as the matter fields has a flux through the apparent horizon, which is either outward or inward depending on the background cosmological dynamics. Regarding the entropy of the apparent horizon, we consider different forms of the horizon entropy like the Tsallis entropy, the Rényi entropy, the Kaniadakis entropy, or even the 4-parameter generalized entropy; and determine the appropriate conditions on the respective entropic parameters coming from the second law of horizon thermodynamics. The constraints on the entropic parameters are found in such a way that it validates the second law of thermodynamics during a wide range of cosmic era of the universe, particularly from inflation to radiation dominated epoch followed by a reheating stage. Importantly, the present work provides a model independent way to constrain the entropic parameters directly from the second law of thermodynamics for the apparent horizon.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
The State of Lithium-Ion Battery Health Prognostics in the CPS Era
Authors:
Gaurav Shinde,
Rohan Mohapatra,
Pooja Krishan,
Harish Garg,
Srikanth Prabhu,
Sanchari Das,
Mohammad Masum,
Saptarshi Sengupta
Abstract:
Lithium-ion batteries (Li-ion) have revolutionized energy storage technology, becoming integral to our daily lives by powering a diverse range of devices and applications. Their high energy density, fast power response, recyclability, and mobility advantages have made them the preferred choice for numerous sectors. This paper explores the seamless integration of Prognostics and Health Management w…
▽ More
Lithium-ion batteries (Li-ion) have revolutionized energy storage technology, becoming integral to our daily lives by powering a diverse range of devices and applications. Their high energy density, fast power response, recyclability, and mobility advantages have made them the preferred choice for numerous sectors. This paper explores the seamless integration of Prognostics and Health Management within batteries, presenting a multidisciplinary approach that enhances the reliability, safety, and performance of these powerhouses. Remaining useful life (RUL), a critical concept in prognostics, is examined in depth, emphasizing its role in predicting component failure before it occurs. The paper reviews various RUL prediction methods, from traditional models to cutting-edge data-driven techniques. Furthermore, it highlights the paradigm shift toward deep learning architectures within the field of Li-ion battery health prognostics, elucidating the pivotal role of deep learning in addressing battery system complexities. Practical applications of PHM across industries are also explored, offering readers insights into real-world implementations.This paper serves as a comprehensive guide, catering to both researchers and practitioners in the field of Li-ion battery PHM.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Fast and faithful interpolation of numerical relativity surrogate waveforms using meshfree approximation
Authors:
Lalit Pathak,
Amit Reza,
Anand S. Sengupta
Abstract:
Several theoretical waveform models have been developed over the years to capture the gravitational wave emission from the dynamical evolution of compact binary systems of neutron stars and black holes. As ground-based detectors improve their sensitivity at low frequencies, the real-time computation of these waveforms can become computationally expensive, exacerbating the steep cost of rapidly rec…
▽ More
Several theoretical waveform models have been developed over the years to capture the gravitational wave emission from the dynamical evolution of compact binary systems of neutron stars and black holes. As ground-based detectors improve their sensitivity at low frequencies, the real-time computation of these waveforms can become computationally expensive, exacerbating the steep cost of rapidly reconstructing source parameters using Bayesian methods. This paper describes an efficient numerical algorithm for generating high-fidelity interpolated compact binary waveforms at an arbitrary point in the signal manifold by leveraging computational linear algebra techniques such as singular value decomposition and meshfree approximation. The results are presented for the time-domain \texttt{NRHybSur3dq8} inspiral-merger-ringdown (IMR) waveform model that is fine tuned to numerical relativity simulations and parameterized by the two component-masses and two aligned spins. For demonstration, we target a specific region of the intrinsic parameter space inspired by the previously inferred parameters of the \texttt{GW200311\_115853} event -- a binary black hole system whose merger was recorded by the network of advanced-LIGO and Virgo detectors during the third observation run. We show that the meshfree interpolated waveforms can be evaluated in $\sim 2.3$ ms, which is about $\times 38$ faster than its brute-force (frequency-domain tapered) implementation in the \textsc{PyCBC} software package at a median accuracy of $\sim \mathcal{O}(10^{-5})$. The algorithm is computationally efficient and scales favourably with an increasing number of dimensions of the parameter space. This technique may find use in rapid parameter estimation and source reconstruction studies.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Bifurcated Attention: Accelerating Massively Parallel Decoding with Shared Prefixes in LLMs
Authors:
Ben Athiwaratkun,
Sujan Kumar Gonugondla,
Sanjay Krishna Gouda,
Haifeng Qian,
Hantian Ding,
Qing Sun,
Jun Wang,
Jiacheng Guo,
Liangfu Chen,
Parminder Bhatia,
Ramesh Nallapati,
Sudipta Sengupta,
Bing Xiang
Abstract:
This study introduces bifurcated attention, a method designed to enhance language model inference in shared-context batch decoding scenarios. Our approach addresses the challenge of redundant memory IO costs, a critical factor contributing to latency in high batch sizes and extended context lengths. Bifurcated attention achieves this by strategically dividing the attention mechanism during increme…
▽ More
This study introduces bifurcated attention, a method designed to enhance language model inference in shared-context batch decoding scenarios. Our approach addresses the challenge of redundant memory IO costs, a critical factor contributing to latency in high batch sizes and extended context lengths. Bifurcated attention achieves this by strategically dividing the attention mechanism during incremental decoding into two separate GEMM operations: one focusing on the KV cache from prefill, and another on the decoding process itself. While maintaining the computational load (FLOPs) of standard attention mechanisms, bifurcated attention ensures precise computation with significantly reduced memory IO. Our empirical results show over 2.1$\times$ speedup when sampling 16 output sequences and more than 6.2$\times$ speedup when sampling 32 sequences at context lengths exceeding 8k tokens on a 7B model that uses multi-head attention. The efficiency gains from bifurcated attention translate into lower latency, making it particularly suitable for real-time applications. For instance, it enables massively parallel answer generation without substantially increasing latency, thus enhancing performance when integrated with post-processing techniques such as re-ranking.
△ Less
Submitted 11 July, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
FLAP: Flow-Adhering Planning with Constrained Decoding in LLMs
Authors:
Shamik Roy,
Sailik Sengupta,
Daniele Bonadiman,
Saab Mansour,
Arshit Gupta
Abstract:
Planning is a crucial task for agents in task oriented dialogs (TODs). Human agents typically resolve user issues by following predefined workflows, decomposing workflow steps into actionable items, and performing actions by executing APIs in order; all of which require reasoning and planning. With the recent advances in LLMs, there have been increasing attempts to use them for task planning and A…
▽ More
Planning is a crucial task for agents in task oriented dialogs (TODs). Human agents typically resolve user issues by following predefined workflows, decomposing workflow steps into actionable items, and performing actions by executing APIs in order; all of which require reasoning and planning. With the recent advances in LLMs, there have been increasing attempts to use them for task planning and API usage. However, the faithfulness of the plans to predefined workflows and API dependencies, is not guaranteed with LLMs. Moreover, workflows in real life are often custom-defined and prone to changes; hence, adaptation is desirable. To study this, we propose the problem of faithful planning in TODs that needs to resolve user intents by following predefined flows and preserving API dependencies. To solve this problem, we propose FLAP, a Flow-Adhering Planning algorithm based on constrained decoding with lookahead heuristic for LLMs. Our algorithm alleviates the need for finetuning LLMs using domain specific (plan/dependency) data, enables quick adaptation to predefined flows, and outperforms other decoding and prompting-based baselines. Further, our algorithm empowers smaller LLMs (7B) to perform at par larger LLMs (30B-40B).
△ Less
Submitted 4 June, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Can Your Model Tell a Negation from an Implicature? Unravelling Challenges With Intent Encoders
Authors:
Yuwei Zhang,
Siffi Singh,
Sailik Sengupta,
Igor Shalyminov,
Hang Su,
Hwanjun Song,
Saab Mansour
Abstract:
Conversational systems often rely on embedding models for intent classification and intent clustering tasks. The advent of Large Language Models (LLMs), which enable instructional embeddings allowing one to adjust semantics over the embedding space using prompts, are being viewed as a panacea for these downstream conversational tasks. However, traditional evaluation benchmarks rely solely on task…
▽ More
Conversational systems often rely on embedding models for intent classification and intent clustering tasks. The advent of Large Language Models (LLMs), which enable instructional embeddings allowing one to adjust semantics over the embedding space using prompts, are being viewed as a panacea for these downstream conversational tasks. However, traditional evaluation benchmarks rely solely on task metrics that don't particularly measure gaps related to semantic understanding. Thus, we propose an intent semantic toolkit that gives a more holistic view of intent embedding models by considering three tasks -- (1) intent classification, (2) intent clustering, and (3) a novel triplet task. The triplet task gauges the model's understanding of two semantic concepts paramount in real-world conversational systems -- negation and implicature. We observe that current embedding models fare poorly in semantic understanding of these concepts. To address this, we propose a pre-training approach to improve the embedding model by leveraging augmentation with data generated by an auto-regressive model and a contrastive loss term. Our approach improves the semantic understanding of the intent embedding model on the aforementioned linguistic dimensions while slightly effecting their performance on downstream task metrics.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Ultralight vector dark matter search using data from the KAGRA O3GK run
Authors:
The LIGO Scientific Collaboration,
the Virgo Collaboration,
the KAGRA Collaboration,
A. G. Abac,
R. Abbott,
H. Abe,
I. Abouelfettouh,
F. Acernese,
K. Ackley,
C. Adamcewicz,
S. Adhicary,
N. Adhikari,
R. X. Adhikari,
V. K. Adkins,
V. B. Adya,
C. Affeldt,
D. Agarwal,
M. Agathos,
O. D. Aguiar,
I. Aguilar,
L. Aiello,
A. Ain,
P. Ajith,
T. Akutsu,
S. Albanesi
, et al. (1778 additional authors not shown)
Abstract:
Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we prese…
▽ More
Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we present the result of a search for $U(1)_{B-L}$ gauge boson DM using the KAGRA data from auxiliary length channels during the first joint observation run together with GEO600. By applying our search pipeline, which takes into account the stochastic nature of ultralight DM, upper bounds on the coupling strength between the $U(1)_{B-L}$ gauge boson and ordinary matter are obtained for a range of DM masses. While our constraints are less stringent than those derived from previous experiments, this study demonstrates the applicability of our method to the lower-mass vector DM search, which is made difficult in this measurement by the short observation time compared to the auto-correlation time scale of DM.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Absence of torsion : Clue from Starobinsky model of f(R) gravity
Authors:
Sonej Alam,
Somasri Sen,
Soumitra Sengupta
Abstract:
One of the surprising aspects of the present Universe, is the absence of any noticeable observable effects of higher-rank anti-symmetric tensor fields, such as space-time torsion, in any natural phenomena. Here we address the possible explanation of torsion, which may often be identified with the field strength tensor of the second rank antisymmetric Kalb-Ramond field. Within the framework of f(R)…
▽ More
One of the surprising aspects of the present Universe, is the absence of any noticeable observable effects of higher-rank anti-symmetric tensor fields, such as space-time torsion, in any natural phenomena. Here we address the possible explanation of torsion, which may often be identified with the field strength tensor of the second rank antisymmetric Kalb-Ramond field. Within the framework of f(R) gravity, we explore the cosmological evolution of the scalar degrees of freedom associated with higher curvature term in a general higher curvature model $f (R) = R +α_n R^n$. We show that while the values of different cosmological parameters follow acceptable values in the framework of standard cosmology at different epochs for different forms of higher curvature gravity (i.e. different values of n ), only for Starobinsky model (n = 2), the Kalb Ramond field gets naturally suppressed with cosmological evolution. In contrast, for other models (n both positive and negative), despite their agreement with standard cosmology, the scalar field associated with the higher derivative degree of freedom induces an enhancement in Kalb-Ramond field and thereby contradicts the observation. The result does not change even if we include the Cosmological Constant. Thus our result reveals that among different $f(R)$ models, Starobinsky model successfully explains the suppression of space-time torsion along with a consistent cosmological evolution.
△ Less
Submitted 23 March, 2024; v1 submitted 5 March, 2024;
originally announced March 2024.
-
Automated Unit Test Improvement using Large Language Models at Meta
Authors:
Nadia Alshahwan,
Jubin Chheda,
Anastasia Finegenova,
Beliz Gokkaya,
Mark Harman,
Inna Harper,
Alexandru Marginean,
Shubho Sengupta,
Eddy Wang
Abstract:
This paper describes Meta's TestGen-LLM tool, which uses LLMs to automatically improve existing human-written tests. TestGen-LLM verifies that its generated test classes successfully clear a set of filters that assure measurable improvement over the original test suite, thereby eliminating problems due to LLM hallucination. We describe the deployment of TestGen-LLM at Meta test-a-thons for the Ins…
▽ More
This paper describes Meta's TestGen-LLM tool, which uses LLMs to automatically improve existing human-written tests. TestGen-LLM verifies that its generated test classes successfully clear a set of filters that assure measurable improvement over the original test suite, thereby eliminating problems due to LLM hallucination. We describe the deployment of TestGen-LLM at Meta test-a-thons for the Instagram and Facebook platforms. In an evaluation on Reels and Stories products for Instagram, 75% of TestGen-LLM's test cases built correctly, 57% passed reliably, and 25% increased coverage. During Meta's Instagram and Facebook test-a-thons, it improved 11.5% of all classes to which it was applied, with 73% of its recommendations being accepted for production deployment by Meta software engineers. We believe this is the first report on industrial scale deployment of LLM-generated code backed by such assurances of code improvement.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Locality Sensitive Hashing for Network Traffic Fingerprinting
Authors:
Nowfel Mashnoor,
Jay Thom,
Abdur Rouf,
Shamik Sengupta,
Batyr Charyyev
Abstract:
The advent of the Internet of Things (IoT) has brought forth additional intricacies and difficulties to computer networks. These gadgets are particularly susceptible to cyber-attacks because of their simplistic design. Therefore, it is crucial to recognise these devices inside a network for the purpose of network administration and to identify any harmful actions. Network traffic fingerprinting is…
▽ More
The advent of the Internet of Things (IoT) has brought forth additional intricacies and difficulties to computer networks. These gadgets are particularly susceptible to cyber-attacks because of their simplistic design. Therefore, it is crucial to recognise these devices inside a network for the purpose of network administration and to identify any harmful actions. Network traffic fingerprinting is a crucial technique for identifying devices and detecting anomalies. Currently, the predominant methods for this depend heavily on machine learning (ML). Nevertheless, machine learning (ML) methods need the selection of features, adjustment of hyperparameters, and retraining of models to attain optimal outcomes and provide resilience to concept drifts detected in a network. In this research, we suggest using locality-sensitive hashing (LSH) for network traffic fingerprinting as a solution to these difficulties. Our study focuses on examining several design options for the Nilsimsa LSH function. We then use this function to create unique fingerprints for network data, which may be used to identify devices. We also compared it with ML-based traffic fingerprinting and observed that our method increases the accuracy of state-of-the-art by 12% achieving around 94% accuracy in identifying devices in a network.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
DeAL: Decoding-time Alignment for Large Language Models
Authors:
James Y. Huang,
Sailik Sengupta,
Daniele Bonadiman,
Yi-an Lai,
Arshit Gupta,
Nikolaos Pappas,
Saab Mansour,
Katrin Kirchhoff,
Dan Roth
Abstract:
Large Language Models (LLMs) are nowadays expected to generate content aligned with human preferences. Current work focuses on alignment at model training time, through techniques such as Reinforcement Learning with Human Feedback (RLHF). However, it is unclear if such methods are an effective choice to teach alignment objectives to the model. First, the inability to incorporate multiple, custom r…
▽ More
Large Language Models (LLMs) are nowadays expected to generate content aligned with human preferences. Current work focuses on alignment at model training time, through techniques such as Reinforcement Learning with Human Feedback (RLHF). However, it is unclear if such methods are an effective choice to teach alignment objectives to the model. First, the inability to incorporate multiple, custom rewards and reliance on a model developer's view of universal and static principles are key limitations. Second, the residual gaps in model training and the reliability of such approaches are also questionable (e.g. susceptibility to jail-breaking even after safety training). To address these, we propose DeAL, a framework that allows the user to customize reward functions and enables Decoding-time Alignment of LLMs (DeAL). At its core, we view decoding as a heuristic-guided search process and facilitate the use of a wide variety of alignment objectives. Our experiments with programmatic constraints such as keyword and length constraints (studied widely in the pre-LLM era) and abstract objectives such as harmlessness and helpfulness (proposed in the post-LLM era) show that we can DeAL with fine-grained trade-offs, improve adherence to alignment objectives, and address residual gaps in LLMs. Lastly, while DeAL can be effectively paired with RLHF and prompting techniques, its generality makes decoding slower, an optimization we leave for future work.
△ Less
Submitted 20 February, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
Assured LLM-Based Software Engineering
Authors:
Nadia Alshahwan,
Mark Harman,
Inna Harper,
Alexandru Marginean,
Shubho Sengupta,
Eddy Wang
Abstract:
In this paper we address the following question: How can we use Large Language Models (LLMs) to improve code independently of a human, while ensuring that the improved code
- does not regress the properties of the original code?
- improves the original in a verifiable and measurable way?
To address this question, we advocate Assured LLM-Based Software Engineering; a generate-and-test approac…
▽ More
In this paper we address the following question: How can we use Large Language Models (LLMs) to improve code independently of a human, while ensuring that the improved code
- does not regress the properties of the original code?
- improves the original in a verifiable and measurable way?
To address this question, we advocate Assured LLM-Based Software Engineering; a generate-and-test approach, inspired by Genetic Improvement. Assured LLMSE applies a series of semantic filters that discard code that fails to meet these twin guarantees. This overcomes the potential problem of LLM's propensity to hallucinate. It allows us to generate code using LLMs, independently of any human. The human plays the role only of final code reviewer, as they would do with code generated by other human engineers.
This paper is an outline of the content of the keynote by Mark Harman at the International Workshop on Interpretability, Robustness, and Benchmarking in Neural Software Engineering, Monday 15th April 2024, Lisbon, Portugal.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Leveraging External Knowledge Resources to Enable Domain-Specific Comprehension
Authors:
Saptarshi Sengupta,
Connor Heaton,
Prasenjit Mitra,
Soumalya Sarkar
Abstract:
Machine Reading Comprehension (MRC) has been a long-standing problem in NLP and, with the recent introduction of the BERT family of transformer based language models, it has come a long way to getting solved. Unfortunately, however, when BERT variants trained on general text corpora are applied to domain-specific text, their performance inevitably degrades on account of the domain shift i.e. genre…
▽ More
Machine Reading Comprehension (MRC) has been a long-standing problem in NLP and, with the recent introduction of the BERT family of transformer based language models, it has come a long way to getting solved. Unfortunately, however, when BERT variants trained on general text corpora are applied to domain-specific text, their performance inevitably degrades on account of the domain shift i.e. genre/subject matter discrepancy between the training and downstream application data. Knowledge graphs act as reservoirs for either open or closed domain information and prior studies have shown that they can be used to improve the performance of general-purpose transformers in domain-specific applications. Building on existing work, we introduce a method using Multi-Layer Perceptrons (MLPs) for aligning and integrating embeddings extracted from knowledge graphs with the embeddings spaces of pre-trained language models (LMs). We fuse the aligned embeddings with open-domain LMs BERT and RoBERTa, and fine-tune them for two MRC tasks namely span detection (COVID-QA) and multiple-choice questions (PubMedQA). On the COVID-QA dataset, we see that our approach allows these models to perform similar to their domain-specific counterparts, Bio/Sci-BERT, as evidenced by the Exact Match (EM) metric. With regards to PubMedQA, we observe an overall improvement in accuracy while the F1 stays relatively the same over the domain-specific models.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Milestones in Bengali Sentiment Analysis leveraging Transformer-models: Fundamentals, Challenges and Future Directions
Authors:
Saptarshi Sengupta,
Shreya Ghosh,
Prasenjit Mitra,
Tarikul Islam Tamiti
Abstract:
Sentiment Analysis (SA) refers to the task of associating a view polarity (usually, positive, negative, or neutral; or even fine-grained such as slightly angry, sad, etc.) to a given text, essentially breaking it down to a supervised (since we have the view labels apriori) classification task. Although heavily studied in resource-rich languages such as English thus pushing the SOTA by leaps and bo…
▽ More
Sentiment Analysis (SA) refers to the task of associating a view polarity (usually, positive, negative, or neutral; or even fine-grained such as slightly angry, sad, etc.) to a given text, essentially breaking it down to a supervised (since we have the view labels apriori) classification task. Although heavily studied in resource-rich languages such as English thus pushing the SOTA by leaps and bounds, owing to the arrival of the Transformer architecture, the same cannot be said for resource-poor languages such as Bengali (BN). For a language spoken by roughly 300 million people, the technology enabling them to run trials on their favored tongue is severely lacking. In this paper, we analyze the SOTA for SA in Bengali, particularly, Transformer-based models. We discuss available datasets, their drawbacks, the nuances associated with Bengali i.e. what makes this a challenging language to apply SA on, and finally provide insights for future direction to mitigate the limitations in the field.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
f(R) gravity with spacetime torsion
Authors:
Hitender Kumar,
Tanmoy Paul,
Soumitra SenGupta
Abstract:
The duality between a higher curvature $f(R)$ gravity model and a scalar-tensor theory helps to bring out the role of the additional degree of freedom originating from the higher derivative terms in the gravity action. Such a degree of freedom which appears as a scalar field has been shown to have multiple implications in Cosmological/Astrophysical scenario. The present work proposes a novel gener…
▽ More
The duality between a higher curvature $f(R)$ gravity model and a scalar-tensor theory helps to bring out the role of the additional degree of freedom originating from the higher derivative terms in the gravity action. Such a degree of freedom which appears as a scalar field has been shown to have multiple implications in Cosmological/Astrophysical scenario. The present work proposes a novel generalization to this correspondence between $f(R)$ gravity and a dual scalar-tensor theory when the affine connection is considered to have an antisymmetric part. It turns out that the $f(R)$ action in presence of spacetime torsion can be recast to a $non-minimally$ coupled scalar-tensor theory with a 2-rank massless antisymmetric tensor field in the Einstein frame, where the scalar field gets coupled with the antisymmetric field through derivative coupling(s).
△ Less
Submitted 20 June, 2024; v1 submitted 3 January, 2024;
originally announced January 2024.
-
Nilpotent polynomials over $\mathbb{Z}$
Authors:
Sayak Sengupta
Abstract:
For a polynomial $u(x)$ in $\mathbb{Z}[x]$ and $r\in\mathbb{Z}$, we consider the orbit of $u$ at $r$ denoted and defined by $\mathcal{O}_u(r):=\{u^{(n)}(r)~|~n\in\mathbb{N}\}$. Here we study polynomials for which $0$ is in the orbit, and we call such polynomials \textit{nilpotent at }$r$ of index $m$ where $m$ is the minimum element of the set $\{n\in\mathbb N~|~u^{(n)}(r)=0\}$. We provide here a…
▽ More
For a polynomial $u(x)$ in $\mathbb{Z}[x]$ and $r\in\mathbb{Z}$, we consider the orbit of $u$ at $r$ denoted and defined by $\mathcal{O}_u(r):=\{u^{(n)}(r)~|~n\in\mathbb{N}\}$. Here we study polynomials for which $0$ is in the orbit, and we call such polynomials \textit{nilpotent at }$r$ of index $m$ where $m$ is the minimum element of the set $\{n\in\mathbb N~|~u^{(n)}(r)=0\}$. We provide here a complete classification of these polynomials when $|r|\le 4$, with $|r|\le 1$ already covered in the author's previous paper, titled \textit{Locally nilpotent polynomials over $\mathbb Z$}. The central goal of this paper is to study the following questions: (i) relation between the integers $r$ and $m$ when the set of nilpotent polynomials at $r$ of index $m$ is non-empty, (ii) classification of the integer polynomials with nilpotency index $|r|$ for large enough $|r|$, and (iii) bounded integer polynomial sequences $\{r_n\}_{n\ge 0}$.
△ Less
Submitted 23 June, 2024; v1 submitted 2 January, 2024;
originally announced January 2024.
-
Forecasting CPI inflation under economic policy and geopolitical uncertainties
Authors:
Shovon Sengupta,
Tanujit Chakraborty,
Sunny Kumar Singh
Abstract:
Forecasting consumer price index (CPI) inflation is of paramount importance for both academics and policymakers at the central banks. This study introduces a filtered ensemble wavelet neural network (FEWNet) to forecast CPI inflation, which is tested on BRIC countries. FEWNet breaks down inflation data into high and low-frequency components using wavelets and utilizes them along with other economi…
▽ More
Forecasting consumer price index (CPI) inflation is of paramount importance for both academics and policymakers at the central banks. This study introduces a filtered ensemble wavelet neural network (FEWNet) to forecast CPI inflation, which is tested on BRIC countries. FEWNet breaks down inflation data into high and low-frequency components using wavelets and utilizes them along with other economic factors (economic policy uncertainty and geopolitical risk) to produce forecasts. All the wavelet-transformed series and filtered exogenous variables are fed into downstream autoregressive neural networks to make the final ensemble forecast. Theoretically, we show that FEWNet reduces the empirical risk compared to fully connected autoregressive neural networks. FEWNet is more accurate than other forecasting methods and can also estimate the uncertainty in its predictions due to its capacity to effectively capture non-linearities and long-range dependencies in the data through its adaptable architecture. This makes FEWNet a valuable tool for central banks to manage inflation.
△ Less
Submitted 2 July, 2024; v1 submitted 30 December, 2023;
originally announced January 2024.
-
Depth-dependent warming of the Gulf of Eilat (Aqaba)
Authors:
Sounav Sengupta,
Hezi Gildor,
Yosef Ashkenazy
Abstract:
The Gulf of Eilat (Gulf of Aqaba) is a semi-enclosed basin situated at the northern end of the Red Sea, renowned for its exceptional marine ecosystem. To evaluate the response of the Gulf to climate variations, we analyzed various factors including temperature down to 700 m, surface air temperature, and heat fluxes. We find that the sea temperature is rising at all depths despite inconclusive tren…
▽ More
The Gulf of Eilat (Gulf of Aqaba) is a semi-enclosed basin situated at the northern end of the Red Sea, renowned for its exceptional marine ecosystem. To evaluate the response of the Gulf to climate variations, we analyzed various factors including temperature down to 700 m, surface air temperature, and heat fluxes. We find that the sea temperature is rising at all depths despite inconclusive trends in local atmospheric variables, including the surface air temperature. The Gulf's sea surface temperature warms at a rate of a few hundredths of a degree Celsius per year, which is comparable to the warming of the global sea surface temperature and the Mediterranean Sea. The increase in sea warming is linked to fewer winter deep mixing events that used to occur more frequently in the past. Based on the analysis of the ocean-atmosphere heat fluxes, we conclude that the lateral advection of heat from the southern part of the Gulf likely leads to an increase in water temperature in the northern part of the Gulf. Our findings suggest that local ocean warming is not necessarily associated with local processes, but rather with the warming of other remote locations.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Charged particle dynamics in an elliptically polarized electromagnetic wave and a uniform axial magnetic field
Authors:
Shivam Kumar Mishra,
Sarveshwar Sharma,
Sudip Sengupta
Abstract:
An analytical study of the charged particle dynamics in the presence of an elliptically polarized electromagnetic wave and a uniform axial magnetic field, is presented. It is found that for $gω_{0}/ ω' = \pm 1$, maximum energy gain occurs respectively for linear and circular polarization; $ω_{0}$ and $ω'$ respectively being the cyclotron frequency of the charged particle in the external magnetic f…
▽ More
An analytical study of the charged particle dynamics in the presence of an elliptically polarized electromagnetic wave and a uniform axial magnetic field, is presented. It is found that for $gω_{0}/ ω' = \pm 1$, maximum energy gain occurs respectively for linear and circular polarization; $ω_{0}$ and $ω'$ respectively being the cyclotron frequency of the charged particle in the external magnetic field and Doppler-shifted frequency of the wave seen by the particle, and $g =\pm 1$ respectively correspond to left and right-handedness of the polarization. An explicit solution of the governing equation is presented in terms of particle position or laboratory time, for the specific case of resonant energy gain in a circularly polarized electromagnetic wave. These explicit position- or time-dependent expressions are useful for better insight into various phenomena, viz., cosmic ray generation, microwave generation, plasma heating, and particle acceleration, etc.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Automatic Report Generation for Histopathology images using pre-trained Vision Transformers and BERT
Authors:
Saurav Sengupta,
Donald E. Brown
Abstract:
Deep learning for histopathology has been successfully used for disease classification, image segmentation and more. However, combining image and text modalities using current state-of-the-art (SOTA) methods has been a challenge due to the high resolution of histopathology images. Automatic report generation for histopathology images is one such challenge. In this work, we show that using an exist…
▽ More
Deep learning for histopathology has been successfully used for disease classification, image segmentation and more. However, combining image and text modalities using current state-of-the-art (SOTA) methods has been a challenge due to the high resolution of histopathology images. Automatic report generation for histopathology images is one such challenge. In this work, we show that using an existing pre-trained Vision Transformer (ViT) to encode 4096x4096 sized patches of the Whole Slide Image (WSI) and a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model for language modeling-based decoder for report generation, we can build a performant and portable report generation mechanism that takes into account the whole high resolution image. Our method allows us to not only generate and evaluate captions that describe the image, but also helps us classify the image into tissue types and the gender of the patient as well. Our best performing model achieves a 89.52% accuracy in Tissue Type classification with a BLEU-4 score of 0.12 in our caption generation task.
△ Less
Submitted 15 March, 2024; v1 submitted 3 December, 2023;
originally announced December 2023.
-
Pinpointing coalescing binary neutron star sources with the IGWN, including LIGO-Aundha
Authors:
Sachin R. Shukla,
Lalit Pathak,
Anand S. Sengupta
Abstract:
LIGO-Aundha (A1), the Indian gravitational wave detector, is expected to join the IGWN and begin operations in the early 2030s. We study the impact of A1 on the accuracy of determining the direction of incoming transient signals from coalescing BNS sources with moderately high SNRs. It is conceivable that A1's sensitivity, effective bandwidth, and duty cycle will improve incrementally through mult…
▽ More
LIGO-Aundha (A1), the Indian gravitational wave detector, is expected to join the IGWN and begin operations in the early 2030s. We study the impact of A1 on the accuracy of determining the direction of incoming transient signals from coalescing BNS sources with moderately high SNRs. It is conceivable that A1's sensitivity, effective bandwidth, and duty cycle will improve incrementally through multiple detector commissioning rounds to achieve the desired `LIGO-A+' design sensitivity. For this purpose, we examine A1 under two distinct noise PSDs. One mirrors the conditions during the O4 run of the LIGO Hanford and Livingston detectors, simulating an early commissioning stage, while the other represents the A+ design sensitivity. We consider various duty cycles of A1 at the sensitivities mentioned above for a comprehensive analysis. We show that even at the O4 sensitivity with a modest $20\%$ duty cycle, A1's addition to the IGWN leads to a $15\%$ reduction in median sky-localization errors ($ΔΩ_{90\%}$) to $5.6$~sq.~deg. At its design sensitivity and $80\%$ duty cycle, this error shrinks further to $2.4$~sq.~deg, with 84\% sources localized within a nominal error box of $10$~sq.~deg! Even in the worst-case scenario, where signals are sub-threshold in A1, we demonstrate its critical role in reducing the localization uncertainties of the BNS source. Our results are obtained from a large Bayesian PE study using simulated signals injected in a heterogeneous network of detectors using the recently developed meshfree approximation aided rapid Bayesian inference pipeline. We consider a seismic cut-off frequency of 10 Hz for all the detectors. We also present hypothetical improvements in sky localization for a few GWTC-like events injected in real data and demonstrate A1's role in resolving the degeneracy between the luminosity distance and inclination angle parameters.
△ Less
Submitted 16 April, 2024; v1 submitted 27 November, 2023;
originally announced November 2023.
-
Automatic Report Generation for Histopathology images using pre-trained Vision Transformers
Authors:
Saurav Sengupta,
Donald E. Brown
Abstract:
Deep learning for histopathology has been successfully used for disease classification, image segmentation and more. However, combining image and text modalities using current state-of-the-art methods has been a challenge due to the high resolution of histopathology images. Automatic report generation for histopathology images is one such challenge. In this work, we show that using an existing pre…
▽ More
Deep learning for histopathology has been successfully used for disease classification, image segmentation and more. However, combining image and text modalities using current state-of-the-art methods has been a challenge due to the high resolution of histopathology images. Automatic report generation for histopathology images is one such challenge. In this work, we show that using an existing pre-trained Vision Transformer in a two-step process of first using it to encode 4096x4096 sized patches of the Whole Slide Image (WSI) and then using it as the encoder and an LSTM decoder for report generation, we can build a fairly performant and portable report generation mechanism that takes into account the whole of the high resolution image, instead of just the patches. We are also able to use representations from an existing powerful pre-trained hierarchical vision transformer and show its usefulness in not just zero shot classification but also for report generation.
△ Less
Submitted 13 November, 2023; v1 submitted 10 November, 2023;
originally announced November 2023.
-
Template bank to search for exotic gravitational wave signals from astrophysical compact binaries
Authors:
Abhishek Sharma,
Soumen Roy,
Anand S. Sengupta
Abstract:
Modeled searches of gravitational wave signals from compact binary mergers rely on template waveforms determined by the theory of general relativity (GR). Once a signal is detected, one generally performs the model agnostic test of GR, either looking for consistency between the GR waveform and data or introducing phenomenological deviations to detect the departure from GR. The non-trivial presence…
▽ More
Modeled searches of gravitational wave signals from compact binary mergers rely on template waveforms determined by the theory of general relativity (GR). Once a signal is detected, one generally performs the model agnostic test of GR, either looking for consistency between the GR waveform and data or introducing phenomenological deviations to detect the departure from GR. The non-trivial presence of beyond-GR physics can alter the waveform and could be missed by the GR template-based searches. A recent study [Phys. Rev. D 107, 024017 (2023)] targeted the binary black hole merger, assuming the parametrized deviation in lower post-Newtonian terms and demonstrated a mild effect on the search sensitivity. Surprisingly, for the search space of binary neutron star (BNS) systems where component masses range from 1 to $2.4\:\rm{M}_\odot$ and parametrized deviations span $1σ$ width of the deviation parameters measured from the GW170817 event, the GR template bank is highly ineffectual for detecting the non-GR signals. Here, we present a new hybrid method to construct a non-GR template bank for the BNS search space. The hybrid method uses the geometric approach of three-dimensional lattice placement to cover most of the parameter space volume, followed by the random method to cover the boundary regions of parameter space. We find that the non-GR bank size is $\sim$15 times larger than the conventional GR bank and is effectual towards detecting non-GR signals in the target search space.
△ Less
Submitted 25 June, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
Statistical Network Analysis: Past, Present, and Future
Authors:
Srijan Sengupta
Abstract:
This article provides a brief overview of statistical network analysis, a rapidly evolving field of statistics, which encompasses statistical models, algorithms, and inferential methods for analyzing data in the form of networks. Particular emphasis is given to connecting the historical developments in network science to today's statistical network analysis, and outlining important new areas for f…
▽ More
This article provides a brief overview of statistical network analysis, a rapidly evolving field of statistics, which encompasses statistical models, algorithms, and inferential methods for analyzing data in the form of networks. Particular emphasis is given to connecting the historical developments in network science to today's statistical network analysis, and outlining important new areas for future research.
This invited article is intended as a book chapter for the volume "Frontiers of Statistics and Data Science" edited by Subhashis Ghoshal and Anindya Roy for the International Indian Statistical Association Series on Statistics and Data Science, published by Springer. This review article covers the material from the short course titled "Statistical Network Analysis: Past, Present, and Future" taught by the author at the Annual Conference of the International Indian Statistical Association, June 6-10, 2023, at Golden, Colorado.
△ Less
Submitted 31 October, 2023;
originally announced November 2023.
-
Inferring to C or not to C: Evolutionary games with Bayesian inferential strategies
Authors:
Arunava Patra,
Supratim Sengupta,
Ayan Paul,
Sagar Chakraborty
Abstract:
Strategies for sustaining cooperation and preventing exploitation by selfish agents in repeated games have mostly been restricted to Markovian strategies where the response of an agent depends on the actions in the previous round. Such strategies are characterized by lack of learning. However, learning from accumulated evidence over time and using the evidence to dynamically update our response is…
▽ More
Strategies for sustaining cooperation and preventing exploitation by selfish agents in repeated games have mostly been restricted to Markovian strategies where the response of an agent depends on the actions in the previous round. Such strategies are characterized by lack of learning. However, learning from accumulated evidence over time and using the evidence to dynamically update our response is a key feature of living organisms. Bayesian inference provides a framework for such evidence-based learning mechanisms. It is therefore imperative to understand how strategies based on Bayesian learning fare in repeated games with Markovian strategies. Here, we consider a scenario where the Bayesian player uses the accumulated evidence of the opponent's actions over several rounds to continuously update her belief about the reactive opponent's strategy. The Bayesian player can then act on her inferred belief in different ways. By studying repeated Prisoner's dilemma games with such Bayesian inferential strategies, both in infinite and finite populations, we identify the conditions under which such strategies can be evolutionarily stable. We find that a Bayesian strategy that is less altruistic than the inferred belief about the opponent's strategy can outperform a larger set of reactive strategies, whereas one that is more generous than the inferred belief is more successful when the benefit-to-cost ratio of mutual cooperation is high. Our analysis reveals how learning the opponent's strategy through Bayesian inference, as opposed to utility maximization, can be beneficial in the long run, in preventing exploitation and eventual invasion by reactive strategies.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
Quality > Quantity: Synthetic Corpora from Foundation Models for Closed-Domain Extractive Question Answering
Authors:
Saptarshi Sengupta,
Connor Heaton,
Shreya Ghosh,
Preslav Nakov,
Prasenjit Mitra
Abstract:
Domain adaptation, the process of training a model in one domain and applying it to another, has been extensively explored in machine learning. While training a domain-specific foundation model (FM) from scratch is an option, recent methods have focused on adapting pre-trained FMs for domain-specific tasks. However, our experiments reveal that either approach does not consistently achieve state-of…
▽ More
Domain adaptation, the process of training a model in one domain and applying it to another, has been extensively explored in machine learning. While training a domain-specific foundation model (FM) from scratch is an option, recent methods have focused on adapting pre-trained FMs for domain-specific tasks. However, our experiments reveal that either approach does not consistently achieve state-of-the-art (SOTA) results in the target domain. In this work, we study extractive question answering within closed domains and introduce the concept of targeted pre-training. This involves determining and generating relevant data to further pre-train our models, as opposed to the conventional philosophy of utilizing domain-specific FMs trained on a wide range of data. Our proposed framework uses Galactica to generate synthetic, ``targeted'' corpora that align with specific writing styles and topics, such as research papers and radiology reports. This process can be viewed as a form of knowledge distillation. We apply our method to two biomedical extractive question answering datasets, COVID-QA and RadQA, achieving a new benchmark on the former and demonstrating overall improvements on the latter. Code available at https://github.com/saptarshi059/CDQA-v1-Targetted-PreTraining/tree/main.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Constant Curvature 3-branes in 5-D f(R) Bulk
Authors:
Shafaq Gulzar Elahi,
Soumya Samrat Mandal,
Soumitra SenGupta
Abstract:
Braneworld models remain the most promising candidates to address several important questions in low-energy particle phenomenology and cosmology. The role of the moduli field(s) and its stabilization is an integral part of this question. In this work, we show that a 5-dimensional warped braneworld model with higher curvature gravity in bulk admits de-Sitter and anti de-Sitter solutions on the bran…
▽ More
Braneworld models remain the most promising candidates to address several important questions in low-energy particle phenomenology and cosmology. The role of the moduli field(s) and its stabilization is an integral part of this question. In this work, we show that a 5-dimensional warped braneworld model with higher curvature gravity in bulk admits de-Sitter and anti de-Sitter solutions on the branes. The remarkable feature of having a positive vacuum energy on the visible brane is the presence of a metastable minimum and a global minimum for the modulus potential. While the metastable minimum leads to a consistent cosmological model of a bouncing universe, the concomitant existence of the global minimum provides a vacuum for the modulus to roll down to stability. Further, this model is shown to be consistent with the swampland conjecture to qualify as a viable candidate in the low energy description of string landscape
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Body-mounted MR-conditional Robot for Minimally Invasive Liver Intervention
Authors:
Zhefeng Huang,
Anthony L. Gunderman,
Samuel E. Wilcox,
Saikat Sengupta,
Jay Shah,
Aiming Lu,
David Woodrum,
Yue Chen
Abstract:
MR-guided microwave ablation (MWA) has proven effective in treating hepatocellular carcinoma (HCC) with small-sized tumors, but the state-of-the-art technique suffers from sub-optimal workflow due to speed and accuracy of needle placement. This paper presents a compact body-mounted MR-conditional robot that can operate in closed-bore MR scanners for accurate needle guidance. The robotic platform c…
▽ More
MR-guided microwave ablation (MWA) has proven effective in treating hepatocellular carcinoma (HCC) with small-sized tumors, but the state-of-the-art technique suffers from sub-optimal workflow due to speed and accuracy of needle placement. This paper presents a compact body-mounted MR-conditional robot that can operate in closed-bore MR scanners for accurate needle guidance. The robotic platform consists of two stacked Cartesian XY stages, each with two degrees of freedom, that facilitate needle guidance. The robot is actuated using 3D-printed pneumatic turbines with MR-conditional bevel gear transmission systems. Pneumatic valves and control mechatronics are located inside the MRI control room and are connected to the robot with pneumatic transmission lines and optical fibers. Free space experiments indicated robot-assisted needle insertion error of 2.6$\pm$1.3 mm at an insertion depth of 80 mm. The MR-guided phantom studies were conducted to verify the MR-conditionality and targeting performance of the robot. Future work will focus on the system optimization and validations in animal trials.
△ Less
Submitted 25 March, 2024; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Give and Take: Federated Transfer Learning for Industrial IoT Network Intrusion Detection
Authors:
Lochana Telugu Rajesh,
Tapadhir Das,
Raj Mani Shukla,
Shamik Sengupta
Abstract:
The rapid growth in Internet of Things (IoT) technology has become an integral part of today's industries forming the Industrial IoT (IIoT) initiative, where industries are leveraging IoT to improve communication and connectivity via emerging solutions like data analytics and cloud computing. Unfortunately, the rapid use of IoT has made it an attractive target for cybercriminals. Therefore, protec…
▽ More
The rapid growth in Internet of Things (IoT) technology has become an integral part of today's industries forming the Industrial IoT (IIoT) initiative, where industries are leveraging IoT to improve communication and connectivity via emerging solutions like data analytics and cloud computing. Unfortunately, the rapid use of IoT has made it an attractive target for cybercriminals. Therefore, protecting these systems is of utmost importance. In this paper, we propose a federated transfer learning (FTL) approach to perform IIoT network intrusion detection. As part of the research, we also propose a combinational neural network as the centerpiece for performing FTL. The proposed technique splits IoT data between the client and server devices to generate corresponding models, and the weights of the client models are combined to update the server model. Results showcase high performance for the FTL setup between iterations on both the IIoT clients and the server. Additionally, the proposed FTL setup achieves better overall performance than contemporary machine learning algorithms at performing network intrusion detection.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Exploring Axions through the Photon Ring of a Spherically Symmetric Black Hole
Authors:
Sourov Roy,
Pratick Sarkar,
Subhadip Sau,
Soumitra SenGupta
Abstract:
In this study, we examine the phenomenon of photon axion conversion occurring in the spacetime surrounding a black hole. Specifically, we focus on the potential existence of a magnetic field around the supermassive black hole M87*, which could facilitate the conversion of photons into axions in close proximity to the photon sphere. While photons traverse through the curved spacetime, they spend ti…
▽ More
In this study, we examine the phenomenon of photon axion conversion occurring in the spacetime surrounding a black hole. Specifically, we focus on the potential existence of a magnetic field around the supermassive black hole M87*, which could facilitate the conversion of photons into axions in close proximity to the photon sphere. While photons traverse through the curved spacetime, they spend time near the photon sphere, where conversion of these photons into axions takes place. Consequently, this process leads to a decrease in the intensity of the black hole's photon ring. To explore the possibilities of detecting these hypothetical axion particles, we propose observing the photon sphere using higher resolution telescopes. By doing so, we can gain valuable insights into the conversion mechanism as well as the nature of the spherically symmetric black hole geometry. Moreover, we also investigate how the photon ring luminosities are affected if the black hole possesses a charge parameter. For instance apart from U(1) electric charge, the presence of extra dimension may induce a {\em tidal charge} with a characteristic signature. It is important to note that the success of the conversion mechanism relies on the axion-photon coupling and mass. As a result, the modified luminosity of the black hole's photon ring offers a valuable means of constraining the axion's mass and coupling parameter within a certain range. Thus our findings contribute to a better understanding of photon axion conversion in the environment of a black hole spacetime and helps us explore the possible existence of extra spatial dimension.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Large Language Models for Software Engineering: Survey and Open Problems
Authors:
Angela Fan,
Beliz Gokkaya,
Mark Harman,
Mitya Lyubarskiy,
Shubho Sengupta,
Shin Yoo,
Jie M. Zhang
Abstract:
This paper provides a survey of the emerging area of Large Language Models (LLMs) for Software Engineering (SE). It also sets out open research challenges for the application of LLMs to technical problems faced by software engineers. LLMs' emergent properties bring novelty and creativity with applications right across the spectrum of Software Engineering activities including coding, design, requir…
▽ More
This paper provides a survey of the emerging area of Large Language Models (LLMs) for Software Engineering (SE). It also sets out open research challenges for the application of LLMs to technical problems faced by software engineers. LLMs' emergent properties bring novelty and creativity with applications right across the spectrum of Software Engineering activities including coding, design, requirements, repair, refactoring, performance improvement, documentation and analytics. However, these very same emergent properties also pose significant technical challenges; we need techniques that can reliably weed out incorrect solutions, such as hallucinations. Our survey reveals the pivotal role that hybrid techniques (traditional SE plus LLMs) have to play in the development and deployment of reliable, efficient and effective LLM-based SE.
△ Less
Submitted 11 November, 2023; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Hamiltonian Form of Gravity around a Singularity
Authors:
Sandipan Sengupta
Abstract:
We show that the Hamiltonian form of gravity in terms of connection and densitized triad variables admits a simpler structure close to a (spacelike) curvature singularity. To define this regime, we construct a limit to a vanishing (spatial) metric determinant. The cosmological and black hole spacetime solutions that characterize such an approach to a singularity are obtained. We also elucidate the…
▽ More
We show that the Hamiltonian form of gravity in terms of connection and densitized triad variables admits a simpler structure close to a (spacelike) curvature singularity. To define this regime, we construct a limit to a vanishing (spatial) metric determinant. The cosmological and black hole spacetime solutions that characterize such an approach to a singularity are obtained. We also elucidate the connection of this generic limit to the `Carroll' limit which could be interpreted as a special case.
△ Less
Submitted 9 July, 2024; v1 submitted 1 October, 2023;
originally announced October 2023.
-
De-SaTE: Denoising Self-attention Transformer Encoders for Li-ion Battery Health Prognostics
Authors:
Gaurav Shinde,
Rohan Mohapatra,
Pooja Krishan,
Saptarshi Sengupta
Abstract:
The usage of Lithium-ion (Li-ion) batteries has gained widespread popularity across various industries, from powering portable electronic devices to propelling electric vehicles and supporting energy storage systems. A central challenge in Li-ion battery reliability lies in accurately predicting their Remaining Useful Life (RUL), which is a critical measure for proactive maintenance and predictive…
▽ More
The usage of Lithium-ion (Li-ion) batteries has gained widespread popularity across various industries, from powering portable electronic devices to propelling electric vehicles and supporting energy storage systems. A central challenge in Li-ion battery reliability lies in accurately predicting their Remaining Useful Life (RUL), which is a critical measure for proactive maintenance and predictive analytics. This study presents a novel approach that harnesses the power of multiple denoising modules, each trained to address specific types of noise commonly encountered in battery data. Specifically, a denoising auto-encoder and a wavelet denoiser are used to generate encoded/decomposed representations, which are subsequently processed through dedicated self-attention transformer encoders. After extensive experimentation on NASA and CALCE data, a broad spectrum of health indicator values are estimated under a set of diverse noise patterns. The reported error metrics on these data are on par with or better than the state-of-the-art reported in recent literature.
△ Less
Submitted 11 November, 2023; v1 submitted 28 September, 2023;
originally announced October 2023.
-
The development of HISPEC for Keck and MODHIS for TMT: science cases and predicted sensitivities
Authors:
Quinn M. Konopacky,
Ashley D. Baker,
Dimitri Mawet,
Michael P. Fitzgerald,
Nemanja Jovanovic,
Charles Beichman,
Garreth Ruane,
Rob Bertz,
Hiroshi Terada,
Richard Dekany,
Larry Lingvay,
Marc Kassis,
David Anderson,
Motohide Tamura,
Bjorn Benneke,
Thomas Beatty,
Tuan Do,
Shogo Nishiyama,
Peter Plavchan,
Jason Wang,
Ji Wang,
Adam Burgasser,
Jean-Baptiste Ruffio,
Huihao Zhang,
Aaron Brown
, et al. (50 additional authors not shown)
Abstract:
HISPEC is a new, high-resolution near-infrared spectrograph being designed for the W.M. Keck II telescope. By offering single-shot, R=100,000 between 0.98 - 2.5 um, HISPEC will enable spectroscopy of transiting and non-transiting exoplanets in close orbits, direct high-contrast detection and spectroscopy of spatially separated substellar companions, and exoplanet dynamical mass and orbit measureme…
▽ More
HISPEC is a new, high-resolution near-infrared spectrograph being designed for the W.M. Keck II telescope. By offering single-shot, R=100,000 between 0.98 - 2.5 um, HISPEC will enable spectroscopy of transiting and non-transiting exoplanets in close orbits, direct high-contrast detection and spectroscopy of spatially separated substellar companions, and exoplanet dynamical mass and orbit measurements using precision radial velocity monitoring calibrated with a suite of state-of-the-art absolute and relative wavelength references. MODHIS is the counterpart to HISPEC for the Thirty Meter Telescope and is being developed in parallel with similar scientific goals. In this proceeding, we provide a brief overview of the current design of both instruments, and the requirements for the two spectrographs as guided by the scientific goals for each. We then outline the current science case for HISPEC and MODHIS, with focuses on the science enabled for exoplanet discovery and characterization. We also provide updated sensitivity curves for both instruments, in terms of both signal-to-noise ratio and predicted radial velocity precision.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Locally nilpotent polynomials over $\mathbb{Z}$
Authors:
Sayak Sengupta
Abstract:
For a polynomial $u=u(x)$ in $\mathbb{Z}[x]$ and $r\in\mathbb{Z}$, we consider the orbit of $u$ at $r$ denoted and defined by $\mathcal{O}_u(r):=\{u(r),u(u(r)),\ldots\}$. We ask two questions here: (i) what are the polynomials $u$ for which $0\in \mathcal{O}_u(r)$, and (ii) what are the polynomials for which $0\not\in \mathcal{O}_u(r)$ but, modulo every prime $p$, $0\in \mathcal{O}_u(r)$? In this…
▽ More
For a polynomial $u=u(x)$ in $\mathbb{Z}[x]$ and $r\in\mathbb{Z}$, we consider the orbit of $u$ at $r$ denoted and defined by $\mathcal{O}_u(r):=\{u(r),u(u(r)),\ldots\}$. We ask two questions here: (i) what are the polynomials $u$ for which $0\in \mathcal{O}_u(r)$, and (ii) what are the polynomials for which $0\not\in \mathcal{O}_u(r)$ but, modulo every prime $p$, $0\in \mathcal{O}_u(r)$? In this paper, we give a complete classification of the polynomials for which (ii) holds for a given $r$. We also present some results for some special values of $r$ where (i) can be answered.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
$\texttt{NePhi}$: Neural Deformation Fields for Approximately Diffeomorphic Medical Image Registration
Authors:
Lin Tian,
Hastings Greer,
Raúl San José Estépar,
Soumyadip Sengupta,
Marc Niethammer
Abstract:
This work proposes NePhi, a generalizable neural deformation model which results in approximately diffeomorphic transformations. In contrast to the predominant voxel-based transformation fields used in learning-based registration approaches, NePhi represents deformations functionally, leading to great flexibility within the design space of memory consumption during training and inference, inferenc…
▽ More
This work proposes NePhi, a generalizable neural deformation model which results in approximately diffeomorphic transformations. In contrast to the predominant voxel-based transformation fields used in learning-based registration approaches, NePhi represents deformations functionally, leading to great flexibility within the design space of memory consumption during training and inference, inference time, registration accuracy, as well as transformation regularity. Specifically, NePhi 1) requires less memory compared to voxel-based learning approaches, 2) improves inference speed by predicting latent codes, compared to current existing neural deformation based registration approaches that \emph{only} rely on optimization, 3) improves accuracy via instance optimization, and 4) shows excellent deformation regularity which is highly desirable for medical image registration. We demonstrate the performance of NePhi on a 2D synthetic dataset as well as for real 3D lung registration. Our results show that NePhi can match the accuracy of voxel-based representations in a single-resolution registration setting. For multi-resolution registration, our method matches the accuracy of current SOTA learning-based registration approaches with instance optimization while reducing memory requirements by a factor of five.
△ Less
Submitted 26 March, 2024; v1 submitted 13 September, 2023;
originally announced September 2023.
-
Prompt sky localization of compact binary sources using a meshfree approximation
Authors:
Lalit Pathak,
Sanket Munishwar,
Amit Reza,
Anand S. Sengupta
Abstract:
The number of gravitational wave signals from the merger of compact binary systems detected in the network of advanced LIGO and Virgo detectors is expected to increase considerably in the upcoming science runs. Once a confident detection is made, it is crucial to reconstruct the source's properties rapidly, particularly the sky position and chirp mass, to follow up on these transient sources with…
▽ More
The number of gravitational wave signals from the merger of compact binary systems detected in the network of advanced LIGO and Virgo detectors is expected to increase considerably in the upcoming science runs. Once a confident detection is made, it is crucial to reconstruct the source's properties rapidly, particularly the sky position and chirp mass, to follow up on these transient sources with telescopes operating at different electromagnetic bands for multi-messenger astronomy. In this context, we present a rapid parameter estimation (PE) method aided by mesh-free approximations to accurately reconstruct properties of compact binary sources from data gathered by a network of gravitational wave detectors. This approach builds upon our previous algorithm [L. Pathak et al., Fast likelihood evaluation using meshfree approximations for reconstructing compact binary sources, https://journals.aps.org/prd/abstract/10.1103/PhysRevD.108.064055, Phys. Rev. D 108, 064055 (2023)] to expedite the evaluation of the likelihood function and extend it to enable coherent network PE in a ten-dimensional parameter space, including sky position and polarization angle. Additionally, we propose an optimized interpolation node placement strategy during the start-up stage to enhance the accuracy of the marginalized posterior distributions. With this updated method, we can estimate the properties of binary neutron star (BNS) sources in approximately 2.4~(2.7) min for the \TaylorF~(\texttt{IMRPhenomD}) signal model by utilizing 64 CPU cores on a shared memory architecture. Furthermore, our approach can be integrated into existing parameter estimation pipelines, providing a valuable tool for the broader scientific community. We also highlight some areas for improvements to this algorithm in the future, which includes overcoming the limitations due to narrow prior bounds.
△ Less
Submitted 17 February, 2024; v1 submitted 13 September, 2023;
originally announced September 2023.
-
TFBEST: Dual-Aspect Transformer with Learnable Positional Encoding for Failure Prediction
Authors:
Rohan Mohapatra,
Saptarshi Sengupta
Abstract:
Hard Disk Drive (HDD) failures in datacenters are costly - from catastrophic data loss to a question of goodwill, stakeholders want to avoid it like the plague. An important tool in proactively monitoring against HDD failure is timely estimation of the Remaining Useful Life (RUL). To this end, the Self-Monitoring, Analysis and Reporting Technology employed within HDDs (S.M.A.R.T.) provide critical…
▽ More
Hard Disk Drive (HDD) failures in datacenters are costly - from catastrophic data loss to a question of goodwill, stakeholders want to avoid it like the plague. An important tool in proactively monitoring against HDD failure is timely estimation of the Remaining Useful Life (RUL). To this end, the Self-Monitoring, Analysis and Reporting Technology employed within HDDs (S.M.A.R.T.) provide critical logs for long-term maintenance of the security and dependability of these essential data storage devices. Data-driven predictive models in the past have used these S.M.A.R.T. logs and CNN/RNN based architectures heavily. However, they have suffered significantly in providing a confidence interval around the predicted RUL values as well as in processing very long sequences of logs. In addition, some of these approaches, such as those based on LSTMs, are inherently slow to train and have tedious feature engineering overheads. To overcome these challenges, in this work we propose a novel transformer architecture - a Temporal-fusion Bi-encoder Self-attention Transformer (TFBEST) for predicting failures in hard-drives. It is an encoder-decoder based deep learning technique that enhances the context gained from understanding health statistics sequences and predicts a sequence of the number of days remaining before a disk potentially fails. In this paper, we also provide a novel confidence margin statistic that can help manufacturers replace a hard-drive within a time frame. Experiments on Seagate HDD data show that our method significantly outperforms the state-of-the-art RUL prediction methods during testing over the exhaustive 10-year data from Backblaze (2013-present). Although validated on HDD failure prediction, the TFBEST architecture is well-suited for other prognostics applications and may be adapted for allied regression problems.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Spontaneous voltage peaks in superconducting Nb channels without engineered asymmetry
Authors:
Shamashis Sengupta,
Miguel Monteverde,
Sara Loucif,
Florian Pallier,
Louis Dumoulin,
Claire Marrache-Kikuchi
Abstract:
Rectification effects in solid-state devices are a consequence of nonreciprocal transport properties. This phenomenon is usually observed in systems with broken inversion symmetry. In most instances, nonreciprocal transport arises in the presence of an applied magnetic field and the rectified signal has an antisymmetric dependence on the field. We have observed rectification of environmental elect…
▽ More
Rectification effects in solid-state devices are a consequence of nonreciprocal transport properties. This phenomenon is usually observed in systems with broken inversion symmetry. In most instances, nonreciprocal transport arises in the presence of an applied magnetic field and the rectified signal has an antisymmetric dependence on the field. We have observed rectification of environmental electromagnetic fluctuations in plain Nb channels without any asymmetry in design, leading to spontaneous voltage peaks at the superconducting transition. The signal is symmetric in the magnetic field and appears even without an applied field at the critical temperature. This is indicative of an unconventional mechanism of nonreciprocal transport resulting from a spontaneous breaking of inversion symmetry.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
The MPIfR-MeerKAT Galactic Plane Survey II. The eccentric double neutron star system PSR J1208-5936 and a neutron star merger rate update
Authors:
M. Colom i Bernadich,
V. Balakrishnan,
E. Barr,
M. Berezina,
M. Burgay,
S. Buchner,
D. J. Champion,
W. Chen,
G. Desvignes,
P. C. C. Freire,
K. Grunthal,
M. Kramer,
Y. Men,
P. V. Padmanabh,
A. Parthasarathy,
D. Pillay,
I. Rammala,
S. Sengupta,
V. Venkatraman Krishnan
Abstract:
The MMGPS-L is the most sensitive pulsar survey in the Southern Hemisphere. We present a follow-up study of one of these new discoveries, PSR J1208-5936, a 28.71-ms recycled pulsar in a double neutron star system with an orbital period of Pb=0.632 days and an eccentricity of e=0.348. Through timing of almost one year of observations, we detected the relativistic advance of periastron (0.918(1) deg…
▽ More
The MMGPS-L is the most sensitive pulsar survey in the Southern Hemisphere. We present a follow-up study of one of these new discoveries, PSR J1208-5936, a 28.71-ms recycled pulsar in a double neutron star system with an orbital period of Pb=0.632 days and an eccentricity of e=0.348. Through timing of almost one year of observations, we detected the relativistic advance of periastron (0.918(1) deg/yr), resulting in a total system mass of Mt=2.586(5) Mo. We also achieved low-significance constraints on the amplitude of the Einstein delay and Shapiro delay, in turn yielding constraints on the pulsar mass (Mp=1.26(+0.13/-0.25) Mo), the companion mass (Mc=1.32(+0.25/-0.13) Mo, and the inclination angle (i=57(12) degrees). This system is highly eccentric compared to other Galactic field double neutron stars with similar periods, possibly hinting at a larger-than-usual supernova kick during the formation of the second-born neutron star. The binary will merge within 7.2(2) Gyr due to the emission of gravitational waves. With the improved sensitivity of the MMGPS-L, we updated the Milky Way neutron star merger rate to be 25(+19/-9) Myr$^{-1}$ within 90% credible intervals, which is lower than previous studies based on known Galactic binaries owing to the lack of further detections despite the highly sensitive nature of the survey. This implies a local cosmic neutron star merger rate of 293(+222/-103} Gpc/yr, consistent with LIGO and Virgo O3 observations. With this, we predict the observation of 10(+8/-4) neutron star merger events during the LIGO-Virgo-KAGRA O4 run. We predict the uncertainties on the component masses and the inclination angle will be reduced to 5x10$^{-3}$ Mo and 0.4 degrees after two decades of timing, and that in at least a decade from now the detection of the shift in Pb and the sky proper motion will serve to make an independent constraint of the distance to the system.
△ Less
Submitted 8 September, 2023; v1 submitted 31 August, 2023;
originally announced August 2023.