Search | arXiv e-print repository

Unveiling desert region in inert doublet model assisted by Peccei-Quinn symmetry

Abstract: The Inert Higgs Doublet model (IDM), assisted by Peccei-Quinn (PQ) symmetry, offers a simple but natural framework of a dark sector that accommodates Weakly Interacting Massive Particle (WIMP) and axion as dark matter components. Spontaneous breaking of $U(1)_{PQ}$ symmetry, which was originally proposed as an elegant solution to the strong charge-parity (CP) problem, also ensures the stability of… ▽ More The Inert Higgs Doublet model (IDM), assisted by Peccei-Quinn (PQ) symmetry, offers a simple but natural framework of a dark sector that accommodates Weakly Interacting Massive Particle (WIMP) and axion as dark matter components. Spontaneous breaking of $U(1)_{PQ}$ symmetry, which was originally proposed as an elegant solution to the strong charge-parity (CP) problem, also ensures the stability of WIMP through a residual $\mathbb{Z}_2$ symmetry. Interestingly, additional fields necessitated by PQ symmetry further enrich the dark sector. These include a scalar field proprietor for axion DM and a vector-like quark (VLQ) that acts as a portal for the dark sector through Yukawa interactions. Moreover, this combination of the axion and WIMP components satisfies the observed DM relic density and reopens the phenomenologically exciting region of the IDM parameter space where the WIMP mass falls between 100 - 550 GeV. We investigate the model-independent pair production of VLQs exploring this region at the Large Hadron Collider (LHC), incorporating the effects of next-to-leading order (NLO) QCD corrections. After production, each VLQ decays into a top or bottom quark accompanied by an inert scalar, a consequence of the residual $\mathbb{Z}_2$ symmetry. Utilising relevant observables with a leptonic search channel and employing multivariate analysis, we demonstrate the ability of this analysis to exclude a significant portion of the parameter space with an integrated luminosity of 300 $\text{fb}^{-1}$. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 29 pages, 8 figures, 4 tables

arXiv:2407.01024 [pdf, other]

Unveiling frequency-dependent eclipsing in spider millisecond pulsars using broadband polarization observations with the Parkes

Authors: Sangita Kumari, Bhaswati Bhattacharyya, Rahul Sharan, Simon Johnston, Patrick Weltevrede, Benjamin Stappers, Devojyoti Kansabanik, Jayanta Roy, Ankita Ghosh

Abstract: This study presents an orbital phase-dependent analysis of three black widow spider millisecond pulsars (BW MSPs), aiming to investigate the magnetic field within the eclipse environment. The ultra-wide-bandwidth low-frequency receiver (UWL) of the Parkes 'Murriyang' radio telescope is utilised for full polarisation observations covering frequencies from 704-4032 MHz. Depolarisation of pulsed emis… ▽ More This study presents an orbital phase-dependent analysis of three black widow spider millisecond pulsars (BW MSPs), aiming to investigate the magnetic field within the eclipse environment. The ultra-wide-bandwidth low-frequency receiver (UWL) of the Parkes 'Murriyang' radio telescope is utilised for full polarisation observations covering frequencies from 704-4032 MHz. Depolarisation of pulsed emission is observed during the eclipse phase of three BW MSPs namely, J0024-7204J, J1431-4715 and PSR J1959+2048, consistent with previous studies of other BW MSPs. We estimated orbital phase dependent RM values for these MSPs. The wide bandwidth observations also provided the constraints on eclipse cutoff frequency for these BW MSPs. For PSR J0024-7204J, we report temporal variation of the eclipse cutoff frequency coupled with changes in the electron column density within the eclipse medium across six observed eclipses. Moreover, the eclipse cutoff frequency for PSR J1431-4715 is determined to be 1251 $\pm$ 80 MHz, leading to the conclusion that synchrotron absorption is the primary mechanism responsible for the eclipsing. Additionally, for PSR J1959+2048, the estimated cutoff frequency exceeded 1400 MHz, consistent with previous studies. With this investigation, we have doubled the sample size of BW MSPs with orbital phase-resolved studies allowing a better probe to the eclipse environment. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: Accepted for publication in ApJ

arXiv:2406.18700 [pdf, other]

On Fourier analysis of sparse Boolean functions over certain Abelian groups

Authors: Sourav Chakraborty, Swarnalipa Datta, Pranjal Dutta, Arijit Ghosh, Swagato Sanyal

Abstract: Given an Abelian group G, a Boolean-valued function f: G -> {-1,+1}, is said to be s-sparse, if it has at most s-many non-zero Fourier coefficients over the domain G. In a seminal paper, Gopalan et al. proved "Granularity" for Fourier coefficients of Boolean valued functions over Z_2^n, that have found many diverse applications in theoretical computer science and combinatorics. They also studied s… ▽ More Given an Abelian group G, a Boolean-valued function f: G -> {-1,+1}, is said to be s-sparse, if it has at most s-many non-zero Fourier coefficients over the domain G. In a seminal paper, Gopalan et al. proved "Granularity" for Fourier coefficients of Boolean valued functions over Z_2^n, that have found many diverse applications in theoretical computer science and combinatorics. They also studied structural results for Boolean functions over Z_2^n which are approximately Fourier-sparse. In this work, we obtain structural results for approximately Fourier-sparse Boolean valued functions over Abelian groups G of the form,G:= Z_{p_1}^{n_1} \times ... \times Z_{p_t}^{n_t}, for distinct primes p_i. We also obtain a lower bound of the form 1/(m^{2}s)^ceiling(phi(m)/2), on the absolute value of the smallest non-zero Fourier coefficient of an s-sparse function, where m=p_1 ... p_t, and phi(m)=(p_1-1) ... (p_t-1). We carefully apply probabilistic techniques from Gopalan et al., to obtain our structural results, and use some non-trivial results from algebraic number theory to get the lower bound. We construct a family of at most s-sparse Boolean functions over Z_p^n, where p > 2, for arbitrarily large enough s, where the minimum non-zero Fourier coefficient is 1/omega(n). The "Granularity" result of Gopalan et al. implies that the absolute values of non-zero Fourier coefficients of any s-sparse Boolean valued function over Z_2^n are 1/O(s). So, our result shows that one cannot expect such a lower bound for general Abelian groups. Using our new structural results on the Fourier coefficients of sparse functions, we design an efficient testing algorithm for Fourier-sparse Boolean functions, thata requires poly((ms)^phi(m),1/epsilon)-many queries. Further, we prove an Omega(sqrt{s}) lower bound on the query complexity of any adaptive sparsity testing algorithm. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.17927 [pdf, other]

Onset of slow dynamics in dense suspensions of active colloids

Authors: Antina Ghosh, Sayan Maity, Vijayakumar Chikkadi

Abstract: Slow relaxation and heterogeneous dynamics are characteristic features of glasses. The presence of glassy dynamics in nonequilibrium systems, such as active matter, is of significant interest due to its implications for living systems and material science. In this study, we use dense suspensions of self-propelled Janus particles moving on a substrate to investigate the onset of slow dynamics. Our… ▽ More Slow relaxation and heterogeneous dynamics are characteristic features of glasses. The presence of glassy dynamics in nonequilibrium systems, such as active matter, is of significant interest due to its implications for living systems and material science. In this study, we use dense suspensions of self-propelled Janus particles moving on a substrate to investigate the onset of slow dynamics. Our findings show that dense active suspensions exhibit several hallmark features of slow dynamics similar to systems approaching equilibrium. The relaxation time fits well with the Vogel-Fulcher-Tamman (VFT) equation, and the system displays heterogeneous dynamics. Furthermore, increasing the activity leads to faster relaxation of the system, and the glass transition density predicted by the VFT equation shifts to higher densities. The measurement of the cage length and persistence length reveal they are of the same order over the range of activities explored in our study. These results are in agreement with recent particle simulations. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.16726 [pdf, other]

HI and CO spectroscopy of the unusual host of GRB 171205A: A grand design spiral galaxy with a distorted HI field

Authors: A. de Ugarte Postigo, M. Michalowski, C. C. Thoene, S. Martin, A. Ashok, J. F. Agui Fernandez, M. Bremer, K. Misra, D. A. Perley, K. E. Heintz, S. V. Cherukuri, W. Dimitrov, T. Geron, A. Ghosh, L. Izzo, D. A. Kann, M. P. Koprowski, A. Lesniewska, J. K. Leung, A. Levan, A. Omar, D. Oszkiewicz, M. Polinska, L. Resmi, S. Schulze

Abstract: GRBs produced by the collapse of massive stars are usually found near the most prominent star-forming regions of star-forming galaxies. GRB 171205A happened in the outskirts of a spiral galaxy, a peculiar location in an atypical GRB host. In this paper we present a highly-resolved study of the molecular gas of this host, with CO(1-0) observations from ALMA. We compare with GMRT atomic HI observati… ▽ More GRBs produced by the collapse of massive stars are usually found near the most prominent star-forming regions of star-forming galaxies. GRB 171205A happened in the outskirts of a spiral galaxy, a peculiar location in an atypical GRB host. In this paper we present a highly-resolved study of the molecular gas of this host, with CO(1-0) observations from ALMA. We compare with GMRT atomic HI observations, and with data at other wavelengths to provide a broad-band view of the galaxy. The ALMA observations have a spatial resolution of 0.2" and a spectral resolution of 10 km/s, observed when the afterglow had a flux density of ~53 mJy. This allowed a molecular study both in emission and absorption. The HI observations allowed to study the host galaxy and its extended environment. The CO emission shows an undisturbed spiral structure with a central bar, and no significant emission at the location of the GRB. Our CO spectrum does not reveal any CO absorption, with a column density limit of < 10^15 cm^-2. This argues against the progenitor forming in a massive molecular cloud. The molecular gas traces the galaxy arms with higher concentration in the regions dominated by dust. The HI gas does not follow the stellar light or the molecular gas and is concentrated in two blobs, with no emission towards the centre of the galaxy, and is slightly displaced towards the southwest of the galaxy, where the GRB exploded. Within the extended neighbourhood of the host galaxy, we identify another prominent HI source at the same redshift, at a projected distance of 188 kpc. Our observations show that the progenitor of this GRB is not associated to a massive molecular cloud, but more likely related to low-metallicity atomic gas. The distortion in the HI gas field is indicator of an odd environment that could have triggered star formation and could be linked to a past interaction with the companion galaxy. △ Less

Submitted 25 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

Comments: 13 pages, 10 figures, 8 tables, A&A submitted after 1st referee review

arXiv:2406.16358 [pdf, other]

Approximate DCT and Quantization Techniques for Energy-Constrained Image Sensors

Authors: Ming-Che Li, Archisman Ghosh, Shreyas Sen

Abstract: Recent expansions in multimedia devices gather enormous amounts of real-time images for processing and inference. The images are first compressed using compression schemes, like JPEG, to reduce storage costs and power for transmitting the captured data. Due to inherent error resilience and imperceptibility in images, JPEG can be approximated to reduce the required computation power and area. This… ▽ More Recent expansions in multimedia devices gather enormous amounts of real-time images for processing and inference. The images are first compressed using compression schemes, like JPEG, to reduce storage costs and power for transmitting the captured data. Due to inherent error resilience and imperceptibility in images, JPEG can be approximated to reduce the required computation power and area. This work demonstrates the first end-to-end approximation computing-based optimization of JPEG hardware using i) an approximate division realized using bit-shift operators to reduce the complexity of the quantization block, ii) loop perforation, and iii) precision scaling on top of a multiplier-less fast DCT architecture to achieve an extremely energy-efficient JPEG compression unit which will be a perfect fit for power/bandwidth-limited scenario. Furthermore, a gradient descent-based heuristic composed of two conventional approximation strategies, i.e., Precision Scaling and Loop Perforation, is implemented for tuning the degree of approximation to trade off energy consumption with the quality degradation of the decoded image. The entire RTL design is coded in Verilog HDL, synthesized, mapped to TSMC 65nm CMOS technology, and simulated using Cadence Spectre Simulator under 25$^{\circ}$\textbf{C}, TT corner. The approximate division approach achieved around $\textbf{28\%}$ reduction in the active design area. The heuristic-based approximation technique combined with accelerator optimization achieves a significant energy reduction of $\textbf{36\%}$ for a minimal image quality degradation of $\textbf{2\%}$ SAD. Simulation results also show that the proposed architecture consumes 15uW at the DCT and quantization stages to compress a colored 480p image at 6fps. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.15824 [pdf, ps, other]

Non-Expanding Random walks on Homogeneous spaces and Diophantine approximation

Authors: Gaurav Aggarwal, Anish Ghosh

Abstract: We study non-expanding random walks on the space of affine lattices and establish a new classification theorem for stationary measures. Further, we prove a theorem that relates the genericity with respect to these random walks to Birkhoff genericity. Finally, we apply these theorems to obtain several results in inhomogeneous Diophantine approximation, especially on fractals. We study non-expanding random walks on the space of affine lattices and establish a new classification theorem for stationary measures. Further, we prove a theorem that relates the genericity with respect to these random walks to Birkhoff genericity. Finally, we apply these theorems to obtain several results in inhomogeneous Diophantine approximation, especially on fractals. △ Less

Submitted 1 July, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

Comments: 33 Pages, Comments Welcome

MSC Class: 37A17; 11K60

arXiv:2406.14505 [pdf, other]

Dynamics of colloidal rods rotating in viscoelastic media

Authors: N Narinder, Jyotiprakash Behera, Ambarish Ghosh

Abstract: We experimentally investigate the in-plane rotational motion of ferromagnetic colloidal rods in viscoelastic media under a rotating magnetic field. Contrary to their rotation in a Newtonian fluid, in a viscoelastic fluid the rods continue to exhibit a net angular drift even for applied field frequencies which is an order of magnitude larger than the step-out frequency. Despite experimental evidenc… ▽ More We experimentally investigate the in-plane rotational motion of ferromagnetic colloidal rods in viscoelastic media under a rotating magnetic field. Contrary to their rotation in a Newtonian fluid, in a viscoelastic fluid the rods continue to exhibit a net angular drift even for applied field frequencies which is an order of magnitude larger than the step-out frequency. Despite experimental evidence, previous studies failed to explain the observed behavior. This is due to the inherent assumption that the rods angular velocity beyond step-out follows same dependence on the applied field frequency as the Newtonian fluids. We demonstrated that the observed net rotation of rods after step-out originates from their interaction with the microstructural stress-relaxation processes within the viscoelastic fluid. Consequently, it exhibits a strong dependence on the rheological properties of the fluid. Our results are further supported by a minimal model which incorporates the memory-mediated response of the viscoelastic fluid on their motion. Furthermore, we demonstrate, both experimentally and numerically, that the observed effect represents a generic feature of viscoelastic media and is expected to manifest for elongated probes in other complex surroundings such as biological assays and colloidal glasses. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.09547

doi 10.1145/3637528.3671899

FLea: Addressing Data Scarcity and Label Skew in Federated Learning via Privacy-preserving Feature Augmentation

Authors: Tong Xia, Abhirup Ghosh, Xinchi Qiu, Cecilia Mascolo

Abstract: Federated Learning (FL) enables model development by leveraging data distributed across numerous edge devices without transferring local data to a central server. However, existing FL methods still face challenges when dealing with scarce and label-skewed data across devices, resulting in local model overfitting and drift, consequently hindering the performance of the global model. In response to… ▽ More Federated Learning (FL) enables model development by leveraging data distributed across numerous edge devices without transferring local data to a central server. However, existing FL methods still face challenges when dealing with scarce and label-skewed data across devices, resulting in local model overfitting and drift, consequently hindering the performance of the global model. In response to these challenges, we propose a pioneering framework called FLea, incorporating the following key components: i) A global feature buffer that stores activation-target pairs shared from multiple clients to support local training. This design mitigates local model drift caused by the absence of certain classes; ii) A feature augmentation approach based on local and global activation mix-ups for local training. This strategy enlarges the training samples, thereby reducing the risk of local overfitting; iii) An obfuscation method to minimize the correlation between intermediate activations and the source data, enhancing the privacy of shared features. To verify the superiority of FLea, we conduct extensive experiments using a wide range of data modalities, simulating different levels of local data scarcity and label skew. The results demonstrate that FLea consistently outperforms state-of-the-art FL counterparts (among 13 of the experimented 18 settings, the improvement is over 5% while concurrently mitigating the privacy vulnerabilities associated with shared features. Code is available at https://github.com/XTxiatong/FLea.git. △ Less

Submitted 18 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

Comments: This work was intended as a replacement of arXiv:2312.02327 and any subsequent updates will appear there

arXiv:2406.08549 [pdf, other]

Investigating Mutual Coupling in the Hydrogen Epoch of Reionization Array and Mitigating its Effects on the 21-cm Power Spectrum

Authors: E. Rath, R. Pascua, A. T. Josaitis, A. Ewall-Wice, N. Fagnoni, E. de Lera Acedo, Z. E. Martinot, Z. Abdurashidova, T. Adams, J. E. Aguirre, R. Baartman, A. P. Beardsley, L. M. Berkhout, G. Bernardi, T. S. Billings, J. D. Bowman, P. Bull, J. Burba, R. Byrne, S. Carey, K. -F. Chen, S. Choudhuri, T. Cox, D. R. DeBoer, M. Dexter , et al. (56 additional authors not shown)

Abstract: Interferometric experiments designed to detect the highly redshifted 21-cm signal from neutral hydrogen are producing increasingly stringent constraints on the 21-cm power spectrum, but some k-modes remain systematics-dominated. Mutual coupling is a major systematic that must be overcome in order to detect the 21-cm signal, and simulations that reproduce effects seen in the data can guide strategi… ▽ More Interferometric experiments designed to detect the highly redshifted 21-cm signal from neutral hydrogen are producing increasingly stringent constraints on the 21-cm power spectrum, but some k-modes remain systematics-dominated. Mutual coupling is a major systematic that must be overcome in order to detect the 21-cm signal, and simulations that reproduce effects seen in the data can guide strategies for mitigating mutual coupling. In this paper, we analyse 12 nights of data from the Hydrogen Epoch of Reionization Array and compare the data against simulations that include a computationally efficient and physically motivated semi-analytic treatment of mutual coupling. We find that simulated coupling features qualitatively agree with coupling features in the data; however, coupling features in the data are brighter than the simulated features, indicating the presence of additional coupling mechanisms not captured by our model. We explore the use of fringe-rate filters as mutual coupling mitigation tools and use our simulations to investigate the effects of mutual coupling on a simulated cosmological 21-cm power spectrum in a "worst case" scenario where the foregrounds are particularly bright. We find that mutual coupling contaminates a large portion of the "EoR Window", and the contamination is several orders-of-magnitude larger than our simulated cosmic signal across a wide range of cosmological Fourier modes. While our fiducial fringe-rate filtering strategy reduces mutual coupling by roughly a factor of 100 in power, a non-negligible amount of coupling cannot be excised with fringe-rate filters, so more sophisticated mitigation strategies are required. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 19 pages, 12 figures, submitted to MNRAS

arXiv:2406.07661 [pdf, other]

ROADWork Dataset: Learning to Recognize, Observe, Analyze and Drive Through Work Zones

Authors: Anurag Ghosh, Robert Tamburo, Shen Zheng, Juan R. Alvarez-Padilla, Hailiang Zhu, Michael Cardei, Nicholas Dunn, Christoph Mertz, Srinivasa G. Narasimhan

Abstract: Perceiving and navigating through work zones is challenging and under-explored, even with major strides in self-driving research. An important reason is the lack of open datasets for develo** new algorithms to address this long-tailed scenario. We propose the ROADWork dataset to learn how to recognize, observe and analyze and drive through work zones. We find that state-of-the-art foundation mod… ▽ More Perceiving and navigating through work zones is challenging and under-explored, even with major strides in self-driving research. An important reason is the lack of open datasets for develo** new algorithms to address this long-tailed scenario. We propose the ROADWork dataset to learn how to recognize, observe and analyze and drive through work zones. We find that state-of-the-art foundation models perform poorly on work zones. With our dataset, we improve upon detecting work zone objects (+26.2 AP), while discovering work zones with higher precision (+32.5%) at a much higher discovery rate (12.8 times), significantly improve detecting (+23.9 AP) and reading (+14.2% 1-NED) work zone signs and describing work zones (+36.7 SPICE). We also compute drivable paths from work zone navigation videos and show that it is possible to predict navigational goals and pathways such that 53.6% goals have angular error (AE) < 0.5 degrees (+9.9 %) and 75.3% pathways have AE < 0.5 degrees (+8.1 %). △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.05288 [pdf, other]

Optimal Eye Surgeon: Finding Image Priors through Sparse Generators at Initialization

Authors: Avrajit Ghosh, Xitong Zhang, Kenneth K. Sun, Qing Qu, Saiprasad Ravishankar, Rongrong Wang

Abstract: We introduce Optimal Eye Surgeon (OES), a framework for pruning and training deep image generator networks. Typically, untrained deep convolutional networks, which include image sampling operations, serve as effective image priors (Ulyanov et al., 2018). However, they tend to overfit to noise in image restoration tasks due to being overparameterized. OES addresses this by adaptively pruning networ… ▽ More We introduce Optimal Eye Surgeon (OES), a framework for pruning and training deep image generator networks. Typically, untrained deep convolutional networks, which include image sampling operations, serve as effective image priors (Ulyanov et al., 2018). However, they tend to overfit to noise in image restoration tasks due to being overparameterized. OES addresses this by adaptively pruning networks at random initialization to a level of underparameterization. This process effectively captures low-frequency image components even without training, by just masking. When trained to fit noisy images, these pruned subnetworks, which we term Sparse-DIP, resist overfitting to noise. This benefit arises from underparameterization and the regularization effect of masking, constraining them in the manifold of image priors. We demonstrate that subnetworks pruned through OES surpass other leading pruning methods, such as the Lottery Ticket Hypothesis, which is known to be suboptimal for image recovery tasks (Wu et al., 2023). Our extensive experiments demonstrate the transferability of OES-masks and the characteristics of sparse-subnetworks for image generation. Code is available at https://github.com/Avra98/Optimal-Eye-Surgeon.git. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: Pruning image generator networks at initialization to alleviate overfitting

Journal ref: International Conference on Machine Learning (ICML 2024)

arXiv:2406.04672 [pdf, ps, other]

Partial semigroup partial dynamical systems and Partial Central Sets

Authors: H. Goodarzi, M. A. Tootkaboni, Arpita Ghosh

Abstract: H. Furstenberg defined Central sets in $\mathbb{N}$ by using the notions of topological dynamics, later Bergelson and Hindman characterized central sets in $\mathbb{N}$ and also in arbitrary semigroup in terms of algebra of Stone-Čech compactification of that set. We state the new notion of large sets in a partial semigroup setting and characterize the algebraic structure of the sets by using the… ▽ More H. Furstenberg defined Central sets in $\mathbb{N}$ by using the notions of topological dynamics, later Bergelson and Hindman characterized central sets in $\mathbb{N}$ and also in arbitrary semigroup in terms of algebra of Stone-Čech compactification of that set. We state the new notion of large sets in a partial semigroup setting and characterize the algebraic structure of the sets by using the algebra of Stone-Čech compactification. By using these notions, we introduce the \emph{Partial Semigroup Partial Dynamical System(PSPDS)} and show that topological dynamical characterization of central sets in a partial semigroup is equivalent to the usual algebraic characterization. △ Less

Submitted 7 June, 2024; originally announced June 2024.

MSC Class: 2020 MSC: 37B02; 22A15 Secondary: 05D10. 37B02; 22A15; 05D10

arXiv:2406.04231 [pdf, other]

Quantifying Misalignment Between Agents

Authors: Aidan Kierans, Avijit Ghosh, Hananel Hazan, Shiri Dori-Hacohen

Abstract: Growing concerns about the AI alignment problem have emerged in recent years, with previous work focusing mainly on (1) qualitative descriptions of the alignment problem; (2) attempting to align AI actions with human interests by focusing on value specification and learning; and/or (3) focusing on a single agent or on humanity as a singular unit. Recent work in sociotechnical AI alignment has made… ▽ More Growing concerns about the AI alignment problem have emerged in recent years, with previous work focusing mainly on (1) qualitative descriptions of the alignment problem; (2) attempting to align AI actions with human interests by focusing on value specification and learning; and/or (3) focusing on a single agent or on humanity as a singular unit. Recent work in sociotechnical AI alignment has made some progress in defining alignment inclusively, but the field as a whole still lacks a systematic understanding of how to specify, describe, and analyze misalignment among entities, which may include individual humans, AI agents, and complex compositional entities such as corporations, nation-states, and so forth. Previous work on controversy in computational social science offers a mathematical model of contention among populations (of humans). In this paper, we adapt this contention model to the alignment problem, and show how misalignment can vary depending on the population of agents (human or otherwise) being observed, the domain in question, and the agents' probability-weighted preferences between possible outcomes. Our model departs from value specification approaches and focuses instead on the morass of complex, interlocking, sometimes contradictory goals that agents may have in practice. We apply our model by analyzing several case studies ranging from social media moderation to autonomous vehicle behavior. By applying our model with appropriately representative value data, AI engineers can ensure that their systems learn values maximally aligned with diverse human interests. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 10 pages, 2 figures, 4 tables, submitted to AIES-24

ACM Class: I.2.11; K.4.m

arXiv:2406.04146 [pdf, other]

Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness

Authors: Guangliang Liu, Milad Afshari, Xitong Zhang, Zhiyu Xue, Avrajit Ghosh, Bidhan Bashyal, Rongrong Wang, Kristen Johnson

Abstract: While task-agnostic debiasing provides notable generalizability and reduced reliance on downstream data, its impact on language modeling ability and the risk of relearning social biases from downstream task-specific data remain as the two most significant challenges when debiasing Pretrained Language Models (PLMs). The impact on language modeling ability can be alleviated given a high-quality and… ▽ More While task-agnostic debiasing provides notable generalizability and reduced reliance on downstream data, its impact on language modeling ability and the risk of relearning social biases from downstream task-specific data remain as the two most significant challenges when debiasing Pretrained Language Models (PLMs). The impact on language modeling ability can be alleviated given a high-quality and long-contextualized debiasing corpus, but there remains a deficiency in understanding the specifics of relearning biases. We empirically ascertain that the effectiveness of task-agnostic debiasing hinges on the quantitative bias level of both the task-specific data used for downstream applications and the debiased model. We empirically show that the lower bound of the bias level of the downstream fine-tuned model can be approximated by the bias level of the debiased model, in most practical cases. To gain more in-depth understanding about how the parameters of PLMs change during fine-tuning due to the forgetting issue of PLMs, we propose a novel framework which can Propagate Socially-fair Debiasing to Downstream Fine-tuning, ProSocialTuning. Our proposed framework can push the fine-tuned model to approach the bias lower bound during downstream fine-tuning, indicating that the ineffectiveness of debiasing can be alleviated by overcoming the forgetting issue through regularizing successfully debiased attention heads based on the PLMs' bias levels from stages of pretraining and debiasing. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2406.03939 [pdf, other]

Flux-density stability and temporal changes in spectra of millisecond pulsars using GMRT

Authors: Rahul Sharan, Bhaswati Bhattacharyaa, Sangita Kumari, Jayanta Roy, Ankita Ghosh

Abstract: This paper presents an investigation of spectral properties of 10 millisecond pulsars (MSPs) discovered by the uGMRT, observed from 2017-2023 using band 3 (300-500 MHz) and 4 (550-750 MHz) of uGMRT. For these MSPs, we have reported a range of spectral indices from ~0 to -4.8, while averaging the full observing band and all the observing epochs. For every MSP, we calculated the mean flux densities… ▽ More This paper presents an investigation of spectral properties of 10 millisecond pulsars (MSPs) discovered by the uGMRT, observed from 2017-2023 using band 3 (300-500 MHz) and 4 (550-750 MHz) of uGMRT. For these MSPs, we have reported a range of spectral indices from ~0 to -4.8, while averaging the full observing band and all the observing epochs. For every MSP, we calculated the mean flux densities across 7-8 sub-bands each with approximately 25 MHz bandwidth spanning band 3 and band 4. We computed their modulation indices as well as average and maximum-to-median flux densities within each subband. Using a temporal variation of flux density we calculated the refractive scintillation time scales and estimated structure function with time lag for 8 MSPs in the sample. We note a significant temporal evolution of the in-band spectra, classified into three categories based on the nature of the best-fit power-law spectra, having single positive spectral indices, multiple broken power law, and single negative spectral indices. Additionally, indications of low-frequency turnover and a temporal variation of the turnover frequency (to the extent that turnover was observed for some of the epochs while not seen for the rest) were noted for all the MSPs. To the best of our knowledge, this is the first systematic investigation probing temporal changes in the MSP spectra as well as in turnover frequency. Future exploration with dense monitoring combined with modeling of spectra can provide vital insight into the intrinsic emission properties of the MSPs and ISM properties. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 18 Pages, 8 Figures, 3 Tables. Accepted for Publication in ApJ

arXiv:2406.03864 [pdf, other]

PairNet: Training with Observed Pairs to Estimate Individual Treatment Effect

Authors: Lokesh Nagalapatti, Pranava Singhal, Avishek Ghosh, Sunita Sarawagi

Abstract: Given a dataset of individuals each described by a covariate vector, a treatment, and an observed outcome on the treatment, the goal of the individual treatment effect (ITE) estimation task is to predict outcome changes resulting from a change in treatment. A fundamental challenge is that in the observational data, a covariate's outcome is observed only under one treatment, whereas we need to infe… ▽ More Given a dataset of individuals each described by a covariate vector, a treatment, and an observed outcome on the treatment, the goal of the individual treatment effect (ITE) estimation task is to predict outcome changes resulting from a change in treatment. A fundamental challenge is that in the observational data, a covariate's outcome is observed only under one treatment, whereas we need to infer the difference in outcomes under two different treatments. Several existing approaches address this issue through training with inferred pseudo-outcomes, but their success relies on the quality of these pseudo-outcomes. We propose PairNet, a novel ITE estimation training strategy that minimizes losses over pairs of examples based on their factual observed outcomes. Theoretical analysis for binary treatments reveals that PairNet is a consistent estimator of ITE risk, and achieves smaller generalization error than baseline models. Empirical comparison with thirteen existing methods across eight benchmarks, covering both discrete and continuous treatments, shows that PairNet achieves significantly lower ITE error compared to the baselines. Also, it is model-agnostic and easy to implement. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: Lokesh and Pranava contributed equally. Accepted at ICML-24

arXiv:2406.01973 [pdf, other]

Adaptive Relaxation based Non-Conservative Chance Constrained Stochastic MPC for Battery Scheduling Under Forecast Uncertainties

Authors: Avik Ghosh, Cristian Cortes-Aguirre, Yi-An Chen, Adil Khurram, Jan Kleissl

Abstract: Chance constrained stochastic model predictive controllers (CC-SMPC) trade off full constraint satisfaction for economical plant performance under uncertainty. Previous CC-SMPC works are over-conservative in constraint violations leading to worse economic performance. Other past works require a-priori information about the uncertainty set, limiting their application to real-world systems. This pap… ▽ More Chance constrained stochastic model predictive controllers (CC-SMPC) trade off full constraint satisfaction for economical plant performance under uncertainty. Previous CC-SMPC works are over-conservative in constraint violations leading to worse economic performance. Other past works require a-priori information about the uncertainty set, limiting their application to real-world systems. This paper considers a discrete linear time invariant system with hard constraints on inputs and chance constraints on states, with unknown uncertainty distribution, statistics, or samples. This work proposes a novel adaptive online update rule to relax the state constraints based on the time-average of past constraint violations, for the SMPC to achieve reduced conservativeness in closed-loop. Under an ideal control policy assumption, it is proven that the time-average of constraint violations converges to the maximum allowed violation probability. The time-average of constraint violations is also proven to asymptotically converge even without the simplifying assumptions. The proposed method is applied to the optimal battery energy storage system (BESS) dispatch in a grid connected microgrid with PV generation and load demand with chance constraints on BESS state-of-charge (SOC). Realistic simulations show the superior electricity cost saving potential of the proposed method as compared to the traditional MPC (with hard constraints on BESS SOC), by satisfying the chance constraints non-conservatively in closed-loop, thereby effectively trading off increased cost savings with minimal adverse effects on BESS lifetime. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 16 pages, 2 figures

arXiv:2406.01149 [pdf, ps, other]

Agnostic Learning of Mixed Linear Regressions with EM and AM Algorithms

Authors: Avishek Ghosh, Arya Mazumdar

Abstract: Mixed linear regression is a well-studied problem in parametric statistics and machine learning. Given a set of samples, tuples of covariates and labels, the task of mixed linear regression is to find a small list of linear relationships that best fit the samples. Usually it is assumed that the label is generated stochastically by randomly selecting one of two or more linear functions, applying th… ▽ More Mixed linear regression is a well-studied problem in parametric statistics and machine learning. Given a set of samples, tuples of covariates and labels, the task of mixed linear regression is to find a small list of linear relationships that best fit the samples. Usually it is assumed that the label is generated stochastically by randomly selecting one of two or more linear functions, applying this chosen function to the covariates, and potentially introducing noise to the result. In that situation, the objective is to estimate the ground-truth linear functions up to some parameter error. The popular expectation maximization (EM) and alternating minimization (AM) algorithms have been previously analyzed for this. In this paper, we consider the more general problem of agnostic learning of mixed linear regression from samples, without such generative models. In particular, we show that the AM and EM algorithms, under standard conditions of separability and good initialization, lead to agnostic learning in mixed linear regression by converging to the population loss minimizers, for suitably defined loss functions. In some sense, this shows the strength of AM and EM algorithms that converges to ``optimal solutions'' even in the absence of realizable generative models. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: To appear in ICML 2024

arXiv:2406.00808 [pdf, other]

EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing

Authors: Hadrien Reynaud, Qingjie Meng, Mischa Dombrowski, Arijit Ghosh, Thomas Day, Alberto Gomez, Paul Leeson, Bernhard Kainz

Abstract: To make medical datasets accessible without sharing sensitive patient information, we introduce a novel end-to-end approach for generative de-identification of dynamic medical imaging data. Until now, generative methods have faced constraints in terms of fidelity, spatio-temporal coherence, and the length of generation, failing to capture the complete details of dataset distributions. We present a… ▽ More To make medical datasets accessible without sharing sensitive patient information, we introduce a novel end-to-end approach for generative de-identification of dynamic medical imaging data. Until now, generative methods have faced constraints in terms of fidelity, spatio-temporal coherence, and the length of generation, failing to capture the complete details of dataset distributions. We present a model designed to produce high-fidelity, long and complete data samples with near-real-time efficiency and explore our approach on a challenging task: generating echocardiogram videos. We develop our generation method based on diffusion models and introduce a protocol for medical video dataset anonymization. As an exemplar, we present EchoNet-Synthetic, a fully synthetic, privacy-compliant echocardiogram dataset with paired ejection fraction labels. As part of our de-identification protocol, we evaluate the quality of the generated dataset and propose to use clinical downstream tasks as a measurement on top of widely used but potentially biased image quality metrics. Experimental outcomes demonstrate that EchoNet-Synthetic achieves comparable dataset fidelity to the actual dataset, effectively supporting the ejection fraction regression task. Code, weights and dataset are available at https://github.com/HReynaud/EchoNet-Synthetic. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: Accepted at MICCAI 2024

arXiv:2406.00708 [pdf, other]

Les Houches 2023: Physics at TeV Colliders: Standard Model Working Group Report

Authors: J. Andersen, B. Assi, K. Asteriadis, P. Azzurri, G. Barone, A. Behring, A. Benecke, S. Bhattacharya, E. Bothmann, S. Caletti, X. Chen, M. Chiesa, A. Cooper-Sarkar, T. Cridge, A. Cueto Gomez, S. Datta, P. K. Dhani, M. Donega, T. Engel, S. Ferrario Ravasio, S. Forte, P. Francavilla, M. V. Garzelli, A. Ghira, A. Ghosh , et al. (59 additional authors not shown)

Abstract: This report presents a short summary of the activities of the "Standard Model" working group for the "Physics at TeV Colliders" workshop (Les Houches, France, 12-30 June, 2023). This report presents a short summary of the activities of the "Standard Model" working group for the "Physics at TeV Colliders" workshop (Les Houches, France, 12-30 June, 2023). △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: Proceedings of the Standard Model Working Group of the 2023 Les Houches Workshop, Physics at TeV Colliders, Les Houches 12-30 June 2023. 48 pages

Report number: DESY-24-076

arXiv:2405.20933 [pdf, ps, other]

Concentration Bounds for Optimized Certainty Equivalent Risk Estimation

Authors: Ayon Ghosh, L. A. Prashanth, Krishna Jagannathan

Abstract: We consider the problem of estimating the Optimized Certainty Equivalent (OCE) risk from independent and identically distributed (i.i.d.) samples. For the classic sample average approximation (SAA) of OCE, we derive mean-squared error as well as concentration bounds (assuming sub-Gaussianity). Further, we analyze an efficient stochastic approximation-based OCE estimator, and derive finite sample b… ▽ More We consider the problem of estimating the Optimized Certainty Equivalent (OCE) risk from independent and identically distributed (i.i.d.) samples. For the classic sample average approximation (SAA) of OCE, we derive mean-squared error as well as concentration bounds (assuming sub-Gaussianity). Further, we analyze an efficient stochastic approximation-based OCE estimator, and derive finite sample bounds for the same. To show the applicability of our bounds, we consider a risk-aware bandit problem, with OCE as the risk. For this problem, we derive bound on the probability of mis-identification. Finally, we conduct numerical experiments to validate the theoretical findings. △ Less

Submitted 31 May, 2024; originally announced May 2024.

arXiv:2405.19274 [pdf, other]

A Hanani-Tutte Theorem for Cycles

Authors: Sutanoya Chakraborty, Arijit Ghosh

Abstract: Given a drawing $D$ of a graph $G$, we define the crossing number between any two cycles $C_{1}$ and $C_{2}$ in $D$ to be the number of crossings that involve at least one edge from each of $C_1$ and $C_2$ except the crossings between edges that are common to both cycles. We show that if the crossing number between every two cycles in $G$ is even in a drawing of $G$ on the plane, then there is a p… ▽ More Given a drawing $D$ of a graph $G$, we define the crossing number between any two cycles $C_{1}$ and $C_{2}$ in $D$ to be the number of crossings that involve at least one edge from each of $C_1$ and $C_2$ except the crossings between edges that are common to both cycles. We show that if the crossing number between every two cycles in $G$ is even in a drawing of $G$ on the plane, then there is a planar drawing of $G$. This result can be extended to arbitrary surfaces. We also establish an equivalence between our result and a fundamental result due to Cairns-Nikolayevsky and Pelsmajer-Schaefer-Štefankovič, about drawing graphs on surfaces, and derive the Loebl-Masbaum theorem from it. △ Less

Submitted 12 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

Comments: Included equivalence with an established result, and derived a previous theorem from the result

arXiv:2405.18024 [pdf, other]

Emergent inhomogeneity and non-locality in a graphene field-effect transistor on a near-parallel moire superlattice of transition metal dichalcogenides

Authors: Shaili Sett, Rahul Debnath, Arup Singha, Shinjan Mandal, Jyothsna K, Monika Bhakar, Kenji Watanabe, Takashi Taniguchi, Varun Raghunathan, Goutam Sheet, Manish Jain, Arindam Ghosh

Abstract: At near-parallel orientation, twisted bilayer of transition metal dichalcogenides exhibit inter-layer charge transfer-driven out-of-plane ferroelectricity that may lead to unique electronic device architectures. Here we report detailed electrical transport in a dual-gated graphene field-effect transistor placed on 3R stacked twisted bilayer of WSe2 at a twist angle of 2.1 degree. We observe hyster… ▽ More At near-parallel orientation, twisted bilayer of transition metal dichalcogenides exhibit inter-layer charge transfer-driven out-of-plane ferroelectricity that may lead to unique electronic device architectures. Here we report detailed electrical transport in a dual-gated graphene field-effect transistor placed on 3R stacked twisted bilayer of WSe2 at a twist angle of 2.1 degree. We observe hysteretic transfer characteristics and an emergent charge inhomogeneity with multiple local Dirac points as the electric displacement field (D) is increased. Concomitantly, we also observe a strong non-local voltage signal at D = 0 V/nm that decreases rapidly with increasing D. A linear scaling of the non-local signal with longitudinal resistance suggests edge mode transport, which we attribute to the breaking of valley symmetry of the graphene channel due to the spatially fluctuating electric field from the moire domains of the underlying twisted WSe2. A quantitative analysis connecting the non-locality and channel inhomogeneity suggests emergence of finite-size domains in the graphene channel that modulate the charge and the valley currents simultaneously. This work underlines efficient control and impact of interfacial ferroelectricity that can trigger a new genre of devices for twistronic applications. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 16 pages, 15 figures

arXiv:2405.15445 [pdf, other]

Cracking of submerged beds

Authors: Satyanu Bhadra, Anit Sane, Akash Ghosh, Shankar Ghosh, Kirti Chandra Sahu

Abstract: We investigate the phenomena of crater formation and gas release caused by projectile impact on underwater beds, which occurs in many natural, geophysical, and industrial applications. The bed in our experiment is constructed of hydrophobic particles, which trap a substantial amount of air in its pores. In contrast to dry beds, the air-water interface in a submerged bed generates a granular skin t… ▽ More We investigate the phenomena of crater formation and gas release caused by projectile impact on underwater beds, which occurs in many natural, geophysical, and industrial applications. The bed in our experiment is constructed of hydrophobic particles, which trap a substantial amount of air in its pores. In contrast to dry beds, the air-water interface in a submerged bed generates a granular skin that provides rigidity to the medium by producing skin over the bulk. The projectile's energy is used to reorganise the grains, which causes the skin to crack, allowing the trapped air to escape. The morphology of the craters as a function of impact energy in submerged beds exhibits different scaling laws than what is known for dry beds. This phenomenon is attributed to the contact line motion on the hydrophobic fractal-like surface of submerged grains. The volume of the gas released is a function of multiple factors, chiefly the velocity of the projectile, depth of the bed and depth of the water column. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 15 pages, 10 figures

arXiv:2405.14684 [pdf, other]

Engineering ultra-strong electron-phonon coupling and nonclassical electron transport in crystalline gold with nanoscale interfaces

Authors: Shreya Kumbhakar, Tuhin Kumar Maji, Binita Tongbram, Shinjan Mandal, Shri Hari Soundararaj, Banashree Debnath, T. Phanindra Sai, Manish Jain, H. R. Krishnamurthy, Anshu Pandey, Arindam Ghosh

Abstract: Electrical resistivity in good metals, particularly noble metals such as gold (Au), silver (Ag), or copper, increases linearly with temperature ($T$) for $T > Θ_{\mathrm{D}}$, where $Θ_{\mathrm{D}}$ is the Debye temperature. This is because the coupling ($λ$) between the electrons and the lattice vibrations, or phonons, in these metals is rather weak with $λ\sim 0.1-0.2$, and a perturbative analys… ▽ More Electrical resistivity in good metals, particularly noble metals such as gold (Au), silver (Ag), or copper, increases linearly with temperature ($T$) for $T > Θ_{\mathrm{D}}$, where $Θ_{\mathrm{D}}$ is the Debye temperature. This is because the coupling ($λ$) between the electrons and the lattice vibrations, or phonons, in these metals is rather weak with $λ\sim 0.1-0.2$, and a perturbative analysis suffices to explain the $T$-linear electron-phonon scattering rate. In this work, we outline a new nanostructuring strategy of crystalline Au where this foundational concept of metallic transport breaks down. We show that by embedding a distributed network of ultra-small Ag nanoparticles (AgNPs) of radius $\sim1-2$ nm inside a crystalline Au shell, an unprecedented enhancement in the electron-phonon interaction, with $λ$ as high as $\approx 20$, can be achieved. This is over hundred times that of bare Au or Ag, and ten times larger than any known metal. With increasing AgNP density, the electrical resistivity deviates from $T$-linearity, and approaches a saturation to the Mott-Ioffe-Regel scale $ρ_{\mathrm{MIR}}\sim h a /e^2$ for both disorder ($T\to 0$) and phonon ($T \gg Θ_{\mathrm{D}}$)-dependent components of resistivity (here, $a=0.3$~nm, is the lattice constant of Au). This giant electron-phonon interaction, which we suggest arises from the coulomb interaction-induced coupling of conduction electrons to the localized phonon modes at the buried Au-Ag hetero-interfaces, allows experimental access to a regime of nonclassical metallic transport that has never been probed before. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 7+12 pages, total 4+8 figures

arXiv:2405.12835 [pdf, ps, other]

$SU(2)$-bundles over highly connected $8$-manifolds

Authors: Samik Basu, Aloke Kr. Ghosh, Subhankar Sau

Abstract: In this paper, we analyze the possible homotopy types of the total space of a principal $SU(2)$-bundle over a $3$-connected $8$-dimensional Poincaré duality complex. Along the way, we also classify the $3$-connected $11$-dimensional complexes $E$ formed from a wedge of $S^4$ and $S^7$ by attaching a $11$-cell. In this paper, we analyze the possible homotopy types of the total space of a principal $SU(2)$-bundle over a $3$-connected $8$-dimensional Poincaré duality complex. Along the way, we also classify the $3$-connected $11$-dimensional complexes $E$ formed from a wedge of $S^4$ and $S^7$ by attaching a $11$-cell. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 20 pages. We welcome any comments

MSC Class: Primary: 55R25; 57P10; Secondary: 57R19; 55P35

arXiv:2405.10167 [pdf, ps, other]

Near Uniform Triangle Sampling Over Adjacency List Graph Streams

Authors: Arijit Bishnu, Arijit Ghosh, Gopinath Mishra, Sayantan Sen

Abstract: Triangle counting and sampling are two fundamental problems for streaming algorithms. Arguably, designing sampling algorithms is more challenging than their counting variants. It may be noted that triangle counting has received far greater attention in the literature than the sampling variant. In this work, we consider the problem of approximately sampling triangles in different models of streamin… ▽ More Triangle counting and sampling are two fundamental problems for streaming algorithms. Arguably, designing sampling algorithms is more challenging than their counting variants. It may be noted that triangle counting has received far greater attention in the literature than the sampling variant. In this work, we consider the problem of approximately sampling triangles in different models of streaming with the focus being on the adjacency list model. In this problem, the edges of a graph $G$ will arrive over a data stream. The goal is to design efficient streaming algorithms that can sample and output a triangle from a distribution, over the triangles in $G$, that is close to the uniform distribution over the triangles in $G$. The distance between distributions is measured in terms of $\ell_1$-distance. The main technical contribution of this paper is to design algorithms for this triangle sampling problem in the adjacency list model with the space complexities matching their counting variants. For the sake of completeness, we also show results on the vertex and edge arrival models. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: 26 pages

arXiv:2405.09589 [pdf, other]

Unveiling Hallucination in Text, Image, Video, and Audio Foundation Models: A Comprehensive Survey

Authors: Pranab Sahoo, Prabhash Meharia, Akash Ghosh, Sriparna Saha, Vinija Jain, Aman Chadha

Abstract: The rapid advancement of foundation models (FMs) across language, image, audio, and video domains has shown remarkable capabilities in diverse tasks. However, the proliferation of FMs brings forth a critical challenge: the potential to generate hallucinated outputs, particularly in high-stakes applications. The tendency of foundation models to produce hallucinated content arguably represents the b… ▽ More The rapid advancement of foundation models (FMs) across language, image, audio, and video domains has shown remarkable capabilities in diverse tasks. However, the proliferation of FMs brings forth a critical challenge: the potential to generate hallucinated outputs, particularly in high-stakes applications. The tendency of foundation models to produce hallucinated content arguably represents the biggest hindrance to their widespread adoption in real-world scenarios, especially in domains where reliability and accuracy are paramount. This survey paper presents a comprehensive overview of recent developments that aim to identify and mitigate the problem of hallucination in FMs, spanning text, image, video, and audio modalities. By synthesizing recent advancements in detecting and mitigating hallucination across various modalities, the paper aims to provide valuable insights for researchers, developers, and practitioners. Essentially, it establishes a clear framework encompassing definition, taxonomy, and detection strategies for addressing hallucination in multimodal foundation models, laying the foundation for future research in this pivotal area. △ Less

Submitted 20 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.06924 [pdf, other]

Revolutionizing Quantum Mechanics: The Birth and Evolution of the Many-Worlds Interpretation

Authors: Arnub Ghosh

Abstract: The Many-worlds Interpretation (MWI) of quantum mechanics has captivated physicists and philosophers alike since its inception in the mid-20th century. This paper explores the historical roots, evolution, and implications of the MWI within the context of quantum theory. Beginning with an overview of early developments in quantum mechanics and the emergence of foundational interpretations, we delve… ▽ More The Many-worlds Interpretation (MWI) of quantum mechanics has captivated physicists and philosophers alike since its inception in the mid-20th century. This paper explores the historical roots, evolution, and implications of the MWI within the context of quantum theory. Beginning with an overview of early developments in quantum mechanics and the emergence of foundational interpretations, we delve into the origins of the MWI through the groundbreaking work of physicist Hugh Everett III. Everett's doctoral thesis proposed a radical solution to the measurement problem, positing the existence of multiple branching universes to account for quantum phenomenon. We trace the evolution of the MWI, examining its refinement and elaboration by subsequent physicists such as John Wheeler. Furthermore, we discuss the MWI's impact on contemporary physics, including its connections to quantum information theory and ongoing experimental tests. By providing a comprehensive analysis of the MWI's historical development and current relevance, this paper offers insights into one of the most provocative interpretations of quantum mechanics and its implications for our understanding of the universe. △ Less

Submitted 11 May, 2024; originally announced May 2024.

Comments: The article is currently being peer-reviewed at the journal: European Physical Journal H

arXiv:2405.06671 [pdf, other]

Parameter-Efficient Instruction Tuning of Large Language Models For Extreme Financial Numeral Labelling

Authors: Subhendu Khatuya, Rajdeep Mukherjee, Akash Ghosh, Manjunath Hegde, Koustuv Dasgupta, Niloy Ganguly, Saptarshi Ghosh, Pawan Goyal

Abstract: We study the problem of automatically annotating relevant numerals (GAAP metrics) occurring in the financial documents with their corresponding XBRL tags. Different from prior works, we investigate the feasibility of solving this extreme classification problem using a generative paradigm through instruction tuning of Large Language Models (LLMs). To this end, we leverage metric metadata informatio… ▽ More We study the problem of automatically annotating relevant numerals (GAAP metrics) occurring in the financial documents with their corresponding XBRL tags. Different from prior works, we investigate the feasibility of solving this extreme classification problem using a generative paradigm through instruction tuning of Large Language Models (LLMs). To this end, we leverage metric metadata information to frame our target outputs while proposing a parameter efficient solution for the task using LoRA. We perform experiments on two recently released financial numeric labeling datasets. Our proposed model, FLAN-FinXC, achieves new state-of-the-art performances on both the datasets, outperforming several strong baselines. We explain the better scores of our proposed model by demonstrating its capability for zero-shot as well as the least frequently occurring tags. Also, even when we fail to predict the XBRL tags correctly, our generated output has substantial overlap with the ground-truth in majority of the cases. △ Less

Submitted 15 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

Comments: This work has been accepted to appear at North American Chapter of the Association for Computational Linguistics (NAACL), 2024

arXiv:2405.05884 [pdf, other]

A meta inspiral-merger-ringdown consistency test of general relativity with gravitational wave signals from compact binaries

Authors: Sakshi Satish Madekar, Nathan K Johnson-McDaniel, Anuradha Gupta, Abhirup Ghosh

Abstract: The observation of gravitational waves from compact binary coalescences is a promising tool to test the validity of general relativity (GR) in a highly dynamical strong-field regime. There are now a variety of tests of GR performed on the observed compact binary signals. In this paper, we propose a new test of GR that compares the results of these individual tests. This meta inspiral-merger-ringdo… ▽ More The observation of gravitational waves from compact binary coalescences is a promising tool to test the validity of general relativity (GR) in a highly dynamical strong-field regime. There are now a variety of tests of GR performed on the observed compact binary signals. In this paper, we propose a new test of GR that compares the results of these individual tests. This meta inspiral-merger-ringdown consistency test (IMRCT) involves inferring the final mass and spin of the remnant black hole obtained from the analyses of two different tests of GR and checking for consistency. If there is a deviation from GR, we expect that different tests of GR will recover different values for the final mass and spin, in general. We check the performance of the meta IMRCT using a standard set of null tests used in various gravitational-wave analyses: the original IMRCT, parameterized phasing tests (TIGER and FTI) and the modified dispersion test. However, the meta IMRCT is applicable to any tests of GR that infer the initial masses and spins or the final mass and spin, including ones that are applied to binary neutron star or neutron star--black hole signals. We apply the meta IMRCT to simulated quasi-circular GR and non-GR binary black hole (BBH) signals as well as to eccentric BBH signals in GR (analyzed with quasicircular waveforms). We find that the meta IMRCT gives consistency with GR for the quasi-circular GR signals and picks up a deviation from GR in the other cases, as do other tests. In some cases, the meta IMRCT finds a significant GR deviation for a given pair of tests (and specific testing parameters) while the individual tests do not, showing that it is more sensitive than the individual tests to certain types of deviations. In addition, we also apply this test to a few selected real compact binary signals and find them consistent with GR. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: 27 pages, 30 figures

arXiv:2405.02173 [pdf, other]

Task Synthesis for Elementary Visual Programming in XLogoOnline Environment

Authors: Chao Wen, Ahana Ghosh, Jacqueline Staub, Adish Singla

Abstract: In recent years, the XLogoOnline programming platform has gained popularity among novice learners. It integrates the Logo programming language with visual programming, providing a visual interface for learning computing concepts. However, XLogoOnline offers only a limited set of tasks, which are inadequate for learners to master the computing concepts that require sufficient practice. To address t… ▽ More In recent years, the XLogoOnline programming platform has gained popularity among novice learners. It integrates the Logo programming language with visual programming, providing a visual interface for learning computing concepts. However, XLogoOnline offers only a limited set of tasks, which are inadequate for learners to master the computing concepts that require sufficient practice. To address this, we introduce XLogoSyn, a novel technique for synthesizing high-quality tasks for varying difficulty levels. Given a reference task, XLogoSyn can generate practice tasks at varying difficulty levels that cater to the varied needs and abilities of different learners. XLogoSyn achieves this by combining symbolic execution and constraint satisfaction techniques. Our expert study demonstrates the effectiveness of XLogoSyn. We have also deployed synthesized practice tasks into XLogoOnline, highlighting the educational benefits of these synthesized practice tasks. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: Accepted as a paper at the AIED'24 conference in the late-breaking results track

arXiv:2405.01295 [pdf, other]

Thermodynamic theory of inverse current in coupled quantum transport

Authors: Shuvadip Ghosh, Nikhil Gupt, Arnab Ghosh

Abstract: The inverse current in coupled (ICC) quantum transport, where one induced current opposes all thermodynamic forces of a system, is a highly counter-intuitive transport phenomenon. Using an exactly solvable model of strongly-coupled quantum dots, we present thermodynamic description of ICC in energy and spin-induced particle currents, with potential applications towards unconventional and autonomou… ▽ More The inverse current in coupled (ICC) quantum transport, where one induced current opposes all thermodynamic forces of a system, is a highly counter-intuitive transport phenomenon. Using an exactly solvable model of strongly-coupled quantum dots, we present thermodynamic description of ICC in energy and spin-induced particle currents, with potential applications towards unconventional and autonomous nanoscale thermoelectric generators. Our analysis reveals the connection between microscopic and macroscopic formulations of entropy production rates, elucidating the often-overlooked role of proper thermodynamic forces and conjugate fluxes in characterizing genuine ICC. In our model, the seemingly paradoxical results of ICC in the energy current arise from chemical work done by current-carrying quantum particles, while in spin-induced particle current, it stems from the relative competition between electron reservoirs controlling one particular transition. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: 18 pages, 6 figures

arXiv:2404.16156 [pdf, other]

Guardians of the Quantum GAN

Authors: Archisman Ghosh, Debarshi Kundu, Avimita Chatterjee, Swaroop Ghosh

Abstract: Quantum Generative Adversarial Networks (qGANs) are at the forefront of image-generating quantum machine learning models. To accommodate the growing demand for Noisy Intermediate-Scale Quantum (NISQ) devices to train and infer quantum machine learning models, the number of third-party vendors offering quantum hardware as a service is expected to rise. This expansion introduces the risk of untruste… ▽ More Quantum Generative Adversarial Networks (qGANs) are at the forefront of image-generating quantum machine learning models. To accommodate the growing demand for Noisy Intermediate-Scale Quantum (NISQ) devices to train and infer quantum machine learning models, the number of third-party vendors offering quantum hardware as a service is expected to rise. This expansion introduces the risk of untrusted vendors potentially stealing proprietary information from the quantum machine learning models. To address this concern we propose a novel watermarking technique that exploits the noise signature embedded during the training phase of qGANs as a non-invasive watermark. The watermark is identifiable in the images generated by the qGAN allowing us to trace the specific quantum hardware used during training hence providing strong proof of ownership. To further enhance the security robustness, we propose the training of qGANs on a sequence of multiple quantum hardware, embedding a complex watermark comprising the noise signatures of all the training hardware that is difficult for adversaries to replicate. We also develop a machine learning classifier to extract this watermark robustly, thereby identifying the training hardware (or the suite of hardware) from the images generated by the qGAN validating the authenticity of the model. We note that the watermark signature is robust against inferencing on hardware different than the hardware that was used for training. We obtain watermark extraction accuracy of 100% and ~90% for training the qGAN on individual and multiple quantum hardware setups (and inferencing on different hardware), respectively. Since parameter evolution during training is strongly modulated by quantum noise, the proposed watermark can be extended to other quantum machine learning models as well. △ Less

Submitted 15 May, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

Comments: 11 pages, 10 figures

arXiv:2404.15905 [pdf, ps, other]

Statistically characterized subgroups related to some non-arithmetic sequence of integers

Authors: Pratulananda Das, Ayan Ghosh

Abstract: Recently in [15], characterized subgroups are investigated for some special kind non-arithmetic sequences. In this note we study subsequent problems in case of "statistically characterized subgroups" introduced in [18]. The whole investigation again reiterates that these statistically characterized subgroups behave in a much different manner compared to classical characterized subgroups which reso… ▽ More Recently in [15], characterized subgroups are investigated for some special kind non-arithmetic sequences. In this note we study subsequent problems in case of "statistically characterized subgroups" introduced in [18]. The whole investigation again reiterates that these statistically characterized subgroups behave in a much different manner compared to classical characterized subgroups which resolves an open question raised in [18]. After that we have provided the first ever example of a statistically characterized subgroup which is countable resolving many open problems from the literature. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.14332 [pdf, other]

Full Event Particle-Level Unfolding with Variable-Length Latent Variational Diffusion

Authors: Alexander Shmakov, Kevin Greif, Michael James Fenton, Aishik Ghosh, Pierre Baldi, Daniel Whiteson

Abstract: The measurements performed by particle physics experiments must account for the imperfect response of the detectors used to observe the interactions. One approach, unfolding, statistically adjusts the experimental data for detector effects. Recently, generative machine learning models have shown promise for performing unbinned unfolding in a high number of dimensions. However, all current generati… ▽ More The measurements performed by particle physics experiments must account for the imperfect response of the detectors used to observe the interactions. One approach, unfolding, statistically adjusts the experimental data for detector effects. Recently, generative machine learning models have shown promise for performing unbinned unfolding in a high number of dimensions. However, all current generative approaches are limited to unfolding a fixed set of observables, making them unable to perform full-event unfolding in the variable dimensional environment of collider data. A novel modification to the variational latent diffusion model (VLD) approach to generative unfolding is presented, which allows for unfolding of high- and variable-dimensional feature spaces. The performance of this method is evaluated in the context of semi-leptonic top quark pair production at the Large Hadron Collider. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: Submission to SciPost

arXiv:2404.10875 [pdf, other]

A Dataset for Large Language Model-Driven AI Accelerator Generation

Authors: Mahmoud Nazzal, Deepak Vungarala, Mehrdad Morsali, Chao Zhang, Arnob Ghosh, Abdallah Khreishah, Shaahin Angizi

Abstract: In the ever-evolving landscape of Deep Neural Networks (DNN) hardware acceleration, unlocking the true potential of systolic array accelerators has long been hindered by the daunting challenges of expertise and time investment. Large Language Models (LLMs) offer a promising solution for automating code generation which is key to unlocking unprecedented efficiency and performance in various domains… ▽ More In the ever-evolving landscape of Deep Neural Networks (DNN) hardware acceleration, unlocking the true potential of systolic array accelerators has long been hindered by the daunting challenges of expertise and time investment. Large Language Models (LLMs) offer a promising solution for automating code generation which is key to unlocking unprecedented efficiency and performance in various domains, including hardware descriptive code. However, the successful application of LLMs to hardware accelerator design is contingent upon the availability of specialized datasets tailored for this purpose. To bridge this gap, we introduce the Systolic Array-based Accelerator DataSet (SA-DS). SA-DS comprises of a diverse collection of spatial arrays following the standardized Berkeley's Gemmini accelerator generator template, enabling design reuse, adaptation, and customization. SA-DS is intended to spark LLM-centred research on DNN hardware accelerator architecture. We envision that SA-DS provides a framework which will shape the course of DNN hardware acceleration research for generations to come. SA-DS is open-sourced under the permissive MIT license at this https://github.com/ACADLab/SA-DS. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: 4 pages, 4 Figures

arXiv:2404.09361 [pdf, other]

Initial results of our spectro-photometric monitoring of XZ Tau

Authors: Arpan Ghosh, Saurabh Sharma, Joe Philip Ninan, Devendra K. Ojha, Aayushi Verma, Tarak Chand Sahu, Rakesh Pandey, Koshvendra Singh

Abstract: We present here initial results of our spectro-photometric monitoring of XZ Tau. During our monitoring period, XZ Tau exhibited several episodes of brightness variations in timescales of months at optical wavelengths in contrast to the mid-infrared wavelengths. The color evolution of XZ Tau during this period suggest that the brightness variations are driven by changes in accretion from the disc.… ▽ More We present here initial results of our spectro-photometric monitoring of XZ Tau. During our monitoring period, XZ Tau exhibited several episodes of brightness variations in timescales of months at optical wavelengths in contrast to the mid-infrared wavelengths. The color evolution of XZ Tau during this period suggest that the brightness variations are driven by changes in accretion from the disc. The mid-infrared light curve shows an overall decline in brightness by $\sim$ 0.5 and 0.7 magnitude respectively in WISE W1 (3.4 $μ$m) and W2 (4.6 $μ$m) bands. The emission profile of the hydrogen recombination lines along with that of Ca II IRT lines points towards magnetospheric accretion of XZ Tau. We have detected P Cygni profile in H$β$ indicating of outflowing winds from regions close to accretion. Forbidden transitions of oxygen are also detected, likely indicating of jets originating around the central pre-main sequence star. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: 11 pages, 3 figures, Accepted for publication in The Bulletin de la Société Royale des Sciences de Liège

arXiv:2404.08737 [pdf, other]

Potential of quantum scientific machine learning applied to weather modelling

Authors: Ben Jaderberg, Antonio A. Gentile, Atiyo Ghosh, Vincent E. Elfving, Caitlin Jones, Davide Vodola, John Manobianco, Horst Weiss

Abstract: In this work we explore how quantum scientific machine learning can be used to tackle the challenge of weather modelling. Using parameterised quantum circuits as machine learning models, we consider two paradigms: supervised learning from weather data and physics-informed solving of the underlying equations of atmospheric dynamics. In the first case, we demonstrate how a quantum model can be train… ▽ More In this work we explore how quantum scientific machine learning can be used to tackle the challenge of weather modelling. Using parameterised quantum circuits as machine learning models, we consider two paradigms: supervised learning from weather data and physics-informed solving of the underlying equations of atmospheric dynamics. In the first case, we demonstrate how a quantum model can be trained to accurately reproduce real-world global stream function dynamics at a resolution of 4°. We detail a number of problem-specific classical and quantum architecture choices used to achieve this result. Subsequently, we introduce the barotropic vorticity equation (BVE) as our model of the atmosphere, which is a $3^{\text{rd}}$ order partial differential equation (PDE) in its stream function formulation. Using the differentiable quantum circuits algorithm, we successfully solve the BVE under appropriate boundary conditions and use the trained model to predict unseen future dynamics to high accuracy given an artificial initial weather state. Whilst challenges remain, our results mark an advancement in terms of the complexity of PDEs solved with quantum scientific machine learning. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 16 pages, 5 figures

arXiv:2404.07214 [pdf, other]

Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions

Authors: Akash Ghosh, Arkadeep Acharya, Sriparna Saha, Vinija Jain, Aman Chadha

Abstract: The advent of Large Language Models (LLMs) has significantly reshaped the trajectory of the AI revolution. Nevertheless, these LLMs exhibit a notable limitation, as they are primarily adept at processing textual information. To address this constraint, researchers have endeavored to integrate visual capabilities with LLMs, resulting in the emergence of Vision-Language Models (VLMs). These advanced… ▽ More The advent of Large Language Models (LLMs) has significantly reshaped the trajectory of the AI revolution. Nevertheless, these LLMs exhibit a notable limitation, as they are primarily adept at processing textual information. To address this constraint, researchers have endeavored to integrate visual capabilities with LLMs, resulting in the emergence of Vision-Language Models (VLMs). These advanced models are instrumental in tackling more intricate tasks such as image captioning and visual question answering. In our comprehensive survey paper, we delve into the key advancements within the realm of VLMs. Our classification organizes VLMs into three distinct categories: models dedicated to vision-language understanding, models that process multimodal inputs to generate unimodal (textual) outputs and models that both accept and produce multimodal inputs and outputs.This classification is based on their respective capabilities and functionalities in processing and generating various modalities of data.We meticulously dissect each model, offering an extensive analysis of its foundational architecture, training data sources, as well as its strengths and limitations wherever possible, providing readers with a comprehensive understanding of its essential components. We also analyzed the performance of VLMs in various benchmark datasets. By doing so, we aim to offer a nuanced understanding of the diverse landscape of VLMs. Additionally, we underscore potential avenues for future research in this dynamic domain, anticipating further breakthroughs and advancements. △ Less

Submitted 12 April, 2024; v1 submitted 20 February, 2024; originally announced April 2024.

Comments: The most extensive and up to date Survey on Visual Language Models covering 76 Visual Language Models

arXiv:2404.05646 [pdf]

Linear and Nonlinear Coupling of Twin-Resonators with Kerr Nonlinearity

Authors: Arghadeep Pal, Alekhya Ghosh, Shuangyou Zhang, Lewis Hill, Haochen Yan, Hao Zhang, Toby Bi, Abdullah Alabbadi, Pascal Del'Haye

Abstract: Nonlinear effects in microresonators are efficient building blocks for all-optical computing and telecom systems. With the latest advances in microfabrication, coupled microresonators are used in a rapidly growing number of applications. In this work, we investigate the coupling between twin-resonators in the presence of Kerr-nonlinearity. We use an experimental setup with controllable coupling be… ▽ More Nonlinear effects in microresonators are efficient building blocks for all-optical computing and telecom systems. With the latest advances in microfabrication, coupled microresonators are used in a rapidly growing number of applications. In this work, we investigate the coupling between twin-resonators in the presence of Kerr-nonlinearity. We use an experimental setup with controllable coupling between two high-Q resonators and discuss the effects caused by the simultaneous presence of linear and non-linear coupling between the optical fields. Linear-coupling-induced mode splitting is observed at low input powers, with the controllable coupling leading to a tunable mode splitting. At high input powers, the hybridized resonances show spontaneous symmetry breaking (SSB) effects, in which the optical power is unevenly distributed between the resonators. Our experimental results are supported by a detailed theoretical model of nonlinear twin-resonators. With the recent interest in coupled resonator systems for neuromorphic computing, quantum systems, and optical frequency comb generation, our work provides important insights into the behavior of these systems at high circulating powers. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: 7 pages, 5 figures

arXiv:2404.05420 [pdf, other]

Accretion Funnel Reconfiguration during an Outburst in a Young Stellar Object: EX Lupi

Authors: Koshvendra Singh, Joe P. Ninan, Marina M. Romanova, David A. H. Buckley, Devendra K. Ojha, Arpan Ghosh, Andrew Monson, Malte Schramm, Saurabh Sharma, Daniel E. Reichart, Joanna Mikolajewska, Juan Carlos Beamin, J. Borissova, Valentin D. Ivanov, Vladimir V. Kouprianov, Franz-Josef Hambsch, Andrew Pearce

Abstract: EX Lupi, a low-mass young stellar object, went into an accretion-driven outburst in March of 2022. The outburst caused a sudden phase change of ~ 112$^{\circ}$ $\pm$ 5$^{\circ}$ in periodically oscillating multiband lightcurves. Our high resolution spectra obtained with HRS on SALT also revealed a consistent phase change in the periodically varying radial velocities, along with an increase in the… ▽ More EX Lupi, a low-mass young stellar object, went into an accretion-driven outburst in March of 2022. The outburst caused a sudden phase change of ~ 112$^{\circ}$ $\pm$ 5$^{\circ}$ in periodically oscillating multiband lightcurves. Our high resolution spectra obtained with HRS on SALT also revealed a consistent phase change in the periodically varying radial velocities, along with an increase in the radial velocity amplitude of various emission lines. The phase change and increase of radial velocity amplitude morphologically translates to a change in the azimuthal and latitudinal location of the accretion hotspot over the stellar surface, which indicates a reconfiguration of the accretion funnel geometry. Our 3D MHD simulations reproduce the phase change for EX Lupi. To explain the observations we explored the possibility of forward shifting of the dipolar accretion funnel as well as the possibility of an emergence of a new accretion funnel. During the outburst, we also found evidence of the hotspot's morphology extending azimuthally, asymmetrically with a leading hot edge and cold tail along the stellar rotation. Our high cadence photometry showed that the accretion flow has clumps. We also detected possible clumpy accretion events in the HRS spectra, that showed episodically highly blue-shifted wings in the Ca II IRT and Balmer H lines. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: Accepted for publication in The Astrophysical Journal

arXiv:2404.04502 [pdf, ps, other]

The interplay between additive and symmetric large sets and their combinatorial applications

Authors: Arkabrata Ghosh, Sayan Goswami, Sourav Kanti Patra

Abstract: The study of symmetric structures is a new trend in Ramsey theory. Recently in [7], Di Nasso initiated a systematic study of symmetrization of classical Ramsey theoretical results, and proved a symmetric version of several Ramsey theoretic results. In this paper Di Nasso asked if his method could be adapted to find new non-linear Diophantine equations that are partition regular [7,Final remarks (4… ▽ More The study of symmetric structures is a new trend in Ramsey theory. Recently in [7], Di Nasso initiated a systematic study of symmetrization of classical Ramsey theoretical results, and proved a symmetric version of several Ramsey theoretic results. In this paper Di Nasso asked if his method could be adapted to find new non-linear Diophantine equations that are partition regular [7,Final remarks (4)]. By analyzing additive, multiplicative, and symmetric large sets, we construct new partition regular equations that give a first affirmative answer to this question. A special case of our result shows that if $P$ is a polynomial with no constant term then the equation $x+P(y-x)=z+w+zw$, where $y\neq x$ is partition regular. Also we prove several new monochromatic patterns involving additive, multiplicative, and symmetric structures. Throughout our work, we use tools from the Algebra of the Stone-Čech Compactifications of discrete semigroups. △ Less

Submitted 17 April, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

Comments: We recovered the mistake in the previous draft. Comments welcome

arXiv:2404.04224 [pdf, other]

Active Causal Learning for Decoding Chemical Complexities with Targeted Interventions

Authors: Zachary R. Fox, Ayana Ghosh

Abstract: Predicting and enhancing inherent properties based on molecular structures is paramount to design tasks in medicine, materials science, and environmental management. Most of the current machine learning and deep learning approaches have become standard for predictions, but they face challenges when applied across different datasets due to reliance on correlations between molecular representation a… ▽ More Predicting and enhancing inherent properties based on molecular structures is paramount to design tasks in medicine, materials science, and environmental management. Most of the current machine learning and deep learning approaches have become standard for predictions, but they face challenges when applied across different datasets due to reliance on correlations between molecular representation and target properties. These approaches typically depend on large datasets to capture the diversity within the chemical space, facilitating a more accurate approximation, interpolation, or extrapolation of the chemical behavior of molecules. In our research, we introduce an active learning approach that discerns underlying cause-effect relationships through strategic sampling with the use of a graph loss function. This method identifies the smallest subset of the dataset capable of encoding the most information representative of a much larger chemical space. The identified causal relations are then leveraged to conduct systematic interventions, optimizing the design task within a chemical space that the models have not encountered previously. While our implementation focused on the QM9 quantum-chemical dataset for a specific design task-finding molecules with a large dipole moment-our active causal learning approach, driven by intelligent sampling and interventions, holds potential for broader applications in molecular, materials design and discovery. △ Less

Submitted 5 April, 2024; originally announced April 2024.

arXiv:2404.04125 [pdf, other]

No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance

Authors: Vishaal Udandarao, Ameya Prabhu, Adhiraj Ghosh, Yash Sharma, Philip H. S. Torr, Adel Bibi, Samuel Albanie, Matthias Bethge

Abstract: Web-crawled pretraining datasets underlie the impressive "zero-shot" evaluation performance of multimodal models, such as CLIP for classification/retrieval and Stable-Diffusion for image generation. However, it is unclear how meaningful the notion of "zero-shot" generalization is for such multimodal models, as it is not known to what extent their pretraining datasets encompass the downstream conce… ▽ More Web-crawled pretraining datasets underlie the impressive "zero-shot" evaluation performance of multimodal models, such as CLIP for classification/retrieval and Stable-Diffusion for image generation. However, it is unclear how meaningful the notion of "zero-shot" generalization is for such multimodal models, as it is not known to what extent their pretraining datasets encompass the downstream concepts targeted for during "zero-shot" evaluation. In this work, we ask: How is the performance of multimodal models on downstream concepts influenced by the frequency of these concepts in their pretraining datasets? We comprehensively investigate this question across 34 models and five standard pretraining datasets (CC-3M, CC-12M, YFCC-15M, LAION-400M, LAION-Aesthetics), generating over 300GB of data artifacts. We consistently find that, far from exhibiting "zero-shot" generalization, multimodal models require exponentially more data to achieve linear improvements in downstream "zero-shot" performance, following a sample inefficient log-linear scaling trend. This trend persists even when controlling for sample-level similarity between pretraining and downstream datasets, and testing on purely synthetic data distributions. Furthermore, upon benchmarking models on long-tailed data sampled based on our analysis, we demonstrate that multimodal models across the board perform poorly. We contribute this long-tail test set as the "Let it Wag!" benchmark to further research in this direction. Taken together, our study reveals an exponential need for training data which implies that the key to "zero-shot" generalization capabilities under large-scale training paradigms remains to be found. △ Less

Submitted 8 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: Extended version of the short paper accepted at DPFM, ICLR'24

arXiv:2404.03404 [pdf, other]

Robust inference for linear regression models with possibly skewed error distribution

Authors: Amarnath Nandy, Ayanendranath Basu, Abhik Ghosh

Abstract: Traditional methods for linear regression generally assume that the underlying error distribution, equivalently the distribution of the responses, is normal. Yet, sometimes real life response data may exhibit a skewed pattern, and assuming normality would not give reliable results in such cases. This is often observed in cases of some biomedical, behavioral, socio-economic and other variables. In… ▽ More Traditional methods for linear regression generally assume that the underlying error distribution, equivalently the distribution of the responses, is normal. Yet, sometimes real life response data may exhibit a skewed pattern, and assuming normality would not give reliable results in such cases. This is often observed in cases of some biomedical, behavioral, socio-economic and other variables. In this paper, we propose to use the class of skew normal (SN) distributions, which also includes the ordinary normal distribution as its special case, as the model for the errors in a linear regression setup and perform subsequent statistical inference using the popular and robust minimum density power divergence approach to get stable insights in the presence of possible data contamination (e.g., outliers). We provide the asymptotic distribution of the proposed estimator of the regression parameters and also propose robust Wald-type tests of significance for these parameters. We provide an influence function analysis of these estimators and test statistics, and also provide level and power influence functions. Numerical verification including simulation studies and real data analysis is provided to substantiate the theory developed. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: Pre-print; under review

arXiv:2404.01285 [pdf, ps, other]

Weak-coupling limits of the quantum Langevin equation for an oscillator

Authors: Aritra Ghosh, Sushanta Dattagupta

Abstract: The quantum Langevin equation as obtained from the independent-oscillator model describes a strong-coupling situation, devoid of the Born-Markov approximation that is employed in the context of the Gorini-Kossakowski-Sudarshan-Lindblad equation. The question we address is what happens when we implement such `Born-Markov'-like approximations at the level of the quantum Langevin equation for a harmo… ▽ More The quantum Langevin equation as obtained from the independent-oscillator model describes a strong-coupling situation, devoid of the Born-Markov approximation that is employed in the context of the Gorini-Kossakowski-Sudarshan-Lindblad equation. The question we address is what happens when we implement such `Born-Markov'-like approximations at the level of the quantum Langevin equation for a harmonic oscillator which carries a noise term satisfying a fluctuation-dissipation theorem. In this backdrop, we also comment on the rotating-wave approximation. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2404.00401 [pdf, other]

How Robust are the Tabular QA Models for Scientific Tables? A Study using Customized Dataset

Authors: Akash Ghosh, B Venkata Sahith, Niloy Ganguly, Pawan Goyal, Mayank Singh

Abstract: Question-answering (QA) on hybrid scientific tabular and textual data deals with scientific information, and relies on complex numerical reasoning. In recent years, while tabular QA has seen rapid progress, understanding their robustness on scientific information is lacking due to absence of any benchmark dataset. To investigate the robustness of the existing state-of-the-art QA models on scientif… ▽ More Question-answering (QA) on hybrid scientific tabular and textual data deals with scientific information, and relies on complex numerical reasoning. In recent years, while tabular QA has seen rapid progress, understanding their robustness on scientific information is lacking due to absence of any benchmark dataset. To investigate the robustness of the existing state-of-the-art QA models on scientific hybrid tabular data, we propose a new dataset, "SciTabQA", consisting of 822 question-answer pairs from scientific tables and their descriptions. With the help of this dataset, we assess the state-of-the-art Tabular QA models based on their ability (i) to use heterogeneous information requiring both structured data (table) and unstructured data (text) and (ii) to perform complex scientific reasoning tasks. In essence, we check the capability of the models to interpret scientific tables and text. Our experiments show that "SciTabQA" is an innovative dataset to study question-answering over scientific heterogeneous data. We benchmark three state-of-the-art Tabular QA models, and find that the best F1 score is only 0.462. △ Less

Submitted 30 March, 2024; originally announced April 2024.

arXiv:2403.20160 [pdf, other]

Stripe 82X Data Release 3: Multiwavelength Catalog with New Spectroscopic Redshifts and Black Hole Masses

Authors: Stephanie M. LaMassa, Alessandro Peca, C. Megan Urry, Eilat Glikman, Tonima Tasnim Ananna, Connor Auge, Francesca Civano, Aritra Ghosh, Allison Kirkpatrick, Michael J. Koss, Meredith Powell, Mara Salvato, Benny Trakhtenbrot

Abstract: We present the third catalog release of the wide-area (31.3 deg$^2$) Stripe 82 X-ray survey. This catalog combines previously published X-ray source properties with multiwavelength counterparts and photometric redshifts, presents 343 new spectroscopic redshifts, and provides black hole masses for 1396 Type 1 Active Galactic Nuclei (AGN). With spectroscopic redshifts for 3457 out of 6181 Stripe 82X… ▽ More We present the third catalog release of the wide-area (31.3 deg$^2$) Stripe 82 X-ray survey. This catalog combines previously published X-ray source properties with multiwavelength counterparts and photometric redshifts, presents 343 new spectroscopic redshifts, and provides black hole masses for 1396 Type 1 Active Galactic Nuclei (AGN). With spectroscopic redshifts for 3457 out of 6181 Stripe 82X sources, the survey has a spectroscopic completeness of 56%. This completeness rises to 90% when considering the contiguous portions of the Stripe 82X survey with homogeneous X-ray coverage at an optical magnitude limit of $r<22$. Within that portion of the survey, 23% of AGN can be considered obscured by being either a Type 2 AGN, reddened ($R-K > 4$, Vega), or X-ray obscured with a column density $N_{\rm H} > 10^{22}$ cm$^{-2}$. Unlike other surveys, there is only a 18% overlap between Type 2 and X-ray obscured AGN. We calculated black hole masses for Type 1 AGN that have SDSS spectra using virial mass estimators calibrated on the H$β$, MgII, H$α$, and CIV emission lines. We find systematic differences in these black hole mass estimates, indicating that statiscal analyses should use black hole masses calculated from the same formula to minimize systematic bias. We find that the highest luminosity AGN are accreting at the highest Eddington ratios, consistent with the picture that most mass accretion happens in the phase when the AGN is luminous ($L_{\rm 2-10 keV} > 10^{45}$ erg s$^{-1}$). △ Less

Submitted 29 March, 2024; originally announced March 2024.

Comments: 32 pages, 11 figures, submitted to AAS journals

Showing 1–50 of 1,658 results for author: Ghosh, A