-
CuSINeS: Curriculum-driven Structure Induced Negative Sampling for Statutory Article Retrieval
Authors:
T. Y. S. S Santosh,
Kristina Kaiser,
Matthias Grabmair
Abstract:
In this paper, we introduce CuSINeS, a negative sampling approach to enhance the performance of Statutory Article Retrieval (SAR). CuSINeS offers three key contributions. Firstly, it employs a curriculum-based negative sampling strategy guiding the model to focus on easier negatives initially and progressively tackle more difficult ones. Secondly, it leverages the hierarchical and sequential infor…
▽ More
In this paper, we introduce CuSINeS, a negative sampling approach to enhance the performance of Statutory Article Retrieval (SAR). CuSINeS offers three key contributions. Firstly, it employs a curriculum-based negative sampling strategy guiding the model to focus on easier negatives initially and progressively tackle more difficult ones. Secondly, it leverages the hierarchical and sequential information derived from the structural organization of statutes to evaluate the difficulty of samples. Lastly, it introduces a dynamic semantic difficulty assessment using the being-trained model itself, surpassing conventional static methods like BM25, adapting the negatives to the model's evolving competence. Experimental results on a real-world expert-annotated SAR dataset validate the effectiveness of CuSINeS across four different baselines, demonstrating its versatility.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
Combined Static Analysis and Machine Learning Prediction for Application Debloating
Authors:
Chris Porter,
Sharjeel Khan,
Kangqi Ni,
Santosh Pande
Abstract:
Software debloating can effectively thwart certain code reuse attacks by reducing attack surfaces to break gadget chains. Approaches based on static analysis enable a reduced set of functions reachable at a callsite for execution by leveraging static properties of the callgraph. This achieves low runtime overhead, but the function set is conservatively computed, negatively affecting reduction. In…
▽ More
Software debloating can effectively thwart certain code reuse attacks by reducing attack surfaces to break gadget chains. Approaches based on static analysis enable a reduced set of functions reachable at a callsite for execution by leveraging static properties of the callgraph. This achieves low runtime overhead, but the function set is conservatively computed, negatively affecting reduction. In contrast, approaches based on machine learning (ML) have much better precision and can sharply reduce function sets, leading to significant improvement in attack surface. Nevertheless, mispredictions occur in ML-based approaches. These cause overheads, and worse, there is no clear way to distinguish between mispredictions and actual attacks.
In this work, we contend that a software debloating approach that incorporates ML-based predictions at runtime is realistic in a whole application setting, and that it can achieve significant attack surface reductions beyond the state of the art. We develop a framework, Predictive Debloat with Static Guarantees (PDSG). PDSG is fully sound and works on application source code. At runtime it predicts the dynamic callee set emanating from a callsite, and to resolve mispredictions, it employs a lightweight audit based on static invariants of call chains. We deduce the invariants offline and assert that they hold at runtime when there is a misprediction. To the best of our knowledge, it achieves the highest gadget reductions among similar techniques on SPEC CPU 2017, reducing 82.5% of the total gadgets on average. It triggers misprediction checks on only 3.8% of the total predictions invoked at runtime, and it leverages Datalog to verify dynamic call sequences conform to the static call relations. It has an overhead of 8.9%, which makes the scheme attractive for practical deployments.
△ Less
Submitted 29 March, 2024;
originally announced April 2024.
-
A Review of Sustainable Practices in Road Freight Transport
Authors:
Subash Gupta,
Santosh Adhikari,
Arbia Hlali
Abstract:
Sustainable road freight transport becomes indispensable in the field of transportation and logistics. The new technological change, the environmental impacts, and social responsibility laid freight road transport in front of various challenges, which makes the sustainable practices a vital solution in the sector. This paper aims to provide a theoretical research findings in sustainable road freig…
▽ More
Sustainable road freight transport becomes indispensable in the field of transportation and logistics. The new technological change, the environmental impacts, and social responsibility laid freight road transport in front of various challenges, which makes the sustainable practices a vital solution in the sector. This paper aims to provide a theoretical research findings in sustainable road freight transport. The methodology discusses the road freight transport sustainability indicators among the literature studies realized in different countries in the world. The review analysis the studies and practical applications from various countries. The result exposes that the sustainability dimensions such as economic, social, environment was discussed in different cases, which prove the efforts of many countries to reduce environmental impact, improve economic efficiency, support social well-being, and expand technological innovations to achieve a sustainable transport system.
△ Less
Submitted 16 April, 2024; v1 submitted 28 March, 2024;
originally announced March 2024.
-
Beyond Borders: Investigating Cross-Jurisdiction Transfer in Legal Case Summarization
Authors:
T. Y. S. S Santosh,
Vatsal Venkatkrishna,
Saptarshi Ghosh,
Matthias Grabmair
Abstract:
Legal professionals face the challenge of managing an overwhelming volume of lengthy judgments, making automated legal case summarization crucial. However, prior approaches mainly focused on training and evaluating these models within the same jurisdiction. In this study, we explore the cross-jurisdictional generalizability of legal case summarization models.Specifically, we explore how to effecti…
▽ More
Legal professionals face the challenge of managing an overwhelming volume of lengthy judgments, making automated legal case summarization crucial. However, prior approaches mainly focused on training and evaluating these models within the same jurisdiction. In this study, we explore the cross-jurisdictional generalizability of legal case summarization models.Specifically, we explore how to effectively summarize legal cases of a target jurisdiction where reference summaries are not available. In particular, we investigate whether supplementing models with unlabeled target jurisdiction corpus and extractive silver summaries obtained from unsupervised algorithms on target data enhances transfer performance. Our comprehensive study on three datasets from different jurisdictions highlights the role of pre-training in improving transfer performance. We shed light on the pivotal influence of jurisdictional similarity in selecting optimal source datasets for effective transfer. Furthermore, our findings underscore that incorporating unlabeled target data yields improvements in general pre-trained models, with additional gains when silver summaries are introduced. This augmentation is especially valuable when dealing with extractive datasets and scenarios featuring limited alignment between source and target jurisdictions. Our study provides key insights for develo** adaptable legal case summarization systems, transcending jurisdictional boundaries.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
TiBiX: Leveraging Temporal Information for Bidirectional X-ray and Report Generation
Authors:
Santosh Sanjeev,
Fadillah Adamsyah Maani,
Arsen Abzhanov,
Vijay Ram Papineni,
Ibrahim Almakky,
Bartłomiej W. Papież,
Mohammad Yaqub
Abstract:
With the emergence of vision language models in the medical imaging domain, numerous studies have focused on two dominant research activities: (1) report generation from Chest X-rays (CXR), and (2) synthetic scan generation from text or reports. Despite some research incorporating multi-view CXRs into the generative process, prior patient scans and reports have been generally disregarded. This can…
▽ More
With the emergence of vision language models in the medical imaging domain, numerous studies have focused on two dominant research activities: (1) report generation from Chest X-rays (CXR), and (2) synthetic scan generation from text or reports. Despite some research incorporating multi-view CXRs into the generative process, prior patient scans and reports have been generally disregarded. This can inadvertently lead to the leaving out of important medical information, thus affecting generation quality. To address this, we propose TiBiX: Leveraging Temporal information for Bidirectional X-ray and Report Generation. Considering previous scans, our approach facilitates bidirectional generation, primarily addressing two challenging problems: (1) generating the current image from the previous image and current report and (2) generating the current report based on both the previous and current images. Moreover, we extract and release a curated temporal benchmark dataset derived from the MIMIC-CXR dataset, which focuses on temporal data. Our comprehensive experiments and ablation studies explore the merits of incorporating prior CXRs and achieve state-of-the-art (SOTA) results on the report generation task. Furthermore, we attain on-par performance with SOTA image generation efforts, thus serving as a new baseline in longitudinal bidirectional CXR-to-report generation. The code is available at https://github.com/BioMedIA-MBZUAI/TiBiX.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
FissionFusion: Fast Geometric Generation and Hierarchical Sou** for Medical Image Analysis
Authors:
Santosh Sanjeev,
Nuren Zhaksylyk,
Ibrahim Almakky,
Anees Ur Rehman Hashmi,
Mohammad Areeb Qazi,
Mohammad Yaqub
Abstract:
The scarcity of well-annotated medical datasets requires leveraging transfer learning from broader datasets like ImageNet or pre-trained models like CLIP. Model soups averages multiple fine-tuned models aiming to improve performance on In-Domain (ID) tasks and enhance robustness against Out-of-Distribution (OOD) datasets. However, applying these methods to the medical imaging domain faces challeng…
▽ More
The scarcity of well-annotated medical datasets requires leveraging transfer learning from broader datasets like ImageNet or pre-trained models like CLIP. Model soups averages multiple fine-tuned models aiming to improve performance on In-Domain (ID) tasks and enhance robustness against Out-of-Distribution (OOD) datasets. However, applying these methods to the medical imaging domain faces challenges and results in suboptimal performance. This is primarily due to differences in error surface characteristics that stem from data complexities such as heterogeneity, domain shift, class imbalance, and distributional shifts between training and testing phases. To address this issue, we propose a hierarchical merging approach that involves local and global aggregation of models at various levels based on models' hyperparameter configurations. Furthermore, to alleviate the need for training a large number of models in the hyperparameter search, we introduce a computationally efficient method using a cyclical learning rate scheduler to produce multiple models for aggregation in the weight space. Our method demonstrates significant improvements over the model sou** approach across multiple datasets (around 6% gain in HAM10000 and CheXpert datasets) while maintaining low computational costs for model generation and selection. Moreover, we achieve better results on OOD datasets than model soups. The code is available at https://github.com/BioMedIA-MBZUAI/FissionFusion.
△ Less
Submitted 3 June, 2024; v1 submitted 20 March, 2024;
originally announced March 2024.
-
MedMerge: Merging Models for Effective Transfer Learning to Medical Imaging Tasks
Authors:
Ibrahim Almakky,
Santosh Sanjeev,
Anees Ur Rehman Hashmi,
Mohammad Areeb Qazi,
Mohammad Yaqub
Abstract:
Transfer learning has become a powerful tool to initialize deep learning models to achieve faster convergence and higher performance. This is especially useful in the medical imaging analysis domain, where data scarcity limits possible performance gains for deep learning models. Some advancements have been made in boosting the transfer learning performance gain by merging models starting from the…
▽ More
Transfer learning has become a powerful tool to initialize deep learning models to achieve faster convergence and higher performance. This is especially useful in the medical imaging analysis domain, where data scarcity limits possible performance gains for deep learning models. Some advancements have been made in boosting the transfer learning performance gain by merging models starting from the same initialization. However, in the medical imaging analysis domain, there is an opportunity in merging models starting from different initialisations, thus combining the features learnt from different tasks. In this work, we propose MedMerge, a method whereby the weights of different models can be merged, and their features can be effectively utilized to boost performance on a new task. With MedMerge, we learn kernel-level weights that can later be used to merge the models into a single model, even when starting from different initializations. Testing on various medical imaging analysis tasks, we show that our merged model can achieve significant performance gains, with up to 3% improvement on the F1 score. The code implementation of this work will be available at www.github.com/BioMedIA-MBZUAI/MedMerge.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Stability and Dynamics of f(Q,B) Gravity
Authors:
Santosh V Lohakare,
B. Mishra
Abstract:
This study explores the cosmological implications of a modified $f(Q, B)$ gravity model, incorporating both the nonmetricity scalar ($Q$) and the boundary term ($B$). A generic connection ($Γ_{μν}^α$) has been used to define the four-dimensional metric tensor and the covariant derivative ($\nabla_μ$). Statistical analysis using Markov Chain Monte Carlo (MCMC) techniques constrains the $H(z)$ model…
▽ More
This study explores the cosmological implications of a modified $f(Q, B)$ gravity model, incorporating both the nonmetricity scalar ($Q$) and the boundary term ($B$). A generic connection ($Γ_{μν}^α$) has been used to define the four-dimensional metric tensor and the covariant derivative ($\nabla_μ$). Statistical analysis using Markov Chain Monte Carlo (MCMC) techniques constrains the $H(z)$ model free parameters based on observational data from the Cosmic Chronometers (CC) sample, the extended Pantheon$^+$ dataset, and Baryonic Acoustic Oscillation (BAO) measurements. This analysis clarifies the deceleration and Equation of State (EoS) parameters and reveals a smooth transition from a deceleration to an accelerating expansion phase on the evolution history of the Universe. The most intriguing finding is identifying a stable critical point within the dynamical system of the model. This critical point corresponds to the de Sitter phase, a well-known era of accelerated expansion. The stability of the critical point suggests that, under specific initial conditions, the trajectory of the Universe will inherently be drawn towards and remain within the de Sitter phase. This aligns with the current observations, indicating Universe dominated by dark energy (DE) and undergoing late-time accelerated expansion.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
XReal: Realistic Anatomy and Pathology-Aware X-ray Generation via Controllable Diffusion Model
Authors:
Anees Ur Rehman Hashmi,
Ibrahim Almakky,
Mohammad Areeb Qazi,
Santosh Sanjeev,
Vijay Ram Papineni,
Dwarikanath Mahapatra,
Mohammad Yaqub
Abstract:
Large-scale generative models have demonstrated impressive capacity in producing visually compelling images, with increasing applications in medical imaging. However, they continue to grapple with the challenge of image hallucination and the generation of anatomically inaccurate outputs. These limitations are mainly due to the sole reliance on textual inputs and lack of spatial control over the ge…
▽ More
Large-scale generative models have demonstrated impressive capacity in producing visually compelling images, with increasing applications in medical imaging. However, they continue to grapple with the challenge of image hallucination and the generation of anatomically inaccurate outputs. These limitations are mainly due to the sole reliance on textual inputs and lack of spatial control over the generated images, hindering the potential usefulness of such models in real-life settings. We present XReal, a novel controllable diffusion model for generating realistic chest X-ray images through precise anatomy and pathology location control. Our lightweight method can seamlessly integrate spatial control in a pre-trained text-to-image diffusion model without fine-tuning, retaining its existing knowledge while enhancing its generation capabilities. XReal outperforms state-of-the-art x-ray diffusion models in quantitative and qualitative metrics while showing 13% and 10% anatomy and pathology realism gain, respectively, based on the expert radiologist evaluation. Our model holds promise for advancing generative models in medical imaging, offering greater precision and adaptability while inviting further exploration in this evolving field. A large synthetically generated data with annotations and code is publicly available at https://github.com/BioMedIA-MBZUAI/XReal.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Ciphertext-Only Attack on a Secure $k$-NN Computation on Cloud
Authors:
Shyam Murthy,
Santosh Kumar Upadhyaya,
Srinivas Vivek
Abstract:
The rise of cloud computing has spurred a trend of transferring data storage and computational tasks to the cloud. To protect confidential information such as customer data and business details, it is essential to encrypt this sensitive data before cloud storage. Implementing encryption can prevent unauthorized access, data breaches, and the resultant financial loss, reputation damage, and legal i…
▽ More
The rise of cloud computing has spurred a trend of transferring data storage and computational tasks to the cloud. To protect confidential information such as customer data and business details, it is essential to encrypt this sensitive data before cloud storage. Implementing encryption can prevent unauthorized access, data breaches, and the resultant financial loss, reputation damage, and legal issues. Moreover, to facilitate the execution of data mining algorithms on the cloud-stored data, the encryption needs to be compatible with domain computation. The $k$-nearest neighbor ($k$-NN) computation for a specific query vector is widely used in fields like location-based services. Sanyashi et al. (ICISS 2023) proposed an encryption scheme to facilitate privacy-preserving $k$-NN computation on the cloud by utilizing Asymmetric Scalar-Product-Preserving Encryption (ASPE).
In this work, we identify a significant vulnerability in the aforementioned encryption scheme of Sanyashi et al. Specifically, we give an efficient algorithm and also empirically demonstrate that their encryption scheme is vulnerable to the ciphertext-only attack (COA).
△ Less
Submitted 17 April, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
Beyond the Labels: Unveiling Text-Dependency in Paralinguistic Speech Recognition Datasets
Authors:
Jan Pešán,
Santosh Kesiraju,
Lukáš Burget,
Jan ''Honza'' Černocký
Abstract:
Paralinguistic traits like cognitive load and emotion are increasingly recognized as pivotal areas in speech recognition research, often examined through specialized datasets like CLSE and IEMOCAP. However, the integrity of these datasets is seldom scrutinized for text-dependency. This paper critically evaluates the prevalent assumption that machine learning models trained on such datasets genuine…
▽ More
Paralinguistic traits like cognitive load and emotion are increasingly recognized as pivotal areas in speech recognition research, often examined through specialized datasets like CLSE and IEMOCAP. However, the integrity of these datasets is seldom scrutinized for text-dependency. This paper critically evaluates the prevalent assumption that machine learning models trained on such datasets genuinely learn to identify paralinguistic traits, rather than merely capturing lexical features. By examining the lexical overlap in these datasets and testing the performance of machine learning models, we expose significant text-dependency in trait-labeling. Our results suggest that some machine learning models, especially large pre-trained models like HuBERT, might inadvertently focus on lexical characteristics rather than the intended paralinguistic features. The study serves as a call to action for the research community to reevaluate the reliability of existing datasets and methodologies, ensuring that machine learning models genuinely learn what they are designed to recognize.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Graph learning methods to extract empathy supporting regions in a naturalistic stimuli fMRI
Authors:
Sasanka GRS,
Ayushi Agrawal,
Santosh Nannuru,
Kavita Vemuri
Abstract:
Functional MRI (fMRI) research, employing naturalistic stimuli like movies, explores brain network interactions in complex cognitive processes such as empathy. The empathy network encompasses multiple brain areas, including the Insula, PFC, ACC, and parietal regions. Our novel processing pipeline applies graph learning methods to whole-brain timeseries signals, incorporating high-pass filtering, v…
▽ More
Functional MRI (fMRI) research, employing naturalistic stimuli like movies, explores brain network interactions in complex cognitive processes such as empathy. The empathy network encompasses multiple brain areas, including the Insula, PFC, ACC, and parietal regions. Our novel processing pipeline applies graph learning methods to whole-brain timeseries signals, incorporating high-pass filtering, voxel-level clustering, and windowed graph learning with a sparsity-based approach. The study involves two short movies shown to 14 healthy volunteers, considering 54 regions extracted from the AAL Atlas. The sparsity-based graph learning consistently outperforms, achieving over 88% accuracy in capturing emotion contagion variations. Temporal analysis reveals a gradual induction of empathy, supported by the method's effectiveness in capturing dynamic connectomes through graph clustering. Edge-weight dynamics analysis underscores sparsity-based learning's superiority, while connectome-network analysis highlights the pivotal role of the Insula, Amygdala, and Thalamus in empathy. Spectral filtering analysis emphasizes the band-pass filter's significance in isolating regions linked to emotional and empathetic processing during empathy HIGH states. Key regions like Amygdala, Insula, and Angular Gyrus consistently activate, supporting their critical role in immediate emotional responses. Strong similarities across movies in graph cluster labels, connectome-network analysis, and spectral filtering-based analyses reveal robust neural correlates of empathy. These findings advance our understanding of empathy-related neural dynamics and identify specific regions in empathetic responses, offering insights for targeted interventions and treatments associated with empathetic processing.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Quasiparticle band structure and excitonic optical response in V2O5 bulk and monolayer
Authors:
Claudio Garcia,
Santosh Kumar Radha,
Swagata Acharya,
Walter R. L. Lambrecht
Abstract:
The electronic band structure of V$_2$O$_5$ is calculated using an all-electron quasiparticle self-consistent (QS) $GW$ method, including electron-hole ladder diagrams in the screening of $W$. The optical dielectric function calculated with the Bethe-Salpeter equation exhibits excitons with large binding energy, consistent with spectroscopic ellipsometry data and other recent calculations. Sharp p…
▽ More
The electronic band structure of V$_2$O$_5$ is calculated using an all-electron quasiparticle self-consistent (QS) $GW$ method, including electron-hole ladder diagrams in the screening of $W$. The optical dielectric function calculated with the Bethe-Salpeter equation exhibits excitons with large binding energy, consistent with spectroscopic ellipsometry data and other recent calculations. Sharp peaks in the direction perpendicular to the layers at high energy are found to be an artifact of the truncation of the numbers of bands included in the BSE calculation of the macroscopic dielectric function. The $\varepsilon_1(ω=0)$ gives indices of refraction in good agreement with experiment. The excitons are charge transfer excitons with the hole primarily on oxygen and electrons on vanadium, but depending on which exciton, the distribution over different oxygens changes. The exciton wave functions have a spread of about 5-15Å, with asymmetric character for the electron distribution around the hole depending on which oxygen the hole is fixed at. The monolayer quasiparticle gap increases inversely proportional to interlayer distance once the initial interlayer covalent couplings are removed which is thanks to the long-range nature of the self-energy and the reduced screening in a 2D system. The optical gap on the other hand is relatively independent of interlayer spacing because of the compensation between the self-energy gap shift and the exciton binding energy, both of which are proportional to the screened Coulomb interaction $\hat{W}$. Recent experimental results on very thin layer V$_2$O$_5$ obtained by chemical exfoliation provide experimental support for an increase in gap.
△ Less
Submitted 17 May, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
IndicVoices: Towards building an Inclusive Multilingual Speech Dataset for Indian Languages
Authors:
Tahir Javed,
Janki Atul Nawale,
Eldho Ittan George,
Sakshi Joshi,
Kaushal Santosh Bhogale,
Deovrat Mehendale,
Ishvinder Virender Sethi,
Aparna Ananthanarayanan,
Hafsah Faquih,
Pratiti Palit,
Sneha Ravishankar,
Saranya Sukumaran,
Tripura Panchagnula,
Sunjay Murali,
Kunal Sharad Gandhi,
Ambujavalli R,
Manickam K M,
C Venkata Vaijayanthi,
Krishnan Srinivasa Raghavan Karunganni,
Pratyush Kumar,
Mitesh M Khapra
Abstract:
We present INDICVOICES, a dataset of natural and spontaneous speech containing a total of 7348 hours of read (9%), extempore (74%) and conversational (17%) audio from 16237 speakers covering 145 Indian districts and 22 languages. Of these 7348 hours, 1639 hours have already been transcribed, with a median of 73 hours per language. Through this paper, we share our journey of capturing the cultural,…
▽ More
We present INDICVOICES, a dataset of natural and spontaneous speech containing a total of 7348 hours of read (9%), extempore (74%) and conversational (17%) audio from 16237 speakers covering 145 Indian districts and 22 languages. Of these 7348 hours, 1639 hours have already been transcribed, with a median of 73 hours per language. Through this paper, we share our journey of capturing the cultural, linguistic and demographic diversity of India to create a one-of-its-kind inclusive and representative dataset. More specifically, we share an open-source blueprint for data collection at scale comprising of standardised protocols, centralised tools, a repository of engaging questions, prompts and conversation scenarios spanning multiple domains and topics of interest, quality control mechanisms, comprehensive transcription guidelines and transcription tools. We hope that this open source blueprint will serve as a comprehensive starter kit for data collection efforts in other multilingual regions of the world. Using INDICVOICES, we build IndicASR, the first ASR model to support all the 22 languages listed in the 8th schedule of the Constitution of India. All the data, tools, guidelines, models and other materials developed as a part of this work will be made publicly available
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Impact of Diffusion on synchronization pattern of epidemics in nonidentical metapopulation networks
Authors:
Anika Roy,
Ujjwal Shekhar,
Aditi Bose,
Subrata Ghosh,
Santosh Nannuru,
Syamal Kumar Dana,
Chittaranjan Hens
Abstract:
In a prior study, a novel deterministic compartmental model known as the SEIHRK model was introduced, shedding light on the pivotal role of test kits as an intervention strategy for mitigating epidemics. Particularly in heterogeneous networks, it was empirically demonstrated that strategically distributing a limited number of test kits among nodes with higher degrees substantially diminishes the o…
▽ More
In a prior study, a novel deterministic compartmental model known as the SEIHRK model was introduced, shedding light on the pivotal role of test kits as an intervention strategy for mitigating epidemics. Particularly in heterogeneous networks, it was empirically demonstrated that strategically distributing a limited number of test kits among nodes with higher degrees substantially diminishes the outbreak size. The network's dynamics were explored under varying values of infection rate. In this research, we expand upon these findings to investigate the influence of migration on infection dynamics within distinct communities of the network. Notably, we observe that nodes equipped with test kits and those without tend to segregate into two separate clusters when coupling strength is low, but beyond a critical threshold coupling coefficient, they coalesce into a unified cluster. Building on this clustering phenomenon, we develop a reduced equation model and rigorously validate its accuracy through comprehensive simulations. We show that this property is observed in both complete and random graphs.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
B-mesons as essential probes of hot QCD matter
Authors:
Vinod Chandra,
Santosh K. Das
Abstract:
This article elucidates the pivotal role of b-mesons and bottomonium states in exploring the existence and properties of hot QCD matter (commonly known as quark-gluon-plasma (QGP) produced within the crucible heavy-ion collision experiments). Owing to the complex and confounding nature of strong interaction force the direct detection of probing the hot QCD matter is not feasible. In light of this,…
▽ More
This article elucidates the pivotal role of b-mesons and bottomonium states in exploring the existence and properties of hot QCD matter (commonly known as quark-gluon-plasma (QGP) produced within the crucible heavy-ion collision experiments). Owing to the complex and confounding nature of strong interaction force the direct detection of probing the hot QCD matter is not feasible. In light of this, investigating the dynamics of b-quarks and anti-quarks within the hot QCD medium emerges as an invaluable indirect probe. The impact of b-quarks and the mesons spans a spectrum of interesting domains regarding the physics of QCD at finite temperature, encompassing the QCD phase transition, color screening, quarkonia dissociation, heavy quark energy loss and collective flow, anisotropic aspects, and strongly coupled nature of hot QCD medium. These aspects underscore the indispensable nature of B-mesons in the quest to create and explore the complex nature of strong interaction force through the QGP/hot QCD matter. In this context, we mainly focus on works related to transport studies of b-mesons in hot QCD medium, lattice QCD, and effective field theory studies on bottomonium states, and finally, open quantum system frameworks to quarkonia to explore the properties of hot QCD medium in relativistic heavy-ion collision experiments.
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
A Multimodal Handover Failure Detection Dataset and Baselines
Authors:
Santosh Thoduka,
Nico Hochgeschwender,
Juergen Gall,
Paul G. Plöger
Abstract:
An object handover between a robot and a human is a coordinated action which is prone to failure for reasons such as miscommunication, incorrect actions and unexpected object properties. Existing works on handover failure detection and prevention focus on preventing failures due to object slip or external disturbances. However, there is a lack of datasets and evaluation methods that consider unpre…
▽ More
An object handover between a robot and a human is a coordinated action which is prone to failure for reasons such as miscommunication, incorrect actions and unexpected object properties. Existing works on handover failure detection and prevention focus on preventing failures due to object slip or external disturbances. However, there is a lack of datasets and evaluation methods that consider unpreventable failures caused by the human participant. To address this deficit, we present the multimodal Handover Failure Detection dataset, which consists of failures induced by the human participant, such as ignoring the robot or not releasing the object. We also present two baseline methods for handover failure detection: (i) a video classification method using 3D CNNs and (ii) a temporal action segmentation approach which jointly classifies the human action, robot action and overall outcome of the action. The results show that video is an important modality, but using force-torque data and gripper position help improve failure detection and action segmentation accuracy.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Towards Explainability and Fairness in Swiss Judgement Prediction: Benchmarking on a Multilingual Dataset
Authors:
Santosh T. Y. S. S,
Nina Baumgartner,
Matthias Stürmer,
Matthias Grabmair,
Joel Niklaus
Abstract:
The assessment of explainability in Legal Judgement Prediction (LJP) systems is of paramount importance in building trustworthy and transparent systems, particularly considering the reliance of these systems on factors that may lack legal relevance or involve sensitive attributes. This study delves into the realm of explainability and fairness in LJP models, utilizing Swiss Judgement Prediction (S…
▽ More
The assessment of explainability in Legal Judgement Prediction (LJP) systems is of paramount importance in building trustworthy and transparent systems, particularly considering the reliance of these systems on factors that may lack legal relevance or involve sensitive attributes. This study delves into the realm of explainability and fairness in LJP models, utilizing Swiss Judgement Prediction (SJP), the only available multilingual LJP dataset. We curate a comprehensive collection of rationales that `support' and `oppose' judgement from legal experts for 108 cases in German, French, and Italian. By employing an occlusion-based explainability approach, we evaluate the explainability performance of state-of-the-art monolingual and multilingual BERT-based LJP models, as well as models developed with techniques such as data augmentation and cross-lingual transfer, which demonstrated prediction performance improvement. Notably, our findings reveal that improved prediction performance does not necessarily correspond to enhanced explainability performance, underscoring the significance of evaluating models from an explainability perspective. Additionally, we introduce a novel evaluation framework, Lower Court Insertion (LCI), which allows us to quantify the influence of lower court information on model predictions, exposing current models' biases.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Effects of anisotropy in an anisotropic extension of $w$CDM model
Authors:
Vikrant Yadav,
Santosh Kumar Yadav,
Rajpal
Abstract:
In this paper, we derive observational constraints on an anisotropic $w$CDM model from observational data including Baryonic Acoustic Oscillations (BAOs), Cosmic Chronometer (CC), Big Bang Nucleosynthesis (BBN), Pantheon Plus (PP) compilation of Type Ia supernovae, and SH0ES Cepheid host distance anchors. We find that anisotropy is of the order $10^{-13}$, and its presence in the $w$CDM model redu…
▽ More
In this paper, we derive observational constraints on an anisotropic $w$CDM model from observational data including Baryonic Acoustic Oscillations (BAOs), Cosmic Chronometer (CC), Big Bang Nucleosynthesis (BBN), Pantheon Plus (PP) compilation of Type Ia supernovae, and SH0ES Cepheid host distance anchors. We find that anisotropy is of the order $10^{-13}$, and its presence in the $w$CDM model reduces $H_0$ tension by $\sim 2σ$ and $\sim 1σ$ in the analyses with BAO+CC+BBN+PP and BAO+CC+BBN+PPSH0ES data combinations, respectively. In both analyses, the quintessence form of dark energy is favored at 95\% CL.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Computation of marginal eigenvalue distributions in the Laguerre and Jacobi $β$ ensembles
Authors:
Peter J. Forrester,
Santosh Kumar
Abstract:
We consider the problem of the exact computation of the marginal eigenvalue distributions in the Laguerre and Jacobi $β$ ensembles. In the case $β=1$ this is a question of long standing in the mathematical statistics literature. A recursive procedure to accomplish this task is given for $β$ a positive integer, and the parameter $λ_1$ a non-negative integer. This case is special due to a finite bas…
▽ More
We consider the problem of the exact computation of the marginal eigenvalue distributions in the Laguerre and Jacobi $β$ ensembles. In the case $β=1$ this is a question of long standing in the mathematical statistics literature. A recursive procedure to accomplish this task is given for $β$ a positive integer, and the parameter $λ_1$ a non-negative integer. This case is special due to a finite basis of elementary functions, with coefficients which are polynomials. In the Laguerre case with $β= 1$ and $λ_1 + 1/2$ a non-negative integer some evidence is given of their again being a finite basis, now consisting of elementary functions and the error function multiplied by elementary functions. Moreover, from this the corresponding distributions in the fixed trace case permit a finite basis of power functions, as also for $λ_1$ a non-negative integer. The fixed trace case in this setting is relevant to quantum information theory and quantum transport problem, allowing particularly the exact determination of Landauer conductance distributions in a previously intractable parameter regime. Our findings also aid in analyzing zeros of the generating function for specific gap probabilities, supporting the validity of an associated large $N$ local central limit theorem.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
Distinct Composition-Dependent Topological Hall Effect in Mn2-xZnxSb
Authors:
Md Rafique Un Nabi,
Yue Li,
Suzanne G E te Velthuis,
Santosh Karki Chhetri,
Dinesh Upreti,
Rabindra Basnet,
Gokul Acharya,
Charudatta Phatak,
** Hu
Abstract:
Spintronics, an evolving interdisciplinary field at the intersection of magnetism and electronics, explores innovative applications of electron charge and spin properties for advanced electronic devices. The topological Hall effect, a key component in spintronics, has gained significance due to emerging theories surrounding noncoplanar chiral spin textures. This study focuses on Mn2-xZnxSb, a mate…
▽ More
Spintronics, an evolving interdisciplinary field at the intersection of magnetism and electronics, explores innovative applications of electron charge and spin properties for advanced electronic devices. The topological Hall effect, a key component in spintronics, has gained significance due to emerging theories surrounding noncoplanar chiral spin textures. This study focuses on Mn2-xZnxSb, a material crystalizing in centrosymmetric space group with rich magnetic phases tunable by Zn contents. Through comprehensive magnetic and transport characterizations, we found that the high-Zn (x>0.6) samples display THE which is enhanced with decreasing temperature, while THE in the low-Zn (x<0.6) samples show an opposite trend. The coexistence of those distinct temperature dependences for THE suggests very different magnetic interactions/structure for different compositions and underscores the strong coupling between magnetism and transport in Mn2-xZnxSb. Our findings contribute to understanding topological magnetism in centrosymmetric tetragonal lattices, establishing Mn2-xZnxSb as a unique platform for exploring tunable transport effects and opening avenues for further exploration in the realm of spintronics.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Electronic structure and magnetism in P4/nmm KCoO2
Authors:
Ozan Dernek,
Santosh Kumar Radha,
Jerome Jackson,
Walter R. L. Lambrecht
Abstract:
KCoO2 has been found in 1975 to exist in a unique structure with P4/nmm spacegroup with Co in a square pyramidal coordination with the Co atoms in the plane linked by O in a square arrangement reminiscent of the cuprates but its electronic structure has not been studied until now. Unlike Co atoms in LiCoO2 and NaCoO2 in octahedral coordination, which are non-magnetic band structure insulators, the…
▽ More
KCoO2 has been found in 1975 to exist in a unique structure with P4/nmm spacegroup with Co in a square pyramidal coordination with the Co atoms in the plane linked by O in a square arrangement reminiscent of the cuprates but its electronic structure has not been studied until now. Unlike Co atoms in LiCoO2 and NaCoO2 in octahedral coordination, which are non-magnetic band structure insulators, the unusual coordination of d6 Co^{3+} in KCoO2 is here shown to lead to a magnetic stabilization of an insulating structure with high magnetic moments of 4μB per Co. The electronic band structure is calculated using the quasiparticle self-consistent (QS)GW method and the basic formation of magnetic moments is explained in terms of the orbital decomposition of the bands. The optical dielectric function is calculated using the Bethe-Salpeter equation including only transitions between equal spin bands. The magnetic moments are shown to prefer an antiferromagnetic ordering along the [110] direction. Exchange interactions are calculated from the transverse spin susceptibility and a rigid spin approximation. The Néel temperature is estimated using the mean-field and Tyablikov methods and found to be between approximately 100 and 250 K. The band structure in the AFM ordering can be related to the FM ordering by band folding effects. The optical spectra are similar in both structures and show evidence of excitonic features below the quasiparticle gap of about 4 eV.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Understanding and tuning magnetism in layered Ising-type antiferromagnet FePSe3 for potential 2D magnet
Authors:
Rabindra Basnet,
Taksh Patel,
Jian Wang,
Dinesh Upreti,
Santosh Karki Chhetri,
Gokul Acharya,
Md Rafique Un Nabi,
Josh Sakon,
** Hu
Abstract:
Recent development in two-dimensional (2D) magnetic materials have motivated the search for new van der Waals magnetic materials, especially Ising-type magnets with strong magnetic anisotropy. Fe-based MPX3 (M = transition metal, X = chalcogen) compounds such as FePS3 and FePSe3 both exhibit an Ising-type magnetic order, but FePSe3 receives much less attention compared to FePS3. This work focuses…
▽ More
Recent development in two-dimensional (2D) magnetic materials have motivated the search for new van der Waals magnetic materials, especially Ising-type magnets with strong magnetic anisotropy. Fe-based MPX3 (M = transition metal, X = chalcogen) compounds such as FePS3 and FePSe3 both exhibit an Ising-type magnetic order, but FePSe3 receives much less attention compared to FePS3. This work focuses on establishing the strategy to engineer magnetic anisotropy and exchange interactions in this less-explored compound. Through chalcogen and metal substitutions, the magnetic anisotropy is found to be immune against S substitution for Se whereas tunable only with heavy Mn substitution for Fe. In particular, Mn substitution leads to a continuous rotation of magnetic moments from the out-of-plane direction towards in-plane. Furthermore, the magnetic ordering temperature displays non-monotonic do** dependence for both chalcogen and metal substitutions but due to different mechanisms. These findings provide deeper insight into the Ising-type magnetism in this important van der Waals material, shedding light on the study of other Ising-type magnetic systems as well as discovering novel 2D magnets for potential applications in spintronics.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Quantum Algorithm Exploration using Application-Oriented Performance Benchmarks
Authors:
Thomas Lubinski,
Joshua J. Goings,
Karl Mayer,
Sonika Johri,
Nithin Reddy,
Aman Mehta,
Niranjan Bhatia,
Sonny Rappaport,
Daniel Mills,
Charles H. Baldwin,
Luning Zhao,
Aaron Barbosa,
Smarak Maity,
Pranav S. Mundada
Abstract:
The QED-C suite of Application-Oriented Benchmarks provides the ability to gauge performance characteristics of quantum computers as applied to real-world applications. Its benchmark programs sweep over a range of problem sizes and inputs, capturing key performance metrics related to the quality of results, total time of execution, and quantum gate resources consumed. In this manuscript, we invest…
▽ More
The QED-C suite of Application-Oriented Benchmarks provides the ability to gauge performance characteristics of quantum computers as applied to real-world applications. Its benchmark programs sweep over a range of problem sizes and inputs, capturing key performance metrics related to the quality of results, total time of execution, and quantum gate resources consumed. In this manuscript, we investigate challenges in broadening the relevance of this benchmarking methodology to applications of greater complexity. First, we introduce a method for improving landscape coverage by varying algorithm parameters systematically, exemplifying this functionality in a new scalable HHL linear equation solver benchmark. Second, we add a VQE implementation of a Hydrogen Lattice simulation to the QED-C suite, and introduce a methodology for analyzing the result quality and run-time cost trade-off. We observe a decrease in accuracy with increased number of qubits, but only a mild increase in the execution time. Third, unique characteristics of a supervised machine-learning classification application are explored as a benchmark to gauge the extensibility of the framework to new classes of application. Applying this to a binary classification problem revealed the increase in training time required for larger anzatz circuits, and the significant classical overhead. Fourth, we add methods to include optimization and error mitigation in the benchmarking workflow which allows us to: identify a favourable trade off between approximate gate synthesis and gate noise; observe the benefits of measurement error mitigation and a form of deterministic error mitigation algorithm; and to contrast the improvement with the resulting time overhead. Looking ahead, we discuss how the benchmark framework can be instrumental in facilitating the exploration of algorithmic options and their impact on performance.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Joint User and Beam Selection in Millimeter Wave Networks
Authors:
Santosh Kumar Singh,
Satyabrata Sahu,
Ayushi Thawait,
Prasanna Chaporkar,
Gaurav S. Kasbekar
Abstract:
We study the problem of selecting a user equipment (UE) and a beam for each access point (AP) for concurrent transmissions in a millimeter wave (mmWave) network, such that the sum of weighted rates of UEs is maximized. We prove that this problem is NP-complete. We propose two algorithms -- Markov Chain Monte Carlo (MCMC) based and local interaction game (LIG) based UE and beam selection -- and pro…
▽ More
We study the problem of selecting a user equipment (UE) and a beam for each access point (AP) for concurrent transmissions in a millimeter wave (mmWave) network, such that the sum of weighted rates of UEs is maximized. We prove that this problem is NP-complete. We propose two algorithms -- Markov Chain Monte Carlo (MCMC) based and local interaction game (LIG) based UE and beam selection -- and prove that both of them asymptotically achieve the optimal solution. Also, we propose two fast greedy algorithms -- NGUB1 and NGUB2 -- for UE and beam selection. Through extensive simulations, we show that our proposed greedy algorithms outperform the most relevant algorithms proposed in prior work and perform close to the asymptotically optimal algorithms.
△ Less
Submitted 15 March, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
Through the Lens of Split Vote: Exploring Disagreement, Difficulty and Calibration in Legal Case Outcome Classification
Authors:
Shanshan Xu,
T. Y. S. S Santosh,
Oana Ichim,
Barbara Plank,
Matthias Grabmair
Abstract:
In legal decisions, split votes (SV) occur when judges cannot reach a unanimous decision, posing a difficulty for lawyers who must navigate diverse legal arguments and opinions. In high-stakes domains, understanding the alignment of perceived difficulty between humans and AI systems is crucial to build trust. However, existing NLP calibration methods focus on a classifier's awareness of predictive…
▽ More
In legal decisions, split votes (SV) occur when judges cannot reach a unanimous decision, posing a difficulty for lawyers who must navigate diverse legal arguments and opinions. In high-stakes domains, understanding the alignment of perceived difficulty between humans and AI systems is crucial to build trust. However, existing NLP calibration methods focus on a classifier's awareness of predictive performance, measured against the human majority class, overlooking inherent human label variation (HLV). This paper explores split votes as naturally observable human disagreement and value pluralism. We collect judges' vote distributions from the European Court of Human Rights (ECHR), and present SV-ECHR, a case outcome classification (COC) dataset with SV information. We build a taxonomy of disagreement with SV-specific subcategories. We further assess the alignment of perceived difficulty between models and humans, as well as confidence- and human-calibration of COC models. We observe limited alignment with the judge vote distribution. To our knowledge, this is the first systematic exploration of calibration to human judgements in legal NLP. Our study underscores the necessity for further research on measuring and enhancing model calibration considering HLV in legal decision tasks.
△ Less
Submitted 6 June, 2024; v1 submitted 11 February, 2024;
originally announced February 2024.
-
HD 12098: a highly distorted dipole mode in an obliquely pulsating roAp star
Authors:
D. W. Kurtz,
H. Saio,
D. L. Holdsworth,
Santosh Joshi,
S. Seetha
Abstract:
HD 12098 is an roAp star pulsating in the most distorted dipole mode yet observed in this class of star. Using TESS Sector 58 observations we show that there are photometric spots at both the magnetic poles of this star. It pulsates obliquely primarily in a strongly distorted dipole mode with a period of $P_{\rm puls} = 7.85$ min ($ν_{\rm puls} = 183.34905$ d$^{-1}$; 2.12210 mHz) that gives rise t…
▽ More
HD 12098 is an roAp star pulsating in the most distorted dipole mode yet observed in this class of star. Using TESS Sector 58 observations we show that there are photometric spots at both the magnetic poles of this star. It pulsates obliquely primarily in a strongly distorted dipole mode with a period of $P_{\rm puls} = 7.85$ min ($ν_{\rm puls} = 183.34905$ d$^{-1}$; 2.12210 mHz) that gives rise to an unusual quadruplet in the amplitude spectrum. Our magnetic pulsation model cannot account for the strong distortion of the pulsation in one hemisphere, although it is successful in the other hemisphere. There are high-overtone p~modes with frequencies separated by more than the large separation, a challenging problem in mode selection. The mode frequencies observed in the TESS data are in the same frequency range as those previously observed in ground-based Johnson $B$ data, but are not for the same modes. Hence the star has either changed modes, or observations at different atmospheric depth detect different modes. There is also a low-overtone p mode and possibly g modes that are not expected theoretically with the $> 1$ kG magnetic field observed in this star.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Amplitude Modulation in a Delta Scuti star HD118660
Authors:
Mrinmoy Sarkar,
Santosh Joshi,
Peter de Cat
Abstract:
In this paper, we report the detection of amplitude modulation in a delta Scuti star HD118660. We found that the p-mode frequency at 24.3837 c/d varies periodically in amplitude with frequency 0.0558 c/d. However, all other modes are stable in both amplitude and phase which is clear evidence of non-conservation of visible pulsation mode energy. We constructed a two-frequency model by superimposing…
▽ More
In this paper, we report the detection of amplitude modulation in a delta Scuti star HD118660. We found that the p-mode frequency at 24.3837 c/d varies periodically in amplitude with frequency 0.0558 c/d. However, all other modes are stable in both amplitude and phase which is clear evidence of non-conservation of visible pulsation mode energy. We constructed a two-frequency model by superimposing two sinusoids with frequencies n1 = 24.3837 c/d and n2 = 24.4420 c/d and corresponding phases f1 = 0:5211 rad and f2 = 0:9481 rad to mimic the observed variations of amplitude and phase with time. The plausible explanation of the amplitude modulation in HD118660 is due to beating of two unresolved closed frequencies n1 and n2.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Improving Program Debloating with 1-DU Chain Minimality
Authors:
Myeongsoo Kim,
Santosh Pande,
Alessandro Orso
Abstract:
Modern software often struggles with bloat, leading to increased memory consumption and security vulnerabilities from unused code. In response, various program debloating techniques have been developed, typically utilizing test cases that represent functionalities users want to retain. These methods range from aggressive approaches, which prioritize maximal code reduction but may overfit to test c…
▽ More
Modern software often struggles with bloat, leading to increased memory consumption and security vulnerabilities from unused code. In response, various program debloating techniques have been developed, typically utilizing test cases that represent functionalities users want to retain. These methods range from aggressive approaches, which prioritize maximal code reduction but may overfit to test cases and potentially reintroduce past security issues, to conservative strategies that aim to preserve all influenced code, often at the expense of less effective bloat reduction and security improvement. In this research, we present RLDebloatDU, an innovative debloating technique that employs 1-DU chain minimality within abstract syntax trees. Our approach maintains essential program data dependencies, striking a balance between aggressive code reduction and the preservation of program semantics. We evaluated RLDebloatDU on ten Linux kernel programs, comparing its performance with two leading debloating techniques: Chisel, known for its aggressive debloating approach, and Razor, recognized for its conservative strategy. RLDebloatDU significantly lowers the incidence of Common Vulnerabilities and Exposures (CVEs) and improves soundness compared to both, highlighting its efficacy in reducing security issues without reintroducing resolved security issues.
△ Less
Submitted 31 January, 2024;
originally announced February 2024.
-
Localisation of Gamma Ray Bursts using AstroSat Mass Model
Authors:
Divita Saraogi,
J Venkata Aditya,
Varun Bhalerao,
Suman Bala,
Arvind Balasubramanian,
Sujay Mate,
Tanmoy Chattopadhyay,
Soumya Gupta,
Vipul Prasad,
Gaurav Waratkar,
Navaneeth P K,
Rahul Gopalakrishnan,
Dipankar Bhattacharya,
Gulab Dewangan,
Santosh Vadawale
Abstract:
The Cadmium Zinc Telluride Imager (CZTI) aboard AstroSat has good sensitivity to Gamma Ray Bursts (GRBs), with close to 600 detections including about 50 discoveries undetected by other missions. However, CZTI was not designed to be a GRB monitor and lacks localisation capabilities. We introduce a new method of localising GRBs using "shadows" cast on the CZTI detector plane due to absorption and s…
▽ More
The Cadmium Zinc Telluride Imager (CZTI) aboard AstroSat has good sensitivity to Gamma Ray Bursts (GRBs), with close to 600 detections including about 50 discoveries undetected by other missions. However, CZTI was not designed to be a GRB monitor and lacks localisation capabilities. We introduce a new method of localising GRBs using "shadows" cast on the CZTI detector plane due to absorption and scattering by satellite components and instruments. Comparing the observed distribution of counts on the detector plane with simulated distributions with the AstroSat Mass Model, we can localise GRBs in the sky. Our localisation uncertainty is defined by a two-component model, with a narrow Gaussian component that has close to 50% probability of containing the source, and the remaining spread over a broader Gaussian component with an 11.3 times higher $σ$. The width ($σ$) of the Gaussian components scales inversely with source counts. We test this model by applying the method to GRBs with known positions and find good agreement between the model and observations. This new ability expands the utility of CZTI in the study of GRBs and other rapid high-energy transients.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Momentary Stressor Logging and Reflective Visualizations: Implications for Stress Management with Wearables
Authors:
Sameer Neupane,
Mithun Saha,
Nasir Ali,
Timothy Hnat,
Shahin Alan Samiei,
Anandatirtha Nandugudi,
David M. Almeida,
Santosh Kumar
Abstract:
Commercial wearables from Fitbit, Garmin, and Whoop have recently introduced real-time notifications based on detecting changes in physiological responses indicating potential stress. In this paper, we investigate how these new capabilities can be leveraged to improve stress management. We developed a smartwatch app, a smartphone app, and a cloud service, and conducted a 100-day field study with 1…
▽ More
Commercial wearables from Fitbit, Garmin, and Whoop have recently introduced real-time notifications based on detecting changes in physiological responses indicating potential stress. In this paper, we investigate how these new capabilities can be leveraged to improve stress management. We developed a smartwatch app, a smartphone app, and a cloud service, and conducted a 100-day field study with 122 participants who received prompts triggered by physiological responses several times a day. They were asked whether they were stressed, and if so, to log the most likely stressor. Each week, participants received new visualizations of their data to self-reflect on patterns and trends. Participants reported better awareness of their stressors, and self-initiating fourteen kinds of behavioral changes to reduce stress in their daily lives. Repeated self-reports over 14 weeks showed reductions in both stress intensity (in 26,521 momentary ratings) and stress frequency (in 1,057 weekly surveys).
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Guiding Soft Robots with Motor-Imagery Brain Signals and Impedance Control
Authors:
Maximilian Stölzle,
Sonal Santosh Baberwal,
Daniela Rus,
Shirley Coyle,
Cosimo Della Santina
Abstract:
Integrating Brain-Machine Interfaces into non-clinical applications like robot motion control remains difficult - despite remarkable advancements in clinical settings. Specifically, EEG-based motor imagery systems are still error-prone, posing safety risks when rigid robots operate near humans. This work presents an alternative pathway towards safe and effective operation by combining wearable EEG…
▽ More
Integrating Brain-Machine Interfaces into non-clinical applications like robot motion control remains difficult - despite remarkable advancements in clinical settings. Specifically, EEG-based motor imagery systems are still error-prone, posing safety risks when rigid robots operate near humans. This work presents an alternative pathway towards safe and effective operation by combining wearable EEG with physically embodied safety in soft robots. We introduce and test a pipeline that allows a user to move a soft robot's end effector in real time via brain waves that are measured by as few as three EEG channels. A robust motor imagery algorithm interprets the user's intentions to move the position of a virtual attractor to which the end effector is attracted, thanks to a new Cartesian impedance controller. We specifically focus here on planar soft robot-based architected metamaterials, which require the development of a novel control architecture to deal with the peculiar nonlinearities - e.g., non-affinity in control. We preliminarily but quantitatively evaluate the approach on the task of setpoint regulation. We observe that the user reaches the proximity of the setpoint in 66% of steps and that for successful steps, the average response time is 21.5s. We also demonstrate the execution of simple real-world tasks involving interaction with the environment, which would be extremely hard to perform if it were not for the robot's softness.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
Families over the integral Bernstein Center and Tate cohomology of local Base change lifts for GL(n, F)
Authors:
Sabyasachi Dhar,
Santosh Nadimpalli
Abstract:
Let $p$ and $l$ be distinct odd primes, and let $F$ be a $p$-adic field. Let $π$ be a generic smooth integral representation of ${\rm GL}_n(F)$ over an $\overline{\mathbb{Q}}_l$-vector space. Let $E$ be a finite Galois extension of $F$ with $[E:F]=l$. Let $Π$ be the base change lift of $π$ to the group ${\rm GL}_n(E)$. Let $\mathbb{W}^0(Π, ψ_E)$ be the lattice of $\overline{\mathbb{Z}}_l$-valued f…
▽ More
Let $p$ and $l$ be distinct odd primes, and let $F$ be a $p$-adic field. Let $π$ be a generic smooth integral representation of ${\rm GL}_n(F)$ over an $\overline{\mathbb{Q}}_l$-vector space. Let $E$ be a finite Galois extension of $F$ with $[E:F]=l$. Let $Π$ be the base change lift of $π$ to the group ${\rm GL}_n(E)$. Let $\mathbb{W}^0(Π, ψ_E)$ be the lattice of $\overline{\mathbb{Z}}_l$-valued functions in the Whittaker model of $Π$, with respect to a standard ${\rm Gal}(E/F)$-equivaraint additive character $ψ_E:E\rightarrow \overline{\mathbb{Q}}_l^\times$. We show that the unique generic sub-quotient of the zero-th Tate cohomology group of $\mathbb{W}^0(Π, ψ_E)$ is isomorphic to the Frobenius twist of the unique generic sub-quotient of the mod-$l$ reduction of $π$. We first prove a version of this result for a family of smooth generic representations of ${\rm GL}_n(E)$ over the integral Bernstein center of ${\rm GL}_n(F)$. Our methods use the theory of Rankin-selberg convolutions and simple identities of local $γ$-factors. The results of this article remove the hypothesis that $l$ does not divide the pro-order of ${\rm GL}_{n-1}(F)$ in our previous work.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
Extropy and Varextropy estimators with applications
Authors:
Santosh Kumar Chaudhary,
Nitin Gupta
Abstract:
In many statistical studies, the measure of uncertainties like entropy, extropy, varentropy and varextropy of a distribution function is of prime interest. This paper proposes estimators of extropy and varextropy. Proposed estimators are consistent. Based on extropy estimator, a test of symmetry is given. The proposed test has the advantage that we do not need to estimate the centre of symmetry. T…
▽ More
In many statistical studies, the measure of uncertainties like entropy, extropy, varentropy and varextropy of a distribution function is of prime interest. This paper proposes estimators of extropy and varextropy. Proposed estimators are consistent. Based on extropy estimator, a test of symmetry is given. The proposed test has the advantage that we do not need to estimate the centre of symmetry. The critical value and power of the proposed test statistics have been obtained. The test procedure has been implemented on six real-life data sets to verify its performance in identifying the symmetric nature.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Enabling clustering algorithms to detect clusters of varying densities through scale-invariant data preprocessing
Authors:
Sunil Aryal,
Jonathan R. Wells,
Arbind Agrahari Baniya,
KC Santosh
Abstract:
In this paper, we show that preprocessing data using a variant of rank transformation called 'Average Rank over an Ensemble of Sub-samples (ARES)' makes clustering algorithms robust to data representation and enable them to detect varying density clusters. Our empirical results, obtained using three most widely used clustering algorithms-namely KMeans, DBSCAN, and DP (Density Peak)-across a wide r…
▽ More
In this paper, we show that preprocessing data using a variant of rank transformation called 'Average Rank over an Ensemble of Sub-samples (ARES)' makes clustering algorithms robust to data representation and enable them to detect varying density clusters. Our empirical results, obtained using three most widely used clustering algorithms-namely KMeans, DBSCAN, and DP (Density Peak)-across a wide range of real-world datasets, show that clustering after ARES transformation produces better and more consistent results.
△ Less
Submitted 20 January, 2024;
originally announced January 2024.
-
Salt Effects on Ionic Conductivity Mechanisms in Ethylene Carbonate Electrolytes: Interplay of Viscosity and Ion-ion Relaxations
Authors:
Hema Teherpuria,
Sapta Sindhu Paul Chowdhury,
Sridhar Kumar Kannam,
Prabhat K. Jaiswal,
Santosh Mogurampelly
Abstract:
The intricate role of shear viscosity and ion-pair relaxations on ionic conductivity mechanisms and the underlying changes induced by salt concentration ($c$) in organic liquid electrolytes remain poorly understood despite their widespread technological importance. Using molecular dynamics simulations employing nonpolarizable force fields for $c$ ranging between 10$^{-3}$ to 101 M, we show that th…
▽ More
The intricate role of shear viscosity and ion-pair relaxations on ionic conductivity mechanisms and the underlying changes induced by salt concentration ($c$) in organic liquid electrolytes remain poorly understood despite their widespread technological importance. Using molecular dynamics simulations employing nonpolarizable force fields for $c$ ranging between 10$^{-3}$ to 101 M, we show that the low and high $c$ regimes of the EC-LiTFSI electrolytes are distinctly characterized by $η\simτ_c^{1/2}$ and $η\simτ_c^{1}$, where $η$ and $τ_c$ are shear viscosity and cation-anion relaxation timescales, respectively. Our extensive simulations and analyses suggest a universal relationship between the ionic conductivity and c as $σ(c)\sim c^αe^{-c/c_{0}} (α>0)$. The proposed relationship convincingly explains the ionic conductivity over a wide range of $c$, where the term $c^α$ accounts for the uncorrelated motion of ions and $e^{-c/c_0}$ captures the salt-induced changes in shear viscosity. Our simulations suggest vehicular mechanism to be dominant at low $c$ regime which transition into a Grotthuss mechanism at high $c$ regime, where structural relaxation is the dominant form of ion transport mechanism. Our findings shed light on some of the fundamental aspects of the ion conductivity mechanisms in liquid electrolytes, offering insights into optimizing the ion transport in EC-LiTFSI electrolytes.
△ Less
Submitted 20 January, 2024;
originally announced January 2024.
-
Leveraging Domain Adaptation for Accurate Machine Learning Predictions of New Halide Perovskites
Authors:
Dipannoy Das Gupta,
Zachary J. L. Bare,
Suxuen Yew,
Santosh Adhikari,
Brian DeCost,
Qi Zhang,
Charles Musgrave,
Christopher Sutton
Abstract:
We combine graph neural networks (GNN) with an inexpensive and reliable structure generation approach based on the bond-valence method (BVM) to train accurate machine learning models for screening 222,960 halide perovskites using statistical estimates of the DFT/PBE formation energy (Ef), and the PBE and HSE band gaps (Eg). The GNNs were fined tuned using domain adaptation (DA) from a source model…
▽ More
We combine graph neural networks (GNN) with an inexpensive and reliable structure generation approach based on the bond-valence method (BVM) to train accurate machine learning models for screening 222,960 halide perovskites using statistical estimates of the DFT/PBE formation energy (Ef), and the PBE and HSE band gaps (Eg). The GNNs were fined tuned using domain adaptation (DA) from a source model, which yields a factor of 1.8 times improvement in Ef and 1.2 - 1.35 times improvement in HSE Eg compared to direct training (i.e., without DA). Using these two ML models, 48 compounds were identified out of 222,960 candidates as both stable and that have an HSE Eg that is relevant for photovoltaic applications. For this subset, only 8 have been reported to date, indicating that 40 compounds remain unexplored to the best of our knowledge and therefore offer opportunities for potential experimental examination.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Instabilities in models of supergiants MWC 137 and MWC 314
Authors:
Sugyan Parida,
Abhay Pratap Yadav,
Santosh Joshi
Abstract:
In several B-type supergiants photometric and spectroscopic variabilities together with episodes of enhanced mass-loss have been observed. Here we present the preliminary results of linear stability analysis followed by nonlinear numerical simulations in two B-type supergiants MWC 137 and MWC 314. All the considered models of MWC 137 having mass in the range of 30 M$_{\odot}$ to 70 M$_{\odot}$ are…
▽ More
In several B-type supergiants photometric and spectroscopic variabilities together with episodes of enhanced mass-loss have been observed. Here we present the preliminary results of linear stability analysis followed by nonlinear numerical simulations in two B-type supergiants MWC 137 and MWC 314. All the considered models of MWC 137 having mass in the range of 30 M$_{\odot}$ to 70 M$_{\odot}$ are unstable while for the case of MWC 314 models with mass below 31 M$_{\odot}$ are unstable. The instabilities have been followed into nonlinear regime for selected models of these two supergiants. During the nonlinear numerical simulations, instabilities lead to finite amplitude pulsation with a well defined saturation level in the considered models of MWC 137 with mass greater than 42 M$_{\odot}$. The model of MWC 314 with mass of 40 M$_{\odot}$ - the suggested mass for the primary star - does not show any instabilities both in linear stability analysis and nonlinear numerical simulations. Velocity amplitude reaches to 10$^7$ cm/s in the nonlinear regime for the model of MWC 314 with mass of 30 M$_{\odot}$. Further extensive numerical simulations and observations are required to understand the origin of the observed variabilities in these stars.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Photometric Variability of Low Mass Stars and Brown Dwarfs in IC~348 and Taurus Star-Forming Regions
Authors:
Samrat Ghosh,
Soumen Mondal,
Santosh Joshi,
Sneh Lata,
Rajib Kumbhakar
Abstract:
Low-mass stars belonging to the M spectral type are the most numerous stars in our Galaxy, amounting to about two-thirds in number, and are found at the bottom of the main sequence in the H-R diagram. %Multi-wavelength studies on star-forming regions help to understand the census of PMS stars, their formation process, and the interaction of expanding H II regions harboring massive stars with their…
▽ More
Low-mass stars belonging to the M spectral type are the most numerous stars in our Galaxy, amounting to about two-thirds in number, and are found at the bottom of the main sequence in the H-R diagram. %Multi-wavelength studies on star-forming regions help to understand the census of PMS stars, their formation process, and the interaction of expanding H II regions harboring massive stars with their natal molecular clouds. Photometric studies of low-mass stars, including brown dwarfs (BDs), provide several important evolutions of their atmosphere, magnetic flares and chromospheric activity. This paper highlights a few interesting results from our optical I-band observations of 2MASS J03435638+3209591 in the young star-forming IC~348 region and three BDs in Taurus star-forming regions using ground-based telescopes as well as a space-based telescope. We estimated the fast periodicities in the range of 1.5 to 3 hours in Taurus BDs. Furthermore, using the long-term photometry from the Transiting Exoplanet Survey Satellite (TESS), we have conducted a time-resolved variability analysis of CFHT-BD-Tau 4. The periodogram analysis of TESS data reveals an orbital period of $\sim$ 3 days. We found two flare events in TESS sector 43 data for this BD and estimated the flared energies as $4.59\times10^{35}$ erg and $2.64\times10^{36}$ erg, which sit in the superflare range.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
VALD-MD: Visual Attribution via Latent Diffusion for Medical Diagnostics
Authors:
Ammar A. Siddiqui,
Santosh Tirunagari,
Tehseen Zia,
David Windridge
Abstract:
Visual attribution in medical imaging seeks to make evident the diagnostically-relevant components of a medical image, in contrast to the more common detection of diseased tissue deployed in standard machine vision pipelines (which are less straightforwardly interpretable/explainable to clinicians). We here present a novel generative visual attribution technique, one that leverages latent diffusio…
▽ More
Visual attribution in medical imaging seeks to make evident the diagnostically-relevant components of a medical image, in contrast to the more common detection of diseased tissue deployed in standard machine vision pipelines (which are less straightforwardly interpretable/explainable to clinicians). We here present a novel generative visual attribution technique, one that leverages latent diffusion models in combination with domain-specific large language models, in order to generate normal counterparts of abnormal images. The discrepancy between the two hence gives rise to a map** indicating the diagnostically-relevant image components. To achieve this, we deploy image priors in conjunction with appropriate conditioning mechanisms in order to control the image generative process, including natural language text prompts acquired from medical science and applied radiology. We perform experiments and quantitatively evaluate our results on the COVID-19 Radiography Database containing labelled chest X-rays with differing pathologies via the Frechet Inception Distance (FID), Structural Similarity (SSIM) and Multi Scale Structural Similarity Metric (MS-SSIM) metrics obtained between real and generated images. The resulting system also exhibits a range of latent capabilities including zero-shot localized disease induction, which are evaluated with real examples from the cheXpert dataset.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
b-it-bots RoboCup@Work Team Description Paper 2023
Authors:
Kevin Patel,
Vamsi Kalagaturu,
Vivek Mannava,
Ravisankar Selvaraju,
Shubham Shinde,
Dharmin Bakaraniya,
Deebul Nair,
Mohammad Wasil,
Santosh Thoduka,
Iman Awaad,
Sven Schneider,
Nico Hochgeschwender,
Paul G. Plöger
Abstract:
This paper presents the b-it-bots RoboCup@Work team and its current hardware and functional architecture for the KUKA youBot robot. We describe the underlying software framework and the developed capabilities required for operating in industrial environments including features such as reliable and precise navigation, flexible manipulation, robust object recognition and task planning. New developme…
▽ More
This paper presents the b-it-bots RoboCup@Work team and its current hardware and functional architecture for the KUKA youBot robot. We describe the underlying software framework and the developed capabilities required for operating in industrial environments including features such as reliable and precise navigation, flexible manipulation, robust object recognition and task planning. New developments include an approach to grasp vertical objects, placement of objects by considering the empty space on a workstation, and the process of porting our code to ROS2.
△ Less
Submitted 29 December, 2023;
originally announced December 2023.
-
Report of the DOE/NSF Workshop on Correctness in Scientific Computing, June 2023, Orlando, FL
Authors:
Maya Gokhale,
Ganesh Gopalakrishnan,
Jackson Mayo,
Santosh Nagarakatte,
Cindy Rubio-González,
Stephen F. Siegel
Abstract:
This report is a digest of the DOE/NSF Workshop on Correctness in Scientific Computing (CSC'23) held on June 17, 2023, as part of the Federated Computing Research Conference (FCRC) 2023. CSC was conceived by DOE and NSF to address the growing concerns about correctness among those who employ computational methods to perform large-scale scientific simulations. These concerns have escalated, given t…
▽ More
This report is a digest of the DOE/NSF Workshop on Correctness in Scientific Computing (CSC'23) held on June 17, 2023, as part of the Federated Computing Research Conference (FCRC) 2023. CSC was conceived by DOE and NSF to address the growing concerns about correctness among those who employ computational methods to perform large-scale scientific simulations. These concerns have escalated, given the complexity, scale, and heterogeneity of today's HPC software and hardware. If correctness is not proactively addressed, there is the risk of producing flawed science on top of unacceptable productivity losses faced by computational scientists and engineers. HPC systems are beginning to include data-driven methods, including machine learning and surrogate models, and their impact on overall HPC system correctness was also felt urgent to discuss.
Stakeholders of correctness in this space were identified to belong to several sub-disciplines of computer science; from computer architecture researchers who design special-purpose hardware that offers high energy efficiencies; numerical algorithm designers who develop efficient computational schemes based on reduced precision as well as reduced data movement; all the way to researchers in programming language and formal methods who seek methodologies for correct compilation and verification. To include attendees with such a diverse set of backgrounds, CSC was held during the Federated Computing Research Conference (FCRC) 2023.
△ Less
Submitted 27 December, 2023; v1 submitted 25 December, 2023;
originally announced December 2023.
-
Muted: Multilingual Targeted Offensive Speech Identification and Visualization
Authors:
Christoph Tillmann,
Aashka Trivedi,
Sara Rosenthal,
Santosh Borse,
Rong Zhang,
Avirup Sil,
Bishwaranjan Bhattacharjee
Abstract:
Offensive language such as hate, abuse, and profanity (HAP) occurs in various content on the web. While previous work has mostly dealt with sentence level annotations, there have been a few recent attempts to identify offensive spans as well. We build upon this work and introduce Muted, a system to identify multilingual HAP content by displaying offensive arguments and their targets using heat map…
▽ More
Offensive language such as hate, abuse, and profanity (HAP) occurs in various content on the web. While previous work has mostly dealt with sentence level annotations, there have been a few recent attempts to identify offensive spans as well. We build upon this work and introduce Muted, a system to identify multilingual HAP content by displaying offensive arguments and their targets using heat maps to indicate their intensity. Muted can leverage any transformer-based HAP-classification model and its attention mechanism out-of-the-box to identify toxic spans, without further fine-tuning. In addition, we use the spaCy library to identify the specific targets and arguments for the words predicted by the attention heatmaps. We present the model's performance on identifying offensive spans and their targets in existing datasets and present new annotations on German text. Finally, we demonstrate our proposed visualization tool on multilingual inputs.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
SkyScenes: A Synthetic Dataset for Aerial Scene Understanding
Authors:
Sahil Khose,
Anisha Pal,
Aayushi Agarwal,
Deepanshi,
Judy Hoffman,
Prithvijit Chattopadhyay
Abstract:
Real-world aerial scene understanding is limited by a lack of datasets that contain densely annotated images curated under a diverse set of conditions. Due to inherent challenges in obtaining such images in controlled real-world settings, we present SkyScenes, a synthetic dataset of densely annotated aerial images captured from Unmanned Aerial Vehicle (UAV) perspectives. We carefully curate SkySce…
▽ More
Real-world aerial scene understanding is limited by a lack of datasets that contain densely annotated images curated under a diverse set of conditions. Due to inherent challenges in obtaining such images in controlled real-world settings, we present SkyScenes, a synthetic dataset of densely annotated aerial images captured from Unmanned Aerial Vehicle (UAV) perspectives. We carefully curate SkyScenes images from CARLA to comprehensively capture diversity across layout (urban and rural maps), weather conditions, times of day, pitch angles and altitudes with corresponding semantic, instance and depth annotations. Through our experiments using SkyScenes, we show that (1) Models trained on SkyScenes generalize well to different real-world scenarios, (2) augmenting training on real images with SkyScenes data can improve real-world performance, (3) controlled variations in SkyScenes can offer insights into how models respond to changes in viewpoint conditions, and (4) incorporating additional sensor modalities (depth) can improve aerial scene understanding.
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
JWST and ALMA discern the assembly of structural and obscured components in a high-redshift starburst galaxy
Authors:
Zhaoxuan Liu,
John D. Silverman,
Emanuele Daddi,
Annagrazia Puglisi,
Alvio Renzini,
Boris S. Kalita,
Jeyhan S. Kartaltepe,
Daichi Kashino,
Giulia Rodighiero,
Wiphu Rujopakarn,
Tomoko L. Suzuki,
Takumi S. Tanaka,
Francesco Valentino,
Irham Taufik Andika,
Caitlin M. Casey,
Andreas Faisst,
Maximilien Franco,
Ghassem Gozaliasl,
Steven Gillman,
Christopher C. Hayward,
Anton M. Koekemoer,
Vasily Kokorev,
Erini Lambrides,
Minju M. Lee,
Georgios E. Magdis
, et al. (5 additional authors not shown)
Abstract:
We present observations and analysis of the starburst, PACS-819, at z=1.45 ($M_*=10^{10.7}$ M$_{ \odot}$), using high-resolution ($0^{\prime \prime}.1$; 0.8 kpc) ALMA and multi-wavelength JWST images from the COSMOS-Web program. Dissimilar to HST/ACS images in the rest-frame UV, the redder NIRCam and MIRI images reveal a smooth central mass concentration and spiral-like features, atypical for such…
▽ More
We present observations and analysis of the starburst, PACS-819, at z=1.45 ($M_*=10^{10.7}$ M$_{ \odot}$), using high-resolution ($0^{\prime \prime}.1$; 0.8 kpc) ALMA and multi-wavelength JWST images from the COSMOS-Web program. Dissimilar to HST/ACS images in the rest-frame UV, the redder NIRCam and MIRI images reveal a smooth central mass concentration and spiral-like features, atypical for such an intense starburst. Through dynamical modeling of the CO J=5--4 emission with ALMA, PACS-819 is rotation-dominated thus has a disk-like nature. However, kinematic anomalies in CO and asymmetric features in the bluer JWST bands (e.g., F150W) support a more disturbed nature likely due to interactions. The JWST imaging further enables us to map the distribution of stellar mass and dust attenuation, thus clarifying the relationships between different structural components, not discernable in the previous HST images. The CO J = 5 -- 4 and FIR dust continuum emission are co-spatial with a heavily-obscured starbursting core (<1 kpc) which is partially surrounded by much less obscured star-forming structures including a prominent arc, possibly a tidally-distorted dwarf galaxy, and a clump, either a sign of an ongoing violent disk instability or a recently accreted low-mass satellite. With spatially-resolved maps, we find a high molecular gas fraction in the central area reaching $\sim3$ ($M_{\text{gas}}$/$M_*$) and short depletion times ($M_{\text{gas}}/SFR\sim$ 120 Myrs) across the entire system. These observations provide insights into the complex nature of starbursts in the distant universe and underscore the wealth of complementary information from high-resolution observations with both ALMA and JWST.
△ Less
Submitted 10 May, 2024; v1 submitted 24 November, 2023;
originally announced November 2023.
-
Calibrated Language Models Must Hallucinate
Authors:
Adam Tauman Kalai,
Santosh S. Vempala
Abstract:
Recent language models generate false but plausible-sounding text with surprising frequency. Such "hallucinations" are an obstacle to the usability of language-based AI systems and can harm people who rely upon their outputs. This work shows that there is an inherent statistical lower-bound on the rate that pretrained language models hallucinate certain types of facts, having nothing to do with th…
▽ More
Recent language models generate false but plausible-sounding text with surprising frequency. Such "hallucinations" are an obstacle to the usability of language-based AI systems and can harm people who rely upon their outputs. This work shows that there is an inherent statistical lower-bound on the rate that pretrained language models hallucinate certain types of facts, having nothing to do with the transformer LM architecture or data quality. For "arbitrary" facts whose veracity cannot be determined from the training data, we show that hallucinations must occur at a certain rate for language models that satisfy a statistical calibration condition appropriate for generative language models. Specifically, if the maximum probability of any fact is bounded, we show that the probability of generating a hallucination is close to the fraction of facts that occur exactly once in the training data (a "Good-Turing" estimate), even assuming ideal training data without errors.
One conclusion is that models pretrained to be sufficiently good predictors (i.e., calibrated) may require post-training to mitigate hallucinations on the type of arbitrary facts that tend to appear once in the training set. However, our analysis also suggests that there is no statistical reason that pretraining will lead to hallucination on facts that tend to appear more than once in the training data (like references to publications such as articles and books, whose hallucinations have been particularly notable and problematic) or on systematic facts (like arithmetic calculations). Therefore, different architectures and learning algorithms may mitigate these latter types of hallucinations.
△ Less
Submitted 19 March, 2024; v1 submitted 24 November, 2023;
originally announced November 2023.
-
On the stability and pulsation in models of B[e] star MWC 137
Authors:
Sugyan Parida,
Abhay Pratap Yadav,
Michaela Kraus,
Wolfgang Glatzel,
Yogesh Chandra Joshi,
Santosh Joshi
Abstract:
B[e] type stars are characterised by strong emission lines, photometric $\&$ spectroscopic variabilities and unsteady mass-loss rates. MWC 137 is a galactic B[e] type star situated in the constellation Orion. Recent photometric observation of MWC 137 by TESS has revealed variabilities with a dominant period of 1.9 d. The origin of this variability is not known but suspected to be from stellar puls…
▽ More
B[e] type stars are characterised by strong emission lines, photometric $\&$ spectroscopic variabilities and unsteady mass-loss rates. MWC 137 is a galactic B[e] type star situated in the constellation Orion. Recent photometric observation of MWC 137 by TESS has revealed variabilities with a dominant period of 1.9 d. The origin of this variability is not known but suspected to be from stellar pulsation. To understand the nature and origin of this variability, we have constructed three different set of models of MWC 137 and performed non-adiabatic linear stability analysis. Several low order modes are found to be unstable in which models having mass in the range of 31 to 34 M$_{\odot}$ and 43 to 46 M$_{\odot}$ have period close to 1.9 d. The evolution of instabilities in the non-linear regime for model having solar chemical composition and mass of 45 M$_{\odot}$ leads to finite amplitude pulsation with a period of 1.9 d. Therefore in the present study we confirm that this variability in MWC 137 is due to pulsation. Evolutionary tracks passing through the location of MWC 137 in the HR diagram indicate that the star is either in post main sequence evolutionary phase or about to enter in this evolutionary phase.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Optical polarisation study of Galactic Open clusters
Authors:
Namita Uppal,
Shashikiran Ganesh,
Santosh Joshi,
Mrinmoy Sarkar,
Prachi Prajapati,
Athul Dileep
Abstract:
Dust is a ubiquitous component in our Galaxy. It accounts for only $1\%$ mass of the ISM but still is an essential part of the Galaxy. It affects our view of the Galaxy by obscuring the starlight at shorter wavelengths and re-emitting in longer wavelengths. Studying the dust distribution in the Galaxy at longer wavelengths may cause discrepancies due to distance ambiguity caused by unknown Galacti…
▽ More
Dust is a ubiquitous component in our Galaxy. It accounts for only $1\%$ mass of the ISM but still is an essential part of the Galaxy. It affects our view of the Galaxy by obscuring the starlight at shorter wavelengths and re-emitting in longer wavelengths. Studying the dust distribution in the Galaxy at longer wavelengths may cause discrepancies due to distance ambiguity caused by unknown Galactic potential. However, another aspect of dust, i.e., the polarisation of the background starlight, when combined with distance information, will help to give direct observational evidence of the number of dust clouds encountered in the line of sight. We observed 15 open clusters distributed at increasing distances in three lines of sight using two Indian national facilities. The measured polarisation results used to scrutinize the dust distribution and orientation of the local plane of sky magnetic fields towards selected directions. The analysis of the stars observed towards the distant cluster King 8 cluster shows two foreground layers at a distance of $\sim 500$ pc and $\sim$ 3500 pc. Similar analysis towards different clusters also results in multiple dust layers.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
AttentioNet: Monitoring Student Attention Type in Learning with EEG-Based Measurement System
Authors:
Dhruv Verma,
Sejal Bhalla,
S. V. Sai Santosh,
Saumya Yadav,
Aman Parnami,
Jainendra Shukla
Abstract:
Student attention is an indispensable input for uncovering their goals, intentions, and interests, which prove to be invaluable for a multitude of research areas, ranging from psychology to interactive systems. However, most existing methods to classify attention fail to model its complex nature. To bridge this gap, we propose AttentioNet, a novel Convolutional Neural Network-based approach that u…
▽ More
Student attention is an indispensable input for uncovering their goals, intentions, and interests, which prove to be invaluable for a multitude of research areas, ranging from psychology to interactive systems. However, most existing methods to classify attention fail to model its complex nature. To bridge this gap, we propose AttentioNet, a novel Convolutional Neural Network-based approach that utilizes Electroencephalography (EEG) data to classify attention into five states: Selective, Sustained, Divided, Alternating, and relaxed state. We collected a dataset of 20 subjects through standard neuropsychological tasks to elicit different attentional states. The average across-student accuracy of our proposed model at this configuration is 92.3% (SD=3.04), which is well-suited for end-user applications. Our transfer learning-based approach for personalizing the model to individual subjects effectively addresses the issue of individual variability in EEG signals, resulting in improved performance and adaptability of the model for real-world applications. This represents a significant advancement in the field of EEG-based classification. Experimental results demonstrate that AttentioNet outperforms a popular EEGnet baseline (p-value < 0.05) in both subject-independent and subject-dependent settings, confirming the effectiveness of our proposed approach despite the limitations of our dataset. These results highlight the promising potential of AttentioNet for attention classification using EEG data.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Contrastive Moments: Unsupervised Halfspace Learning in Polynomial Time
Authors:
Xinyuan Cao,
Santosh S. Vempala
Abstract:
We give a polynomial-time algorithm for learning high-dimensional halfspaces with margins in $d$-dimensional space to within desired TV distance when the ambient distribution is an unknown affine transformation of the $d$-fold product of an (unknown) symmetric one-dimensional logconcave distribution, and the halfspace is introduced by deleting at least an $ε$ fraction of the data in one of the com…
▽ More
We give a polynomial-time algorithm for learning high-dimensional halfspaces with margins in $d$-dimensional space to within desired TV distance when the ambient distribution is an unknown affine transformation of the $d$-fold product of an (unknown) symmetric one-dimensional logconcave distribution, and the halfspace is introduced by deleting at least an $ε$ fraction of the data in one of the component distributions. Notably, our algorithm does not need labels and establishes the unique (and efficient) identifiability of the hidden halfspace under this distributional assumption. The sample and time complexity of the algorithm are polynomial in the dimension and $1/ε$. The algorithm uses only the first two moments of suitable re-weightings of the empirical distribution, which we call contrastive moments; its analysis uses classical facts about generalized Dirichlet polynomials and relies crucially on a new monotonicity property of the moment ratio of truncations of logconcave distributions. Such algorithms, based only on first and second moments were suggested in earlier work, but hitherto eluded rigorous guarantees.
Prior work addressed the special case when the underlying distribution is Gaussian via Non-Gaussian Component Analysis. We improve on this by providing polytime guarantees based on Total Variation (TV) distance, in place of existing moment-bound guarantees that can be super-polynomial. Our work is also the first to go beyond Gaussians in this setting.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.