Search | arXiv e-print repository

Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification

Authors: Pritish Sahu, Karan Sikka, Ajay Divakaran

Abstract: Large Visual Language Models (LVLMs) struggle with hallucinations in visual instruction following task(s), limiting their trustworthiness and real-world applicability. We propose Pelican -- a novel framework designed to detect and mitigate hallucinations through claim verification. Pelican first decomposes the visual claim into a chain of sub-claims based on first-order predicates. These sub-claim… ▽ More Large Visual Language Models (LVLMs) struggle with hallucinations in visual instruction following task(s), limiting their trustworthiness and real-world applicability. We propose Pelican -- a novel framework designed to detect and mitigate hallucinations through claim verification. Pelican first decomposes the visual claim into a chain of sub-claims based on first-order predicates. These sub-claims consist of (predicate, question) pairs and can be conceptualized as nodes of a computational graph. We then use Program-of-Thought prompting to generate Python code for answering these questions through flexible composition of external tools. Pelican improves over prior work by introducing (1) intermediate variables for precise grounding of object instances, and (2) shared computation for answering the sub-question to enable adaptive corrections and inconsistency identification. We finally use reasoning abilities of LLM to verify the correctness of the the claim by considering the consistency and confidence of the (question, answer) pairs from each sub-claim. Our experiments reveal a drop in hallucination rate by $\sim$8%-32% across various baseline LVLMs and a 27% drop compared to approaches proposed for hallucination mitigation on MMHal-Bench. Results on two other benchmarks further corroborate our results. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2405.11892 [pdf, other]

Emergence of giant orbital Hall and tunable spin Hall effects in centrosymmetric TMDs

Authors: Pratik Sahu, Jatin Kumar Bidika, Bubunu Biswal, S. Satpathy, B. R. K. Nanda

Abstract: We demonstrate the formation of orbital and spin Hall effects (OHE/SHE) in the 1T phase of non-magnetic transition metal dichalcogenides. With the aid of density functional theory calculations and model Hamiltonian studies on MX$_2$ (M = Pt, Pd and X = S, Se, and Te), we show an intrinsic orbital Hall conductivity ($\sim 10^3 \hbar /e\ Ω^{-1}cm^{-1}$) , which primarily emerges due to the orbital t… ▽ More We demonstrate the formation of orbital and spin Hall effects (OHE/SHE) in the 1T phase of non-magnetic transition metal dichalcogenides. With the aid of density functional theory calculations and model Hamiltonian studies on MX$_2$ (M = Pt, Pd and X = S, Se, and Te), we show an intrinsic orbital Hall conductivity ($\sim 10^3 \hbar /e\ Ω^{-1}cm^{-1}$) , which primarily emerges due to the orbital texture around the valleys in the momentum space. The robust spin-orbit coupling in these systems induces a sizable SHE out of OHE. Furthermore, to resemble the typical experimental setups, where the magnetic overlayers produce a proximity magnetic field, we examine the effect of magnetic field on OHE and SHE and showed that the latter can be doubled in these class of compounds. With a giant OHE and tunable SHE, the 1T-TMDs are promising candidates for spin and orbital driven quantum devices such as SOT-MRAM, spin nano-oscillators, spin logic devices etc., and to carry out spin-charge conversion experiments for fundamental research. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.11201 [pdf, ps, other]

On General Weighted Extropy of Percentile Ranked Set Sampling

Authors: Pradeep Kumar Sahu, Nitin Gupta

Abstract: The extropy measure, first proposed by Lad, Sanfilippo, and Agro in their (2015) paper in Statistical Science, has attracted considerable attention in recent years. Our study introduces a fresh approach to representing weighted extropy in the framework of percentile ranked set sampling. Furthermore, we provide additional insights such as stochastic orders, characterizations, and bounds. Our findin… ▽ More The extropy measure, first proposed by Lad, Sanfilippo, and Agro in their (2015) paper in Statistical Science, has attracted considerable attention in recent years. Our study introduces a fresh approach to representing weighted extropy in the framework of percentile ranked set sampling. Furthermore, we provide additional insights such as stochastic orders, characterizations, and bounds. Our findings illuminate the comparison between the weighted extropy of percentile ranked set sampling and its equivalent in simple random sampling. △ Less

Submitted 18 May, 2024; originally announced May 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2403.02673, arXiv:2207.02003

arXiv:2405.09557 [pdf, other]

Machine Learning in Short-Reach Optical Systems: A Comprehensive Survey

Authors: Chen Shao, Elias Giacoumidis, Syed Moktacim Billah, Shi Li, Jialei Li, Prashasti Sahu, Andre Richter, Tobias Kaefer, Michael Faerber

Abstract: In recent years, extensive research has been conducted to explore the utilization of machine learning algorithms in various direct-detected and self-coherent short-reach communication applications. These applications encompass a wide range of tasks, including bandwidth request prediction, signal quality monitoring, fault detection, traffic prediction, and digital signal processing (DSP)-based equa… ▽ More In recent years, extensive research has been conducted to explore the utilization of machine learning algorithms in various direct-detected and self-coherent short-reach communication applications. These applications encompass a wide range of tasks, including bandwidth request prediction, signal quality monitoring, fault detection, traffic prediction, and digital signal processing (DSP)-based equalization. As a versatile approach, machine learning demonstrates the ability to address stochastic phenomena in optical systems networks where deterministic methods may fall short. However, when it comes to DSP equalization algorithms, their performance improvements are often marginal, and their complexity is prohibitively high, especially in cost-sensitive short-reach communications scenarios such as passive optical networks (PONs). They excel in capturing temporal dependencies, handling irregular or nonlinear patterns effectively, and accommodating variable time intervals. Within this extensive survey, we outline the application of machine learning techniques in short-reach communications, specifically emphasizing their utilization in high-bandwidth demanding PONs. Notably, we introduce a novel taxonomy for time-series methods employed in machine learning signal processing, providing a structured classification framework. Our taxonomy categorizes current time series methods into four distinct groups: traditional methods, Fourier convolution-based methods, transformer-based models, and time-series convolutional networks. Finally, we highlight prospective research directions within this rapidly evolving field and outline specific solutions to mitigate the complexity associated with hardware implementations. We aim to pave the way for more practical and efficient deployment of machine learning approaches in short-reach optical communication systems by addressing complexity concerns. △ Less

Submitted 29 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

Comments: 23 pages, 2 figure, 3 tables, Accepted as MDPI Photonics Journal Speical Issue Machine Learning Applied to Optical Communication Systems

arXiv:2404.14353 [pdf]

Electroporation-mediated Metformin for effective anticancer treatment of triple-negative breast cancer cells

Authors: Praveen Sahu, Ignacio G. Camarillo, Pragatheiswar Giri, Raji Sundararajan

Abstract: In this research, we investigated the efficacy of Metformin, the most commonly administered type-2 diabetes drug for triple negative breast cancer (TNBC) treatment, due to its various anticancer properties. It is a plant-based bio-compound, synthesized as a novel biguanide, called dimethyl biguanide or metformin. One of the ways it operates is by hindering electron transport chain-complex I, in mi… ▽ More In this research, we investigated the efficacy of Metformin, the most commonly administered type-2 diabetes drug for triple negative breast cancer (TNBC) treatment, due to its various anticancer properties. It is a plant-based bio-compound, synthesized as a novel biguanide, called dimethyl biguanide or metformin. One of the ways it operates is by hindering electron transport chain-complex I, in mitochondria, which causes a drop-in energy (ATP) generation. This eventually builds energetic stress and a decline in energy. Therefore, the natural cellular processes and proliferating tumor cells are obstructed. Here, we used electroporation, where, the MDA-MB-231, human TNBC cells were subjected to high intensity, short-duration electrical pulses (EP) in the presence of Metformin. The cell viability results indicate lower cell viability of 43.45% as compared to 85.20% with drug alone at 5mM concentration. This indicates that Metformin, the most common diabetes drug could also be explored for cancer treatment. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.13007 [pdf]

Influence of strain and point defects on the electronic structure and related properties of (111)NiO epitaxial films

Authors: Bhabani Prasad Sahu, Poonam Sharma, Santosh Kumar Yadav, Alok Shukla, Subhabrata Dhar

Abstract: (111)NiO epitaxial films are grown on c-sapphire substrates at various growth temperatures ranging from room-temperature to 600C using pulsed laser deposition (PLD) technique. Two series of samples, where different laser fluences are used to ablate the target, are studied here. Films grown with higher laser fluence, are found to be embedded with Ni-clusters crystallographically aligned with the (1… ▽ More (111)NiO epitaxial films are grown on c-sapphire substrates at various growth temperatures ranging from room-temperature to 600C using pulsed laser deposition (PLD) technique. Two series of samples, where different laser fluences are used to ablate the target, are studied here. Films grown with higher laser fluence, are found to be embedded with Ni-clusters crystallographically aligned with the (111)NiO matrix. While the layers grown with lower laser energy density exhibit p-type conductivity specially at low growth temperatures. X-ray diffraction study shows the coexistence of biaxial compressive and tensile hydrostatic strains in these samples, which results in an expansion of the lattice primarily along the growth direction. This effective uniaxial expansion {epsilon}_perpendicular increases with the reduction of the growth temperature. Band gap of these samples is found to decrease linearly with {epsilon}_perpendicular. This result is validated by density functional theory (DFT) calculations. Experimental findings and the theoretical study further indicate that V_Ni + O_I and V_O + Ni_I complexes exist as the dominant native defects in samples grown with Ni-deficient (low laser fluence) and Ni-rich (high laser fluence) conditions, respectively. P-type conductivity observed in the samples grown in Ni-deficient condition is more likely to be resulting from V_Ni + O_I defects than Ni-vacancies (V_Ni). △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: 9 pages and 9 figures (Main manuscript), 4 pages and 4 figures (supplemental material)

arXiv:2403.02673 [pdf, ps, other]

On General Weighted Extropy of Extreme Ranked Set Sampling

Authors: Pradeep Kumar Sahu, Nitin Gupta

Abstract: The extropy measure, introduced by Lad, Sanfilippo, and Agro in their (2015) paper in Statistical Science, has garnered significant interest over the past years. In this study, we present a novel representation for the weighted extropy within the context of extreme ranked set sampling. Additionally, we offer related findings such as stochastic orders, characterizations, and precise bounds. Our res… ▽ More The extropy measure, introduced by Lad, Sanfilippo, and Agro in their (2015) paper in Statistical Science, has garnered significant interest over the past years. In this study, we present a novel representation for the weighted extropy within the context of extreme ranked set sampling. Additionally, we offer related findings such as stochastic orders, characterizations, and precise bounds. Our results shed light onthe comparison between the weighted extropy of extreme ranked set sampling and its counterpart in simple random sampling. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: arXiv admin note: text overlap with arXiv:2207.02003

arXiv:2312.03484 [pdf, other]

Higher order dynamical charge fluctuations in heavy-ion collisions

Authors: Bhanu Sharma, P. K. Sahu

Abstract: The event-by-event charge fluctuation measurements are proposed to provide the signature of quark-gluon plasma (QGP) in heavy-ion collisions. Measure of dynamical charge fluctuations is expected to carry information of initial fractional charge of the QGP phase at the final state. We propose the higher order charge fluctuation measurement to study the QGP signal in heavy-ion collisions. This highe… ▽ More The event-by-event charge fluctuation measurements are proposed to provide the signature of quark-gluon plasma (QGP) in heavy-ion collisions. Measure of dynamical charge fluctuations is expected to carry information of initial fractional charge of the QGP phase at the final state. We propose the higher order charge fluctuation measurement to study the QGP signal in heavy-ion collisions. This higher order charge fluctuation observable can amplify the signature of QGP. Also, the SMASH model is used to study the behavior of these observable in heavy-ion collisions at center of mass energies accessible in the STAR beam energy scan program. △ Less

Submitted 6 December, 2023; originally announced December 2023.

Comments: 14 pages, 4 figures

arXiv:2310.09337 [pdf, other]

doi 10.1007/JHEP03(2024)029

Leptogenesis in a Left-Right Symmetric Model with double seesaw

Authors: Utkarsh Patel, Pratik Adarsh, Sudhanwa Patra, Purushottam Sahu

Abstract: We explore the connection between low-scale CP-violating Dirac phase~$(δ)$ and high-scale leptogenesis in a Left-Right Symmetric Model (LRSM) with scalar bidoublet and doublets. The fermion sector of the model is extended with one sterile neutrino~$(S_L)$ per generation to implement a double seesaw mechanism in the neutral fermion mass matrix. The double seesaw is performed via the implementation… ▽ More We explore the connection between low-scale CP-violating Dirac phase~$(δ)$ and high-scale leptogenesis in a Left-Right Symmetric Model (LRSM) with scalar bidoublet and doublets. The fermion sector of the model is extended with one sterile neutrino~$(S_L)$ per generation to implement a double seesaw mechanism in the neutral fermion mass matrix. The double seesaw is performed via the implementation of type-I seesaw twice. The first seesaw facilitates the generation of Majorana mass term for heavy right-handed (RH) neutrinos~$(N_R)$, and the light neutrino mass becomes linearly dependent on $S_L$ mass in the second. In our framework, we have taken charge conjugation ($C$) as the discrete left-right (LR) symmetry. This choice assists in deriving the Dirac neutrino mass matrix ($M_D$) in terms of the light and heavy RH neutrino masses and light neutrino mixing matrix $U_{PMNS}$ (containing $δ$). We illustrate the viability of unflavored thermal leptogenesis via the decay of RH neutrinos by using the obtained $M_D$ with the masses of RH neutrinos as input parameters. A complete analysis of the Boltzmann equations describing the asymmetry evolution is performed in the unflavored regime, and it is shown that with or without Majorana phases, the CP-violating Dirac phase is sufficient to produce the required asymmetry in the leptonic sector within this framework for a given choice of input parameters. Finally, we comment on the possibility of constraining our model with the current and near-future oscillation experiments, which are aimed at refining the value of $δ$. △ Less

Submitted 15 February, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

Comments: 45 pages, 13 figures, 5 Tables

Journal ref: Journal of High Energy Physics 2024-03-05 | Journal article

arXiv:2309.13581 [pdf, other]

CDF II W-mass anomaly and SO(10) GUT

Authors: Purushottam Sahu, Hiranmaya Mishra, Prasanta K. Panigrahi, Sudhanwa Patra, Utpal Sarkar

Abstract: The W-mass anomaly has yet to be established, but a huge proliferation of articles on the subject established the rich potential of such event. We investigate the SO(10) GUT constraints from the recently reported W-mass anomaly. We consider both Supersymmetric (SUSY) and non-supersymmetric (non-SUSY) grand unified theories by studying renormalization group equations (RGEs) for gauge coupling unifi… ▽ More The W-mass anomaly has yet to be established, but a huge proliferation of articles on the subject established the rich potential of such event. We investigate the SO(10) GUT constraints from the recently reported W-mass anomaly. We consider both Supersymmetric (SUSY) and non-supersymmetric (non-SUSY) grand unified theories by studying renormalization group equations (RGEs) for gauge coupling unification and their predictions on proton decay. In the non-SUSY models, single-stage unification is possible if one include a light (around TeV) real triplet Higgs scalar. However, these models predict speedy proton decay, inconsistent with the present experimental bound on the proton decay. This situation may be improved by including newer scalars and new intermediate-mass scales, which are present in the $SO(10)$ GUTs. The standard model is extended to a left-right symmetric model (LR), and the scale of LR breaking naturally introduces the intermediate scale in the model. A single-stage unification is possible even without including any triplet Higgs scalar in a minimal supersymmetric standard model. △ Less

Submitted 24 September, 2023; originally announced September 2023.

Comments: 9 pages, 4 figures

arXiv:2309.12938 [pdf, other]

Frustrated with Code Quality Issues? LLMs can Help!

Authors: Nalin Wadhwa, Jui Pradhan, Atharv Sonwane, Surya Prakash Sahu, Nagarajan Natarajan, Aditya Kanade, Suresh Parthasarathy, Sriram Rajamani

Abstract: As software projects progress, quality of code assumes paramount importance as it affects reliability, maintainability and security of software. For this reason, static analysis tools are used in developer workflows to flag code quality issues. However, developers need to spend extra efforts to revise their code to improve code quality based on the tool findings. In this work, we investigate the u… ▽ More As software projects progress, quality of code assumes paramount importance as it affects reliability, maintainability and security of software. For this reason, static analysis tools are used in developer workflows to flag code quality issues. However, developers need to spend extra efforts to revise their code to improve code quality based on the tool findings. In this work, we investigate the use of (instruction-following) large language models (LLMs) to assist developers in revising code to resolve code quality issues. We present a tool, CORE (short for COde REvisions), architected using a pair of LLMs organized as a duo comprised of a proposer and a ranker. Providers of static analysis tools recommend ways to mitigate the tool warnings and developers follow them to revise their code. The \emph{proposer LLM} of CORE takes the same set of recommendations and applies them to generate candidate code revisions. The candidates which pass the static quality checks are retained. However, the LLM may introduce subtle, unintended functionality changes which may go un-detected by the static analysis. The \emph{ranker LLM} evaluates the changes made by the proposer using a rubric that closely follows the acceptance criteria that a developer would enforce. CORE uses the scores assigned by the ranker LLM to rank the candidate revisions before presenting them to the developer. CORE could revise 59.2% Python files (across 52 quality checks) so that they pass scrutiny by both a tool and a human reviewer. The ranker LLM is able to reduce false positives by 25.8% in these cases. CORE produced revisions that passed the static analysis tool in 76.8% Java files (across 10 quality checks) comparable to 78.3% of a specialized program repair tool, with significantly much less engineering efforts. △ Less

Submitted 22 September, 2023; originally announced September 2023.

arXiv:2308.11151 [pdf]

Predicting Pair Correlation Functions of Glasses using Machine Learning

Authors: Kumar Ayush, Pooja Sahu, Sk Musharaf Ali, Tarak K Patra

Abstract: Glasses offer a broad range of tunable thermophysical properties that are linked to their compositions. However, it is challenging to establish a universal composition-property relation of glasses due to their enormous composition and chemical space. Here, we address this problem and develop a metamodel of composition-atomistic structure relation of a class of glassy material via a machine learnin… ▽ More Glasses offer a broad range of tunable thermophysical properties that are linked to their compositions. However, it is challenging to establish a universal composition-property relation of glasses due to their enormous composition and chemical space. Here, we address this problem and develop a metamodel of composition-atomistic structure relation of a class of glassy material via a machine learning (ML) approach. Within this ML framework, an unsupervised deep learning technique, viz. convolutional neural network (CNN) autoencoder, and a regression algorithm, viz. random forest (RF), are integrated into a fully automated pipeline to predict the spatial distribution of atoms in a glass. The RF regression model predicts the pair correlation function of a glass in a latent space. Subsequently, the decoder of the CNN converts the latent space representation to the actual pair correlation function of the given glass. The atomistic structures of silicate (SiO2) and sodium borosilicate (NBS) based glasses with varying compositions and dopants are collected from molecular dynamics (MD) simulations to establish and validate this ML pipeline. The model is found to predict the atom pair correlation function for many unknown glasses very accurately. This method is very generic and can accelerate the design, discovery, and fundamental understanding of composition-atomistic structure relations of glasses and other materials. △ Less

Submitted 21 August, 2023; originally announced August 2023.

arXiv:2308.06999 [pdf]

Ni cluster embedded (111)NiO layers grown on (0001)GaN films using pulsed laser deposition technique

Authors: Simran Arora, Shivesh Yadav, Amandeep Kaur, Bhabani Prasad Sahu, Zainab Hussain, Subhabrata Dhar

Abstract: (111) NiO epitaxial layers embedded with crystallographically oriented Ni-clusters are grown on c-GaN/Sapphire templates using pulsed laser deposition technique. Structural and magnetic properties of the films are examined by a variety of techniques including high resolution x-ray diffraction, precession-electron diffraction and superconducting quantum interference device magnetometry. The study r… ▽ More (111) NiO epitaxial layers embedded with crystallographically oriented Ni-clusters are grown on c-GaN/Sapphire templates using pulsed laser deposition technique. Structural and magnetic properties of the films are examined by a variety of techniques including high resolution x-ray diffraction, precession-electron diffraction and superconducting quantum interference device magnetometry. The study reveals that the inclusion, orientation, shape, size, density and magnetic properties of these clusters depend strongly on the growth temperature (TG). Though, most of the Ni-clusters are found to be crystallographically aligned with the NiO matrix with Ni(111) parallel to NiO(111), clusters with other orientations also exist, especially in samples grown at lower temperatures. Average size and density of the clusters increase with TG . Proportion of the Ni(111) parallel to NiO(111) oriented clusters also improves as TG is increased. All cluster embedded films show ferromagnetic behaviour even at room temperature. Easy-axis is found to be oriented in the layer plane in samples grown at relatively lower temperatures. However, it turns perpendicular to the layer plane for samples grown at sufficiently high temperatures. This reversal of easy-axis has been attributed to the size dependent competition between the shape, magnetoelastic and the surface anisotropies of the clusters. This composite material thus has great potential to serve as spin-injector and spinstorage medium in GaN based spintronics of the future. △ Less

Submitted 14 August, 2023; originally announced August 2023.

arXiv:2307.16122 [pdf]

doi 10.1016/j.msea.2023.145648

Enabling plastic co-deformation of disparate phases in a laser rapid solidified Sr-modified Al-Si eutectic through partial-dislocation-mediated-plasticity in Si

Authors: Arkajit Ghosh, Wenqian Wu, Bibhu Prasad Sahu, Jian Wang, Amit Misra

Abstract: Nano-scale eutectics, such as rapid solidified Al-Si, exhibit enhanced yield strength and strain hardening but plasticity is limited by cracking of the hard phase (Si). Mechanisms that may suppress cracking and enable plastic co-deformation of soft and hard phases are key to maximizing plasticity in these high-strength microstructures. Using a combination of laser rapid solidification and chemical… ▽ More Nano-scale eutectics, such as rapid solidified Al-Si, exhibit enhanced yield strength and strain hardening but plasticity is limited by cracking of the hard phase (Si). Mechanisms that may suppress cracking and enable plastic co-deformation of soft and hard phases are key to maximizing plasticity in these high-strength microstructures. Using a combination of laser rapid solidification and chemical (Sr) modification, we have synthesized fully eutectic Al-Si microstructures with heavily twinned Si nano-fibers that exhibit high hardness up to 2.9 GPa, and high compressive flow strength (~840 MPa) with stable plastic flow to ~26% plastic strain. After deformation, the hard Si(Sr) fibers did not exhibit cracks, but a high density of stacking faults were observed in the Si(Sr) fibers suggesting partial dislocation mediated plasticity. Mechanisms for suppression of cracking and activation of partial dislocations in Si deformed at room temperature are discussed in terms of nanoscale fiber geometry with reduced aspect ratio and lowering of the Peierls barrier in chemically-modified, nano-twinned Si fibers. △ Less

Submitted 30 July, 2023; originally announced July 2023.

Comments: 29 pages, 15 figures

arXiv:2307.11524 [pdf, other]

doi 10.1021/acs.nanolett.3c04060

Magnetic Proximity induced efficient charge-to-spin conversion in large area PtSe$_{2}$/Ni$_{80}$Fe$_{20}$ heterostructures

Authors: Richa Mudgal, Alka Jakhar, Pankhuri Gupta, Ram Singh Yadav, B. Biswal, P. Sahu, Himanshu Bangar, Akash Kumar, Niru Chowdhury, Biswarup Satpati, B. R. K. Nanda, S. Satpathy, Samaresh Das, P. K. Muduli

Abstract: As a topological Dirac semimetal with controllable spin-orbit coupling and conductivity, PtSe$_2$, a transition-metal dichalcogenide, is a promising material for several applications from optoelectric to sensors. However, its potential for spintronics applications is yet to be explored. In this work, we demonstrate that PtSe$_{2}$/Ni$_{80}$Fe$_{20}$ heterostructure can generate a large dam**-lik… ▽ More As a topological Dirac semimetal with controllable spin-orbit coupling and conductivity, PtSe$_2$, a transition-metal dichalcogenide, is a promising material for several applications from optoelectric to sensors. However, its potential for spintronics applications is yet to be explored. In this work, we demonstrate that PtSe$_{2}$/Ni$_{80}$Fe$_{20}$ heterostructure can generate a large dam**-like current-induced spin-orbit torques (SOT), despite the absence of spin-splitting in bulk PtSe$_{2}$. The efficiency of charge-to-spin conversion is found to be $(-0.1 \pm 0.02)$~nm$^{-1}$ in PtSe$_{2}$/Ni$_{80}$Fe$_{20}$, which is three times that of the control sample, Ni$_{80}$Fe$_{20}$/Pt. Our band structure calculations show that the SOT due to the PtSe$_2$ arises from an unexpectedly large spin splitting in the interfacial region of PtSe$_2$ introduced by the proximity magnetic field of the Ni$_{80}$Fe$_{20}$ layer. Our results open up the possibilities of using large-area PtSe$_{2}$ for energy-efficient nanoscale devices by utilizing the proximity-induced SOT. △ Less

Submitted 21 July, 2023; originally announced July 2023.

Comments: 18 pages, 4 figures

Journal ref: Nano Lett. 23, 11925 (2023)

arXiv:2306.15792 [pdf, other]

Sidecars on the Central Lane: Impact of Network Proxies on Microservices

Authors: Prateek Sahu, Lucy Zheng, Marco Bueso, Shijia Wei, Neeraja J. Yadwadkar, Mohit Tiwari

Abstract: Cloud applications are moving away from monolithic model towards loosely-coupled microservices designs. Service meshes are widely used for implementing microservices applications mainly because they provide a modular architecture for modern applications by separating operational features from application business logic. Sidecar proxies in service meshes enable this modularity by applying security,… ▽ More Cloud applications are moving away from monolithic model towards loosely-coupled microservices designs. Service meshes are widely used for implementing microservices applications mainly because they provide a modular architecture for modern applications by separating operational features from application business logic. Sidecar proxies in service meshes enable this modularity by applying security, networking, and monitoring policies on the traffic to and from services. To implement these policies, sidecars often execute complex chains of logic that vary across associated applications and end up unevenly impacting the performance of the overall application. Lack of understanding of how the sidecars impact the performance of microservice-based applications stands in the way of building performant and resource-efficient applications. To this end, we bring sidecar proxies in focus and argue that we need to deeply study their impact on the system performance and resource utilization. We identify and describe challenges in characterizing sidecars, namely the need for microarchitectural metrics and comprehensive methodologies, and discuss research directions where such characterization will help in building efficient service mesh infrastructure for microservice applications. △ Less

Submitted 17 October, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

Comments: Presented at HotInfra 2023 (co-located with ISCA 2023, Orlando, FL)

arXiv:2302.14685 [pdf, other]

DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks

Authors: Samyak Jain, Sravanti Addepalli, Pawan Sahu, Priyam Dey, R. Venkatesh Babu

Abstract: Generalization of neural networks is crucial for deploying them safely in the real world. Common training strategies to improve generalization involve the use of data augmentations, ensembling and model averaging. In this work, we first establish a surprisingly simple but strong benchmark for generalization which utilizes diverse augmentations within a training minibatch, and show that this can le… ▽ More Generalization of neural networks is crucial for deploying them safely in the real world. Common training strategies to improve generalization involve the use of data augmentations, ensembling and model averaging. In this work, we first establish a surprisingly simple but strong benchmark for generalization which utilizes diverse augmentations within a training minibatch, and show that this can learn a more balanced distribution of features. Further, we propose Diversify-Aggregate-Repeat Training (DART) strategy that first trains diverse models using different augmentations (or domains) to explore the loss basin, and further Aggregates their weights to combine their expertise and obtain improved generalization. We find that Repeating the step of Aggregation throughout training improves the overall optimization trajectory and also ensures that the individual models have a sufficiently low loss barrier to obtain improved generalization on combining them. We shed light on our approach by casting it in the framework proposed by Shen et al. and theoretically show that it indeed generalizes better. In addition to improvements in In- Domain generalization, we demonstrate SOTA performance on the Domain Generalization benchmarks in the popular DomainBed framework as well. Our method is generic and can easily be integrated with several base training algorithms to achieve performance gains. △ Less

Submitted 10 June, 2023; v1 submitted 28 February, 2023; originally announced February 2023.

Comments: CVPR 2023. First two authors contributed equally

arXiv:2302.14538 [pdf, other]

doi 10.1103/PhysRevD.107.075037

Neutrinoless double beta decay in Left-Right symmetric model with double seesaw mechanism

Authors: Sudhanwa Patra, S. T. Petcov, Prativa Pritimita, Purushottam Sahu

Abstract: We discuss a left-right (L-R) symmetric model with the double seesaw mechanism at the TeV scale generating Majorana masses for the active left-handed (LH) flavour neutrinos $ν_{αL}$ and the heavy right-handed (RH) neutrinos $N_{βR}$, $α,β= e,μ,τ$, which in turn mediate lepton number violating processes, including neutrinoless double beta decay. The Higgs sector is composed of two Higgs doublets… ▽ More We discuss a left-right (L-R) symmetric model with the double seesaw mechanism at the TeV scale generating Majorana masses for the active left-handed (LH) flavour neutrinos $ν_{αL}$ and the heavy right-handed (RH) neutrinos $N_{βR}$, $α,β= e,μ,τ$, which in turn mediate lepton number violating processes, including neutrinoless double beta decay. The Higgs sector is composed of two Higgs doublets $H_L$, $H_R$, and a bi-doublet $Φ$. The fermion sector has the usual for the L-R symmetric models quarks and leptons, along with three $SU(2)$ singlet fermion $S_{γL}$. The choice of bare Majorana mass term for these sterile fermions induces large Majorana masses for the heavy RH neutrinos leading to two sets of heavy Majorana particles $N_j$ and $S_k$, $j,k=1,2,3$, with masses $m_{N_j} \ll m_{S_k}$. Working with a specific version of the model in which the $ν_{αL} - N_{βR}$ and the $N_{βR} - S_{γL}$ Dirac mass terms are diagonal, and assuming that $m_{N_j} \sim (1 - 1000)$ GeV and ${\rm max}(m_{S_k}) \sim (1 - 10)$ TeV, $m_{N_j} \ll m_{S_k}$, we study in detail the new ``non-standard'' contributions to the $0νββ$ decay amplitude and half-life arising due to the exchange of virtual $N_j$ and $S_k$. We find that in both cases of NO and IO light neutrino mass spectra, these contributions are strongly enhanced and are dominant at relatively small values of the lightest neutrino mass $m_{1(3)} \sim (10^{-4} - 10^{-2})$ eV over the light Majorana neutrino exchange contribution. In large part of the parameter space, the predictions of the model for the $0νββ$ decay generalised effective Majorana mass and half-life are within the sensitivity range of the planned next generation of neutrinoless double beta decay experiments LEGEND-200 (LEGEND-1000), nEXO, KamlAND-Zen-II, CUPID, NEXT-HD. △ Less

Submitted 27 April, 2023; v1 submitted 28 February, 2023; originally announced February 2023.

Comments: 48 pages, 4 figures

Journal ref: Phys. Rev. D 107, 075037 2023-04-26

arXiv:2211.13769 [pdf, other]

On Designing Light-Weight Object Trackers through Network Pruning: Use CNNs or Transformers?

Authors: Saksham Aggarwal, Taneesh Gupta, Pawan Kumar Sahu, Arnav Chavan, Rishabh Tiwari, Dilip K. Prasad, Deepak K. Gupta

Abstract: Object trackers deployed on low-power devices need to be light-weight, however, most of the current state-of-the-art (SOTA) methods rely on using compute-heavy backbones built using CNNs or transformers. Large sizes of such models do not allow their deployment in low-power conditions and designing compressed variants of large tracking models is of great importance. This paper demonstrates how high… ▽ More Object trackers deployed on low-power devices need to be light-weight, however, most of the current state-of-the-art (SOTA) methods rely on using compute-heavy backbones built using CNNs or transformers. Large sizes of such models do not allow their deployment in low-power conditions and designing compressed variants of large tracking models is of great importance. This paper demonstrates how highly compressed light-weight object trackers can be designed using neural architectural pruning of large CNN and transformer based trackers. Further, a comparative study on architectural choices best suited to design light-weight trackers is provided. A comparison between SOTA trackers using CNNs, transformers as well as the combination of the two is presented to study their stability at various compression ratios. Finally results for extreme pruning scenarios going as low as 1% in some cases are shown to study the limits of network pruning in object tracking. This work provides deeper insights into designing highly efficient trackers from existing SOTA methods. △ Less

Submitted 26 March, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

Comments: Accepted at IEEE ICASSP 2023

arXiv:2211.13441 [pdf, ps, other]

doi 10.1080/02331888.2023.2241595

On weighted cumulative residual extropy and weighted negative cumulative extropy

Authors: Nitin Gupta, Santosh Kumar Chaudhary, Pradeep Kumar Sahu

Abstract: In this paper, we define general weighted cumulative residual extropy (GWCRJ) and general weighted negative cumulative extropy (GWNCJ). We obtain its simple estimators for complete and right censored data. We obtain some results on GWCREJ and GWNCJ. We establish its connection to reliability theory and coherent systems. We also propose empirical estimators of weighted negative cumulative extropy (… ▽ More In this paper, we define general weighted cumulative residual extropy (GWCRJ) and general weighted negative cumulative extropy (GWNCJ). We obtain its simple estimators for complete and right censored data. We obtain some results on GWCREJ and GWNCJ. We establish its connection to reliability theory and coherent systems. We also propose empirical estimators of weighted negative cumulative extropy (WNCJ). △ Less

Submitted 24 November, 2022; originally announced November 2022.

Report number: GSTA 2241595 MSC Class: 94A17; 62N05; 60E15

Journal ref: 2023

arXiv:2210.09048 [pdf, other]

ATHENA Detector Proposal -- A Totally Hermetic Electron Nucleus Apparatus proposed for IP6 at the Electron-Ion Collider

Authors: ATHENA Collaboration, J. Adam, L. Adamczyk, N. Agrawal, C. Aidala, W. Akers, M. Alekseev, M. M. Allen, F. Ameli, A. Angerami, P. Antonioli, N. J. Apadula, A. Aprahamian, W. Armstrong, M. Arratia, J. R. Arrington, A. Asaturyan, E. C. Aschenauer, K. Augsten, S. Aune, K. Bailey, C. Baldanza, M. Bansal, F. Barbosa, L. Barion , et al. (415 additional authors not shown)

Abstract: ATHENA has been designed as a general purpose detector capable of delivering the full scientific scope of the Electron-Ion Collider. Careful technology choices provide fine tracking and momentum resolution, high performance electromagnetic and hadronic calorimetry, hadron identification over a wide kinematic range, and near-complete hermeticity. This article describes the detector design and its e… ▽ More ATHENA has been designed as a general purpose detector capable of delivering the full scientific scope of the Electron-Ion Collider. Careful technology choices provide fine tracking and momentum resolution, high performance electromagnetic and hadronic calorimetry, hadron identification over a wide kinematic range, and near-complete hermeticity. This article describes the detector design and its expected performance in the most relevant physics channels. It includes an evaluation of detector technology choices, the technical challenges to realizing the detector and the R&D required to meet those challenges. △ Less

Submitted 13 October, 2022; originally announced October 2022.

Journal ref: JINST 17 (2022) 10, P10019

arXiv:2210.07712 [pdf, ps, other]

On general weighted cumulative past extropy

Authors: Pradeep Kumar Sahu, Nitin Gupta

Abstract: In this paper, we study some properties and characterization of the general weighted cumulative past extropy (n-WCPJ). Many results including some bounds, inequalities, and effects of linear transformations are obtained. We study the characterization of n-WCPJ based on the largest order statistics. Conditional WCPJ and some of its properties are discussed. In this paper, we study some properties and characterization of the general weighted cumulative past extropy (n-WCPJ). Many results including some bounds, inequalities, and effects of linear transformations are obtained. We study the characterization of n-WCPJ based on the largest order statistics. Conditional WCPJ and some of its properties are discussed. △ Less

Submitted 8 May, 2023; v1 submitted 14 October, 2022; originally announced October 2022.

Comments: 12

MSC Class: 62B10; 62N05; 94A15; 94A17

arXiv:2209.15093 [pdf, other]

Unpacking Large Language Models with Conceptual Consistency

Authors: Pritish Sahu, Michael Cogswell, Yunye Gong, Ajay Divakaran

Abstract: If a Large Language Model (LLM) answers "yes" to the question "Are mountains tall?" then does it know what a mountain is? Can you rely on it responding correctly or incorrectly to other questions about mountains? The success of Large Language Models (LLMs) indicates they are increasingly able to answer queries like these accurately, but that ability does not necessarily imply a general understandi… ▽ More If a Large Language Model (LLM) answers "yes" to the question "Are mountains tall?" then does it know what a mountain is? Can you rely on it responding correctly or incorrectly to other questions about mountains? The success of Large Language Models (LLMs) indicates they are increasingly able to answer queries like these accurately, but that ability does not necessarily imply a general understanding of concepts relevant to the anchor query. We propose conceptual consistency to measure a LLM's understanding of relevant concepts. This novel metric measures how well a model can be characterized by finding out how consistent its responses to queries about conceptually relevant background knowledge are. To compute it we extract background knowledge by traversing paths between concepts in a knowledge base and then try to predict the model's response to the anchor query from the background knowledge. We investigate the performance of current LLMs in a commonsense reasoning setting using the CSQA dataset and the ConceptNet knowledge base. While conceptual consistency, like other metrics, does increase with the scale of the LLM used, we find that popular models do not necessarily have high conceptual consistency. Our analysis also shows significant variation in conceptual consistency across different kinds of relations, concepts, and prompts. This serves as a step toward building models that humans can apply a theory of mind to, and thus interact with intuitively. △ Less

Submitted 29 September, 2022; originally announced September 2022.

arXiv:2209.08372 [pdf, other]

CodeQueries: A Dataset of Semantic Queries over Code

Authors: Surya Prakash Sahu, Madhurima Mandal, Shikhar Bharadwaj, Aditya Kanade, Petros Maniatis, Shirish Shevade

Abstract: Developers often have questions about semantic aspects of code they are working on, e.g., "Is there a class whose parent classes declare a conflicting attribute?". Answering them requires understanding code semantics such as attributes and inheritance relation of classes. An answer to such a question should identify code spans constituting the answer (e.g., the declaration of the subclass) as well… ▽ More Developers often have questions about semantic aspects of code they are working on, e.g., "Is there a class whose parent classes declare a conflicting attribute?". Answering them requires understanding code semantics such as attributes and inheritance relation of classes. An answer to such a question should identify code spans constituting the answer (e.g., the declaration of the subclass) as well as supporting facts (e.g., the definitions of the conflicting attributes). The existing work on question-answering over code has considered yes/no questions or method-level context. We contribute a labeled dataset, called CodeQueries, of semantic queries over Python code. Compared to the existing datasets, in CodeQueries, the queries are about code semantics, the context is file level and the answers are code spans. We curate the dataset based on queries supported by a widely-used static analysis tool, CodeQL, and include both positive and negative examples, and queries requiring single-hop and multi-hop reasoning. To assess the value of our dataset, we evaluate baseline neural approaches. We study a large language model (GPT3.5-Turbo) in zero-shot and few-shot settings on a subset of CodeQueries. We also evaluate a BERT style model (CuBERT) with fine-tuning. We find that these models achieve limited success on CodeQueries. CodeQueries is thus a challenging dataset to test the ability of neural models, to understand code semantics, in the extractive question-answering setting. △ Less

Submitted 14 July, 2023; v1 submitted 17 September, 2022; originally announced September 2022.

arXiv:2209.02873 [pdf, ps, other]

A novel difference equation approach for the stability and robustness of compact schemes for variable coefficient PDEs

Authors: Anindya Goswami, Kuldip Singh Patel, Pradeep Kumar Sahu

Abstract: Fourth-order accurate compact schemes for variable coefficient convection diffusion equations are considered. A sufficient condition for the stability of the fully discrete problem is derived using a difference equation based approach. The constant coefficient problems are considered as a special case, and the unconditional stability of compact schemes for such case is proved theoretically. The co… ▽ More Fourth-order accurate compact schemes for variable coefficient convection diffusion equations are considered. A sufficient condition for the stability of the fully discrete problem is derived using a difference equation based approach. The constant coefficient problems are considered as a special case, and the unconditional stability of compact schemes for such case is proved theoretically. The condition number of the amplification matrix is also analyzed, and an estimate for the same is derived. The examples are provided to support the assumption taken to assure stability. △ Less

Submitted 28 January, 2024; v1 submitted 6 September, 2022; originally announced September 2022.

arXiv:2207.08872 [pdf]

doi 10.1103/PhysRevMaterials.6.074206

Room temperature spin-orbit torque efficiency in sputtered low-temperature superconductor delta-TaN

Authors: Przemyslaw Wojciech Swatek, Xudong Hang, Yihong Fan, Wei Jiang, Hwanhui Yun, Deyuan Lyu, Delin Zhang, Thomas J. Peterson, Protyush Sahu, Onri Jay Benally, Zach Cresswell, **ming Liu, Rabindra Pahari, Daniel Kukla, Tony Low, K. Andre Mkhoyan, Jian-** Wang

Abstract: In the course of searching for promising topological materials for applications in future topological electronics, we evaluated spin-orbit torques (SOTs) in high-quality sputtered $δ-$TaN/Co20Fe60B20 devices through spin-torque ferromagnetic resonance ST-FMR and spin pum** measurements. From the ST-FMR characterization we observed a significant linewidth modulation in the magnetic Co20Fe60B20 la… ▽ More In the course of searching for promising topological materials for applications in future topological electronics, we evaluated spin-orbit torques (SOTs) in high-quality sputtered $δ-$TaN/Co20Fe60B20 devices through spin-torque ferromagnetic resonance ST-FMR and spin pum** measurements. From the ST-FMR characterization we observed a significant linewidth modulation in the magnetic Co20Fe60B20 layer attributed to the charge-to-spin conversion generated from the $δ-$TaN layer. Remarkably, the spin-torque efficiency determined from ST-FMR and spin pum** measurements is as large as $Θ =$ 0.034 and 0.031, respectively. These values are over two times larger than for $α-$Ta, but almost five times lower than for $β-$Ta, which can be attributed to the low room temperature electrical resistivity $\sim 74μΩ$ cm in $δ-$TaN. A large spin diffusion length of at least $\sim8$ nm is estimated, which is comparable to the spin diffusion length in pure Ta. Comprehensive experimental analysis, together with density functional theory calculations, indicates that the origin of the pronounced SOT effect in $δ-$TaN can be mostly related to a significant contribution from the Berry curvature associated with the presence of a topically nontrivial electronic band structure in the vicinity of the Fermi level (EF). Through additional detailed theoretical analysis, we also found that an isostructural allotrope of the superconducting $δ-$TaN phase, the simple hexagonal structure, $θ-$TaN, has larger Berry curvature, and that, together with expected reasonable charge conductivity, it can also be a promising candidate for exploring a generation of spin-orbit torque magnetic random access memory as cheap, temperature stable, and highly efficient spin current sources. △ Less

Submitted 29 July, 2022; v1 submitted 18 July, 2022; originally announced July 2022.

Comments: 28 pages, 7 figures

Journal ref: Phys. Rev. Materials 6, 074206 (2022)

arXiv:2206.14913 [pdf, other]

GPTs at Factify 2022: Prompt Aided Fact-Verification

Authors: Pawan Kumar Sahu, Saksham Aggarwal, Taneesh Gupta, Gyanendra Das

Abstract: One of the most pressing societal issues is the fight against false news. The false claims, as difficult as they are to expose, create a lot of damage. To tackle the problem, fact verification becomes crucial and thus has been a topic of interest among diverse research communities. Using only the textual form of data we propose our solution to the problem and achieve competitive results with other… ▽ More One of the most pressing societal issues is the fight against false news. The false claims, as difficult as they are to expose, create a lot of damage. To tackle the problem, fact verification becomes crucial and thus has been a topic of interest among diverse research communities. Using only the textual form of data we propose our solution to the problem and achieve competitive results with other approaches. We present our solution based on two approaches - PLM (pre-trained language model) based method and Prompt based method. The PLM-based approach uses the traditional supervised learning, where the model is trained to take 'x' as input and output prediction 'y' as P(y|x). Whereas, Prompt-based learning reflects the idea to design input to fit the model such that the original objective may be re-framed as a problem of (masked) language modeling. We may further stimulate the rich knowledge provided by PLMs to better serve downstream tasks by employing extra prompts to fine-tune PLMs. Our experiments showed that the proposed method performs better than just fine-tuning PLMs. We achieved an F1 score of 0.6946 on the FACTIFY dataset and a 7th position on the competition leader-board. △ Less

Submitted 29 June, 2022; originally announced June 2022.

Comments: Accepted in AAAI'22: First Workshop on Multimodal Fact-Checking and Hate Speech Detection, Februrary 22 - March 1, 2022,Vancouver, BC, Canada

arXiv:2206.09265 [pdf, ps, other]

SAViR-T: Spatially Attentive Visual Reasoning with Transformers

Authors: Pritish Sahu, Kalliopi Basioti, Vladimir Pavlovic

Abstract: We present a novel computational model, "SAViR-T", for the family of visual reasoning problems embodied in the Raven's Progressive Matrices (RPM). Our model considers explicit spatial semantics of visual elements within each image in the puzzle, encoded as spatio-visual tokens, and learns the intra-image as well as the inter-image token dependencies, highly relevant for the visual reasoning task.… ▽ More We present a novel computational model, "SAViR-T", for the family of visual reasoning problems embodied in the Raven's Progressive Matrices (RPM). Our model considers explicit spatial semantics of visual elements within each image in the puzzle, encoded as spatio-visual tokens, and learns the intra-image as well as the inter-image token dependencies, highly relevant for the visual reasoning task. Token-wise relationship, modeled through a transformer-based SAViR-T architecture, extract group (row or column) driven representations by leveraging the group-rule coherence and use this as the inductive bias to extract the underlying rule representations in the top two row (or column) per token in the RPM. We use this relation representations to locate the correct choice image that completes the last row or column for the RPM. Extensive experiments across both synthetic RPM benchmarks, including RAVEN, I-RAVEN, RAVEN-FAIR, and PGM, and the natural image-based "V-PROM" demonstrate that SAViR-T sets a new state-of-the-art for visual reasoning, exceeding prior models' performance by a considerable margin. △ Less

Submitted 21 June, 2022; v1 submitted 18 June, 2022; originally announced June 2022.

arXiv:2204.07334 [pdf, other]

Effect of right-handed currents and dark side of the solar neutrino parameter space to Neutrinoless Double Beta Decay

Authors: Pritam Kumar Bishee, Purushottam Sahu, Sudhanwa Patra

Abstract: The Majorana nature of neutrinos will be the confirmed by the observation of the rare process called as neutrinoless double beta decay process, i.e. the simultaneous decay of two neutrons in the nucleus of an isotope (A, Z) into two protons and two electrons without the emission of any neutrinos i.e, $(A, Z) \to (A, Z + 2) + 2 e^-$. The non-observation of such a decay so far has been interpreted a… ▽ More The Majorana nature of neutrinos will be the confirmed by the observation of the rare process called as neutrinoless double beta decay process, i.e. the simultaneous decay of two neutrons in the nucleus of an isotope (A, Z) into two protons and two electrons without the emission of any neutrinos i.e, $(A, Z) \to (A, Z + 2) + 2 e^-$. The non-observation of such a decay so far has been interpreted as a lower limit on the half life of the isotope under investigation, which puts severe constraints on any new physics giving rise to LNV in the electron sector. On the other hand, the standard mechanism with normal ordering and inverted ordering can not saturate the present experimental limit while quasi-degenerate light neutrinos are strongly disfavored by the upper limits on the sum of light neutrino masses from cosmological data sets. In this work, we show that how dark side of the solar neutrino parameter space and effect of new physics contributions from right-handed currents can saturate the experimental limit provided by KamLAND-Zen and GERDA. △ Less

Submitted 26 April, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

Comments: 6 pages, 2 figures

Journal ref: STUDENT JOURNAL OF PHYSICS 2018 Vol. 7, No. 4 Oct-Dec. 2018

arXiv:2204.06392 [pdf, other]

doi 10.1007/JHEP11(2022)029

Flavour anomalies and dark matter assisted unification in $SO(10)$ GUT

Authors: Purushottam Sahu, Aishwarya Bhatta, Rukmani Mohanta, Shivaramakrishna Singirala, Sudhanwa Patra

Abstract: With the recent experimental hint of new physics from flavor physics anomalies, combined with the evidence from neutrino mass and dark matter, we consider a minimal extension of SM with a scalar leptoquark and a fermion triplet. The scalar leptoquark with couplings to leptons and quarks can explain lepton flavor non-universality observables $R_K$, $R_{K^{(*)}}$, $R_{D^{(*)}}$ and $R_{J/ψ}$. Neutra… ▽ More With the recent experimental hint of new physics from flavor physics anomalies, combined with the evidence from neutrino mass and dark matter, we consider a minimal extension of SM with a scalar leptoquark and a fermion triplet. The scalar leptoquark with couplings to leptons and quarks can explain lepton flavor non-universality observables $R_K$, $R_{K^{(*)}}$, $R_{D^{(*)}}$ and $R_{J/ψ}$. Neutral component of fermion triplet provides current abundance of dark matter in the Universe. The interesting feature of the proposal is that the minimal addition of these phenomenologically rich particles (scalar leptoquark and fermion triplet) assist in realizing the unification of the gauge couplings associated with the strong and electroweak forces of standard model when embedded in the non-supersymmetric $SO(10)$ grand unified theory. We discuss on unification mass scale and the corresponding proton decay constraints while taking into account the GUT threshold corrections. △ Less

Submitted 25 October, 2022; v1 submitted 13 April, 2022; originally announced April 2022.

Comments: 43 pages,9 figures

Report number: 29 (2022)

Journal ref: Journal of High Energy Physics 2022-11-07

arXiv:2203.05437 [pdf]

IndicNLG Benchmark: Multilingual Datasets for Diverse NLG Tasks in Indic Languages

Authors: Aman Kumar, Himani Shrotriya, Prachi Sahu, Raj Dabre, Ratish Puduppully, Anoop Kunchukuttan, Amogh Mishra, Mitesh M. Khapra, Pratyush Kumar

Abstract: Natural Language Generation (NLG) for non-English languages is hampered by the scarcity of datasets in these languages. In this paper, we present the IndicNLG Benchmark, a collection of datasets for benchmarking NLG for 11 Indic languages. We focus on five diverse tasks, namely, biography generation using Wikipedia infoboxes, news headline generation, sentence summarization, paraphrase generation… ▽ More Natural Language Generation (NLG) for non-English languages is hampered by the scarcity of datasets in these languages. In this paper, we present the IndicNLG Benchmark, a collection of datasets for benchmarking NLG for 11 Indic languages. We focus on five diverse tasks, namely, biography generation using Wikipedia infoboxes, news headline generation, sentence summarization, paraphrase generation and, question generation. We describe the created datasets and use them to benchmark the performance of several monolingual and multilingual baselines that leverage pre-trained sequence-to-sequence models. Our results exhibit the strong performance of multilingual language-specific pre-trained models, and the utility of models trained on our dataset for other related NLG tasks. Our dataset creation methods can be easily applied to modest-resource languages as they involve simple steps such as scra** news articles and Wikipedia infoboxes, light cleaning, and pivoting through machine translation data. To the best of our knowledge, the IndicNLG Benchmark is the first NLG benchmark for Indic languages and the most diverse multilingual NLG dataset, with approximately 8M examples across 5 tasks and 11 languages. The datasets and models are publicly available at https://ai4bharat.iitm.ac.in/indicnlg-suite. △ Less

Submitted 26 October, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

Comments: Accepted at EMNLP 2022

arXiv:2201.07441 [pdf, other]

doi 10.1142/S0217751X21502638

LHC signatures of sterile neutrinos in a minimal radiative extended seesaw framework

Authors: Sudhanwa Patra, Utkarsh Patel, Purushottam Sahu

Abstract: The presence of small neutrino masses and flavour mixings can be accounted for naturally in various models about extensions of the standard model, particularly in the seesaw mechanism models. In this work, we present a minimally extended seesaw framework with two right-handed neutrinos, where the active neutrino masses are derived in the radiative regime. Using the framework it can be shown that w… ▽ More The presence of small neutrino masses and flavour mixings can be accounted for naturally in various models about extensions of the standard model, particularly in the seesaw mechanism models. In this work, we present a minimally extended seesaw framework with two right-handed neutrinos, where the active neutrino masses are derived in the radiative regime. Using the framework it can be shown that within certain mass limits, the light neutrino mass term can approach a form that is similar to its form under the type-I seesaw mechanism. Apart from this, we show that the decay width of right-handed neutrinos (produced through the decay of W boson in a particle collider) is short enough to cause a sufficiently long lifetime for the particles, thus ensuring an observable displacement in the LHC between the production and decay vertices. We comment on the fact that these displaced vertex signatures thus can serve as a means to verify the existence of these right-handed neutrinos in future experiments. Lastly, we line up the possibility of our future work where the vertex signatures of particles greater than the mass of W boson can be worked upon. △ Less

Submitted 19 January, 2022; originally announced January 2022.

Comments: arXiv admin note: text overlap with arXiv:1903.04905, arXiv:1801.03624 by other authors

Journal ref: International Journal of Modern Physics A (2021) 2150263, S0217751X21502638

arXiv:2112.13181 [pdf, other]

doi 10.1016/j.pmcj.2022.101582

DeepMTL Pro: Deep Learning Based MultipleTransmitter Localization and Power Estimation

Authors: Caitao Zhan, Mohammad Ghaderibaneh, Pranjal Sahu, Himanshu Gupta

Abstract: In this paper, we address the problem of Multiple Transmitter Localization (MTL). MTL is to determine the locations of potential multiple transmitters in a field, based on readings from a distributed set of sensors. In contrast to the widely studied single transmitter localization problem, the MTL problem has only been studied recently in a few works. MTL is of great significance in many applicati… ▽ More In this paper, we address the problem of Multiple Transmitter Localization (MTL). MTL is to determine the locations of potential multiple transmitters in a field, based on readings from a distributed set of sensors. In contrast to the widely studied single transmitter localization problem, the MTL problem has only been studied recently in a few works. MTL is of great significance in many applications wherein intruders may be present. E.g., in shared spectrum systems, detection of unauthorized transmitters and estimating their power are imperative to efficient utilization of the shared spectrum. In this paper, we present DeepMTL, a novel deep-learning approach to address the MTL problem. In particular, we frame MTL as a sequence of two steps, each of which is a computer vision problem: image-to-image translation and object detection. The first step of mage-to-image translation essentially maps an input image representing sensor readings to an image representing the distribution of transmitter locations, and the second object detection step derives precise locations of transmitters from the image of transmitter distributions. For the first step, we design our learning model Sen2Peak, while for the second step, we customize a state-of-the-art object detection model YOLO-cust. Using DeepMTL as a building block, we also develop techniques to estimate transmit power of the localized transmitters. We demonstrate the effectiveness of our approach via extensive large-scale simulations and show that our approach outperforms the previous approaches significantly (by 50% or more) in performance metrics including localization error, miss rate, and false alarm rate. Our method also incurs a very small latency. We evaluate our techniques over a small-scale area with real testbed data and the testbed results align with the simulation results. △ Less

Submitted 22 March, 2022; v1 submitted 24 December, 2021; originally announced December 2021.

Comments: 38 pages, 27 figures. This is the final revision verison of a journal paper submitted to Pervasive and Mobile Computing (PMC). This is an extension of an accepted paper at IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM 2021)

arXiv:2112.04117 [pdf, ps, other]

doi 10.1016/j.nuclphysa.2021.122362

Thermal properties of hot and dense medium in interacting hadron resonance gas model

Authors: S. Sahoo, D. K. Mishra, P. K. Sahu

Abstract: The meson exchange interaction based on relativistic mean-field (RMF) theory has been introduced in the hadron resonance gas (HRG) model, called interacting HRG (iHRG) model. This model can be used to explain the experimental data both at finite temperature ($T$) with finite chemical potential ($μ_B$) and finite temperature at vanishing chemical potential. The nuclear matter equation of state also… ▽ More The meson exchange interaction based on relativistic mean-field (RMF) theory has been introduced in the hadron resonance gas (HRG) model, called interacting HRG (iHRG) model. This model can be used to explain the experimental data both at finite temperature ($T$) with finite chemical potential ($μ_B$) and finite temperature at vanishing chemical potential. The nuclear matter equation of state also can be explained at zero temperature with finite baryon density (finite chemical potential) due to the presence of attractive and repulsive interactions between the hadrons in the iHRG model. Similarly, the lattice equation of state is well described at $μ_B$ = 0 and finite temperature by the iHRG model. In the present study, we have calculated the thermodynamical quantities as a function of temperature and chemical potential using both HRG and iHRG models. Also, we have presented the isothermal compressibility ($k_T$), specific heat ($C_V$), and speed of sound ($c_s^2$) as a function of $μ_B$, $T$, and center of mass energies. The effect of kinematic acceptance on these quantities are also presented as a function of $μ$ and $T$. Results from this study on $k_T$ are compared with results from other heavy-ion transport models and experimental data up to LHC energies. △ Less

Submitted 8 December, 2021; originally announced December 2021.

Comments: 34 pages, 16 figures, accepted for publication in Nuclear Physics A

arXiv:2111.06916 [pdf]

Offense Detection in Dravidian Languages using Code-Mixing Index based Focal Loss

Authors: Debapriya Tula, Shreyas MS, Viswanatha Reddy, Pranjal Sahu, Sumanth Doddapaneni, Prathyush Potluri, Rohan Sukumaran, Parth Patwa

Abstract: Over the past decade, we have seen exponential growth in online content fueled by social media platforms. Data generation of this scale comes with the caveat of insurmountable offensive content in it. The complexity of identifying offensive content is exacerbated by the usage of multiple modalities (image, language, etc.), code-mixed language and more. Moreover, even after careful sampling and ann… ▽ More Over the past decade, we have seen exponential growth in online content fueled by social media platforms. Data generation of this scale comes with the caveat of insurmountable offensive content in it. The complexity of identifying offensive content is exacerbated by the usage of multiple modalities (image, language, etc.), code-mixed language and more. Moreover, even after careful sampling and annotation of offensive content, there will always exist a significant class imbalance between offensive and non-offensive content. In this paper, we introduce a novel Code-Mixing Index (CMI) based focal loss which circumvents two challenges (1) code-mixing in languages (2) class imbalance problem for Dravidian language offense detection. We also replace the conventional dot product-based classifier with the cosine-based classifier which results in a boost in performance. Further, we use multilingual models that help transfer characteristics learnt across languages to work effectively with low resourced languages. It is also important to note that our model handles instances of mixed script (say usage of Latin and Dravidian-Tamil script) as well. To summarize, our model can handle offensive language detection in a low-resource, class imbalanced, multilingual and code-mixed setting. △ Less

Submitted 6 May, 2022; v1 submitted 12 November, 2021; originally announced November 2021.

Comments: Accepted for publication at SN Computer Science Journal

arXiv:2110.11899 [pdf, other]

Challenges in Procedural Multimodal Machine Comprehension:A Novel Way To Benchmark

Authors: Pritish Sahu, Karan Sikka, Ajay Divakaran

Abstract: We focus on Multimodal Machine Reading Comprehension (M3C) where a model is expected to answer questions based on given passage (or context), and the context and the questions can be in different modalities. Previous works such as RecipeQA have proposed datasets and cloze-style tasks for evaluation. However, we identify three critical biases stemming from the question-answer generation process and… ▽ More We focus on Multimodal Machine Reading Comprehension (M3C) where a model is expected to answer questions based on given passage (or context), and the context and the questions can be in different modalities. Previous works such as RecipeQA have proposed datasets and cloze-style tasks for evaluation. However, we identify three critical biases stemming from the question-answer generation process and memorization capabilities of large deep models. These biases makes it easier for a model to overfit by relying on spurious correlations or naive data patterns. We propose a systematic framework to address these biases through three Control-Knobs that enable us to generate a test bed of datasets of progressive difficulty levels. We believe that our benchmark (referred to as Meta-RecipeQA) will provide, for the first time, a fine grained estimate of a model's generalization capabilities. We also propose a general M3C model that is used to realize several prior SOTA models and motivate a novel hierarchical transformer based reasoning network (HTRN). We perform a detailed evaluation of these models with different language and visual features on our benchmark. We observe a consistent improvement with HTRN over SOTA (~18% in Visual Cloze task and ~13% in average over all the tasks). We also observe a drop in performance across all the models when testing on RecipeQA and proposed Meta-RecipeQA (e.g. 83.6% versus 67.1% for HTRN), which shows that the proposed dataset is relatively less biased. We conclude by highlighting the impact of the control knobs with some quantitative results. △ Less

Submitted 22 October, 2021; originally announced October 2021.

arXiv:2109.13156 [pdf, ps, other]

DAReN: A Collaborative Approach Towards Reasoning And Disentangling

Authors: Pritish Sahu, Kalliopi Basioti, Vladimir Pavlovic

Abstract: Computational learning approaches to solving visual reasoning tests, such as Raven's Progressive Matrices (RPM), critically depend on the ability to identify the visual concepts used in the test (i.e., the representation) as well as the latent rules based on those concepts (i.e., the reasoning). However, learning of representation and reasoning is a challenging and ill-posed task, often approached… ▽ More Computational learning approaches to solving visual reasoning tests, such as Raven's Progressive Matrices (RPM), critically depend on the ability to identify the visual concepts used in the test (i.e., the representation) as well as the latent rules based on those concepts (i.e., the reasoning). However, learning of representation and reasoning is a challenging and ill-posed task, often approached in a stage-wise manner (first representation, then reasoning). In this work, we propose an end-to-end joint representation-reasoning learning framework, which leverages a weak form of inductive bias to improve both tasks together. Specifically, we introduce a general generative graphical model for RPMs, GM-RPM, and apply it to solve the reasoning test. We accomplish this using a novel learning framework Disentangling based Abstract Reasoning Network (DAReN) based on the principles of GM-RPM. We perform an empirical evaluation of DAReN over several benchmark datasets. DAReN shows consistent improvement over state-of-the-art (SOTA) models on both the reasoning and the disentanglement tasks. This demonstrates the strong correlation between disentangled latent representation and the ability to solve abstract visual reasoning tasks. △ Less

Submitted 29 June, 2022; v1 submitted 27 September, 2021; originally announced September 2021.

arXiv:2109.12536 [pdf, other]

Gauge coupling unification in a minimal non-supersymmetric $E_6$ GUT

Authors: Chandini Dash, Snigdha Mishra, Sudhanwa Patra, Purushottam Sahu

Abstract: We consider a minimal renormalizable non-supersymmetric $E_6$ Grand Unified Theory using fundamental representation $27$ for fermions and scalars. The scalar with adjoint representation ${78}$ is also taken for direct breaking of $E_{6}$ to SM. The proposed model, guided by TeV-scale vector-like fermions and scalar leptoquark offer successful gauge unification even in the absence of any intermedia… ▽ More We consider a minimal renormalizable non-supersymmetric $E_6$ Grand Unified Theory using fundamental representation $27$ for fermions and scalars. The scalar with adjoint representation ${78}$ is also taken for direct breaking of $E_{6}$ to SM. The proposed model, guided by TeV-scale vector-like fermions and scalar leptoquark offer successful gauge unification even in the absence of any intermediate symmetry. Embedded with threshold corrections, it is shown to be compatible with the present experimental limit on proton decay lifetime. The unique feature of the model shows that, the GUT threshold corrections to the unification mass, is controlled by superheavy gauge bosons only, thereby minimising the uncertainty of the GUT predictions. The scalar leptoquark and vector-like fermions residing in $27$ representation can explain flavor physics anomalies like $R_{D^{(\ast)}}$ as reported by the LHCb collaboration and the muon anomalous magnetic moment reported by the recent muon $g-2$ experiment at Fermilab. The model can also predict a sub-eV scale neutrino at one-loop level via exchange of $W$ and $Z$ gauge bosons through MRIS mechanism. △ Less

Submitted 6 February, 2024; v1 submitted 26 September, 2021; originally announced September 2021.

Comments: 5 pages, 4 tables, 2 figures

arXiv:2109.12240 [pdf, other]

Logical Credal Networks

Authors: Haifeng Qian, Radu Marinescu, Alexander Gray, Debarun Bhattacharjya, Francisco Barahona, Tian Gao, Ryan Riegel, Pravinda Sahu

Abstract: This paper introduces Logical Credal Networks, an expressive probabilistic logic that generalizes many prior models that combine logic and probability. Given imprecise information represented by probability bounds and conditional probability bounds of logic formulas, this logic specifies a set of probability distributions over all interpretations. On the one hand, our approach allows propositional… ▽ More This paper introduces Logical Credal Networks, an expressive probabilistic logic that generalizes many prior models that combine logic and probability. Given imprecise information represented by probability bounds and conditional probability bounds of logic formulas, this logic specifies a set of probability distributions over all interpretations. On the one hand, our approach allows propositional and first-order logic formulas with few restrictions, e.g., without requiring acyclicity. On the other hand, it has a Markov condition similar to Bayesian networks and Markov random fields that is critical in real-world applications. Having both these properties makes this logic unique, and we investigate its performance on maximum a posteriori inference tasks, including solving Mastermind games with uncertainty and detecting credit card fraud. The results show that the proposed method outperforms existing approaches, and its advantage lies in aggregating multiple sources of imprecise information. △ Less

Submitted 24 September, 2021; originally announced September 2021.

arXiv:2106.15970 [pdf, other]

doi 10.1016/j.nima.2021.165596

Measurement of ion backflow fraction in GEM detectors

Authors: A. Tripathy, P. K Sahu, S. Swain, S. Sahu

Abstract: A systematic study is performed to measure the ion backflow fraction of the GEM detectors. The effects of different voltage configurations and Ar/CO_2 gas mixtures, in ratios of 70:30, 80:20 and 90:10, on positive ion fraction are investigated in detail. Moreover, a comparative study is performed between single and quadruple GEM detectors.The ion current with detector effective gain is measured wi… ▽ More A systematic study is performed to measure the ion backflow fraction of the GEM detectors. The effects of different voltage configurations and Ar/CO_2 gas mixtures, in ratios of 70:30, 80:20 and 90:10, on positive ion fraction are investigated in detail. Moreover, a comparative study is performed between single and quadruple GEM detectors.The ion current with detector effective gain is measured with various field configurations and with three proportions of gas mixtures. The ion backflow fraction for the GEM is substantially reduced with the lower drift field. A minimum ion backflow fraction of 18 % is achieved in the single GEM detector with Ar/CO_2 80:20 gas mixture, however, a minimum ion backflow fraction of 3.5 %, 3.0%, and 3.8 % are obtained for a drift field of 0.1kV/cm with Ar/CO_2 70:30, 80:20 and 90:10 gas mixtures, respectively for quadrupole GEM detector. Similar values of effective gain and ion backflow fraction have been found by calculating the current from pulse height spectrum method, obtained in the Multi Channel Analyser. △ Less

Submitted 30 June, 2021; originally announced June 2021.

Comments: 24 pages, To be published in Nuclear Instruments and Methods in Physics Research A

arXiv:2106.04653 [pdf, other]

Comprehension Based Question Answering using Bloom's Taxonomy

Authors: Pritish Sahu, Michael Cogswell, Sara Rutherford-Quach, Ajay Divakaran

Abstract: Current pre-trained language models have lots of knowledge, but a more limited ability to use that knowledge. Bloom's Taxonomy helps educators teach children how to use knowledge by categorizing comprehension skills, so we use it to analyze and improve the comprehension skills of large pre-trained language models. Our experiments focus on zero-shot question answering, using the taxonomy to provide… ▽ More Current pre-trained language models have lots of knowledge, but a more limited ability to use that knowledge. Bloom's Taxonomy helps educators teach children how to use knowledge by categorizing comprehension skills, so we use it to analyze and improve the comprehension skills of large pre-trained language models. Our experiments focus on zero-shot question answering, using the taxonomy to provide proximal context that helps the model answer questions by being relevant to those questions. We show targeting context in this manner improves performance across 4 popular common sense question answer datasets. △ Less

Submitted 8 June, 2021; originally announced June 2021.

arXiv:2104.10139 [pdf, other]

Towards Solving Multimodal Comprehension

Authors: Pritish Sahu, Karan Sikka, Ajay Divakaran

Abstract: This paper targets the problem of procedural multimodal machine comprehension (M3C). This task requires an AI to comprehend given steps of multimodal instructions and then answer questions. Compared to vanilla machine comprehension tasks where an AI is required only to understand a textual input, procedural M3C is more challenging as the AI needs to comprehend both the temporal and causal factors… ▽ More This paper targets the problem of procedural multimodal machine comprehension (M3C). This task requires an AI to comprehend given steps of multimodal instructions and then answer questions. Compared to vanilla machine comprehension tasks where an AI is required only to understand a textual input, procedural M3C is more challenging as the AI needs to comprehend both the temporal and causal factors along with multimodal inputs. Recently Yagcioglu et al. [35] introduced RecipeQA dataset to evaluate M3C. Our first contribution is the introduction of two new M3C datasets- WoodworkQA and DecorationQA with 16K and 10K instructional procedures, respectively. We then evaluate M3C using a textual cloze style question-answering task and highlight an inherent bias in the question answer generation method from [35] that enables a naive baseline to cheat by learning from only answer choices. This naive baseline performs similar to a popular method used in question answering- Impatient Reader [6] that uses attention over both the context and the query. We hypothesized that this naturally occurring bias present in the dataset affects even the best performing model. We verify our proposed hypothesis and propose an algorithm capable of modifying the given dataset to remove the bias elements. Finally, we report our performance on the debiased dataset with several strong baselines. We observe that the performance of all methods falls by a margin of 8% - 16% after correcting for the bias. We hope these datasets and the analysis will provide valuable benchmarks and encourage further research in this area. △ Less

Submitted 20 April, 2021; originally announced April 2021.

arXiv:2102.05397 [pdf, other]

doi 10.1088/1367-2630/ac23f1

Geometric signatures of tissue surface tension in a three-dimensional model of confluent tissue

Authors: Preeti Sahu, J. M. Schwarz, M. Lisa Manning

Abstract: In dense biological tissues, cell types performing different roles remain segregated by maintaining sharp interfaces. To better understand the mechanisms for such sharp compartmentalization, we study the effect of an imposed heterotypic tension at the interface between two distinct cell types in a fully 3D model for confluent tissues. We find that cells rapidly sort and self-organize to generate a… ▽ More In dense biological tissues, cell types performing different roles remain segregated by maintaining sharp interfaces. To better understand the mechanisms for such sharp compartmentalization, we study the effect of an imposed heterotypic tension at the interface between two distinct cell types in a fully 3D model for confluent tissues. We find that cells rapidly sort and self-organize to generate a tissue-scale interface between cell types, and cells adjacent to this interface exhibit signature geometric features including nematic-like ordering, bimodal facet areas, and registration, or alignment, of cell centers on either side of the two-tissue interface. The magnitude of these features scales directly with the magnitude of imposed tension, suggesting that biologists can estimate the magnitude of tissue surface tension between two tissue types simply by segmenting a 3D tissue. To uncover the underlying physical mechanisms driving these geometric features, we develop two minimal, ordered models using two different underlying lattices that identify an energetic competition between bulk cell shapes and tissue interface area. When the interface area dominates, changes to neighbor topology are costly and occur less frequently, which generates the observed geometric features. △ Less

Submitted 10 February, 2021; originally announced February 2021.

Comments: 16 pages, 14 figures

arXiv:2012.13556 [pdf]

Graphene oxide based synaptic memristor device for neuromorphic computing

Authors: Dwipak Prasad Sahu, Prabana Jetty, S. Narayana Jammalamadaka

Abstract: Brain-inspired neuromorphic computing which consist neurons and synapses, with an ability to perform complex information processing has unfolded a new paradigm of computing to overcome the von Neumann bottleneck. Electronic synaptic memristor devices which can compete with the biological synapses are indeed significant for neuromorphic computing. In this work, we demonstrate our efforts to develop… ▽ More Brain-inspired neuromorphic computing which consist neurons and synapses, with an ability to perform complex information processing has unfolded a new paradigm of computing to overcome the von Neumann bottleneck. Electronic synaptic memristor devices which can compete with the biological synapses are indeed significant for neuromorphic computing. In this work, we demonstrate our efforts to develop and realize the graphene oxide (GO) based memristor device as a synaptic device, which mimic as a biological synapse. Indeed, this device exhibits the essential synaptic learning behavior including analog memory characteristics, potentiation and depression. Furthermore, spike-timing-dependent-plasticity learning rule is mimicked by engineering the pre- and post-synaptic spikes. In addition, non-volatile properties such as endurance, retentivity, multilevel switching of the device are explored. These results suggest that Ag/GO/FTO memristor device would indeed be a potential candidate for future neuromorphic computing applications. Keywords: RRAM, Graphene oxide, neuromorphic computing, synaptic device, potentiation, depression △ Less

Submitted 25 December, 2020; originally announced December 2020.

Comments: Nanotechnology (accepted) (IOP publishing)

arXiv:2011.11610 [pdf, other]

Transfer Learning for Oral Cancer Detection using Microscopic Images

Authors: Rutwik Palaskar, Renu Vyas, Vilas Khedekar, Sangeeta Palaskar, Pranjal Sahu

Abstract: Oral cancer has more than 83% survival rate if detected in its early stages, however, only 29% of cases are currently detected early. Deep learning techniques can detect patterns of oral cancer cells and can aid in its early detection. In this work, we present the first results of neural networks for oral cancer detection using microscopic images. We compare numerous state-of-the-art models via tr… ▽ More Oral cancer has more than 83% survival rate if detected in its early stages, however, only 29% of cases are currently detected early. Deep learning techniques can detect patterns of oral cancer cells and can aid in its early detection. In this work, we present the first results of neural networks for oral cancer detection using microscopic images. We compare numerous state-of-the-art models via transfer learning approach and collect and release an augmented dataset of high-quality microscopic images of oral cancer. We present a comprehensive study of different models and report their performance on this type of data. Overall, we obtain a 10-15% absolute improvement with transfer learning methods compared to a simple Convolutional Neural Network baseline. Ablation studies show the added benefit of data augmentation techniques with finetuning for this task. △ Less

Submitted 9 April, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

arXiv:2011.10889 [pdf, other]

Zero-Shot Learning with Knowledge Enhanced Visual Semantic Embeddings

Authors: Karan Sikka, Jihua Huang, Andrew Silberfarb, Prateeth Nayak, Luke Rohrer, Pritish Sahu, John Byrnes, Ajay Divakaran, Richard Rohwer

Abstract: We improve zero-shot learning (ZSL) by incorporating common-sense knowledge in DNNs. We propose Common-Sense based Neuro-Symbolic Loss (CSNL) that formulates prior knowledge as novel neuro-symbolic loss functions that regularize visual-semantic embedding. CSNL forces visual features in the VSE to obey common-sense rules relating to hypernyms and attributes. We introduce two key novelties for impro… ▽ More We improve zero-shot learning (ZSL) by incorporating common-sense knowledge in DNNs. We propose Common-Sense based Neuro-Symbolic Loss (CSNL) that formulates prior knowledge as novel neuro-symbolic loss functions that regularize visual-semantic embedding. CSNL forces visual features in the VSE to obey common-sense rules relating to hypernyms and attributes. We introduce two key novelties for improved learning: (1) enforcement of rules for a group instead of a single concept to take into account class-wise relationships, and (2) confidence margins inside logical operators that enable implicit curriculum learning and prevent premature overfitting. We evaluate the advantages of incorporating each knowledge source and show consistent gains over prior state-of-art methods in both conventional and generalized ZSL e.g. 11.5%, 5.5%, and 11.6% improvements on AWA2, CUB, and Kinetics respectively. △ Less

Submitted 21 November, 2020; originally announced November 2020.

arXiv:2008.07330 [pdf, other]

Optimal Posteriors for Chi-squared Divergence based PAC-Bayesian Bounds and Comparison with KL-divergence based Optimal Posteriors and Cross-Validation Procedure

Authors: Puja Sahu, Nandyala Hemachandra

Abstract: We investigate optimal posteriors for recently introduced \cite{begin2016pac} chi-squared divergence based PAC-Bayesian bounds in terms of nature of their distribution, scalability of computations, and test set performance. For a finite classifier set, we deduce bounds for three distance functions: KL-divergence, linear and squared distances. Optimal posterior weights are proportional to deviation… ▽ More We investigate optimal posteriors for recently introduced \cite{begin2016pac} chi-squared divergence based PAC-Bayesian bounds in terms of nature of their distribution, scalability of computations, and test set performance. For a finite classifier set, we deduce bounds for three distance functions: KL-divergence, linear and squared distances. Optimal posterior weights are proportional to deviations of empirical risks, usually with subset support. For uniform prior, it is sufficient to search among posteriors on classifier subsets ordered by these risks. We show the bound minimization for linear distance as a convex program and obtain a closed-form expression for its optimal posterior. Whereas that for squared distance is a quasi-convex program under a specific condition, and the one for KL-divergence is non-convex optimization (a difference of convex functions). To compute such optimal posteriors, we derive fast converging fixed point (FP) equations. We apply these approaches to a finite set of SVM regularization parameter values to yield stochastic SVMs with tight bounds. We perform a comprehensive performance comparison between our optimal posteriors and known KL-divergence based posteriors on a variety of UCI datasets with varying ranges and variances in risk values, etc. Chi-squared divergence based posteriors have weaker bounds and worse test errors, hinting at an underlying regularization by KL-divergence based posteriors. Our study highlights the impact of divergence function on the performance of PAC-Bayesian classifiers. We compare our stochastic classifiers with cross-validation based deterministic classifier. The latter has better test errors, but ours is more sample robust, has quantifiable generalization guarantees, and is computationally much faster. △ Less

Submitted 13 August, 2020; originally announced August 2020.

Comments: arXiv admin note: text overlap with arXiv:1912.06803

arXiv:2007.11818 [pdf, other]

Speculative Interference Attacks: Breaking Invisible Speculation Schemes

Authors: Mohammad Behnia, Prateek Sahu, Riccardo Paccagnella, Jiyong Yu, Zirui Zhao, Xiang Zou, Thomas Unterluggauer, Josep Torrellas, Carlos Rozas, Adam Morrison, Frank Mckeen, Fangfei Liu, Ron Gabor, Christopher W. Fletcher, Abhishek Basak, Alaa Alameldeen

Abstract: Recent security vulnerabilities that target speculative execution (e.g., Spectre) present a significant challenge for processor design. The highly publicized vulnerability uses speculative execution to learn victim secrets by changing cache state. As a result, recent computer architecture research has focused on invisible speculation mechanisms that attempt to block changes in cache state due to s… ▽ More Recent security vulnerabilities that target speculative execution (e.g., Spectre) present a significant challenge for processor design. The highly publicized vulnerability uses speculative execution to learn victim secrets by changing cache state. As a result, recent computer architecture research has focused on invisible speculation mechanisms that attempt to block changes in cache state due to speculative execution. Prior work has shown significant success in preventing Spectre and other vulnerabilities at modest performance costs. In this paper, we introduce speculative interference attacks, which show that prior invisible speculation mechanisms do not fully block these speculation-based attacks. We make two key observations. First, misspeculated younger instructions can change the timing of older, bound-to-retire instructions, including memory operations. Second, changing the timing of a memory operation can change the order of that memory operation relative to other memory operations, resulting in persistent changes to the cache state. Using these observations, we demonstrate (among other attack variants) that secret information accessed by mis-speculated instructions can change the order of bound-to-retire loads. Load timing changes can therefore leave secret-dependent changes in the cache, even in the presence of invisible speculation mechanisms. We show that this problem is not easy to fix: Speculative interference converts timing changes to persistent cache-state changes, and timing is typically ignored by many cache-based defenses. We develop a framework to understand the attack and demonstrate concrete proof-of-concept attacks against invisible speculation mechanisms. We provide security definitions sufficient to block speculative interference attacks; describe a simple defense mechanism with a high performance cost; and discuss how future research can improve its performance. △ Less

Submitted 23 April, 2021; v1 submitted 23 July, 2020; originally announced July 2020.

Comments: Updated CR Version

arXiv:2007.11185 [pdf, other]

Study of strange non-strange hadron ratios in pp and p-Pb collisions at LHC energies

Authors: Sarita Sahoo, Rama Chandra Baral, Pradip Kumar Sahu, Mina Ketan Parida

Abstract: It has been observed that the yields of strange and multi-strange hadrons relative to pion increase significantly with the event charged-particle multiplicity. We notice from experimental data that yield ratios between non-strange hadrons, like p/$π$ or hadrons of same strange content, like $Λ$/K$_s^0$, show similar enhancement. We have studied this behavior within the ambit of a parton model (EPO… ▽ More It has been observed that the yields of strange and multi-strange hadrons relative to pion increase significantly with the event charged-particle multiplicity. We notice from experimental data that yield ratios between non-strange hadrons, like p/$π$ or hadrons of same strange content, like $Λ$/K$_s^0$, show similar enhancement. We have studied this behavior within the ambit of a parton model (EPOS3) and A Multi-Phase Transport (AMPT) model in pp and p-Pb collisions at LHC energies. We investigate model predictions of yields and yield ratios of different identified hadron productions as a function of charged-particle multiplicity and compare them with published ALICE results. The string melting versions of AMPT and EPOS are found to establish enhancements in the particle yield ratios. △ Less

Submitted 21 July, 2020; originally announced July 2020.

Comments: 5 pages, 3 figures

arXiv:2005.07635 [pdf, other]

Magnetic Skyrmions in Condensed Matter Physics

Authors: Sayantika Bhowal, S. Satpathy, Pratik Sahu

Abstract: Skyrmions were originally introduced in Particle Physics as a possible mechanism to explain the stability of particles. Lately the concept has been applied in Condensed Matter Physics to describe the newly discovered topologically protected magnetic configurations called the magnetic Skyrmions. This elementary review introduces the concept at a level suitable for beginning students of Physics. Skyrmions were originally introduced in Particle Physics as a possible mechanism to explain the stability of particles. Lately the concept has been applied in Condensed Matter Physics to describe the newly discovered topologically protected magnetic configurations called the magnetic Skyrmions. This elementary review introduces the concept at a level suitable for beginning students of Physics. △ Less

Submitted 15 May, 2020; originally announced May 2020.

Comments: 16 pages, 6 figures

Showing 1–50 of 220 results for author: Sahu, P