-
Mitigating Hallucination in Fictional Character Role-Play
Authors:
Nafis Sadeq,
Zhouhang Xie,
Byungkyu Kang,
Prarit Lamba,
Xiang Gao,
Julian McAuley
Abstract:
Role-playing has wide-ranging applications in customer support, embodied agents, computational social science, etc. The influence of parametric world knowledge of large language models (LLMs) often causes role-playing characters to act out of character and hallucinate about things outside the scope of their knowledge. In this work, we focus on the evaluation and mitigation of hallucination in fict…
▽ More
Role-playing has wide-ranging applications in customer support, embodied agents, computational social science, etc. The influence of parametric world knowledge of large language models (LLMs) often causes role-playing characters to act out of character and hallucinate about things outside the scope of their knowledge. In this work, we focus on the evaluation and mitigation of hallucination in fictional character role-play. We introduce a dataset with more than 2,000 characters and 72,000 interviews, including 18,000 adversarial questions. We propose RoleFact, a role-playing method that mitigates hallucination by modulating the influence of parametric knowledge using a pre-calibrated confidence threshold. Experiments show that the proposed method improves the factual precision of generated responses by 18% for adversarial questions with a 44% reduction in temporal hallucination for time-sensitive interviews. The code and the dataset will be available at https://github.com/NafisSadeq/rolefact.git.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
LLM+Reasoning+Planning for supporting incomplete user queries in presence of APIs
Authors:
Sudhir Agarwal,
Anu Sreepathy,
David H. Alonso,
Prarit Lamba
Abstract:
Recent availability of Large Language Models (LLMs) has led to the development of numerous LLM-based approaches aimed at providing natural language interfaces for various end-user tasks. These end-user tasks in turn can typically be accomplished by orchestrating a given set of APIs. In practice, natural language task requests (user queries) are often incomplete, i.e., they may not contain all the…
▽ More
Recent availability of Large Language Models (LLMs) has led to the development of numerous LLM-based approaches aimed at providing natural language interfaces for various end-user tasks. These end-user tasks in turn can typically be accomplished by orchestrating a given set of APIs. In practice, natural language task requests (user queries) are often incomplete, i.e., they may not contain all the information required by the APIs. While LLMs excel at natural language processing (NLP) tasks, they frequently hallucinate on missing information or struggle with orchestrating the APIs. The key idea behind our proposed approach is to leverage logical reasoning and classical AI planning along with an LLM for accurately answering user queries including identification and gathering of any missing information in these queries. Our approach uses an LLM and ASP (Answer Set Programming) solver to translate a user query to a representation in Planning Domain Definition Language (PDDL) via an intermediate representation in ASP. We introduce a special API "get_info_api" for gathering missing information. We model all the APIs as PDDL actions in a way that supports dataflow between the APIs. Our approach then uses a classical AI planner to generate an orchestration of API calls (including calls to get_info_api) to answer the user query. Our evaluation results show that our approach significantly outperforms a pure LLM based approach by achieving over 95\% success rate in most cases on a dataset containing complete and incomplete single goal and multi-goal queries where the multi-goal queries may or may not require dataflow among the APIs.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
Vector detection of AC magnetic fields by Nitrogen-Vacancy centers of single orientation in diamond
Authors:
Pooja Lamba,
Akshat Rana,
Sougata Halder,
Siddharth Dhomkar,
Dieter Suter,
Rama K. Kamineni
Abstract:
Nitrogen-Vacancy (NV) centers in diamond have useful properties for detecting both AC and DC magnetic fields with high sensitivity at nano-scale resolution. Vector detection of AC magnetic fields can be achieved by using NV centers having three different orientations. Here, we propose a method to achieve this by using NV centers of single orientation. In this method, a static magnetic field is app…
▽ More
Nitrogen-Vacancy (NV) centers in diamond have useful properties for detecting both AC and DC magnetic fields with high sensitivity at nano-scale resolution. Vector detection of AC magnetic fields can be achieved by using NV centers having three different orientations. Here, we propose a method to achieve this by using NV centers of single orientation. In this method, a static magnetic field is applied perpendicular to the NV axis, leading to strong mixing of the $m_{s}=-1$ and $1$ electron spin states. As a result, all three electron spin transitions of the triplet ground state have non-zero dipole moments, with each transition coupling to a single component of the magnetic field. This can be used to measure both strength and orientation of the applied AC field. To validate the technique, we perform a proof of principle experiment using a subset of ensemble NV centers in diamond, all having the same orientation.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
Supersymmetry : A decade after Higgs discovery
Authors:
V. Suryanarayana Mummidi,
Priyanka Lamba,
Sudhir K. Vempati
Abstract:
Supersymmetric extensions of the Standard Model have been in vogue for over half a century. They have many interesting theoretical properties like calculability, absence of quadratic divergences, and phenomenologically impactful features like gauge coupling unification, dark matter candidates, signatures at present and future colliders, etc. A defining feature of these models is the calculability…
▽ More
Supersymmetric extensions of the Standard Model have been in vogue for over half a century. They have many interesting theoretical properties like calculability, absence of quadratic divergences, and phenomenologically impactful features like gauge coupling unification, dark matter candidates, signatures at present and future colliders, etc. A defining feature of these models is the calculability of Higgs mass in terms of a few parameters. The discovery of a Higgs particle with a mass of around 125 GeV thus has significant implications. The null results for the searches of superpartners at LHC has also put further constraints. Taken together with direct detection limits on WIMP (Weakly Interacting Massive Particle) dark matter, it appears that TeV scale supersymmetry is not realised in Nature and the theoretical expectations have reached a turning point. The present onslaught from the experiments suggests that supersymmetric models need a more complex particle structure, lagrangian and breaking patterns to be a natural solution to the hierarchy problem. We review existing models and discuss their feasibility in the current and future experimental programs.
△ Less
Submitted 9 June, 2023;
originally announced June 2023.
-
Unsupervised Improvement of Factual Knowledge in Language Models
Authors:
Nafis Sadeq,
Byungkyu Kang,
Prarit Lamba,
Julian McAuley
Abstract:
Masked language modeling (MLM) plays a key role in pretraining large language models. But the MLM objective is often dominated by high-frequency words that are sub-optimal for learning factual knowledge. In this work, we propose an approach for influencing MLM pretraining in a way that can improve language model performance on a variety of knowledge-intensive tasks. We force the language model to…
▽ More
Masked language modeling (MLM) plays a key role in pretraining large language models. But the MLM objective is often dominated by high-frequency words that are sub-optimal for learning factual knowledge. In this work, we propose an approach for influencing MLM pretraining in a way that can improve language model performance on a variety of knowledge-intensive tasks. We force the language model to prioritize informative words in a fully unsupervised way. Experiments demonstrate that the proposed approach can significantly improve the performance of pretrained language models on tasks such as factual recall, question answering, sentiment analysis, and natural language inference in a closed-book setting.
△ Less
Submitted 4 April, 2023;
originally announced April 2023.
-
Quantum information and CP measurement in $H \to τ^+ τ^-$ at future lepton colliders
Authors:
Mohammad Mahdi Altakach,
Priyanka Lamba,
Fabio Maltoni,
Kentarou Mawatari,
Kazuki Sakurai
Abstract:
We introduce a methodology and investigate the feasibility of measuring quantum properties of tau lepton pairs in the $H \to τ^+ τ^-$ decay at future lepton colliders. In particular, observation of entanglement, steerability and violation of Bell inequalities are examined for the ILC and FCC-ee. We find that detecting quantum correlation crucially relies on precise reconstruction of the tau lepton…
▽ More
We introduce a methodology and investigate the feasibility of measuring quantum properties of tau lepton pairs in the $H \to τ^+ τ^-$ decay at future lepton colliders. In particular, observation of entanglement, steerability and violation of Bell inequalities are examined for the ILC and FCC-ee. We find that detecting quantum correlation crucially relies on precise reconstruction of the tau lepton rest frame and a simple kinematics reconstruction does not suffice due to the finite energy resolution of the colliding beams and detectors. To correct for energy mismeasurements, a log-likelihood method is developed that incorporates the information of impact parameters of tau lepton decays. We demonstrate that an accurate measurement of quantum properties is possible with this method. As a by-product, we show that a novel model-independent test of CP violation can be performed and the CP-phase of $H ττ$ interaction can be constrained with an accuracy comparable to dedicated analyses, i.e., up to $7.9^{\circ}$ and $5.4^{\circ}$ at ILC and FCC-ee, respectively.
△ Less
Submitted 12 May, 2023; v1 submitted 18 November, 2022;
originally announced November 2022.
-
Aspects of Heavy Supersymmetry
Authors:
Priyanka Lamba
Abstract:
The discovery of the Higgs boson raises the question of its "lightness" in mass when the Standard Model is considered as an effective quantum field theory. Supersymmetry is the only currently known symmetry which can protect the Higgs mass while still treating the Higgs as an elementary quantum field. However in the view of null experimental confirmation from both direct (LHC) and indirect searche…
▽ More
The discovery of the Higgs boson raises the question of its "lightness" in mass when the Standard Model is considered as an effective quantum field theory. Supersymmetry is the only currently known symmetry which can protect the Higgs mass while still treating the Higgs as an elementary quantum field. However in the view of null experimental confirmation from both direct (LHC) and indirect searches (flavour, dark matter) of the supersymmetric particles and the constraints from the Higgs mass, several possible heavy spectra for supersymmetric partners have been proposed.
In the present thesis, we study the possible origins of these heavy spectra by considering a considering many sequestered spurion fields as carriers of supersymmetry breaking. We show that "natural" supersymmetric spectrum is possible in these models and in particular a "coherent" scenario leads to low fine tuning, light Higgsino mixed dark matter ( a la focus point region) even with heavy supersymmetric spectrum. We then consider this model within the context of string landscape, where we use the Bousso-Polchinski framework of four form fluxes to model the spurions. We show that the flavour violating parameters of supersymmetric spectrum can be "diluted" away in the presence of large number of fluxes.
One of the possible supersymmetric spectra which emerges by considering all the data is the generation split (Gensplit) spectrum which allows for flavour violation to be present for the first two generations, which are heavy. We study this spectrum within the context of supersymmetric SU(5) and proton decay. The results are quite interesting and dependent on the proton decay mode considered. The strongest bound p to k νis now modified depending on the flavour of the neutrino and brings the parameter space within the realms of upcoming experiments of JUNO, DUNE and Hyper K. These results will be discussed.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Restricting $q^2 l^2$ operators from $π^0 \rightarrow μe$
Authors:
Mathew Thomas Arun,
Priyanka Lamba,
Sudhir K. Vempati
Abstract:
In this paper we consider semileptonic lepton flavor violating operators of the type $q^2l^2$ in low energy effective theory (LEFT). At the chiral scale, we match these operators to chiral perturbation theory ($χ$PT) and place constraints from the process $π^0\rightarrow μ^+ \,\, e^-$. These bounds are shown to depend on the chiral nature of the operators. The scalar operators are significantly mo…
▽ More
In this paper we consider semileptonic lepton flavor violating operators of the type $q^2l^2$ in low energy effective theory (LEFT). At the chiral scale, we match these operators to chiral perturbation theory ($χ$PT) and place constraints from the process $π^0\rightarrow μ^+ \,\, e^-$. These bounds are shown to depend on the chiral nature of the operators. The scalar operators are significantly more constrained compared to the vector operators. We then compare the limits from $μ\to e$ conversion in Nuclei and show that the limits on scalar operators are within an order of magnitude of the corresponding limits from $μ\to e$ conversion in Ti. On the other hand, the limits on vector operators are however much weaker. Towards the end, we evolve the LEFT operators to W-boson mass scale using RGE, and match them to the Standard Model effective field theory (SMEFT) operators. We, then, derive the constraints on the parameter space of Leptoquark models that could generate these SMEFT operators at tree level.
△ Less
Submitted 14 April, 2022;
originally announced April 2022.
-
Discovery prospects for long-lived multiply charged particles at the LHC
Authors:
Mohammad Mahdi Altakach,
Priyanka Lamba,
Rafał Masełek,
Vasiliki A. Mitsou,
Kazuki Sakurai
Abstract:
In this work, we aim to provide a comprehensive and largely model independent investigation on prospects to detect long-lived multiply charged particles at the LHC. We consider particles with spin 0 and $\frac{1}{2}$, with electric charges in range $1 \le |Q/e| \le 8$, which are singlet or triplet under $SU(3)_C$. Such particles might be produced as particle-antiparticle pairs and propagate throug…
▽ More
In this work, we aim to provide a comprehensive and largely model independent investigation on prospects to detect long-lived multiply charged particles at the LHC. We consider particles with spin 0 and $\frac{1}{2}$, with electric charges in range $1 \le |Q/e| \le 8$, which are singlet or triplet under $SU(3)_C$. Such particles might be produced as particle-antiparticle pairs and propagate through detectors, or form a positronium(quarkonium)-like bound state. We consider both possibilities and estimate lower mass bounds on new particles, that can be provided by ATLAS, CMS and MoEDAL experiments at the end of Run 3 and HL-LHC data taking periods. We find out that the sensitivities of ATLAS and CMS are generally stronger than those of MoEDAL at Run 3, while they may be competitive at HL-LHC for $3 \lesssim |Q/e| \lesssim 7$ for all types of long-lived particles we consider.
△ Less
Submitted 8 October, 2022; v1 submitted 7 April, 2022;
originally announced April 2022.
-
Diluting SUSY flavour problem on the Landscape
Authors:
Emilian Dudas,
Priyanka Lamba,
Sudhir K. Vempati
Abstract:
We consider an explicit effective field theory example based on the Bousso-Polchinski framework with a large number N of hidden sectors contributing to supersymmetry breaking. Each contribution comes from four form quantized fluxes, multiplied by random couplings. The soft terms in the observable sector in this case become random variables, with mean values and standard deviations which are comput…
▽ More
We consider an explicit effective field theory example based on the Bousso-Polchinski framework with a large number N of hidden sectors contributing to supersymmetry breaking. Each contribution comes from four form quantized fluxes, multiplied by random couplings. The soft terms in the observable sector in this case become random variables, with mean values and standard deviations which are computable. We show that this setup naturally leads to a solution of the flavor problem in low-energy supersymmetry if N is sufficiently large. We investigate the consequences for flavor violating processes at low-energy and for dark matter.
△ Less
Submitted 30 December, 2019;
originally announced December 2019.