-
InternalInspector $I^2$: Robust Confidence Estimation in LLMs through Internal States
Authors:
Mohammad Beigi,
Ying Shen,
Runing Yang,
Zihao Lin,
Qifan Wang,
Ankith Mohan,
Jianfeng He,
Ming **,
Chang-Tien Lu,
Lifu Huang
Abstract:
Despite their vast capabilities, Large Language Models (LLMs) often struggle with generating reliable outputs, frequently producing high-confidence inaccuracies known as hallucinations. Addressing this challenge, our research introduces InternalInspector, a novel framework designed to enhance confidence estimation in LLMs by leveraging contrastive learning on internal states including attention st…
▽ More
Despite their vast capabilities, Large Language Models (LLMs) often struggle with generating reliable outputs, frequently producing high-confidence inaccuracies known as hallucinations. Addressing this challenge, our research introduces InternalInspector, a novel framework designed to enhance confidence estimation in LLMs by leveraging contrastive learning on internal states including attention states, feed-forward states, and activation states of all layers. Unlike existing methods that primarily focus on the final activation state, InternalInspector conducts a comprehensive analysis across all internal states of every layer to accurately identify both correct and incorrect prediction processes. By benchmarking InternalInspector against existing confidence estimation methods across various natural language understanding and generation tasks, including factual question answering, commonsense reasoning, and reading comprehension, InternalInspector achieves significantly higher accuracy in aligning the estimated confidence scores with the correctness of the LLM's predictions and lower calibration error. Furthermore, InternalInspector excels at HaluEval, a hallucination detection benchmark, outperforming other internal-based confidence estimation methods in this task.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models
Authors:
Zihao Lin,
Mohammad Beigi,
Hongxuan Li,
Yufan Zhou,
Yuxiang Zhang,
Qifan Wang,
Wenpeng Yin,
Lifu Huang
Abstract:
Memory Editing (ME) has emerged as an efficient method to modify erroneous facts or inject new facts into Large Language Models (LLMs). Two mainstream ME methods exist: parameter-modifying ME and parameter-preserving ME (integrating extra modules while preserving original parameters). Regrettably, previous studies on ME evaluation have two critical limitations: (i) evaluating LLMs with single edit…
▽ More
Memory Editing (ME) has emerged as an efficient method to modify erroneous facts or inject new facts into Large Language Models (LLMs). Two mainstream ME methods exist: parameter-modifying ME and parameter-preserving ME (integrating extra modules while preserving original parameters). Regrettably, previous studies on ME evaluation have two critical limitations: (i) evaluating LLMs with single edit only, neglecting the need for continuous editing, and (ii) evaluations focusing solely on basic factual triples, overlooking broader LLM capabilities like logical reasoning and reading understanding. This study addresses these limitations with contributions threefold: (i) We explore how ME affects a wide range of fundamental capabilities of LLMs under sequential editing. Experimental results reveal an intriguing phenomenon: Most parameter-modifying ME consistently degrade performance across all tasks after a few sequential edits. In contrast, parameter-preserving ME effectively maintains LLMs' fundamental capabilities but struggles to accurately recall edited knowledge presented in a different format. (ii) We extend our evaluation to different editing settings, such as layers to edit, model size, instruction tuning, etc. Experimental findings indicate several strategies that can potentially mitigate the adverse effects of ME. (iii) We further explain why parameter-modifying ME damages LLMs from three dimensions: parameter changes after editing, language modeling capability, and the in-context learning capability. Our in-depth study advocates more careful use of ME in real-world scenarios.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Scale-Invariant Local Descriptor for Event Recognition in 1D Sensor Signals
Authors:
Jierui Xie,
Mandis S. Beigi
Abstract:
In this paper, we introduce a shape-based, time-scale invariant feature descriptor for 1-D sensor signals. The time-scale invariance of the feature allows us to use feature from one training event to describe events of the same semantic class which may take place over varying time scales such as walking slow and walking fast. Therefore it requires less training set. The descriptor takes advantage…
▽ More
In this paper, we introduce a shape-based, time-scale invariant feature descriptor for 1-D sensor signals. The time-scale invariance of the feature allows us to use feature from one training event to describe events of the same semantic class which may take place over varying time scales such as walking slow and walking fast. Therefore it requires less training set. The descriptor takes advantage of the invariant location detection in the scale space theory and employs a high level shape encoding scheme to capture invariant local features of events. Based on this descriptor, a scale-invariant classifier with "R" metric (SIC-R) is designed to recognize multi-scale events of human activities. The R metric combines the number of matches of keypoint in scale space with the Dynamic Time War** score. SICR is tested on various types of 1-D sensors data from passive infrared, accelerometer and seismic sensors with more than 90% classification accuracy.
△ Less
Submitted 27 May, 2011;
originally announced May 2011.