-
generAItor: Tree-in-the-Loop Text Generation for Language Model Explainability and Adaptation
Authors:
Thilo Spinner,
Rebecca Kehlbeck,
Rita Sevastjanova,
Tobias Stähle,
Daniel A. Keim,
Oliver Deussen,
Mennatallah El-Assady
Abstract:
Large language models (LLMs) are widely deployed in various downstream tasks, e.g., auto-completion, aided writing, or chat-based text generation. However, the considered output candidates of the underlying search algorithm are under-explored and under-explained. We tackle this shortcoming by proposing a tree-in-the-loop approach, where a visual representation of the beam search tree is the centra…
▽ More
Large language models (LLMs) are widely deployed in various downstream tasks, e.g., auto-completion, aided writing, or chat-based text generation. However, the considered output candidates of the underlying search algorithm are under-explored and under-explained. We tackle this shortcoming by proposing a tree-in-the-loop approach, where a visual representation of the beam search tree is the central component for analyzing, explaining, and adapting the generated outputs. To support these tasks, we present generAItor, a visual analytics technique, augmenting the central beam search tree with various task-specific widgets, providing targeted visualizations and interaction possibilities. Our approach allows interactions on multiple levels and offers an iterative pipeline that encompasses generating, exploring, and comparing output candidates, as well as fine-tuning the model based on adapted data. Our case study shows that our tool generates new insights in gender bias analysis beyond state-of-the-art template-based methods. Additionally, we demonstrate the applicability of our approach in a qualitative user study. Finally, we quantitatively evaluate the adaptability of the model to few samples, as occurring in text-generation use cases.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Revealing the Unwritten: Visual Investigation of Beam Search Trees to Address Language Model Prompting Challenges
Authors:
Thilo Spinner,
Rebecca Kehlbeck,
Rita Sevastjanova,
Tobias Stähle,
Daniel A. Keim,
Oliver Deussen,
Andreas Spitz,
Mennatallah El-Assady
Abstract:
The growing popularity of generative language models has amplified interest in interactive methods to guide model outputs. Prompt refinement is considered one of the most effective means to influence output among these methods. We identify several challenges associated with prompting large language models, categorized into data- and model-specific, linguistic, and socio-linguistic challenges. A co…
▽ More
The growing popularity of generative language models has amplified interest in interactive methods to guide model outputs. Prompt refinement is considered one of the most effective means to influence output among these methods. We identify several challenges associated with prompting large language models, categorized into data- and model-specific, linguistic, and socio-linguistic challenges. A comprehensive examination of model outputs, including runner-up candidates and their corresponding probabilities, is needed to address these issues. The beam search tree, the prevalent algorithm to sample model outputs, can inherently supply this information. Consequently, we introduce an interactive visual method for investigating the beam search tree, facilitating analysis of the decisions made by the model during generation. We quantitatively show the value of exposing the beam search tree and present five detailed analysis scenarios addressing the identified challenges. Our methodology validates existing results and offers additional insights.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Reducing Ambiguities in Line-based Density Plots by Image-space Colorization
Authors:
Yumeng Xue,
Patrick Paetzold,
Rebecca Kehlbeck,
Bin Chen,
Kin Chung Kwan,
Yunhai Wang,
Oliver Deussen
Abstract:
Line-based density plots are used to reduce visual clutter in line charts with a multitude of individual lines. However, these traditional density plots are often perceived ambiguously, which obstructs the user's identification of underlying trends in complex datasets. Thus, we propose a novel image space coloring method for line-based density plots that enhances their interpretability. Our method…
▽ More
Line-based density plots are used to reduce visual clutter in line charts with a multitude of individual lines. However, these traditional density plots are often perceived ambiguously, which obstructs the user's identification of underlying trends in complex datasets. Thus, we propose a novel image space coloring method for line-based density plots that enhances their interpretability. Our method employs color not only to visually communicate data density but also to highlight similar regions in the plot, allowing users to identify and distinguish trends easily. We achieve this by performing hierarchical clustering based on the lines passing through each region and map** the identified clusters to the hue circle using circular MDS. Additionally, we propose a heuristic approach to assign each line to the most probable cluster, enabling users to analyze density and individual lines. We motivate our method by conducting a small-scale user study, demonstrating the effectiveness of our method using synthetic and real-world datasets, and providing an interactive online tool for generating colored line-based density plots.
△ Less
Submitted 22 November, 2023; v1 submitted 16 July, 2023;
originally announced July 2023.
-
SpEuler: Semantics-preserving Euler Diagrams
Authors:
Rebecca Kehlbeck,
Jochen Görtler,
Yunhai Wang,
Oliver Deussen
Abstract:
Creating comprehensible visualizations of highly overlap** set-typed data is a challenging task due to its complexity. To facilitate insights into set connectivity and to leverage semantic relations between intersections, we propose a fast two-step layout technique for Euler diagrams that are both well-matched and well-formed. Our method conforms to established form guidelines for Euler diagrams…
▽ More
Creating comprehensible visualizations of highly overlap** set-typed data is a challenging task due to its complexity. To facilitate insights into set connectivity and to leverage semantic relations between intersections, we propose a fast two-step layout technique for Euler diagrams that are both well-matched and well-formed. Our method conforms to established form guidelines for Euler diagrams regarding semantics, aesthetics, and readability. First, we establish an initial ordering of the data, which we then use to incrementally create a planar, connected, and monotone dual graph representation. In the next step, the graph is transformed into a circular layout that maintains the semantics and yields simple Euler diagrams with smooth curves. When the data cannot be represented by simple diagrams, our algorithm always falls back to a solution that is not well-formed but still well-matched, whereas previous methods often fail to produce expected results. We show the usefulness of our method for visualizing set-typed data using examples from text analysis and infographics. Furthermore, we discuss the characteristics of our approach and evaluate our method against state-of-the-art methods.
△ Less
Submitted 7 August, 2021;
originally announced August 2021.
-
Semantic Concept Spaces: Guided Topic Model Refinement using Word-Embedding Projections
Authors:
Mennatallah El-Assady,
Rebecca Kehlbeck,
Christopher Collins,
Daniel Keim,
Oliver Deussen
Abstract:
We present a framework that allows users to incorporate the semantics of their domain knowledge for topic model refinement while remaining model-agnostic. Our approach enables users to (1) understand the semantic space of the model, (2) identify regions of potential conflicts and problems, and (3) readjust the semantic relation of concepts based on their understanding, directly influencing the top…
▽ More
We present a framework that allows users to incorporate the semantics of their domain knowledge for topic model refinement while remaining model-agnostic. Our approach enables users to (1) understand the semantic space of the model, (2) identify regions of potential conflicts and problems, and (3) readjust the semantic relation of concepts based on their understanding, directly influencing the topic modeling. These tasks are supported by an interactive visual analytics workspace that uses word-embedding projections to define concept regions which can then be refined. The user-refined concepts are independent of a particular document collection and can be transferred to related corpora. All user interactions within the concept space directly affect the semantic relations of the underlying vector space model, which, in turn, change the topic modeling. In addition to direct manipulation, our system guides the users' decision-making process through recommended interactions that point out potential improvements. This targeted refinement aims at minimizing the feedback required for an efficient human-in-the-loop process. We confirm the improvements achieved through our approach in two user studies that show topic model quality improvements through our visual knowledge externalization and learning process.
△ Less
Submitted 1 August, 2019;
originally announced August 2019.
-
VIANA: Visual Interactive Annotation of Argumentation
Authors:
Fabian Sperrle,
Rita Sevastjanova,
Rebecca Kehlbeck,
Mennatallah El-Assady
Abstract:
Argumentation Mining addresses the challenging tasks of identifying boundaries of argumentative text fragments and extracting their relationships. Fully automated solutions do not reach satisfactory accuracy due to their insufficient incorporation of semantics and domain knowledge. Therefore, experts currently rely on time-consuming manual annotations. In this paper, we present a visual analytics…
▽ More
Argumentation Mining addresses the challenging tasks of identifying boundaries of argumentative text fragments and extracting their relationships. Fully automated solutions do not reach satisfactory accuracy due to their insufficient incorporation of semantics and domain knowledge. Therefore, experts currently rely on time-consuming manual annotations. In this paper, we present a visual analytics system that augments the manual annotation process by automatically suggesting which text fragments to annotate next. The accuracy of those suggestions is improved over time by incorporating linguistic knowledge and language modeling to learn a measure of argument similarity from user interactions. Based on a long-term collaboration with domain experts, we identify and model five high-level analysis tasks. We enable close reading and note-taking, annotation of arguments, argument reconstruction, extraction of argument relations, and exploration of argument graphs. To avoid context switches, we transition between all views through seamless morphing, visually anchoring all text- and graph-based layers. We evaluate our system with a two-stage expert user study based on a corpus of presidential debates. The results show that experts prefer our system over existing solutions due to the speedup provided by the automatic suggestions and the tight integration between text and graph views.
△ Less
Submitted 29 July, 2019;
originally announced July 2019.