Search | arXiv e-print repository

Interactive Prompt Debugging with Sequence Salience

Authors: Ian Tenney, Ryan Mullins, Bin Du, Shree Pandya, Minsuk Kahng, Lucas Dixon

Abstract: We present Sequence Salience, a visual tool for interactive prompt debugging with input salience methods. Sequence Salience builds on widely used salience methods for text classification and single-token prediction, and extends this to a system tailored for debugging complex LLM prompts. Our system is well-suited for long texts, and expands on previous work by 1) providing controllable aggregation… ▽ More We present Sequence Salience, a visual tool for interactive prompt debugging with input salience methods. Sequence Salience builds on widely used salience methods for text classification and single-token prediction, and extends this to a system tailored for debugging complex LLM prompts. Our system is well-suited for long texts, and expands on previous work by 1) providing controllable aggregation of token-level salience to the word, sentence, or paragraph level, making salience over long inputs tractable; and 2) supporting rapid iteration where practitioners can act on salience results, refine prompts, and run salience on the new output. We include case studies showing how Sequence Salience can help practitioners work with several complex prompting strategies, including few-shot, chain-of-thought, and constitutional principles. Sequence Salience is built on the Learning Interpretability Tool, an open-source platform for ML model visualizations, and code, notebooks, and tutorials are available at http://goo.gle/sequence-salience. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.01361 [pdf, other]

LLM Attributor: Interactive Visual Attribution for LLM Generation

Authors: Seongmin Lee, Zijie J. Wang, Aishwarya Chakravarthy, Alec Helbling, ShengYun Peng, Mansi Phute, Duen Horng Chau, Minsuk Kahng

Abstract: While large language models (LLMs) have shown remarkable capability to generate convincing text across diverse domains, concerns around its potential risks have highlighted the importance of understanding the rationale behind text generation. We present LLM Attributor, a Python library that provides interactive visualizations for training data attribution of an LLM's text generation. Our library o… ▽ More While large language models (LLMs) have shown remarkable capability to generate convincing text across diverse domains, concerns around its potential risks have highlighted the importance of understanding the rationale behind text generation. We present LLM Attributor, a Python library that provides interactive visualizations for training data attribution of an LLM's text generation. Our library offers a new way to quickly attribute an LLM's text generation to training data points to inspect model behaviors, enhance its trustworthiness, and compare model-generated text with user-provided text. We describe the visual and interactive design of our tool and highlight usage scenarios for LLaMA2 models fine-tuned with two different datasets: online articles about recent disasters and finance-related question-answer pairs. Thanks to LLM Attributor's broad support for computational notebooks, users can easily integrate it into their workflow to interactively visualize attributions of their models. For easier access and extensibility, we open-source LLM Attributor at https://github.com/poloclub/ LLM-Attribution. The video demo is available at https://youtu.be/mIG2MDQKQxM. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: 8 pages, 3 figures, For a video demo, see https://youtu.be/mIG2MDQKQxM

arXiv:2403.12075 [pdf, other]

Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation

Authors: Jessica Quaye, Alicia Parrish, Oana Inel, Charvi Rastogi, Hannah Rose Kirk, Minsuk Kahng, Erin van Liemt, Max Bartolo, Jess Tsang, Justin White, Nathan Clement, Rafael Mosquera, Juan Ciro, Vijay Janapa Reddi, Lora Aroyo

Abstract: With the rise of text-to-image (T2I) generative AI models reaching wide audiences, it is critical to evaluate model robustness against non-obvious attacks to mitigate the generation of offensive images. By focusing on ``implicitly adversarial'' prompts (those that trigger T2I models to generate unsafe images for non-obvious reasons), we isolate a set of difficult safety issues that human creativit… ▽ More With the rise of text-to-image (T2I) generative AI models reaching wide audiences, it is critical to evaluate model robustness against non-obvious attacks to mitigate the generation of offensive images. By focusing on ``implicitly adversarial'' prompts (those that trigger T2I models to generate unsafe images for non-obvious reasons), we isolate a set of difficult safety issues that human creativity is well-suited to uncover. To this end, we built the Adversarial Nibbler Challenge, a red-teaming methodology for crowdsourcing a diverse set of implicitly adversarial prompts. We have assembled a suite of state-of-the-art T2I models, employed a simple user interface to identify and annotate harms, and engaged diverse populations to capture long-tail safety issues that may be overlooked in standard testing. The challenge is run in consecutive rounds to enable a sustained discovery and analysis of safety pitfalls in T2I models. In this paper, we present an in-depth account of our methodology, a systematic study of novel attack strategies and discussion of safety failures revealed by challenge participants. We also release a companion visualization tool for easy exploration and derivation of insights from the dataset. The first challenge round resulted in over 10k prompt-image pairs with machine annotations for safety. A subset of 1.5k samples contains rich human annotations of harm types and attack styles. We find that 14% of images that humans consider harmful are mislabeled as ``safe'' by machines. We have identified new attack strategies that highlight the complexity of ensuring T2I model robustness. Our findings emphasize the necessity of continual auditing and adaptation as new vulnerabilities emerge. We are confident that this work will enable proactive, iterative safety assessments and promote responsible development of T2I models. △ Less

Submitted 13 May, 2024; v1 submitted 14 February, 2024; originally announced March 2024.

Comments: 10 pages, 6 figures

arXiv:2402.16611 [pdf, other]

doi 10.1145/3613905.3651007

Understanding the Dataset Practitioners Behind Large Language Model Development

Authors: Crystal Qian, Emily Reif, Minsuk Kahng

Abstract: As large language models (LLMs) become more advanced and impactful, it is increasingly important to scrutinize the data that they rely upon and produce. What is it to be a dataset practitioner doing this work? We approach this in two parts: first, we define the role of "dataset practitioners" by performing a retrospective analysis on the responsibilities of teams contributing to LLM development at… ▽ More As large language models (LLMs) become more advanced and impactful, it is increasingly important to scrutinize the data that they rely upon and produce. What is it to be a dataset practitioner doing this work? We approach this in two parts: first, we define the role of "dataset practitioners" by performing a retrospective analysis on the responsibilities of teams contributing to LLM development at a technology company, Google. Then, we conduct semi-structured interviews with a cross-section of these practitioners (N=10). We find that although data quality is a top priority, there is little consensus around what data quality is and how to evaluate it. Consequently, practitioners either rely on their own intuition or write custom code to evaluate their data. We discuss potential reasons for this phenomenon and opportunities for alignment. △ Less

Submitted 1 April, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: 7 pages, 2 figures. To be published in In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA '24). Revised to reflect updates from CHI LBW reviewer feedback

arXiv:2402.14880 [pdf, other]

Automatic Histograms: Leveraging Language Models for Text Dataset Exploration

Authors: Emily Reif, Crystal Qian, James Wexler, Minsuk Kahng

Abstract: Making sense of unstructured text datasets is perennially difficult, yet increasingly relevant with Large Language Models. Data workers often rely on dataset summaries, especially distributions of various derived features. Some features, like toxicity or topics, are relevant to many datasets, but many interesting features are domain specific: instruments and genres for a music dataset, or diseases… ▽ More Making sense of unstructured text datasets is perennially difficult, yet increasingly relevant with Large Language Models. Data workers often rely on dataset summaries, especially distributions of various derived features. Some features, like toxicity or topics, are relevant to many datasets, but many interesting features are domain specific: instruments and genres for a music dataset, or diseases and symptoms for a medical dataset. Accordingly, data workers often run custom analyses for each dataset, which is cumbersome and difficult. We present AutoHistograms, a visualization tool leveragingLLMs. AutoHistograms automatically identifies relevant features, visualizes them with histograms, and allows the user to interactively query the dataset for categories of entities and create new histograms. In a user study with 10 data workers (n=10), we observe that participants can quickly identify insights and explore the data using AutoHistograms, and conceptualize a broad range of applicable use cases. Together, this tool and user study contributeto the growing field of LLM-assisted sensemaking tools. △ Less

Submitted 21 February, 2024; originally announced February 2024.

arXiv:2402.10524 [pdf, other]

LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models

Authors: Minsuk Kahng, Ian Tenney, Mahima Pushkarna, Michael Xieyang Liu, James Wexler, Emily Reif, Krystal Kallarackal, Minsuk Chang, Michael Terry, Lucas Dixon

Abstract: Automatic side-by-side evaluation has emerged as a promising approach to evaluating the quality of responses from large language models (LLMs). However, analyzing the results from this evaluation approach raises scalability and interpretability challenges. In this paper, we present LLM Comparator, a novel visual analytics tool for interactively analyzing results from automatic side-by-side evaluat… ▽ More Automatic side-by-side evaluation has emerged as a promising approach to evaluating the quality of responses from large language models (LLMs). However, analyzing the results from this evaluation approach raises scalability and interpretability challenges. In this paper, we present LLM Comparator, a novel visual analytics tool for interactively analyzing results from automatic side-by-side evaluation. The tool supports interactive workflows for users to understand when and why a model performs better or worse than a baseline model, and how the responses from two models are qualitatively different. We iteratively designed and developed the tool by closely working with researchers and engineers at a large technology company. This paper details the user challenges we identified, the design and development of the tool, and an observational study with participants who regularly evaluate their models. △ Less

Submitted 16 February, 2024; originally announced February 2024.

arXiv:2309.06703 [pdf, other]

VLSlice: Interactive Vision-and-Language Slice Discovery

Authors: Eric Slyman, Minsuk Kahng, Stefan Lee

Abstract: Recent work in vision-and-language demonstrates that large-scale pretraining can learn generalizable models that are efficiently transferable to downstream tasks. While this may improve dataset-scale aggregate metrics, analyzing performance around hand-crafted subgroups targeting specific bias dimensions reveals systemic undesirable behaviors. However, this subgroup analysis is frequently stalled… ▽ More Recent work in vision-and-language demonstrates that large-scale pretraining can learn generalizable models that are efficiently transferable to downstream tasks. While this may improve dataset-scale aggregate metrics, analyzing performance around hand-crafted subgroups targeting specific bias dimensions reveals systemic undesirable behaviors. However, this subgroup analysis is frequently stalled by annotation efforts, which require extensive time and resources to collect the necessary data. Prior art attempts to automatically discover subgroups to circumvent these constraints but typically leverages model behavior on existing task-specific annotations and rapidly degrades on more complex inputs beyond "tabular" data, none of which study vision-and-language models. This paper presents VLSlice, an interactive system enabling user-guided discovery of coherent representation-level subgroups with consistent visiolinguistic behavior, denoted as vision-and-language slices, from unlabeled image sets. We show that VLSlice enables users to quickly generate diverse high-coherency slices in a user study (n=22) and release the tool publicly. △ Less

Submitted 13 September, 2023; originally announced September 2023.

Comments: Conference paper at ICCV 2023. 17 pages, 11 figures. https://ericslyman.com/vlslice/

ACM Class: I.4.10; I.2.7; J.4

arXiv:2305.11364 [pdf, other]

Visualizing Linguistic Diversity of Text Datasets Synthesized by Large Language Models

Authors: Emily Reif, Minsuk Kahng, Savvas Petridis

Abstract: Large language models (LLMs) can be used to generate smaller, more refined datasets via few-shot prompting for benchmarking, fine-tuning or other use cases. However, understanding and evaluating these datasets is difficult, and the failure modes of LLM-generated data are still not well understood. Specifically, the data can be repetitive in surprising ways, not only semantically but also syntactic… ▽ More Large language models (LLMs) can be used to generate smaller, more refined datasets via few-shot prompting for benchmarking, fine-tuning or other use cases. However, understanding and evaluating these datasets is difficult, and the failure modes of LLM-generated data are still not well understood. Specifically, the data can be repetitive in surprising ways, not only semantically but also syntactically and lexically. We present LinguisticLens, a novel inter-active visualization tool for making sense of and analyzing syntactic diversity of LLM-generated datasets. LinguisticLens clusters text along syntactic, lexical, and semantic axes. It supports hierarchical visualization of a text dataset, allowing users to quickly scan for an overview and inspect individual examples. The live demo is available at shorturl.at/zHOUV. △ Less

Submitted 27 September, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

arXiv:2206.02039 [pdf, other]

Beyond Value: CHECKLIST for Testing Inferences in Planning-Based RL

Authors: Kin-Ho Lam, Delyar Tabatabai, Jed Irvine, Donald Bertucci, Anita Ruangrotsakun, Minsuk Kahng, Alan Fern

Abstract: Reinforcement learning (RL) agents are commonly evaluated via their expected value over a distribution of test scenarios. Unfortunately, this evaluation approach provides limited evidence for post-deployment generalization beyond the test distribution. In this paper, we address this limitation by extending the recent CheckList testing methodology from natural language processing to planning-based… ▽ More Reinforcement learning (RL) agents are commonly evaluated via their expected value over a distribution of test scenarios. Unfortunately, this evaluation approach provides limited evidence for post-deployment generalization beyond the test distribution. In this paper, we address this limitation by extending the recent CheckList testing methodology from natural language processing to planning-based RL. Specifically, we consider testing RL agents that make decisions via online tree search using a learned transition model and value function. The key idea is to improve the assessment of future performance via a CheckList approach for exploring and assessing the agent's inferences during tree search. The approach provides the user with an interface and general query-rule mechanism for identifying potential inference flaws and validating expected inference invariances. We present a user study involving knowledgeable AI researchers using the approach to evaluate an agent trained to play a complex real-time strategy game. The results show the approach is effective in allowing users to identify previously-unknown flaws in the agent's reasoning. In addition, our analysis provides insight into how AI experts use this type of testing approach, which may help improve future instantiations. △ Less

Submitted 7 June, 2022; v1 submitted 4 June, 2022; originally announced June 2022.

Comments: This work will appear in the Proceedings of the 32nd International Conference on Automated Planning and Scheduling (ICAPS2022) https://icaps22.icaps-conference.org/papers

arXiv:2205.06935 [pdf, other]

DendroMap: Visual Exploration of Large-Scale Image Datasets for Machine Learning with Treemaps

Authors: Donald Bertucci, Md Montaser Hamid, Yashwanthi Anand, Anita Ruangrotsakun, Delyar Tabatabai, Melissa Perez, Minsuk Kahng

Abstract: In this paper, we present DendroMap, a novel approach to interactively exploring large-scale image datasets for machine learning (ML). ML practitioners often explore image datasets by generating a grid of images or projecting high-dimensional representations of images into 2-D using dimensionality reduction techniques (e.g., t-SNE). However, neither approach effectively scales to large datasets be… ▽ More In this paper, we present DendroMap, a novel approach to interactively exploring large-scale image datasets for machine learning (ML). ML practitioners often explore image datasets by generating a grid of images or projecting high-dimensional representations of images into 2-D using dimensionality reduction techniques (e.g., t-SNE). However, neither approach effectively scales to large datasets because images are ineffectively organized and interactions are insufficiently supported. To address these challenges, we develop DendroMap by adapting Treemaps, a well-known visualization technique. DendroMap effectively organizes images by extracting hierarchical cluster structures from high-dimensional representations of images. It enables users to make sense of the overall distributions of datasets and interactively zoom into specific areas of interests at multiple levels of abstraction. Our case studies with widely-used image datasets for deep learning demonstrate that users can discover insights about datasets and trained models by examining the diversity of images, identifying underperforming subgroups, and analyzing classification errors. We conducted a user study that evaluates the effectiveness of DendroMap in grou** and searching tasks by comparing it with a gridified version of t-SNE and found that participants preferred DendroMap. DendroMap is available at https://div-lab.github.io/dendromap/. △ Less

Submitted 15 August, 2022; v1 submitted 13 May, 2022; originally announced May 2022.

Comments: This paper has been accepted for the IEEE VIS 2022 Conference and will be published in the IEEE Transactions on Visualization and Computer Graphics

arXiv:2109.13978 [pdf, other]

Identifying Reasoning Flaws in Planning-Based RL Using Tree Explanations

Authors: Kin-Ho Lam, Zhengxian Lin, Jed Irvine, Jonathan Dodge, Zeyad T Shureih, Roli Khanna, Minsuk Kahng, Alan Fern

Abstract: Enabling humans to identify potential flaws in an agent's decision making is an important Explainable AI application. We consider identifying such flaws in a planning-based deep reinforcement learning (RL) agent for a complex real-time strategy game. In particular, the agent makes decisions via tree search using a learned model and evaluation function over interpretable states and actions. This gi… ▽ More Enabling humans to identify potential flaws in an agent's decision making is an important Explainable AI application. We consider identifying such flaws in a planning-based deep reinforcement learning (RL) agent for a complex real-time strategy game. In particular, the agent makes decisions via tree search using a learned model and evaluation function over interpretable states and actions. This gives the potential for humans to identify flaws at the level of reasoning steps in the tree, even if the entire reasoning process is too complex to understand. However, it is unclear whether humans will be able to identify such flaws due to the size and complexity of trees. We describe a user interface and case study, where a small group of AI experts and developers attempt to identify reasoning flaws due to inaccurate agent learning. Overall, the interface allowed the group to identify a number of significant flaws of varying types, demonstrating the promise of this approach. △ Less

Submitted 28 September, 2021; originally announced September 2021.

arXiv:2109.06365 [pdf, other]

doi 10.1002/ail2.46

From Heatmaps to Structural Explanations of Image Classifiers

Authors: Li Fuxin, Zhongang Qi, Saeed Khorram, Vivswan Shitole, Prasad Tadepalli, Minsuk Kahng, Alan Fern

Abstract: This paper summarizes our endeavors in the past few years in terms of explaining image classifiers, with the aim of including negative results and insights we have gained. The paper starts with describing the explainable neural network (XNN), which attempts to extract and visualize several high-level concepts purely from the deep network, without relying on human linguistic concepts. This helps us… ▽ More This paper summarizes our endeavors in the past few years in terms of explaining image classifiers, with the aim of including negative results and insights we have gained. The paper starts with describing the explainable neural network (XNN), which attempts to extract and visualize several high-level concepts purely from the deep network, without relying on human linguistic concepts. This helps users understand network classifications that are less intuitive and substantially improves user performance on a difficult fine-grained classification task of discriminating among different species of seagulls. Realizing that an important missing piece is a reliable heatmap visualization tool, we have developed I-GOS and iGOS++ utilizing integrated gradients to avoid local optima in heatmap generation, which improved the performance across all resolutions. During the development of those visualizations, we realized that for a significant number of images, the classifier has multiple different paths to reach a confident prediction. This has lead to our recent development of structured attention graphs (SAGs), an approach that utilizes beam search to locate multiple coarse heatmaps for a single image, and compactly visualizes a set of heatmaps by capturing how different combinations of image regions impact the confidence of a classifier. Through the research process, we have learned much about insights in building deep network explanations, the existence and frequency of multiple explanations, and various tricks of the trade that make explanations work. In this paper, we attempt to share those insights and opinions with the readers with the hope that some of them will be informative for future researchers on explainable deep learning. △ Less

Submitted 13 September, 2021; originally announced September 2021.

Comments: Submitted to Applied AI Letters

Journal ref: Applied AI Letters.2021;2:e46

arXiv:2108.08000 [pdf, other]

Contrastive Identification of Covariate Shift in Image Data

Authors: Matthew L. Olson, Thuy-Vy Nguyen, Gaurav Dixit, Neale Ratzlaff, Weng-Keen Wong, Minsuk Kahng

Abstract: Identifying covariate shift is crucial for making machine learning systems robust in the real world and for detecting training data biases that are not reflected in test data. However, detecting covariate shift is challenging, especially when the data consists of high-dimensional images, and when multiple types of localized covariate shift affect different subspaces of the data. Although automated… ▽ More Identifying covariate shift is crucial for making machine learning systems robust in the real world and for detecting training data biases that are not reflected in test data. However, detecting covariate shift is challenging, especially when the data consists of high-dimensional images, and when multiple types of localized covariate shift affect different subspaces of the data. Although automated techniques can be used to detect the existence of covariate shift, our goal is to help human users characterize the extent of covariate shift in large image datasets with interfaces that seamlessly integrate information obtained from the detection algorithms. In this paper, we design and evaluate a new visual interface that facilitates the comparison of the local distributions of training and test data. We conduct a quantitative user study on multi-attribute facial data to compare two different learned low-dimensional latent representations (pretrained ImageNet CNN vs. density ratio) and two user analytic workflows (nearest-neighbor vs. cluster-to-cluster). Our results indicate that the latent representation of our density ratio model, combined with a nearest-neighbor comparison, is the most effective at hel** humans identify covariate shift. △ Less

Submitted 19 August, 2021; v1 submitted 18 August, 2021; originally announced August 2021.

Comments: IEEE VIS 2021

arXiv:2011.06733 [pdf, other]

One Explanation is Not Enough: Structured Attention Graphs for Image Classification

Authors: Vivswan Shitole, Li Fuxin, Minsuk Kahng, Prasad Tadepalli, Alan Fern

Abstract: Attention maps are a popular way of explaining the decisions of convolutional networks for image classification. Typically, for each image of interest, a single attention map is produced, which assigns weights to pixels based on their importance to the classification. A single attention map, however, provides an incomplete understanding since there are often many other maps that explain a classifi… ▽ More Attention maps are a popular way of explaining the decisions of convolutional networks for image classification. Typically, for each image of interest, a single attention map is produced, which assigns weights to pixels based on their importance to the classification. A single attention map, however, provides an incomplete understanding since there are often many other maps that explain a classification equally well. In this paper, we introduce structured attention graphs (SAGs), which compactly represent sets of attention maps for an image by capturing how different combinations of image regions impact a classifier's confidence. We propose an approach to compute SAGs and a visualization for SAGs so that deeper insight can be gained into a classifier's decisions. We conduct a user study comparing the use of SAGs to traditional attention maps for answering counterfactual questions about image classifications. Our results show that the users are more correct when answering comparative counterfactual questions based on SAGs compared to the baselines. △ Less

Submitted 7 November, 2021; v1 submitted 12 November, 2020; originally announced November 2020.

Comments: 26 pages, 25 figures

Journal ref: NeuRIPS 2021

arXiv:2004.15004 [pdf, other]

doi 10.1109/TVCG.2020.3030418

CNN Explainer: Learning Convolutional Neural Networks with Interactive Visualization

Authors: Zijie J. Wang, Robert Turko, Omar Shaikh, Haekyu Park, Nilaksh Das, Fred Hohman, Minsuk Kahng, Duen Horng Chau

Abstract: Deep learning's great success motivates many practitioners and students to learn about this exciting technology. However, it is often challenging for beginners to take their first step due to the complexity of understanding and applying deep learning. We present CNN Explainer, an interactive visualization tool designed for non-experts to learn and examine convolutional neural networks (CNNs), a fo… ▽ More Deep learning's great success motivates many practitioners and students to learn about this exciting technology. However, it is often challenging for beginners to take their first step due to the complexity of understanding and applying deep learning. We present CNN Explainer, an interactive visualization tool designed for non-experts to learn and examine convolutional neural networks (CNNs), a foundational deep learning model architecture. Our tool addresses key challenges that novices face while learning about CNNs, which we identify from interviews with instructors and a survey with past students. CNN Explainer tightly integrates a model overview that summarizes a CNN's structure, and on-demand, dynamic visual explanation views that help users understand the underlying components of CNNs. Through smooth transitions across levels of abstraction, our tool enables users to inspect the interplay between low-level mathematical operations and high-level model structures. A qualitative user study shows that CNN Explainer helps users more easily understand the inner workings of CNNs, and is engaging and enjoyable to use. We also derive design lessons from our study. Developed using modern web technologies, CNN Explainer runs locally in users' web browsers without the need for installation or specialized hardware, broadening the public's education access to modern deep learning techniques. △ Less

Submitted 28 August, 2020; v1 submitted 30 April, 2020; originally announced April 2020.

Comments: 11 pages, 14 figures, to be presented at IEEE VIS 2020. For a demo video, see https://youtu.be/HnWIHWFbuUQ . For a live demo, visit https://poloclub.github.io/cnn-explainer/

arXiv:2001.02004 [pdf, other]

doi 10.1145/3334480.3382899

CNN 101: Interactive Visual Learning for Convolutional Neural Networks

Authors: Zijie J. Wang, Robert Turko, Omar Shaikh, Haekyu Park, Nilaksh Das, Fred Hohman, Minsuk Kahng, Duen Horng Chau

Abstract: The success of deep learning solving previously-thought hard problems has inspired many non-experts to learn and understand this exciting technology. However, it is often challenging for learners to take the first steps due to the complexity of deep learning models. We present our ongoing work, CNN 101, an interactive visualization system for explaining and teaching convolutional neural networks.… ▽ More The success of deep learning solving previously-thought hard problems has inspired many non-experts to learn and understand this exciting technology. However, it is often challenging for learners to take the first steps due to the complexity of deep learning models. We present our ongoing work, CNN 101, an interactive visualization system for explaining and teaching convolutional neural networks. Through tightly integrated interactive views, CNN 101 offers both overview and detailed descriptions of how a model works. Built using modern web technologies, CNN 101 runs locally in users' web browsers without requiring specialized hardware, broadening the public's education access to modern deep learning techniques. △ Less

Submitted 27 February, 2020; v1 submitted 7 January, 2020; originally announced January 2020.

Comments: CHI'20 Late-Breaking Work (April 25-30, 2020), 7 pages, 3 figures

arXiv:1904.05419 [pdf, other]

doi 10.1109/VAST47406.2019.8986948

FairVis: Visual Analytics for Discovering Intersectional Bias in Machine Learning

Authors: Ángel Alexander Cabrera, Will Epperson, Fred Hohman, Minsuk Kahng, Jamie Morgenstern, Duen Horng Chau

Abstract: The growing capability and accessibility of machine learning has led to its application to many real-world domains and data about people. Despite the benefits algorithmic systems may bring, models can reflect, inject, or exacerbate implicit and explicit societal biases into their outputs, disadvantaging certain demographic subgroups. Discovering which biases a machine learning model has introduced… ▽ More The growing capability and accessibility of machine learning has led to its application to many real-world domains and data about people. Despite the benefits algorithmic systems may bring, models can reflect, inject, or exacerbate implicit and explicit societal biases into their outputs, disadvantaging certain demographic subgroups. Discovering which biases a machine learning model has introduced is a great challenge, due to the numerous definitions of fairness and the large number of potentially impacted subgroups. We present FairVis, a mixed-initiative visual analytics system that integrates a novel subgroup discovery technique for users to audit the fairness of machine learning models. Through FairVis, users can apply domain knowledge to generate and investigate known subgroups, and explore suggested and similar subgroups. FairVis' coordinated views enable users to explore a high-level overview of subgroup performance and subsequently drill down into detailed investigation of specific subgroups. We show how FairVis helps to discover biases in two real datasets used in predicting income and recidivism. As a visual analytics system devoted to discovering bias in machine learning, FairVis demonstrates how interactive visualization may help data scientists and the general public understand and create more equitable algorithmic systems. △ Less

Submitted 1 September, 2019; v1 submitted 10 April, 2019; originally announced April 2019.

Comments: Accepted as a VAST conference paper to IEEE VIS'19

arXiv:1809.01587 [pdf, other]

doi 10.1109/TVCG.2018.2864500

GAN Lab: Understanding Complex Deep Generative Models using Interactive Visual Experimentation

Authors: Minsuk Kahng, Nikhil Thorat, Duen Horng Chau, Fernanda Viégas, Martin Wattenberg

Abstract: Recent success in deep learning has generated immense interest among practitioners and students, inspiring many to learn about this new technology. While visual and interactive approaches have been successfully developed to help people more easily learn deep learning, most existing tools focus on simpler models. In this work, we present GAN Lab, the first interactive visualization tool designed fo… ▽ More Recent success in deep learning has generated immense interest among practitioners and students, inspiring many to learn about this new technology. While visual and interactive approaches have been successfully developed to help people more easily learn deep learning, most existing tools focus on simpler models. In this work, we present GAN Lab, the first interactive visualization tool designed for non-experts to learn and experiment with Generative Adversarial Networks (GANs), a popular class of complex deep learning models. With GAN Lab, users can interactively train generative models and visualize the dynamic training process's intermediate results. GAN Lab tightly integrates an model overview graph that summarizes GAN's structure, and a layered distributions view that helps users interpret the interplay between submodels. GAN Lab introduces new interactive experimentation features for learning complex deep learning models, such as step-by-step training at multiple levels of abstraction for understanding intricate training dynamics. Implemented using TensorFlow.js, GAN Lab is accessible to anyone via modern web browsers, without the need for installation or specialized hardware, overcoming a major practical challenge in deploying interactive tools for deep learning. △ Less

Submitted 5 September, 2018; originally announced September 2018.

Comments: This paper will be published in the IEEE Transactions on Visualization and Computer Graphics, 25(1), January 2019, and presented at IEEE VAST 2018

arXiv:1801.06889 [pdf, other]

Visual Analytics in Deep Learning: An Interrogative Survey for the Next Frontiers

Authors: Fred Hohman, Minsuk Kahng, Robert Pienta, Duen Horng Chau

Abstract: Deep learning has recently seen rapid development and received significant attention due to its state-of-the-art performance on previously-thought hard problems. However, because of the internal complexity and nonlinear structure of deep neural networks, the underlying decision making processes for why these models are achieving such performance are challenging and sometimes mystifying to interpre… ▽ More Deep learning has recently seen rapid development and received significant attention due to its state-of-the-art performance on previously-thought hard problems. However, because of the internal complexity and nonlinear structure of deep neural networks, the underlying decision making processes for why these models are achieving such performance are challenging and sometimes mystifying to interpret. As deep learning spreads across domains, it is of paramount importance that we equip users of deep learning with tools for understanding when a model works correctly, when it fails, and ultimately how to improve its performance. Standardized toolkits for building neural networks have helped democratize deep learning; visual analytics systems have now been developed to support model explanation, interpretation, debugging, and improvement. We present a survey of the role of visual analytics in deep learning research, which highlights its short yet impactful history and thoroughly summarizes the state-of-the-art using a human-centered interrogative framework, focusing on the Five W's and How (Why, Who, What, How, When, and Where). We conclude by highlighting research directions and open research problems. This survey helps researchers and practitioners in both visual analytics and deep learning to quickly learn key aspects of this young and rapidly growing body of research, whose impact spans a diverse range of domains. △ Less

Submitted 14 May, 2018; v1 submitted 21 January, 2018; originally announced January 2018.

Comments: Under review for IEEE Transactions on Visualization and Computer Graphics (TVCG)

ACM Class: H.5.2; I.5.1.d; I.6.9.c; I.6.9.f; I.2.6.g

arXiv:1704.01942 [pdf, other]

ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models

Authors: Minsuk Kahng, Pierre Y. Andrews, Aditya Kalro, Duen Horng Chau

Abstract: While deep learning models have achieved state-of-the-art accuracies for many prediction tasks, understanding these models remains a challenge. Despite the recent interest in develo** visual tools to help users interpret deep learning models, the complexity and wide variety of models deployed in industry, and the large-scale datasets that they used, pose unique design challenges that are inadequ… ▽ More While deep learning models have achieved state-of-the-art accuracies for many prediction tasks, understanding these models remains a challenge. Despite the recent interest in develo** visual tools to help users interpret deep learning models, the complexity and wide variety of models deployed in industry, and the large-scale datasets that they used, pose unique design challenges that are inadequately addressed by existing work. Through participatory design sessions with over 15 researchers and engineers at Facebook, we have developed, deployed, and iteratively improved ActiVis, an interactive visualization system for interpreting large-scale deep learning models and results. By tightly integrating multiple coordinated views, such as a computation graph overview of the model architecture, and a neuron activation view for pattern discovery and comparison, users can explore complex deep neural network models at both the instance- and subset-level. ActiVis has been deployed on Facebook's machine learning platform. We present case studies with Facebook researchers and engineers, and usage scenarios of how ActiVis may work with different models. △ Less

Submitted 8 August, 2017; v1 submitted 6 April, 2017; originally announced April 2017.

Comments: Will be presented at IEEE VAST 2017 and published in IEEE Transactions on Visualization and Computer Graphics, 24(1)

arXiv:1609.08535 [pdf, other]

Chronodes: Interactive Multi-focus Exploration of Event Sequences

Authors: Peter J Polack Jr, Shang-Tse Chen, Minsuk Kahng, Kaya de Barbaro, Moushumi Sharmin, Rahul Basole, Duen Horng Chau

Abstract: The advent of mobile health technologies presents new challenges that existing visualizations, interactive tools, and algorithms are not yet designed to support. In dealing with uncertainty in sensor data and high-dimensional physiological records, we must seek to improve current tools that make sense of health data from traditional perspectives in event-based trend discovery. With Chronodes, a sy… ▽ More The advent of mobile health technologies presents new challenges that existing visualizations, interactive tools, and algorithms are not yet designed to support. In dealing with uncertainty in sensor data and high-dimensional physiological records, we must seek to improve current tools that make sense of health data from traditional perspectives in event-based trend discovery. With Chronodes, a system developed to help researchers collect, interpret, and model mobile health (mHealth) data, we posit a series of interaction techniques that enable new approaches to understanding and exploring event-based data. From numerous and discontinuous mobile health data streams, Chronodes finds and visualizes frequent event sequences that reveal common chronological patterns across participants and days. By then promoting the sequences as interactive elements, Chronodes presents opportunities for finding, defining, and comparing cohorts of participants that exhibit particular behaviors. We applied Chronodes to a real 40GB mHealth dataset capturing about 400 hours of data. Through our pilot study with 20 behavioral and biomedical health experts, we gained insights into Chronodes' efficacy, limitations, and potential applicability to a wide range of healthcare scenarios. △ Less

Submitted 27 September, 2016; originally announced September 2016.

arXiv:1603.02371 [pdf, other]

doi 10.14778/2994509.2994520

Interactive Browsing and Navigation in Relational Databases

Authors: Minsuk Kahng, Shamkant B. Navathe, John T. Stasko, Duen Horng Chau

Abstract: Although researchers have devoted considerable attention to hel** database users formulate queries, many users still find it challenging to specify queries that involve joining tables. To help users construct join queries for exploring relational databases, we propose ETable, a novel presentation data model that provides users with a presentation-level interactive view. This view compactly prese… ▽ More Although researchers have devoted considerable attention to hel** database users formulate queries, many users still find it challenging to specify queries that involve joining tables. To help users construct join queries for exploring relational databases, we propose ETable, a novel presentation data model that provides users with a presentation-level interactive view. This view compactly presents one-to-many and many-to-many relationships within a single enriched table by allowing a cell to contain a set of entity references. Users can directly interact with this enriched table to incrementally construct complex queries and navigate databases on a conceptual entity-relationship level. In a user study, participants performed a range of database querying tasks faster with ETable than with a commercial graphical query builder. Subjective feedback about ETable was also positive. All participants found that ETable was easier to learn and helpful for exploring databases. △ Less

Submitted 20 August, 2016; v1 submitted 7 March, 2016; originally announced March 2016.

Comments: VLDB 2016

Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 9, No. 12, pp. 1017-1028, 2016

arXiv:1506.01406 [pdf, other]

doi 10.1007/978-3-319-46227-1_39

M-Flash: Fast Billion-scale Graph Computation Using a Bimodal Block Processing Model

Authors: Hugo Gualdron, Robson Cordeiro, Jose Rodrigues-Jr, Duen Chau, Minsuk Kahng, U Kang

Abstract: Recent graph computation approaches have demonstrated that a single PC can perform efficiently on billion-scale graphs. While these approaches achieve scalability by optimizing I/O operations, they do not fully exploit the capabilities of modern hard drives and processors. To overcome their performance, in this work, we introduce the Bimodal Block Processing (BBP), an innovation that is able to bo… ▽ More Recent graph computation approaches have demonstrated that a single PC can perform efficiently on billion-scale graphs. While these approaches achieve scalability by optimizing I/O operations, they do not fully exploit the capabilities of modern hard drives and processors. To overcome their performance, in this work, we introduce the Bimodal Block Processing (BBP), an innovation that is able to boost the graph computation by minimizing the I/O cost even further. With this strategy, we achieved the following contributions: (1) M-Flash, the fastest graph computation framework to date; (2) a flexible and simple programming model to easily implement popular and essential graph algorithms, including the first single-machine billion-scale eigensolver; and (3) extensive experiments on real graphs with up to 6.6 billion edges, demonstrating M-Flash's consistent and significant speedup. △ Less

Submitted 14 September, 2016; v1 submitted 3 June, 2015; originally announced June 2015.

Comments: Hugo Gualdron, Robson Cordeiro, Jose Rodrigues-Jr, Duen Chau, Minsuk Kahng, U Kang (2016) M-Flash: Fast Billion-scale Graph Computation Using a Bimodal Block Processing Model, In: ECML-PKDD16, pages 623-640, LNCS, Springer

arXiv:1505.06792 [pdf, other]

Seeing the Forest through the Trees: Adaptive Local Exploration of Large Graphs

Authors: Robert Pienta, Zhiyuan Lin, Minsuk Kahng, Jilles Vreeken, Partha P. Talukdar, James Abello, Ganesh Parameswaran, Duen Horng Chau

Abstract: Visualization is a powerful paradigm for exploratory data analysis. Visualizing large graphs, however, often results in a meaningless hairball. In this paper, we propose a different approach that helps the user adaptively explore large million-node graphs from a local perspective. For nodes that the user investigates, we propose to only show the neighbors with the most subjectively interesting nei… ▽ More Visualization is a powerful paradigm for exploratory data analysis. Visualizing large graphs, however, often results in a meaningless hairball. In this paper, we propose a different approach that helps the user adaptively explore large million-node graphs from a local perspective. For nodes that the user investigates, we propose to only show the neighbors with the most subjectively interesting neighborhoods. We contribute novel ideas to measure this interestingness in terms of how surprising a neighborhood is given the background distribution, as well as how well it fits the nodes the user chose to explore. We introduce FACETS, a fast and scalable method for visually exploring large graphs. By implementing our above ideas, it allows users to look into the forest through its trees. Empirical evaluation shows that our method works very well in practice, providing rankings of nodes that match interests of users. Moreover, as it scales linearly, FACETS is suited for the exploration of very large graphs. △ Less

Submitted 21 July, 2016; v1 submitted 25 May, 2015; originally announced May 2015.

Showing 1–24 of 24 results for author: Kahng, M