-
Investigating Interaction Modes and User Agency in Human-LLM Collaboration for Domain-Specific Data Analysis
Authors:
Jia**g Guo,
Vikram Mohanty,
Jorge Piazentin Ono,
Hongtao Hao,
Liang Gou,
Liu Ren
Abstract:
Despite demonstrating robust capabilities in performing tasks related to general-domain data-operation tasks, Large Language Models (LLMs) may exhibit shortcomings when applied to domain-specific tasks. We consider the design of domain-specific AI-powered data analysis tools from two dimensions: interaction and user agency. We implemented two design probes that fall on the two ends of the two dime…
▽ More
Despite demonstrating robust capabilities in performing tasks related to general-domain data-operation tasks, Large Language Models (LLMs) may exhibit shortcomings when applied to domain-specific tasks. We consider the design of domain-specific AI-powered data analysis tools from two dimensions: interaction and user agency. We implemented two design probes that fall on the two ends of the two dimensions: an open-ended high agency (OHA) prototype and a structured low agency (SLA) prototype. We conducted an interview study with nine data scientists to investigate (1) how users perceived the LLM outputs for data analysis assistance, and (2) how the two test design probes, OHA and SLA, affected user behavior, performance, and perceptions. Our study revealed insights regarding participants' interactions with LLMs, how they perceived the results, and their desire for explainability concerning LLM outputs, along with a noted need for collaboration with other users, and how they envisioned the utility of LLMs in their workflow.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows
Authors:
Jasmine Y. Shih,
Vishal Mohanty,
Yannis Katsis,
Hariharan Subramonyam
Abstract:
Domain experts can play a crucial role in guiding data scientists to optimize machine learning models while ensuring contextual relevance for downstream use. However, in current workflows, such collaboration is challenging due to differing expertise, abstract documentation practices, and lack of access and visibility into low-level implementation artifacts. To address these challenges and enable d…
▽ More
Domain experts can play a crucial role in guiding data scientists to optimize machine learning models while ensuring contextual relevance for downstream use. However, in current workflows, such collaboration is challenging due to differing expertise, abstract documentation practices, and lack of access and visibility into low-level implementation artifacts. To address these challenges and enable domain expert participation, we introduce CellSync, a collaboration framework comprising (1) a Jupyter Notebook extension that continuously tracks changes to dataframes and model metrics and (2) a Large Language Model powered visualization dashboard that makes those changes interpretable to domain experts. Through CellSync's cell-level dataset visualization with code summaries, domain experts can interactively examine how individual data and modeling operations impact different data segments. The chat features enable data-centric conversations and targeted feedback to data scientists. Our preliminary evaluation shows that CellSync provides transparency and promotes critical discussions about the intents and implications of data operations.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
What Lies Beneath? Exploring the Impact of Underlying AI Model Updates in AI-Infused Systems
Authors:
Vikram Mohanty,
Jude Lim,
Kurt Luther
Abstract:
As AI models evolve, understanding the influence of underlying models on user experience and performance in AI-infused systems becomes critical, particularly while transitioning between different model versions. We studied the influence of model change by conducting two complementary studies in the context of AI-based facial recognition for historical person identification tasks. First, we ran an…
▽ More
As AI models evolve, understanding the influence of underlying models on user experience and performance in AI-infused systems becomes critical, particularly while transitioning between different model versions. We studied the influence of model change by conducting two complementary studies in the context of AI-based facial recognition for historical person identification tasks. First, we ran an online experiment where crowd workers interacted with two different facial recognition models: an older version and a recently updated, developer-certified more accurate model. Second, we studied a real-world deployment of these models on a popular historical photo platform through a diary study with 10 users. Our findings sheds light on models affecting human-AI team performance, users' abilities to differentiate between different models, the folk theories they develop, and how these theories influence their preferences. Drawing from these insights, we discuss design implications for updating models in AI-infused systems.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
DoubleCheck: Designing Community-based Assessability for Historical Person Identification
Authors:
Vikram Mohanty,
Kurt Luther
Abstract:
Historical photos are valuable for their cultural and economic significance, but can be difficult to identify accurately due to various challenges such as low-quality images, lack of corroborating evidence, and limited research resources. Misidentified photos can have significant negative consequences, including lost economic value, incorrect historical records, and the spread of misinformation th…
▽ More
Historical photos are valuable for their cultural and economic significance, but can be difficult to identify accurately due to various challenges such as low-quality images, lack of corroborating evidence, and limited research resources. Misidentified photos can have significant negative consequences, including lost economic value, incorrect historical records, and the spread of misinformation that can lead to perpetuating conspiracy theories. To accurately assess the credibility of a photo identification (ID), it may be necessary to conduct investigative research, use domain knowledge, and consult experts. In this paper, we introduce DoubleCheck, a quality assessment framework for verifying historical photo IDs on Civil War Photo Sleuth (CWPS), a popular online platform for identifying American Civil War-era photos using facial recognition and crowdsourcing. DoubleCheck focuses on improving CWPS's user experience and system architecture to display information useful for assessing the quality of historical photo IDs on CWPS. In a mixed-methods evaluation of DoubleCheck, we found that users contributed a wide diversity of sources for photo IDs, which helped facilitate the community's assessment of these IDs through DoubleCheck's provenance visualizations. Further, DoubleCheck's quality assessment badges and visualizations supported users in making accurate assessments of photo IDs, even in cases involving ID conflicts.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Probabilistic Genotype-Phenotype Maps Reveal Mutational Robustness of RNA Folding, Spin Glasses, and Quantum Circuits
Authors:
Anna Sap**ton,
Vaibhav Mohanty
Abstract:
Recent studies of genotype-phenotype (GP) maps have reported universally enhanced phenotypic robustness to genotype mutations, a feature essential to evolution. Virtually all of these studies make a simplifying assumption that each genotype maps deterministically to a single phenotype. Here, we introduce probabilistic genotype-phenotype (PrGP) maps, where each genotype maps to a vector of phenotyp…
▽ More
Recent studies of genotype-phenotype (GP) maps have reported universally enhanced phenotypic robustness to genotype mutations, a feature essential to evolution. Virtually all of these studies make a simplifying assumption that each genotype maps deterministically to a single phenotype. Here, we introduce probabilistic genotype-phenotype (PrGP) maps, where each genotype maps to a vector of phenotype probabilities, as a more realistic framework for investigating robustness. We study three model systems to show that our generalized framework can handle uncertainty emerging from various physical sources: (1) thermal fluctuation in RNA folding, (2) external field disorder in spin glass ground state finding, and (3) superposition and entanglement in quantum circuits, which are realized experimentally on a 7-qubit IBM quantum computer. In all three cases, we observe a novel biphasic robustness scaling which is enhanced relative to random expectation for more frequent phenotypes and approaches random expectation for less frequent phenotypes.
△ Less
Submitted 4 January, 2023;
originally announced January 2023.
-
Robustness and Stability of Spin Glass Ground States to Perturbed Interactions
Authors:
Vaibhav Mohanty,
Ard A. Louis
Abstract:
Across many scientific and engineering disciplines, it is important to consider how much the output of a given system changes due to perturbations of the input. Here, we investigate the glassy phase of $\pm J$ spin glasses at zero temperature by calculating the robustness of the ground states to flips in the sign of single interactions. For random graphs and the Sherrington-Kirkpatrick model, we f…
▽ More
Across many scientific and engineering disciplines, it is important to consider how much the output of a given system changes due to perturbations of the input. Here, we investigate the glassy phase of $\pm J$ spin glasses at zero temperature by calculating the robustness of the ground states to flips in the sign of single interactions. For random graphs and the Sherrington-Kirkpatrick model, we find relatively large sets of bond configurations that generate the same ground state. These sets can themselves be analyzed as subgraphs of the interaction domain, and we compute many of their topological properties. In particular, we find that the robustness, equivalent to the average degree, of these subgraphs is much higher than one would expect from a random model. Most notably, it scales in the same logarithmic way with the size of the subgraph as has been found in genotype-phenotype maps for RNA secondary structure folding, protein quaternary structure, gene regulatory networks, as well as for models for genetic programming. The similarity between these disparate systems suggests that this scaling may have a more universal origin.
△ Less
Submitted 20 October, 2022; v1 submitted 9 December, 2020;
originally announced December 2020.
-
Auditing Indian Elections
Authors:
Vishal Mohanty,
Nicholas Akinyokun,
Andrew Conway,
Chris Culnane,
Philip B. Stark,
Vanessa Teague
Abstract:
Indian Electronic Voting Machines (EVMs) will be fitted with printers that produce Voter-Verifiable Paper Audit Trails (VVPATs) in time for the 2019 general election. VVPATs provide evidence that each vote was recorded as the voter intended, without having to trust the perfection or security of the EVMs.
However, confidence in election results requires more: VVPATs must be preserved inviolate an…
▽ More
Indian Electronic Voting Machines (EVMs) will be fitted with printers that produce Voter-Verifiable Paper Audit Trails (VVPATs) in time for the 2019 general election. VVPATs provide evidence that each vote was recorded as the voter intended, without having to trust the perfection or security of the EVMs.
However, confidence in election results requires more: VVPATs must be preserved inviolate and then actually used to check the reported election result in a trustworthy way that the public can verify. A full manual tally from the VVPATs could be prohibitively expensive and time-consuming; moreover, it is difficult for the public to determine whether a full hand count was conducted accurately. We show how Risk-Limiting Audits (RLAs) could provide high confidence in Indian election results. Compared to full hand recounts, RLAs typically require manually inspecting far fewer VVPATs when the outcome is correct, and are much easier for the electorate to observe in adequate detail to determine whether the result is trustworthy.
△ Less
Submitted 25 January, 2019; v1 submitted 10 January, 2019;
originally announced January 2019.
-
A 5-Dimensional Tonnetz for Nearly Symmetric Hexachords
Authors:
Vaibhav Mohanty
Abstract:
The standard 2-dimensional Tonnetz describes parsimonious voice-leading connections between major and minor triads as the 3-dimensional Tonnetz does for dominant seventh and half-diminished seventh chords. In this paper, I present a geometric model for a 5-dimensional Tonnetz for parsimonious voice-leading between nearly symmetric hexachords of the mystic-Wozzeck genus. Cartesian coordinates for p…
▽ More
The standard 2-dimensional Tonnetz describes parsimonious voice-leading connections between major and minor triads as the 3-dimensional Tonnetz does for dominant seventh and half-diminished seventh chords. In this paper, I present a geometric model for a 5-dimensional Tonnetz for parsimonious voice-leading between nearly symmetric hexachords of the mystic-Wozzeck genus. Cartesian coordinates for points on this discretized grid, generalized coordinate collections for 5-simplices corresponding to mystic and Wozzeck chords, and the geometric nearest-neighbors of a selected chord are derived.
△ Less
Submitted 15 June, 2018;
originally announced June 2018.
-
Dodecatonic Cycles and Parsimonious Voice-Leading in the Mystic-Wozzeck Genus
Authors:
Vaibhav Mohanty
Abstract:
This paper develops a unified voice-leading model for the genus of mystic and Wozzeck chords. These voice-leading regions are constructed by perturbing symmetric partitions of the octave, and new Neo-Riemannian transformations between nearly symmetric hexachords are defined. The behaviors of these transformations are shown within visual representations of the voice-leading regions for the mystic-W…
▽ More
This paper develops a unified voice-leading model for the genus of mystic and Wozzeck chords. These voice-leading regions are constructed by perturbing symmetric partitions of the octave, and new Neo-Riemannian transformations between nearly symmetric hexachords are defined. The behaviors of these transformations are shown within visual representations of the voice-leading regions for the mystic-Wozzeck genus.
△ Less
Submitted 26 May, 2018;
originally announced May 2018.
-
DeepVO: A Deep Learning approach for Monocular Visual Odometry
Authors:
Vikram Mohanty,
Shubh Agrawal,
Shaswat Datta,
Arna Ghosh,
Vishnu Dutt Sharma,
Debashish Chakravarty
Abstract:
Deep Learning based techniques have been adopted with precision to solve a lot of standard computer vision problems, some of which are image classification, object detection and segmentation. Despite the widespread success of these approaches, they have not yet been exploited largely for solving the standard perception related problems encountered in autonomous navigation such as Visual Odometry (…
▽ More
Deep Learning based techniques have been adopted with precision to solve a lot of standard computer vision problems, some of which are image classification, object detection and segmentation. Despite the widespread success of these approaches, they have not yet been exploited largely for solving the standard perception related problems encountered in autonomous navigation such as Visual Odometry (VO), Structure from Motion (SfM) and Simultaneous Localization and Map** (SLAM). This paper analyzes the problem of Monocular Visual Odometry using a Deep Learning-based framework, instead of the regular 'feature detection and tracking' pipeline approaches. Several experiments were performed to understand the influence of a known/unknown environment, a conventional trackable feature and pre-trained activations tuned for object classification on the network's ability to accurately estimate the motion trajectory of the camera (or the vehicle). Based on these observations, we propose a Convolutional Neural Network architecture, best suited for estimating the object's pose under known environment conditions, and displays promising results when it comes to inferring the actual scale using just a single camera in real-time.
△ Less
Submitted 18 November, 2016;
originally announced November 2016.