-
Magic Markup: Maintaining Document-External Markup with an LLM
Authors:
Edward Misback,
Zachary Tatlock,
Steven L. Tanimoto
Abstract:
Text documents, including programs, typically have human-readable semantic structure. Historically, programmatic access to these semantics has required explicit in-document tagging. Especially in systems where the text has an execution semantics, this means it is an opt-in feature that is hard to support properly. Today, language models offer a new method: metadata can be bound to entities in chan…
▽ More
Text documents, including programs, typically have human-readable semantic structure. Historically, programmatic access to these semantics has required explicit in-document tagging. Especially in systems where the text has an execution semantics, this means it is an opt-in feature that is hard to support properly. Today, language models offer a new method: metadata can be bound to entities in changing text using a model's human-like understanding of semantics, with no requirements on the document structure. This method expands the applications of document annotation, a fundamental operation in program writing, debugging, maintenance, and presentation. We contribute a system that employs an intelligent agent to re-tag modified programs, enabling rich annotations to automatically follow code as it evolves. We also contribute a formal problem definition, an empirical synthetic benchmark suite, and our benchmark generator. Our system achieves an accuracy of 90% on our benchmarks and can replace a document's tags in parallel at a rate of 5 seconds per tag. While there remains significant room for improvement, we find performance reliable enough to justify further exploration of applications.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
rTisane: Externalizing conceptual models for data analysis increases engagement with domain knowledge and improves statistical model quality
Authors:
Eunice Jun,
Edward Misback,
Jeffrey Heer,
René Just
Abstract:
Statistical models should accurately reflect analysts' domain knowledge about variables and their relationships. While recent tools let analysts express these assumptions and use them to produce a resulting statistical model, it remains unclear what analysts want to express and how externalization impacts statistical model quality. This paper addresses these gaps. We first conduct an exploratory s…
▽ More
Statistical models should accurately reflect analysts' domain knowledge about variables and their relationships. While recent tools let analysts express these assumptions and use them to produce a resulting statistical model, it remains unclear what analysts want to express and how externalization impacts statistical model quality. This paper addresses these gaps. We first conduct an exploratory study of analysts using a domain-specific language (DSL) to express conceptual models. We observe a preference for detailing how variables relate and a desire to allow, and then later resolve, ambiguity in their conceptual models. We leverage these findings to develop rTisane, a DSL for expressing conceptual models augmented with an interactive disambiguation process. In a controlled evaluation, we find that rTisane's DSL helps analysts engage more deeply with and accurately externalize their assumptions. rTisane also leads to statistical models that match analysts' assumptions, maintain analysis intent, and better fit the data.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Odyssey: An Interactive Workbench for Expert-Driven Floating-Point Expression Rewriting
Authors:
Edward Misback,
Caleb C. Chan,
Brett Saiki,
Eunice Jun,
Zachary Tatlock,
Pavel Panchekha
Abstract:
In recent years, researchers have proposed a number of automated tools to identify and improve floating-point rounding error in mathematical expressions. However, users struggle to effectively apply these tools. In this paper, we work with novices, experts, and tool developers to investigate user needs during the expression rewriting process. We find that users follow an iterative design process.…
▽ More
In recent years, researchers have proposed a number of automated tools to identify and improve floating-point rounding error in mathematical expressions. However, users struggle to effectively apply these tools. In this paper, we work with novices, experts, and tool developers to investigate user needs during the expression rewriting process. We find that users follow an iterative design process. They want to compare expressions on multiple input ranges, integrate and guide various rewriting tools and understand where errors come from. We organize this investigation's results into a three-stage workflow and implement that workflow in a new, extensible workbench dubbed Odyssey. Odyssey enables users to: (1) diagnose problems in an expression, (2) generate solutions automatically or by hand, and (3) tune their results. Odyssey tracks a working set of expressions and turns a state-of-the-art automated tool "inside out," giving the user access to internal heuristics, algorithms, and functionality. In a user study, Odyssey enabled five expert numerical analysts to solve challenging rewriting problems where state-of-the-art automated tools fail. In particular, the experts unanimously praised Odyssey's novel support for interactive range modification and local error visualization.
△ Less
Submitted 9 August, 2023; v1 submitted 17 May, 2023;
originally announced May 2023.