-
(A)I Am Not a Lawyer, But...: Engaging Legal Experts towards Responsible LLM Policies for Legal Advice
Authors:
Inyoung Cheong,
King Xia,
K. J. Kevin Feng,
Quan Ze Chen,
Amy X. Zhang
Abstract:
Large language models (LLMs) are increasingly capable of providing users with advice in a wide range of professional domains, including legal advice. However, relying on LLMs for legal queries raises concerns due to the significant expertise required and the potential real-world consequences of the advice. To explore \textit{when} and \textit{why} LLMs should or should not provide advice to users,…
▽ More
Large language models (LLMs) are increasingly capable of providing users with advice in a wide range of professional domains, including legal advice. However, relying on LLMs for legal queries raises concerns due to the significant expertise required and the potential real-world consequences of the advice. To explore \textit{when} and \textit{why} LLMs should or should not provide advice to users, we conducted workshops with 20 legal experts using methods inspired by case-based reasoning. The provided realistic queries ("cases") allowed experts to examine granular, situation-specific concerns and overarching technical and legal constraints, producing a concrete set of contextual considerations for LLM developers. By synthesizing the factors that impacted LLM response appropriateness, we present a 4-dimension framework: (1) User attributes and behaviors, (2) Nature of queries, (3) AI capabilities, and (4) Social impacts. We share experts' recommendations for LLM response strategies, which center around hel** users identify `right questions to ask' and relevant information rather than providing definitive legal judgments. Our findings reveal novel legal considerations, such as unauthorized practice of law, confidentiality, and liability for inaccurate advice, that have been overlooked in the literature. The case-based deliberation method enabled us to elicit fine-grained, practice-informed insights that surpass those from de-contextualized surveys or speculative principles. These findings underscore the applicability of our method for translating domain-specific professional knowledge and practices into policies that can guide LLM behavior in a more responsible direction.
△ Less
Submitted 3 May, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Bringing Social Computing to Secondary School Classrooms
Authors:
Kianna Bolante,
Kevin Chen,
Quan Ze Chen,
Amy Zhang
Abstract:
Social computing is the study of how technology shapes human social interactions. This topic has become increasingly relevant to secondary school students (ages 11--18) as more of young people's everyday social experiences take place online, particularly with the continuing effects of the COVID-19 pandemic. However, social computing topics are rarely touched upon in existing middle and high school…
▽ More
Social computing is the study of how technology shapes human social interactions. This topic has become increasingly relevant to secondary school students (ages 11--18) as more of young people's everyday social experiences take place online, particularly with the continuing effects of the COVID-19 pandemic. However, social computing topics are rarely touched upon in existing middle and high school curricula. We seek to introduce concepts from social computing to secondary school students so they can understand how computing has wide-ranging social implications that touch upon their everyday lives, as well as think critically about both the positive and negative sides of different social technology designs.
In this report, we present a series of six lessons combining presentations and hands-on activities covering topics within social computing and detail our experience teaching these lessons to approximately 1,405 students across 13 middle and high schools in our local school district. We developed lessons covering how social computing relates to the topics of Data Management, Encrypted Messaging, Human-Computer Interaction Careers, Machine Learning and Bias, Misinformation, and Online Behavior. We found that 81.13% of students expressed greater interest in the content of our lessons compared to their interest in STEM overall. We also found from pre- and post-lesson comprehension questions that 63.65% learned new concepts from the main activity. We release all lesson materials on a website for public use. From our experience, we observed that students were engaged in these topics and found enjoyment in finding connections between computing and their own lives.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Case Repositories: Towards Case-Based Reasoning for AI Alignment
Authors:
K. J. Kevin Feng,
Quan Ze Chen,
Inyoung Cheong,
King Xia,
Amy X. Zhang
Abstract:
Case studies commonly form the pedagogical backbone in law, ethics, and many other domains that face complex and ambiguous societal questions informed by human values. Similar complexities and ambiguities arise when we consider how AI should be aligned in practice: when faced with vast quantities of diverse (and sometimes conflicting) values from different individuals and communities, with whose v…
▽ More
Case studies commonly form the pedagogical backbone in law, ethics, and many other domains that face complex and ambiguous societal questions informed by human values. Similar complexities and ambiguities arise when we consider how AI should be aligned in practice: when faced with vast quantities of diverse (and sometimes conflicting) values from different individuals and communities, with whose values is AI to align, and how should AI do so? We propose a complementary approach to constitutional AI alignment, grounded in ideas from case-based reasoning (CBR), that focuses on the construction of policies through judgments on a set of cases. We present a process to assemble such a case repository by: 1) gathering a set of ``seed'' cases -- questions one may ask an AI system -- in a particular domain, 2) eliciting domain-specific key dimensions for cases through workshops with domain experts, 3) using LLMs to generate variations of cases not seen in the wild, and 4) engaging with the public to judge and improve cases. We then discuss how such a case repository could assist in AI alignment, both through directly acting as precedents to ground acceptable behaviors, and as a medium for individuals and communities to engage in moral reasoning around AI.
△ Less
Submitted 26 November, 2023; v1 submitted 17 November, 2023;
originally announced November 2023.
-
Case Law Grounding: Aligning Judgments of Humans and AI on Socially-Constructed Concepts
Authors:
Quan Ze Chen,
Amy X. Zhang
Abstract:
Systems for making determinations on socially-constructed and complex concepts at scale are increasingly being deployed. To make such fuzzy concepts tractable for training and evaluating AI, aligning model outputs, or human-in-the-loop workflows, the prevailing strategy involves develo** `constitutions' in the form of rules, policies, or principles. However, high-level rules often fail to captur…
▽ More
Systems for making determinations on socially-constructed and complex concepts at scale are increasingly being deployed. To make such fuzzy concepts tractable for training and evaluating AI, aligning model outputs, or human-in-the-loop workflows, the prevailing strategy involves develo** `constitutions' in the form of rules, policies, or principles. However, high-level rules often fail to capture situational nuances or have differing interpretations, resulting in inconsistent decisions. In this work, we introduce case law grounding (CLG), a hybrid workflow inspired by case law in the legal realm where past judgments on specific cases inform new decisions. Evaluating on two task domains, we find that CLG can improve alignment of decisions (+9.6% and +10.9% accuracy) and consistency ($Δ\barκ$ of +0.263 and +0.433) of human decision-makers, while also providing auditable rationales. We also find similarly substantial alignment improvements for an LLM decision-maker (+25% and +23% accuracy).
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Confidence Contours: Uncertainty-Aware Annotation for Medical Semantic Segmentation
Authors:
Andre Ye,
Quan Ze Chen,
Amy Zhang
Abstract:
Medical image segmentation modeling is a high-stakes task where understanding of uncertainty is crucial for addressing visual ambiguity. Prior work has developed segmentation models utilizing probabilistic or generative mechanisms to infer uncertainty from labels where annotators draw a singular boundary. However, as these annotations cannot represent an individual annotator's uncertainty, models…
▽ More
Medical image segmentation modeling is a high-stakes task where understanding of uncertainty is crucial for addressing visual ambiguity. Prior work has developed segmentation models utilizing probabilistic or generative mechanisms to infer uncertainty from labels where annotators draw a singular boundary. However, as these annotations cannot represent an individual annotator's uncertainty, models trained on them produce uncertainty maps that are difficult to interpret. We propose a novel segmentation representation, Confidence Contours, which uses high- and low-confidence ``contours'' to capture uncertainty directly, and develop a novel annotation system for collecting contours. We conduct an evaluation on the Lung Image Dataset Consortium (LIDC) and a synthetic dataset. From an annotation study with 30 participants, results show that Confidence Contours provide high representative capacity without considerably higher annotator effort. We also find that general-purpose segmentation models can learn Confidence Contours at the same performance level as standard singular annotations. Finally, from interviews with 5 medical experts, we find that Confidence Contour maps are more interpretable than Bayesian maps due to representation of structural uncertainty.
△ Less
Submitted 20 December, 2023; v1 submitted 14 August, 2023;
originally announced August 2023.
-
Skin Deep: Investigating Subjectivity in Skin Tone Annotations for Computer Vision Benchmark Datasets
Authors:
Teanna Barrett,
Quan Ze Chen,
Amy X. Zhang
Abstract:
To investigate the well-observed racial disparities in computer vision systems that analyze images of humans, researchers have turned to skin tone as more objective annotation than race metadata for fairness performance evaluations. However, the current state of skin tone annotation procedures is highly varied. For instance, researchers use a range of untested scales and skin tone categories, have…
▽ More
To investigate the well-observed racial disparities in computer vision systems that analyze images of humans, researchers have turned to skin tone as more objective annotation than race metadata for fairness performance evaluations. However, the current state of skin tone annotation procedures is highly varied. For instance, researchers use a range of untested scales and skin tone categories, have unclear annotation procedures, and provide inadequate analyses of uncertainty. In addition, little attention is paid to the positionality of the humans involved in the annotation process--both designers and annotators alike--and the historical and sociological context of skin tone in the United States. Our work is the first to investigate the skin tone annotation process as a sociotechnical project. We surveyed recent skin tone annotation procedures and conducted annotation experiments to examine how subjective understandings of skin tone are embedded in skin tone annotation procedures. Our systematic literature review revealed the uninterrogated association between skin tone and race and the limited effort to analyze annotator uncertainty in current procedures for skin tone annotation in computer vision evaluation. Our experiments demonstrated that design decisions in the annotation procedure such as the order in which the skin tone scale is presented or additional context in the image (i.e., presence of a face) significantly affected the resulting inter-annotator agreement and individual uncertainty of skin tone annotations. We call for greater reflexivity in the design, analysis, and documentation of procedures for evaluation using skin tone.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
Judgment Sieve: Reducing Uncertainty in Group Judgments through Interventions Targeting Ambiguity versus Disagreement
Authors:
Quan Ze Chen,
Amy X. Zhang
Abstract:
When groups of people are tasked with making a judgment, the issue of uncertainty often arises. Existing methods to reduce uncertainty typically focus on iteratively improving specificity in the overall task instruction. However, uncertainty can arise from multiple sources, such as ambiguity of the item being judged due to limited context, or disagreements among the participants due to different p…
▽ More
When groups of people are tasked with making a judgment, the issue of uncertainty often arises. Existing methods to reduce uncertainty typically focus on iteratively improving specificity in the overall task instruction. However, uncertainty can arise from multiple sources, such as ambiguity of the item being judged due to limited context, or disagreements among the participants due to different perspectives and an under-specified task. A one-size-fits-all intervention may be ineffective if it is not targeted to the right source of uncertainty. In this paper we introduce a new workflow, Judgment Sieve, to reduce uncertainty in tasks involving group judgment in a targeted manner. By utilizing measurements that separate different sources of uncertainty during an initial round of judgment elicitation, we can then select a targeted intervention adding context or deliberation to most effectively reduce uncertainty on each item being judged. We test our approach on two tasks: rating word pair similarity and toxicity of online comments, showing that targeted interventions reduced uncertainty for the most uncertain cases. In the top 10% of cases, we saw an ambiguity reduction of 21.4% and 25.7%, and a disagreement reduction of 22.2% and 11.2% for the two tasks respectively. We also found through a simulation that our targeted approach reduced the average uncertainty scores for both sources of uncertainty as opposed to uniform approaches where reductions in average uncertainty from one source came with an increase for the other.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
Designing Word Filter Tools for Creator-led Comment Moderation
Authors:
Shagun Jhaver,
Quan Ze Chen,
Detlef Knauss,
Amy Zhang
Abstract:
Online social platforms centered around content creators often allow comments on content, where creators moderate the comments they receive. As creators can face overwhelming numbers of comments, with some of them harassing or hateful, platforms typically provide tools such as word filters for creators to automate aspects of moderation. From needfinding interviews with 19 creators about how they u…
▽ More
Online social platforms centered around content creators often allow comments on content, where creators moderate the comments they receive. As creators can face overwhelming numbers of comments, with some of them harassing or hateful, platforms typically provide tools such as word filters for creators to automate aspects of moderation. From needfinding interviews with 19 creators about how they use existing tools, we found that they struggled with writing good filters as well as organizing and revisiting their filters, due to the difficulty of determining what the filters actually catch. To address these issues, we present FilterBuddy, a system that supports creators in authoring new filters or building from existing filter lists, as well as organizing their filters and visualizing what comments are captured over time. We conducted an early-stage evaluation of FilterBuddy with YouTube creators, finding that participants see FilterBuddy not just as a moderation tool, but also a means to organize their comments to better understand their audiences.
△ Less
Submitted 17 February, 2022;
originally announced February 2022.
-
Truncated eigenvalue equation and long wavelength behavior of lattice gauge theory
Authors:
S. H. Guo,
Q. Z. Chen,
X. Fang,
J. Liu,
X. Q. Luo,
W. Zheng
Abstract:
We review our new method, which might be the most direct and efficient way for approaching the continuum physics from Hamiltonian lattice gauge theory. It consists of solving the eigenvalue equation with a truncation scheme preserving the continuum limit. The efficiency has been confirmed by the observations of the scaling behaviors for the long wavelength vacuum wave functions and mass gaps in…
▽ More
We review our new method, which might be the most direct and efficient way for approaching the continuum physics from Hamiltonian lattice gauge theory. It consists of solving the eigenvalue equation with a truncation scheme preserving the continuum limit. The efficiency has been confirmed by the observations of the scaling behaviors for the long wavelength vacuum wave functions and mass gaps in (2+1)-dimensional models and (1+1)-dimensional $σ$ model even at very low truncation orders. Most of these results show rapid convergence to the available Monte Carlo data, ensuring the reliability of our method.
△ Less
Submitted 7 September, 1995;
originally announced September 1995.
-
Spectroscopy and large scale wave functions
Authors:
Q. Z. Chen,
S. H. Guo,
X. Q. Luo,
A. Segui
Abstract:
We discuss the relevance of long wavelength excitations for the low energy spectrum of QCD, and try to develop an efficient method for solving the Schrodinger equation, and for extracting the glueball masses and long wavelength functions of the ground and excited states. Some technical problems appearing in the calculations of SU(3) gauge theory are discussed.
We discuss the relevance of long wavelength excitations for the low energy spectrum of QCD, and try to develop an efficient method for solving the Schrodinger equation, and for extracting the glueball masses and long wavelength functions of the ground and excited states. Some technical problems appearing in the calculations of SU(3) gauge theory are discussed.
△ Less
Submitted 6 September, 1995;
originally announced September 1995.