-
Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference
Authors:
Anton Xue,
Avishree Khare,
Rajeev Alur,
Surbhi Goel,
Eric Wong
Abstract:
We study how to subvert language models from following the rules. We model rule-following as inference in propositional Horn logic, a mathematical system in which rules have the form "if $P$ and $Q$, then $R$" for some propositions $P$, $Q$, and $R$. We prove that although transformers can faithfully abide by such rules, maliciously crafted prompts can nevertheless mislead even theoretically const…
▽ More
We study how to subvert language models from following the rules. We model rule-following as inference in propositional Horn logic, a mathematical system in which rules have the form "if $P$ and $Q$, then $R$" for some propositions $P$, $Q$, and $R$. We prove that although transformers can faithfully abide by such rules, maliciously crafted prompts can nevertheless mislead even theoretically constructed models. Empirically, we find that attacks on our theoretical models mirror popular attacks on large language models. Our work suggests that studying smaller theoretical models can help understand the behavior of large language models in rule-based settings like logical reasoning and jailbreak attacks.
△ Less
Submitted 21 June, 2024;
originally announced July 2024.
-
Towards Compositionality in Concept Learning
Authors:
Adam Stein,
Aaditya Naik,
Yinjun Wu,
Mayur Naik,
Eric Wong
Abstract:
Concept-based interpretability methods offer a lens into the internals of foundation models by decomposing their embeddings into high-level concepts. These concept representations are most useful when they are compositional, meaning that the individual concepts compose to explain the full sample. We show that existing unsupervised concept extraction methods find concepts which are not compositiona…
▽ More
Concept-based interpretability methods offer a lens into the internals of foundation models by decomposing their embeddings into high-level concepts. These concept representations are most useful when they are compositional, meaning that the individual concepts compose to explain the full sample. We show that existing unsupervised concept extraction methods find concepts which are not compositional. To automatically discover compositional concept representations, we identify two salient properties of such representations, and propose Compositional Concept Extraction (CCE) for finding concepts which obey these properties. We evaluate CCE on five different datasets over image and text data. Our evaluation shows that CCE finds more compositional concept representations than baselines and yields better accuracy on four downstream classification tasks. Code and data are available at https://github.com/adaminsky/compositional_concepts .
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Avoiding Copyright Infringement via Machine Unlearning
Authors:
Guangyao Dou,
Zheyuan Liu,
Qing Lyu,
Kaize Ding,
Eric Wong
Abstract:
Pre-trained Large Language Models (LLMs) have demonstrated remarkable capabilities but also pose risks by learning and generating copyrighted material, leading to significant legal and ethical concerns. To address these issues, it is critical for model owners to be able to unlearn copyrighted content at various time steps. We explore the setting of sequential unlearning, where copyrighted content…
▽ More
Pre-trained Large Language Models (LLMs) have demonstrated remarkable capabilities but also pose risks by learning and generating copyrighted material, leading to significant legal and ethical concerns. To address these issues, it is critical for model owners to be able to unlearn copyrighted content at various time steps. We explore the setting of sequential unlearning, where copyrighted content is removed over multiple time steps - a scenario that has not been rigorously addressed. To tackle this challenge, we propose Stable Sequential Unlearning (SSU), a novel unlearning framework for LLMs, designed to have a more stable process to remove copyrighted content from LLMs throughout different time steps using task vectors, by incorporating additional random labeling loss and applying gradient-based weight saliency map**. Experiments demonstrate that SSU finds a good balance between unlearning efficacy and maintaining the model's general knowledge compared to existing baselines.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Data-Efficient Learning with Neural Programs
Authors:
Alaia Solko-Breslin,
Seewon Choi,
Ziyang Li,
Neelay Velingker,
Rajeev Alur,
Mayur Naik,
Eric Wong
Abstract:
Many computational tasks can be naturally expressed as a composition of a DNN followed by a program written in a traditional programming language or an API call to an LLM. We call such composites "neural programs" and focus on the problem of learning the DNN parameters when the training data consist of end-to-end input-output labels for the composite. When the program is written in a differentiabl…
▽ More
Many computational tasks can be naturally expressed as a composition of a DNN followed by a program written in a traditional programming language or an API call to an LLM. We call such composites "neural programs" and focus on the problem of learning the DNN parameters when the training data consist of end-to-end input-output labels for the composite. When the program is written in a differentiable logic programming language, techniques from neurosymbolic learning are applicable, but in general, the learning for neural programs requires estimating the gradients of black-box components. We present an algorithm for learning neural programs, called ISED, that only relies on input-output samples of black-box components. For evaluation, we introduce new benchmarks that involve calls to modern LLMs such as GPT-4 and also consider benchmarks from the neurosymolic learning literature. Our evaluation shows that for the latter benchmarks, ISED has comparable performance to state-of-the-art neurosymbolic frameworks. For the former, we use adaptations of prior work on gradient approximations of black-box components as a baseline, and show that ISED achieves comparable accuracy but in a more data- and sample-efficient manner.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Rethinking Programming Paradigms in the QC-HPC Context
Authors:
Silvina Caino-Lores,
Daniel Claudino,
Eugene Dumitrescu,
Travis S. Humble,
Sonia Lopez Alarcon,
Elaine Wong
Abstract:
Programming for today's quantum computers is making significant strides toward modern workflows compatible with high performance computing (HPC), but fundamental challenges still remain in the integration of these vastly different technologies. Quantum computing (QC) programming languages share some common ground, as well as their emerging runtimes and algorithmic modalities. In this short paper,…
▽ More
Programming for today's quantum computers is making significant strides toward modern workflows compatible with high performance computing (HPC), but fundamental challenges still remain in the integration of these vastly different technologies. Quantum computing (QC) programming languages share some common ground, as well as their emerging runtimes and algorithmic modalities. In this short paper, we explore avenues of refinement for the quantum processing unit (QPU) in the context of many-tasks management, asynchronous or otherwise, in order to understand the value it can play in linking QC with HPC. Through examples, we illustrate how its potential for scientific discovery might be realized.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation
Authors:
Yinjun Wu,
Mayank Keoliya,
Kan Chen,
Neelay Velingker,
Ziyang Li,
Emily J Getzen,
Qi Long,
Mayur Naik,
Ravi B Parikh,
Eric Wong
Abstract:
Designing faithful yet accurate AI models is challenging, particularly in the field of individual treatment effect estimation (ITE). ITE prediction models deployed in critical settings such as healthcare should ideally be (i) accurate, and (ii) provide faithful explanations. However, current solutions are inadequate: state-of-the-art black-box models do not supply explanations, post-hoc explainers…
▽ More
Designing faithful yet accurate AI models is challenging, particularly in the field of individual treatment effect estimation (ITE). ITE prediction models deployed in critical settings such as healthcare should ideally be (i) accurate, and (ii) provide faithful explanations. However, current solutions are inadequate: state-of-the-art black-box models do not supply explanations, post-hoc explainers for black-box models lack faithfulness guarantees, and self-interpretable models greatly compromise accuracy. To address these issues, we propose DISCRET, a self-interpretable ITE framework that synthesizes faithful, rule-based explanations for each sample. A key insight behind DISCRET is that explanations can serve dually as database queries to identify similar subgroups of samples. We provide a novel RL algorithm to efficiently synthesize these explanations from a large search space. We evaluate DISCRET on diverse tasks involving tabular, image, and text data. DISCRET outperforms the best self-interpretable models and has accuracy comparable to the best black-box models while providing faithful explanations. DISCRET is available at https://github.com/wuyinjun-1993/DISCRET-ICML2024.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Analyzing Language Bias Between French and English in Conventional Multilingual Sentiment Analysis Models
Authors:
Ethan Parker Wong,
Faten M'hiri
Abstract:
Inspired by the 'Bias Considerations in Bilingual Natural Language Processing' report by Statistics Canada, this study delves into potential biases in multilingual sentiment analysis between English and French. Given a 50-50 dataset of French and English, we aim to determine if there exists a language bias and explore how the incorporation of more diverse datasets in the future might affect the eq…
▽ More
Inspired by the 'Bias Considerations in Bilingual Natural Language Processing' report by Statistics Canada, this study delves into potential biases in multilingual sentiment analysis between English and French. Given a 50-50 dataset of French and English, we aim to determine if there exists a language bias and explore how the incorporation of more diverse datasets in the future might affect the equity of multilingual Natural Language Processing (NLP) systems. By employing Support Vector Machine (SVM) and Naive Bayes models on three balanced datasets, we reveal potential biases in multilingual sentiment classification. Utilizing Fairlearn, a tool for assessing bias in machine learning models, our findings indicate nuanced outcomes. With French data outperforming English across accuracy, recall, and F1 score metrics in both models, hinting at a language bias favoring French. However, Fairlearn's metrics suggest that the SVM approaches equitable levels with a demographic parity ratio of 0.963, 0.989, and 0.985 for the three separate datasets, indicating near-equitable treatment across languages. In contrast, Naive Bayes demonstrates greater disparities, evidenced by a demographic parity ratio of 0.813, 0.908, and 0.961. These findings reveal the importance of develo** equitable multilingual NLP systems, particularly as we anticipate the inclusion of more datasets in various languages in the future.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Practice-informed Patterns for Organising Large Groups in Distributed Mixed Reality Collaboration
Authors:
Emily Wong,
Juan Sánchez Esquivel,
Jens Emil Grønbæk,
Germán Leiva,
Eduardo Velloso
Abstract:
Collaborating across dissimilar, distributed spaces presents numerous challenges for computer-aided spatial communication. Mixed reality (MR) can blend selected surfaces, allowing collaborators to work in blended f-formations (facing formations), even when their workstations are physically misaligned. Since collaboration often involves more than just participant pairs, this research examines how w…
▽ More
Collaborating across dissimilar, distributed spaces presents numerous challenges for computer-aided spatial communication. Mixed reality (MR) can blend selected surfaces, allowing collaborators to work in blended f-formations (facing formations), even when their workstations are physically misaligned. Since collaboration often involves more than just participant pairs, this research examines how we might scale MR experiences for large-group collaboration. To do so, this study recruited collaboration designers (CDs) to evaluate and reimagine MR for large-scale collaboration. These CDs were engaged in a four-part user study that involved a technology probe, a semi-structured interview, a speculative low-fidelity prototy** activity and a validation session. The outcomes of this paper contribute (1) a set of collaboration design principles to inspire future computer-supported collaborative work, (2) eight collaboration patterns for blended f-formations and collaboration at scale and (3) theoretical implications for f-formations and space-place relationships. As a result, this work creates a blueprint for scaling collaboration across distributed spaces.
△ Less
Submitted 9 May, 2024; v1 submitted 8 May, 2024;
originally announced May 2024.
-
Expressiveness of Commutative Quantum Circuits: A Probabilistic Approach
Authors:
Jorge M. Ramirez,
Elaine Wong,
Caio Alves,
Sarah Chehade,
Ryan Bennink
Abstract:
This study investigates the frame potential and expressiveness of commutative quantum circuits. Based on the Fourier series representation of these circuits, we express quantum expectation and pairwise fidelity as characteristic functions of random variables, and expressiveness as the recurrence probability of a random walk on a lattice. A central outcome of our work includes formulas to approxima…
▽ More
This study investigates the frame potential and expressiveness of commutative quantum circuits. Based on the Fourier series representation of these circuits, we express quantum expectation and pairwise fidelity as characteristic functions of random variables, and expressiveness as the recurrence probability of a random walk on a lattice. A central outcome of our work includes formulas to approximate the frame potential and expressiveness for any commutative quantum circuit, underpinned by convergence theorems in probability theory. We identify the lattice volume of the random walk as means to approximate expressiveness based on circuit architecture. In the specific case of commutative circuits involving Pauli-$Z$ rotations, we provide theoretical results relating expressiveness and circuit structure. Our probabilistic representation also provide means for bounding and approximately calculating the frame potential of a circuit through sampling methods.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
A Cross-Platform Execution Engine for the Quantum Intermediate Representation
Authors:
Elaine Wong,
Vicente Leyton Ortega,
Daniel Claudino,
Seth Johnson,
Sharmin Afrose,
Meenambika Gowrishankar,
Anthony M. Cabrera,
Travis S. Humble
Abstract:
Hybrid languages like the Quantum Intermediate Representation (QIR) are essential for programming systems that mix quantum and conventional computing models, while execution of these programs is often deferred to a system-specific implementation. Here, we describe and demonstrate the QIR Execution Engine (QIR-EE) for parsing, interpreting, and executing QIR across multiple hardware platforms. QIR-…
▽ More
Hybrid languages like the Quantum Intermediate Representation (QIR) are essential for programming systems that mix quantum and conventional computing models, while execution of these programs is often deferred to a system-specific implementation. Here, we describe and demonstrate the QIR Execution Engine (QIR-EE) for parsing, interpreting, and executing QIR across multiple hardware platforms. QIR-EE uses LLVM to execute hybrid instructions specifying quantum programs and, by design, presents extension points that support customized runtime and hardware environments. We demonstrate an implementation that uses the XACC quantum hardware-accelerator library to dispatch prototypical quantum programs on different commercial quantum platforms and numerical simulators, and we validate execution of QIR-EE on the IonQ Harmony and Quantinuum H1-1 hardware. Our results highlight the efficiency of hybrid executable architectures for handling mixed instructions, managing mixed data, and integrating with quantum computing frameworks to realize cross-platform execution.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Authors:
Patrick Chao,
Edoardo Debenedetti,
Alexander Robey,
Maksym Andriushchenko,
Francesco Croce,
Vikash Sehwag,
Edgar Dobriban,
Nicolas Flammarion,
George J. Pappas,
Florian Tramer,
Hamed Hassani,
Eric Wong
Abstract:
Jailbreak attacks cause large language models (LLMs) to generate harmful, unethical, or otherwise objectionable content. Evaluating these attacks presents a number of challenges, which the current collection of benchmarks and evaluation techniques do not adequately address. First, there is no clear standard of practice regarding jailbreaking evaluation. Second, existing works compute costs and suc…
▽ More
Jailbreak attacks cause large language models (LLMs) to generate harmful, unethical, or otherwise objectionable content. Evaluating these attacks presents a number of challenges, which the current collection of benchmarks and evaluation techniques do not adequately address. First, there is no clear standard of practice regarding jailbreaking evaluation. Second, existing works compute costs and success rates in incomparable ways. And third, numerous works are not reproducible, as they withhold adversarial prompts, involve closed-source code, or rely on evolving proprietary APIs. To address these challenges, we introduce JailbreakBench, an open-sourced benchmark with the following components: (1) an evolving repository of state-of-the-art adversarial prompts, which we refer to as jailbreak artifacts; (2) a jailbreaking dataset comprising 100 behaviors -- both original and sourced from prior work -- which align with OpenAI's usage policies; (3) a standardized evaluation framework at https://github.com/JailbreakBench/jailbreakbench that includes a clearly defined threat model, system prompts, chat templates, and scoring functions; and (4) a leaderboard at https://jailbreakbench.github.io/ that tracks the performance of attacks and defenses for various LLMs. We have carefully considered the potential ethical implications of releasing this benchmark, and believe that it will be a net positive for the community.
△ Less
Submitted 16 June, 2024; v1 submitted 27 March, 2024;
originally announced April 2024.
-
Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing
Authors:
Jiabao Ji,
Bairu Hou,
Alexander Robey,
George J. Pappas,
Hamed Hassani,
Yang Zhang,
Eric Wong,
Shiyu Chang
Abstract:
Aligned large language models (LLMs) are vulnerable to jailbreaking attacks, which bypass the safeguards of targeted LLMs and fool them into generating objectionable content. While initial defenses show promise against token-based threat models, there do not exist defenses that provide robustness against semantic attacks and avoid unfavorable trade-offs between robustness and nominal performance.…
▽ More
Aligned large language models (LLMs) are vulnerable to jailbreaking attacks, which bypass the safeguards of targeted LLMs and fool them into generating objectionable content. While initial defenses show promise against token-based threat models, there do not exist defenses that provide robustness against semantic attacks and avoid unfavorable trade-offs between robustness and nominal performance. To meet this need, we propose SEMANTICSMOOTH, a smoothing-based defense that aggregates the predictions of multiple semantically transformed copies of a given input prompt. Experimental results demonstrate that SEMANTICSMOOTH achieves state-of-the-art robustness against GCG, PAIR, and AutoDAN attacks while maintaining strong nominal performance on instruction following benchmarks such as InstructionFollowing and AlpacaEval. The codes will be publicly available at https://github.com/UCSB-NLP-Chang/SemanticSmooth.
△ Less
Submitted 28 February, 2024; v1 submitted 25 February, 2024;
originally announced February 2024.
-
Tyche: Stochastic In-Context Learning for Medical Image Segmentation
Authors:
Marianne Rakic,
Hallee E. Wong,
Jose Javier Gonzalez Ortiz,
Beth Cimini,
John Guttag,
Adrian V. Dalca
Abstract:
Existing learning-based solutions to medical image segmentation have two important shortcomings. First, for most new segmentation task, a new model has to be trained or fine-tuned. This requires extensive resources and machine learning expertise, and is therefore often infeasible for medical researchers and clinicians. Second, most existing segmentation methods produce a single deterministic segme…
▽ More
Existing learning-based solutions to medical image segmentation have two important shortcomings. First, for most new segmentation task, a new model has to be trained or fine-tuned. This requires extensive resources and machine learning expertise, and is therefore often infeasible for medical researchers and clinicians. Second, most existing segmentation methods produce a single deterministic segmentation mask for a given image. In practice however, there is often considerable uncertainty about what constitutes the correct segmentation, and different expert annotators will often segment the same image differently. We tackle both of these problems with Tyche, a model that uses a context set to generate stochastic predictions for previously unseen tasks without the need to retrain. Tyche differs from other in-context segmentation methods in two important ways. (1) We introduce a novel convolution block architecture that enables interactions among predictions. (2) We introduce in-context test-time augmentation, a new mechanism to provide prediction stochasticity. When combined with appropriate model design and loss functions, Tyche can predict a set of plausible diverse segmentation candidates for new or unseen medical images and segmentation tasks without the need to retrain.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image
Authors:
Hallee E. Wong,
Marianne Rakic,
John Guttag,
Adrian V. Dalca
Abstract:
Biomedical image segmentation is a crucial part of both scientific research and clinical care. With enough labelled data, deep learning models can be trained to accurately automate specific biomedical image segmentation tasks. However, manually segmenting images to create training data is highly labor intensive and requires domain expertise. We present ScribblePrompt, a flexible neural network bas…
▽ More
Biomedical image segmentation is a crucial part of both scientific research and clinical care. With enough labelled data, deep learning models can be trained to accurately automate specific biomedical image segmentation tasks. However, manually segmenting images to create training data is highly labor intensive and requires domain expertise. We present ScribblePrompt, a flexible neural network based interactive segmentation tool for biomedical imaging that enables human annotators to segment previously unseen structures using scribbles, clicks, and bounding boxes. Through rigorous quantitative experiments, we demonstrate that given comparable amounts of interaction, ScribblePrompt produces more accurate segmentations than previous methods on datasets unseen during training. In a user study with domain experts, ScribblePrompt reduced annotation time by 28% while improving Dice by 15% compared to the next best method. ScribblePrompt's success rests on a set of careful design decisions. These include a training strategy that incorporates both a highly diverse set of images and tasks, novel algorithms for simulated user interactions and labels, and a network that enables fast inference. We showcase ScribblePrompt in an online demo and provide code at https://scribbleprompt.csail.mit.edu
△ Less
Submitted 12 April, 2024; v1 submitted 12 December, 2023;
originally announced December 2023.
-
Initialization Matters for Adversarial Transfer Learning
Authors:
Andong Hua,
**dong Gu,
Zhiyu Xue,
Nicholas Carlini,
Eric Wong,
Yao Qin
Abstract:
With the prevalence of the Pretraining-Finetuning paradigm in transfer learning, the robustness of downstream tasks has become a critical concern. In this work, we delve into adversarial robustness in transfer learning and reveal the critical role of initialization, including both the pretrained model and the linear head. First, we discover the necessity of an adversarially robust pretrained model…
▽ More
With the prevalence of the Pretraining-Finetuning paradigm in transfer learning, the robustness of downstream tasks has become a critical concern. In this work, we delve into adversarial robustness in transfer learning and reveal the critical role of initialization, including both the pretrained model and the linear head. First, we discover the necessity of an adversarially robust pretrained model. Specifically, we reveal that with a standard pretrained model, Parameter-Efficient Finetuning (PEFT) methods either fail to be adversarially robust or continue to exhibit significantly degraded adversarial robustness on downstream tasks, even with adversarial training during finetuning. Leveraging a robust pretrained model, surprisingly, we observe that a simple linear probing can outperform full finetuning and other PEFT methods with random initialization on certain datasets. We further identify that linear probing excels in preserving robustness from the robust pretraining. Based on this, we propose Robust Linear Initialization (RoLI) for adversarial finetuning, which initializes the linear head with the weights obtained by adversarial linear probing to maximally inherit the robustness from pretraining. Across five different image classification datasets, we demonstrate the effectiveness of RoLI and achieve new state-of-the-art results. Our code is available at \url{https://github.com/DongXzz/RoLI}.
△ Less
Submitted 30 March, 2024; v1 submitted 9 December, 2023;
originally announced December 2023.
-
Deep Multimodal Fusion for Surgical Feedback Classification
Authors:
Rafal Kocielnik,
Elyssa Y. Wong,
Timothy N. Chu,
Lydia Lin,
De-An Huang,
Jiayun Wang,
Anima Anandkumar,
Andrew J. Hung
Abstract:
Quantification of real-time informal feedback delivered by an experienced surgeon to a trainee during surgery is important for skill improvements in surgical training. Such feedback in the live operating room is inherently multimodal, consisting of verbal conversations (e.g., questions and answers) as well as non-verbal elements (e.g., through visual cues like pointing to anatomic elements). In th…
▽ More
Quantification of real-time informal feedback delivered by an experienced surgeon to a trainee during surgery is important for skill improvements in surgical training. Such feedback in the live operating room is inherently multimodal, consisting of verbal conversations (e.g., questions and answers) as well as non-verbal elements (e.g., through visual cues like pointing to anatomic elements). In this work, we leverage a clinically-validated five-category classification of surgical feedback: "Anatomic", "Technical", "Procedural", "Praise" and "Visual Aid". We then develop a multi-label machine learning model to classify these five categories of surgical feedback from inputs of text, audio, and video modalities. The ultimate goal of our work is to help automate the annotation of real-time contextual surgical feedback at scale. Our automated classification of surgical feedback achieves AUCs ranging from 71.5 to 77.6 with the fusion improving performance by 3.1%. We also show that high-quality manual transcriptions of feedback audio from experts improve AUCs to between 76.5 and 96.2, which demonstrates a clear path toward future improvements. Empirically, we find that the Staged training strategy, with first pre-training each modality separately and then training them jointly, is more effective than training different modalities altogether. We also present intuitive findings on the importance of modalities for different feedback categories. This work offers an important first look at the feasibility of automated classification of real-world live surgical feedback based on text, audio, and video modalities.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Sum-of-Parts Models: Faithful Attributions for Groups of Features
Authors:
Weiqiu You,
Helen Qu,
Marco Gatti,
Bhuvnesh Jain,
Eric Wong
Abstract:
An explanation of a machine learning model is considered "faithful" if it accurately reflects the model's decision-making process. However, explanations such as feature attributions for deep learning are not guaranteed to be faithful, and can produce potentially misleading interpretations. In this work, we develop Sum-of-Parts (SOP), a class of models whose predictions come with grouped feature at…
▽ More
An explanation of a machine learning model is considered "faithful" if it accurately reflects the model's decision-making process. However, explanations such as feature attributions for deep learning are not guaranteed to be faithful, and can produce potentially misleading interpretations. In this work, we develop Sum-of-Parts (SOP), a class of models whose predictions come with grouped feature attributions that are faithful-by-construction. This model decomposes a prediction into an interpretable sum of scores, each of which is directly attributable to a sparse group of features. We evaluate SOP on benchmarks with standard interpretability metrics, and in a case study, we use the faithful explanations from SOP to help astrophysicists discover new knowledge about galaxy formation.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
Authors:
Chongyu Fan,
Jiancheng Liu,
Yihua Zhang,
Eric Wong,
Dennis Wei,
Sijia Liu
Abstract:
With evolving data regulations, machine unlearning (MU) has become an important tool for fostering trust and safety in today's AI models. However, existing MU methods focusing on data and/or weight perspectives often suffer limitations in unlearning accuracy, stability, and cross-domain applicability. To address these challenges, we introduce the concept of 'weight saliency' for MU, drawing parall…
▽ More
With evolving data regulations, machine unlearning (MU) has become an important tool for fostering trust and safety in today's AI models. However, existing MU methods focusing on data and/or weight perspectives often suffer limitations in unlearning accuracy, stability, and cross-domain applicability. To address these challenges, we introduce the concept of 'weight saliency' for MU, drawing parallels with input saliency in model explanation. This innovation directs MU's attention toward specific model weights rather than the entire model, improving effectiveness and efficiency. The resultant method that we call saliency unlearning (SalUn) narrows the performance gap with 'exact' unlearning (model retraining from scratch after removing the forgetting data points). To the best of our knowledge, SalUn is the first principled MU approach that can effectively erase the influence of forgetting data, classes, or concepts in both image classification and generation tasks. As highlighted below, For example, SalUn yields a stability advantage in high-variance random data forgetting, e.g., with a 0.2% gap compared to exact unlearning on the CIFAR-10 dataset. Moreover, in preventing conditional diffusion models from generating harmful images, SalUn achieves nearly 100% unlearning accuracy, outperforming current state-of-the-art baselines like Erased Stable Diffusion and Forget-Me-Not. Codes are available at https://github.com/OPTML-Group/Unlearn-Saliency. (WARNING: This paper contains model outputs that may be offensive in nature.)
△ Less
Submitted 4 April, 2024; v1 submitted 19 October, 2023;
originally announced October 2023.
-
Jailbreaking Black Box Large Language Models in Twenty Queries
Authors:
Patrick Chao,
Alexander Robey,
Edgar Dobriban,
Hamed Hassani,
George J. Pappas,
Eric Wong
Abstract:
There is growing interest in ensuring that large language models (LLMs) align with human values. However, the alignment of such models is vulnerable to adversarial jailbreaks, which coax LLMs into overriding their safety guardrails. The identification of these vulnerabilities is therefore instrumental in understanding inherent weaknesses and preventing future misuse. To this end, we propose Prompt…
▽ More
There is growing interest in ensuring that large language models (LLMs) align with human values. However, the alignment of such models is vulnerable to adversarial jailbreaks, which coax LLMs into overriding their safety guardrails. The identification of these vulnerabilities is therefore instrumental in understanding inherent weaknesses and preventing future misuse. To this end, we propose Prompt Automatic Iterative Refinement (PAIR), an algorithm that generates semantic jailbreaks with only black-box access to an LLM. PAIR -- which is inspired by social engineering attacks -- uses an attacker LLM to automatically generate jailbreaks for a separate targeted LLM without human intervention. In this way, the attacker LLM iteratively queries the target LLM to update and refine a candidate jailbreak. Empirically, PAIR often requires fewer than twenty queries to produce a jailbreak, which is orders of magnitude more efficient than existing algorithms. PAIR also achieves competitive jailbreaking success rates and transferability on open and closed-source LLMs, including GPT-3.5/4, Vicuna, and PaLM-2.
△ Less
Submitted 13 October, 2023; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Comparing Styles across Languages
Authors:
Shreya Havaldar,
Matthew Pressimone,
Eric Wong,
Lyle Ungar
Abstract:
Understanding how styles differ across languages is advantageous for training both humans and computers to generate culturally appropriate text. We introduce an explanation framework to extract stylistic differences from multilingual LMs and compare styles across languages. Our framework (1) generates comprehensive style lexica in any language and (2) consolidates feature importances from LMs into…
▽ More
Understanding how styles differ across languages is advantageous for training both humans and computers to generate culturally appropriate text. We introduce an explanation framework to extract stylistic differences from multilingual LMs and compare styles across languages. Our framework (1) generates comprehensive style lexica in any language and (2) consolidates feature importances from LMs into comparable lexical categories. We apply this framework to compare politeness, creating the first holistic multilingual politeness dataset and exploring how politeness varies across four languages. Our approach enables an effective evaluation of how distinct linguistic categories contribute to stylistic variations and provides interpretable insights into how people communicate differently around the world.
△ Less
Submitted 4 December, 2023; v1 submitted 10 October, 2023;
originally announced October 2023.
-
SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks
Authors:
Alexander Robey,
Eric Wong,
Hamed Hassani,
George J. Pappas
Abstract:
Despite efforts to align large language models (LLMs) with human intentions, widely-used LLMs such as GPT, Llama, and Claude are susceptible to jailbreaking attacks, wherein an adversary fools a targeted LLM into generating objectionable content. To address this vulnerability, we propose SmoothLLM, the first algorithm designed to mitigate jailbreaking attacks. Based on our finding that adversarial…
▽ More
Despite efforts to align large language models (LLMs) with human intentions, widely-used LLMs such as GPT, Llama, and Claude are susceptible to jailbreaking attacks, wherein an adversary fools a targeted LLM into generating objectionable content. To address this vulnerability, we propose SmoothLLM, the first algorithm designed to mitigate jailbreaking attacks. Based on our finding that adversarially-generated prompts are brittle to character-level changes, our defense randomly perturbs multiple copies of a given input prompt, and then aggregates the corresponding predictions to detect adversarial inputs. Across a range of popular LLMs, SmoothLLM sets the state-of-the-art for robustness against the GCG, PAIR, RandomSearch, and AmpleGCG jailbreaks. SmoothLLM is also resistant against adaptive GCG attacks, exhibits a small, though non-negligible trade-off between robustness and nominal performance, and is compatible with any LLM. Our code is publicly available at \url{https://github.com/arobey1/smooth-llm}.
△ Less
Submitted 11 June, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Saturn's ancient regular satellites
Authors:
Emily. W. Wong,
Ramon Brasser,
Stephanie. C. Werner,
Michelle. R. Kirchoff
Abstract:
Are Saturn's regular satellites young or old? And how old are Enceladus' cratered plains? To answer these questions we computed model surface ages of the most heavily cratered terrains on Saturn's regular icy satellites using new high-resolution outer Solar System evolution simulations, and coupled with improved estimates of the trans-Neptunian objects populations. The output of the simulations al…
▽ More
Are Saturn's regular satellites young or old? And how old are Enceladus' cratered plains? To answer these questions we computed model surface ages of the most heavily cratered terrains on Saturn's regular icy satellites using new high-resolution outer Solar System evolution simulations, and coupled with improved estimates of the trans-Neptunian objects populations. The output of the simulations allowed us to construct a model impact chronology onto Saturn which automatically applies to its regular satellites. We used crater densities and our impact chronology onto Saturn to construct model impact-crater isochrons, i.e., the scaling of the satellite crater production function with time. The surface ages derived for the cratered plains on Mimas, Enceladus, Tethys, Dione and Rhea range from 4.1 Ga to 4.4 Ga, with the surfaces of Mimas and Enceladus roughly 200 Myr younger than those of the outer three satellites. Uncertainties in these ages are less than 300 Myr. The calculated model surface ages of these satellites are consistent over as much as two orders of magnitude in the observed crater diameter. The similarity of the crater production function amongst all satellites suggests that they were bombarded by a single impactor source. This work supports the idea that Saturn's regular satellites are ancient, and has implications for their formation and their tidal evolution.
△ Less
Submitted 7 September, 2023;
originally announced September 2023.
-
Fe substitution in URu$_2$Si$_2$: singlet magnetism in an extended Doniach phase diagram
Authors:
Andrea Marino,
Denise S. Christovam,
Chun-Fu Chang,
Johannes Falke,
Chang-Yang Kuo,
Chi-Nan Wu,
Martin Sundermann,
Andrea Amorese,
Hlynur Gretarsson,
Eric Lee Wong,
Camilla M. Moir,
Yuang Deng,
M. Brian Maple,
Peter Thalmeier,
Liu Hao Tjeng,
Andrea Severing
Abstract:
The application of pressure as well as the successive substitution of Ru with Fe in the hidden order (HO) compound URu$_2$Si$_2$ leads to the formation of the large moment antiferromagnetic phase (LMAFM). Here we have investigated the substitution series URu$_{2-x}$Fe$_x$Si$_2$ from $x$\,=\,0.0 to 2.0 by U\,4$f$ core-level photoelectron spectroscopy and have observed non-monotonic changes in the s…
▽ More
The application of pressure as well as the successive substitution of Ru with Fe in the hidden order (HO) compound URu$_2$Si$_2$ leads to the formation of the large moment antiferromagnetic phase (LMAFM). Here we have investigated the substitution series URu$_{2-x}$Fe$_x$Si$_2$ from $x$\,=\,0.0 to 2.0 by U\,4$f$ core-level photoelectron spectroscopy and have observed non-monotonic changes in the spectra. The initial increase and subsequent decrease of the spectral weight of the 4$f$ core level satellite with increasing $x$ stands for a non-monotonic 5$f$ filling across the substitution series. The competition of chemical pressure and increase of the density of states at the Fermi energy, both due to substitution of Ru with Fe, can explain such a behavior. An extended Doniach phase diagram including the $x$ dependence of the density of states is proposed. Also in URu$_{2-x}$Fe$_x$Si$_2$ the ground state is a singlet or quasi-doublet state consisting of two singlets. Hence, the formation of magnetic order in the URu$_{2-x}$Fe$_x$Si$_2$ substitution series must be explained within a singlet magnetism model.
△ Less
Submitted 23 August, 2023;
originally announced August 2023.
-
TorchQL: A Programming Framework for Integrity Constraints in Machine Learning
Authors:
Aaditya Naik,
Adam Stein,
Yinjun Wu,
Mayur Naik,
Eric Wong
Abstract:
Finding errors in machine learning applications requires a thorough exploration of their behavior over data. Existing approaches used by practitioners are often ad-hoc and lack the abstractions needed to scale this process. We present TorchQL, a programming framework to evaluate and improve the correctness of machine learning applications. TorchQL allows users to write queries to specify and check…
▽ More
Finding errors in machine learning applications requires a thorough exploration of their behavior over data. Existing approaches used by practitioners are often ad-hoc and lack the abstractions needed to scale this process. We present TorchQL, a programming framework to evaluate and improve the correctness of machine learning applications. TorchQL allows users to write queries to specify and check integrity constraints over machine learning models and datasets. It seamlessly integrates relational algebra with functional programming to allow for highly expressive queries using only eight intuitive operators. We evaluate TorchQL on diverse use-cases including finding critical temporal inconsistencies in objects detected across video frames in autonomous driving, finding data imputation errors in time-series medical records, finding data labeling errors in real-world images, and evaluating biases and constraining outputs of language models. Our experiments show that TorchQL enables up to 13x faster query executions than baselines like Pandas and MongoDB, and up to 40% shorter queries than native Python. We also conduct a user study and find that TorchQL is natural enough for developers familiar with Python to specify complex integrity constraints.
△ Less
Submitted 14 February, 2024; v1 submitted 13 August, 2023;
originally announced August 2023.
-
A Search for Technosignatures Around 11,680 Stars with the Green Bank Telescope at 1.15-1.73 GHz
Authors:
Jean-Luc Margot,
Megan G. Li,
Pavlo Pinchuk,
Nathan Myhrvold,
Larry Lesyna,
Lea E. Alcantara,
Megan T. Andrakin,
Jeth Arunseangroj,
Damien S. Baclet,
Madison H. Belk,
Zerxes R. Bhadha,
Nicholas W. Brandis,
Robert E. Carey,
Harrison P. Cassar,
Sai S. Chava,
Calvin Chen,
James Chen,
Kellen T. Cheng,
Alessia Cimbri,
Benjamin Cloutier,
Jordan A. Combitsis,
Kelly L. Couvrette,
Brandon P. Coy,
Kyle W. Davis,
Antoine F. Delcayre
, et al. (56 additional authors not shown)
Abstract:
We conducted a search for narrowband radio signals over four observing sessions in 2020-2023 with the L-band receiver (1.15-1.73 GHz) of the 100 m diameter Green Bank Telescope. We pointed the telescope in the directions of 62 TESS Objects of Interest, capturing radio emissions from a total of ~11,680 stars and planetary systems in the ~9 arcminute beam of the telescope. All detections were either…
▽ More
We conducted a search for narrowband radio signals over four observing sessions in 2020-2023 with the L-band receiver (1.15-1.73 GHz) of the 100 m diameter Green Bank Telescope. We pointed the telescope in the directions of 62 TESS Objects of Interest, capturing radio emissions from a total of ~11,680 stars and planetary systems in the ~9 arcminute beam of the telescope. All detections were either automatically rejected or visually inspected and confirmed to be of anthropogenic nature. In this work, we also quantified the end-to-end efficiency of radio SETI pipelines with a signal injection and recovery analysis. The UCLA SETI pipeline recovers 94.0% of the injected signals over the usable frequency range of the receiver and 98.7% of the injections when regions of dense RFI are excluded. In another pipeline that uses incoherent sums of 51 consecutive spectra, the recovery rate is ~15 times smaller at ~6%. The pipeline efficiency affects calculations of transmitter prevalence and SETI search volume. Accordingly, we developed an improved Drake Figure of Merit and a formalism to place upper limits on transmitter prevalence that take the pipeline efficiency and transmitter duty cycle into account. Based on our observations, we can state at the 95% confidence level that fewer than 6.6% of stars within 100 pc host a transmitter that is detectable in our search (EIRP > 1e13 W). For stars within 20,000 ly, the fraction of stars with detectable transmitters (EIRP > 5e16 W) is at most 3e-4. Finally, we showed that the UCLA SETI pipeline natively detects the signals detected with AI techniques by Ma et al. (2023).
△ Less
Submitted 15 October, 2023; v1 submitted 4 August, 2023;
originally announced August 2023.
-
Stability Guarantees for Feature Attributions with Multiplicative Smoothing
Authors:
Anton Xue,
Rajeev Alur,
Eric Wong
Abstract:
Explanation methods for machine learning models tend not to provide any formal guarantees and may not reflect the underlying decision-making process. In this work, we analyze stability as a property for reliable feature attribution methods. We prove that relaxed variants of stability are guaranteed if the model is sufficiently Lipschitz with respect to the masking of features. We develop a smoothi…
▽ More
Explanation methods for machine learning models tend not to provide any formal guarantees and may not reflect the underlying decision-making process. In this work, we analyze stability as a property for reliable feature attribution methods. We prove that relaxed variants of stability are guaranteed if the model is sufficiently Lipschitz with respect to the masking of features. We develop a smoothing method called Multiplicative Smoothing (MuS) to achieve such a model. We show that MuS overcomes the theoretical limitations of standard smoothing techniques and can be integrated with any classifier and feature attribution method. We evaluate MuS on vision and language models with various feature attribution methods, such as LIME and SHAP, and demonstrate that MuS endows feature attributions with non-trivial stability guarantees.
△ Less
Submitted 26 October, 2023; v1 submitted 12 July, 2023;
originally announced July 2023.
-
Rationality of Four-Valued Families of Weil Sums of Binomials
Authors:
Daniel J. Katz,
Allison E. Wong
Abstract:
We investigate the rationality of Weil sums of binomials of the form $W^{K,s}_u=\sum_{x \in K} ψ(x^s - u x)$, where $K$ is a finite field whose canonical additive character is $ψ$, and where $u$ is an element of $K^{\times}$ and $s$ is a positive integer relatively prime to $|K^\times|$, so that $x \mapsto x^s$ is a permutation of $K$. The Weil spectrum for $K$ and $s$, which is the family of valu…
▽ More
We investigate the rationality of Weil sums of binomials of the form $W^{K,s}_u=\sum_{x \in K} ψ(x^s - u x)$, where $K$ is a finite field whose canonical additive character is $ψ$, and where $u$ is an element of $K^{\times}$ and $s$ is a positive integer relatively prime to $|K^\times|$, so that $x \mapsto x^s$ is a permutation of $K$. The Weil spectrum for $K$ and $s$, which is the family of values $W^{K,s}_u$ as $u$ runs through $K^\times$, is of interest in arithmetic geometry and in several information-theoretic applications. The Weil spectrum always contains at least three distinct values if $s$ is nondegenerate (i.e., if $s$ is not a power of $p$ modulo $|K^\times|$, where $p$ is the characteristic of $K$). It is already known that if the Weil spectrum contains precisely three distinct values, then they must all be rational integers. We show that if the Weil spectrum contains precisely four distinct values, then they must all be rational integers, with the sole exception of the case where $|K|=5$ and $s \equiv 3 \pmod{4}$.
△ Less
Submitted 6 April, 2024; v1 submitted 26 June, 2023;
originally announced June 2023.
-
Plasma flows during the ablation stage of an over-massed pulsed-power-driven exploding planar wire array
Authors:
R. Datta,
J. Angel,
J. B. Greenly,
S. N. Bland,
J. P. Chittenden,
E. S. Lavine,
W. M. Potter,
D. Robinson,
T. W. O. Varnish,
E. Wong,
D. A. Hammer,
B. R. Kusse,
J. D. Hare
Abstract:
We characterize the plasma flows generated during the ablation stage of an over-massed exploding planar wire array, fielded on the COBRA pulsed-power facility (1 MA peak current, 250 ns rise time). The planar wire array is designed to provide a driving magnetic field (80-100 T) and current per wire distribution (about 60 kA), similar to that in a 10 MA cylindrical exploding wire array fielded on t…
▽ More
We characterize the plasma flows generated during the ablation stage of an over-massed exploding planar wire array, fielded on the COBRA pulsed-power facility (1 MA peak current, 250 ns rise time). The planar wire array is designed to provide a driving magnetic field (80-100 T) and current per wire distribution (about 60 kA), similar to that in a 10 MA cylindrical exploding wire array fielded on the Z machine. Over-massing the arrays enables continuous plasma ablation over the duration of the experiment. The requirement to over-mass on the Z machine necessitates wires with diameters of 75-100 $μ$m, which are thicker than wires usually fielded on wire array experiments. To test ablation with thicker wires, we perform a parametric study by varying the initial wire diameter between 33-100 $μ$m. The largest wire diameter (100 $μ$m) array exhibits early closure of the AK gap, while the gap remains open during the duration of the experiment for wire diameters between 33-75 $μ$m. Laser plasma interferometry and time-gated XUV imaging are used to probe the plasma flows ablating from the wires. The plasma flows from the wires converge to generate a pinch, which appears as a fast-moving ($V \approx {100}$ kms$^{-1}$) column of increased plasma density ($\bar{n}_e \approx 2 \times 10^{18}$ cm$^{-3}$) and strong XUV emission. Finally, we compare the results with three-dimensional resistive-magnetohydrodynamic (MHD) simulations performed using the code GORGON, the results of which reproduce the dynamics of the experiment reasonably well.
△ Less
Submitted 12 July, 2023; v1 submitted 5 June, 2023;
originally announced June 2023.
-
TopEx: Topic-based Explanations for Model Comparison
Authors:
Shreya Havaldar,
Adam Stein,
Eric Wong,
Lyle Ungar
Abstract:
Meaningfully comparing language models is challenging with current explanation methods. Current explanations are overwhelming for humans due to large vocabularies or incomparable across models. We present TopEx, an explanation method that enables a level playing field for comparing language models via model-agnostic topics. We demonstrate how TopEx can identify similarities and differences between…
▽ More
Meaningfully comparing language models is challenging with current explanation methods. Current explanations are overwhelming for humans due to large vocabularies or incomparable across models. We present TopEx, an explanation method that enables a level playing field for comparing language models via model-agnostic topics. We demonstrate how TopEx can identify similarities and differences between DistilRoBERTa and GPT-2 on a variety of NLP tasks.
△ Less
Submitted 1 June, 2023; v1 submitted 1 June, 2023;
originally announced June 2023.
-
Rectifying Group Irregularities in Explanations for Distribution Shift
Authors:
Adam Stein,
Yinjun Wu,
Eric Wong,
Mayur Naik
Abstract:
It is well-known that real-world changes constituting distribution shift adversely affect model performance. How to characterize those changes in an interpretable manner is poorly understood. Existing techniques to address this problem take the form of shift explanations that elucidate how to map samples from the original distribution toward the shifted one by reducing the disparity between these…
▽ More
It is well-known that real-world changes constituting distribution shift adversely affect model performance. How to characterize those changes in an interpretable manner is poorly understood. Existing techniques to address this problem take the form of shift explanations that elucidate how to map samples from the original distribution toward the shifted one by reducing the disparity between these two distributions. However, these methods can introduce group irregularities, leading to explanations that are less feasible and robust. To address these issues, we propose Group-aware Shift Explanations (GSE), a method that produces interpretable explanations by leveraging worst-group optimization to rectify group irregularities. We demonstrate how GSE not only maintains group structures, such as demographic and hierarchical subpopulations, but also enhances feasibility and robustness in the resulting explanations in a wide range of tabular, language, and image settings.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
Cooperation and Cognition in Social Networks
Authors:
Edoardo Gallo,
Joseph Lee,
Yohanes Eko Riyanto,
Erwin Wong
Abstract:
Social networks can sustain cooperation by amplifying the consequences of a single defection through a cascade of relationship losses. Building on Jackson et al. (2012), we introduce a novel robustness notion to characterize low cognitive complexity (LCC) networks - a subset of equilibrium networks that imposes a minimal cognitive burden to calculate and comprehend the consequences of defection. W…
▽ More
Social networks can sustain cooperation by amplifying the consequences of a single defection through a cascade of relationship losses. Building on Jackson et al. (2012), we introduce a novel robustness notion to characterize low cognitive complexity (LCC) networks - a subset of equilibrium networks that imposes a minimal cognitive burden to calculate and comprehend the consequences of defection. We test our theory in a laboratory experiment and find that cooperation is higher in equilibrium than in non-equilibrium networks. Within equilibrium networks, LCC networks exhibit higher levels of cooperation than non-LCC networks. Learning is essential for the emergence of equilibrium play.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
Rigorous Analytic Combinatorics in Several Variables in SageMath
Authors:
Benjamin Hackl,
Andrew Luo,
Stephen Melczer,
Jesse Selover,
Elaine Wong
Abstract:
We introduce the new sage_acsv package for the SageMath computer algebra system, allowing users to rigorously compute asymptotics for a large variety of multivariate sequences with rational generating functions. Using Sage's support for exact computations over the algebraic number field, this package provides the first rigorous implementation of algorithms from the theory of analytic combinatorics…
▽ More
We introduce the new sage_acsv package for the SageMath computer algebra system, allowing users to rigorously compute asymptotics for a large variety of multivariate sequences with rational generating functions. Using Sage's support for exact computations over the algebraic number field, this package provides the first rigorous implementation of algorithms from the theory of analytic combinatorics in several variables.
△ Less
Submitted 31 August, 2023; v1 submitted 16 March, 2023;
originally announced March 2023.
-
Do Machine Learning Models Learn Statistical Rules Inferred from Data?
Authors:
Aaditya Naik,
Yinjun Wu,
Mayur Naik,
Eric Wong
Abstract:
Machine learning models can make critical errors that are easily hidden within vast amounts of data. Such errors often run counter to rules based on human intuition. However, rules based on human knowledge are challenging to scale or to even formalize. We thereby seek to infer statistical rules from the data and quantify the extent to which a model has learned them. We propose a framework SQRL tha…
▽ More
Machine learning models can make critical errors that are easily hidden within vast amounts of data. Such errors often run counter to rules based on human intuition. However, rules based on human knowledge are challenging to scale or to even formalize. We thereby seek to infer statistical rules from the data and quantify the extent to which a model has learned them. We propose a framework SQRL that integrates logic-based methods with statistical inference to derive these rules from a model's training data without supervision. We further show how to adapt models at test time to reduce rule violations and produce more coherent predictions. SQRL generates up to 300K rules over datasets from vision, tabular, and language settings. We uncover up to 158K violations of those rules by state-of-the-art models for classification, object detection, and data imputation. Test-time adaptation reduces these violations by up to 68.7% with relative performance improvement up to 32%. SQRL is available at https://github.com/DebugML/sqrl.
△ Less
Submitted 6 June, 2023; v1 submitted 2 March, 2023;
originally announced March 2023.
-
In-context Example Selection with Influences
Authors:
Tai Nguyen,
Eric Wong
Abstract:
In-context learning (ICL) is a powerful paradigm emerged from large language models (LLMs). Despite its promises, ICL performance is known to be highly sensitive to input examples. In this work, we use $\textit{in-context influences}$ to analyze few-shot ICL performance directly from the in-context examples. Our proposed influence-based example selection method can identify both positive and negat…
▽ More
In-context learning (ICL) is a powerful paradigm emerged from large language models (LLMs). Despite its promises, ICL performance is known to be highly sensitive to input examples. In this work, we use $\textit{in-context influences}$ to analyze few-shot ICL performance directly from the in-context examples. Our proposed influence-based example selection method can identify both positive and negative examples, outperforming several baselines when evaluated on 9 SuperGLUE tasks. Our analysis uncovers up to a $16.3\%$ performance gap between using the most negative in-context examples compared to the most positive. In a case study, we apply our influence-based framework to quantify the phenomena of recency bias in example ordering for few-shot ICL.
△ Less
Submitted 5 June, 2023; v1 submitted 21 February, 2023;
originally announced February 2023.
-
Black Box Adversarial Prompting for Foundation Models
Authors:
Natalie Maus,
Patrick Chao,
Eric Wong,
Jacob Gardner
Abstract:
Prompting interfaces allow users to quickly adjust the output of generative models in both vision and language. However, small changes and design choices in the prompt can lead to significant differences in the output. In this work, we develop a black-box framework for generating adversarial prompts for unstructured image and text generation. These prompts, which can be standalone or prepended to…
▽ More
Prompting interfaces allow users to quickly adjust the output of generative models in both vision and language. However, small changes and design choices in the prompt can lead to significant differences in the output. In this work, we develop a black-box framework for generating adversarial prompts for unstructured image and text generation. These prompts, which can be standalone or prepended to benign prompts, induce specific behaviors into the generative process, such as generating images of a particular object or generating high perplexity text.
△ Less
Submitted 29 May, 2023; v1 submitted 8 February, 2023;
originally announced February 2023.
-
A Unified Approach to Unimodality of Gaussian Polynomials
Authors:
Christoph Koutschan,
Ali K. Uncu,
Elaine Wong
Abstract:
In 2013, Pak and Panova proved the strict unimodality property of $q$-binomial coefficients $\binom{\ell+m}{m}_q$ (as polynomials in $q$) based on the combinatorics of Young tableaux and the semigroup property of Kronecker coefficients. They showed it to be true for all $\ell,m\geq 8$ and a few other cases. We propose a different approach to this problem based on computer algebra, where we establi…
▽ More
In 2013, Pak and Panova proved the strict unimodality property of $q$-binomial coefficients $\binom{\ell+m}{m}_q$ (as polynomials in $q$) based on the combinatorics of Young tableaux and the semigroup property of Kronecker coefficients. They showed it to be true for all $\ell,m\geq 8$ and a few other cases. We propose a different approach to this problem based on computer algebra, where we establish a closed form for the coefficients of these polynomials and then use cylindrical algebraic decomposition to identify exactly the range of coefficients where strict unimodality holds. This strategy allows us to tackle generalizations of the problem, e.g., to show unimodality with larger gaps or unimodality of related sequences. In particular, we present proofs of two additional cases of a conjecture by Stanley and Zanello.
△ Less
Submitted 31 August, 2023; v1 submitted 8 February, 2023;
originally announced February 2023.
-
Improving Models for Student Retention and Graduation using Markov Chains
Authors:
Mason N Tedeschi,
Tiana M Hose,
Emily K Mehlman,
Scott Franklin,
Tony E Wong
Abstract:
Graduation rates are a key measure of the long-term efficacy of academic interventions. However, challenges to using traditional estimates of graduation rates for underrepresented students include inherently small sample sizes and high data requirements. Here, we show that a Markov model increases confidence and reduces biases in estimated graduation rates for underrepresented minority and first-g…
▽ More
Graduation rates are a key measure of the long-term efficacy of academic interventions. However, challenges to using traditional estimates of graduation rates for underrepresented students include inherently small sample sizes and high data requirements. Here, we show that a Markov model increases confidence and reduces biases in estimated graduation rates for underrepresented minority and first-generation students. We use a Learning Assistant program to demonstrate the Markov model's strength for assessing program efficacy. We find that Learning Assistants in gateway science courses are associated with a 9% increase in the six-year graduation rate. These gains are larger for underrepresented minority (21%) and first-generation students (18%). Our results indicate that Learning Assistants can improve overall graduation rates and address inequalities in graduation rates for underrepresented students.
△ Less
Submitted 1 February, 2023;
originally announced February 2023.
-
Faithful Chain-of-Thought Reasoning
Authors:
Qing Lyu,
Shreya Havaldar,
Adam Stein,
Li Zhang,
Delip Rao,
Eric Wong,
Marianna Apidianaki,
Chris Callison-Burch
Abstract:
While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query $\rightarrow$ symbolic reasoning chain) and Problem Solving…
▽ More
While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query $\rightarrow$ symbolic reasoning chain) and Problem Solving (reasoning chain $\rightarrow$ answer), using an LM and a deterministic solver respectively. This guarantees that the reasoning chain provides a faithful explanation of the final answer. Aside from interpretability, Faithful CoT also improves empirical performance: it outperforms standard CoT on 9 of 10 benchmarks from 4 diverse domains, with a relative accuracy gain of 6.3% on Math Word Problems (MWP), 3.4% on Planning, 5.5% on Multi-hop Question Answering (QA), and 21.4% on Relational Inference. Furthermore, with GPT-4 and Codex, it sets the new state-of-the-art few-shot performance on 7 datasets (with 95.0+ accuracy on 6 of them), showing a strong synergy between faithfulness and accuracy.
△ Less
Submitted 20 September, 2023; v1 submitted 30 January, 2023;
originally announced January 2023.
-
WS$_2$ Band Gap Renormalization Induced by Tomonaga Luttinger Liquid Formation in Mirror Twin Boundaries
Authors:
Antonio Rossi,
John C. Thomas,
Johannes T. Küchle,
Elyse Barré,
Zhuohang Yu,
Da Zhou,
Shalini Kumari,
Hsin-Zon Tsai,
Ed Wong,
Chris Jozwiak,
Aaron Bostwick,
Joshua A. Robinson,
Mauricio Terrones,
Archana Raja,
Adam Schwartzberg,
D. Frank Ogletree,
Jeffrey B. Neaton,
Michael F. Crommie,
Francesco Allegretti,
Willi Auwärter,
Eli Rotenberg,
Alexander Weber-Bargioni
Abstract:
Tomonaga-Luttinger liquid (TLL) behavior in one-dimensional systems has been predicted and shown to occur at semiconductor-to-metal transitions within two-dimensional materials. Reports of mirror twin boundaries (MTBs) hosting a Fermi liquid or a TLL have suggested a dependence on the underlying substrate, however, unveiling the physical details of electronic contributions from the substrate requi…
▽ More
Tomonaga-Luttinger liquid (TLL) behavior in one-dimensional systems has been predicted and shown to occur at semiconductor-to-metal transitions within two-dimensional materials. Reports of mirror twin boundaries (MTBs) hosting a Fermi liquid or a TLL have suggested a dependence on the underlying substrate, however, unveiling the physical details of electronic contributions from the substrate require cross-correlative investigation. Here, we study TLL formation in MTBs within defectively engineered WS$_2$ atop graphene, where band structure and the atomic environment is visualized with nano angle-resolved photoelectron spectroscopy, scanning tunneling microscopy and scanning tunneling spectroscopy, and non-contact atomic force microscopy. Correlations between the local density of states and electronic band dispersion elucidated the electron transfer from graphene into a TLL hosted by MTB defects. We find that MTB defects can be substantially charged at a local level, which drives a band gap shift by $\sim$0.5 eV.
△ Less
Submitted 18 January, 2023; v1 submitted 6 January, 2023;
originally announced January 2023.
-
Sea Level and Socioeconomic Uncertainty Drives High-End Coastal Adaptation Costs
Authors:
Tony E. Wong,
Catherine Ledna,
Lisa Rennels,
Hannah Sheets,
Frank C. Errickson,
Delavane Diaz,
David Anthoff
Abstract:
Sea-level rise and associated flood hazards pose severe risks to the millions of people globally living in coastal zones. Models representing coastal adaptation and impacts are important tools to inform the design of strategies to manage these risks. Representing the often deep uncertainties influencing these risks poses nontrivial challenges. A common uncertainty characterization approach is to u…
▽ More
Sea-level rise and associated flood hazards pose severe risks to the millions of people globally living in coastal zones. Models representing coastal adaptation and impacts are important tools to inform the design of strategies to manage these risks. Representing the often deep uncertainties influencing these risks poses nontrivial challenges. A common uncertainty characterization approach is to use a few benchmark cases to represent the range and relative probabilities of the set of possible outcomes. This has been done in coastal adaptation studies, for example, by using low, moderate, and high percentiles of an input of interest, like sea-level changes. A key consideration is how this simplified characterization of uncertainty influences the distributions of estimated coastal impacts. Here, we show that using only a few benchmark percentiles to represent uncertainty in future sea-level change can lead to overconfident projections and underestimate high-end risks as compared to using full ensembles for sea-level change and socioeconomic parametric uncertainties. When uncertainty in future sea level is characterized by low, moderate, and high percentiles of global mean sea-level rise, estimates of high-end (95th percentile) damages are underestimated by between 18% (SSP1-2.6) and 46% (SSP5-8.5). Additionally, using the 5th and 95th percentiles of sea-level scenarios underestimates the 5-95% width of the distribution of adaptation costs by a factor ranging from about two to four, depending on SSP-RCP pathway. The resulting underestimation of the uncertainty range in adaptation costs can bias adaptation and mitigation decision-making.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Leveraging Heteroscedastic Uncertainty in Learning Complex Spectral Map** for Single-channel Speech Enhancement
Authors:
Kuan-Lin Chen,
Daniel D. E. Wong,
Ke Tan,
Buye Xu,
Anurag Kumar,
Vamsi Krishna Ithapu
Abstract:
Most speech enhancement (SE) models learn a point estimate and do not make use of uncertainty estimation in the learning process. In this paper, we show that modeling heteroscedastic uncertainty by minimizing a multivariate Gaussian negative log-likelihood (NLL) improves SE performance at no extra cost. During training, our approach augments a model learning complex spectral map** with a tempora…
▽ More
Most speech enhancement (SE) models learn a point estimate and do not make use of uncertainty estimation in the learning process. In this paper, we show that modeling heteroscedastic uncertainty by minimizing a multivariate Gaussian negative log-likelihood (NLL) improves SE performance at no extra cost. During training, our approach augments a model learning complex spectral map** with a temporary submodel to predict the covariance of the enhancement error at each time-frequency bin. Due to unrestricted heteroscedastic uncertainty, the covariance introduces an undersampling effect, detrimental to SE performance. To mitigate undersampling, our approach inflates the uncertainty lower bound and weights each loss component with their uncertainty, effectively compensating severely undersampled components with more penalties. Our multivariate setting reveals common covariance assumptions such as scalar and diagonal matrices. By weakening these assumptions, we show that the NLL achieves superior performance compared to popular loss functions including the mean squared error (MSE), mean absolute error (MAE), and scale-invariant signal-to-distortion ratio (SI-SDR).
△ Less
Submitted 8 March, 2023; v1 submitted 15 November, 2022;
originally announced November 2022.
-
Superconducting Niobium Tip Electron Beam Source
Authors:
Cameron W. Johnson,
Andreas K. Schmid,
Marian Mankos,
Robin Röpke,
Nicole Kerker,
Ing-Shouh Hwang,
Ed K. Wong,
D. Frank Ogletree,
Andrew M. Minor,
Alexander Stibor
Abstract:
Modern electron microscopy and spectroscopy is a key technology for studying the structure and composition of quantum and biological materials in fundamental and applied sciences. High-resolution spectroscopic techniques and aberration-corrected microscopes are often limited by the relatively large energy distribution of currently available beam sources. This can be improved by a monochromator, wi…
▽ More
Modern electron microscopy and spectroscopy is a key technology for studying the structure and composition of quantum and biological materials in fundamental and applied sciences. High-resolution spectroscopic techniques and aberration-corrected microscopes are often limited by the relatively large energy distribution of currently available beam sources. This can be improved by a monochromator, with the significant drawback of losing most of the beam current. Here, we study the field emission properties of a monocrystalline niobium tip electron field emitter at 5.2 K, well below the superconducting transition temperature. The emitter fabrication process can generate two tip configurations, with or without a nano-protrusion at the apex, strongly influencing the field-emission energy distribution. The geometry without the nano-protrusion has a high beam current, long-term stability, and an energy width of around 100 meV. The beam current can be increased by two orders of magnitude by xenon gas adsorption. We also studied the emitter performance up to 82 K and demonstrated the beam's energy width can be below 40 meV even at liquid nitrogen cooling temperatures when an apex nano-protrusion is present. Furthermore, the spatial and temporal electron-electron correlations of the field emission are studied at normal and superconducting temperatures and the influence of Nottingham heating is discussed. This new monochromatic source will allow unprecedented accuracy and resolution in electron microscopy, spectroscopy, and high-coherence quantum applications.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
Computed Decision Weights and a New Learning Algorithm for Neural Classifiers
Authors:
Eugene Wong
Abstract:
In this paper we consider the possibility of computing rather than training the decision layer weights of a neural classifier. Such a possibility arises in two way, from making an appropriate choice of loss function and by solving a problem of constrained optimization. The latter formulation leads to a promising new learning process for pre-decision weights with both simplicity and efficacy.
In this paper we consider the possibility of computing rather than training the decision layer weights of a neural classifier. Such a possibility arises in two way, from making an appropriate choice of loss function and by solving a problem of constrained optimization. The latter formulation leads to a promising new learning process for pre-decision weights with both simplicity and efficacy.
△ Less
Submitted 17 September, 2022;
originally announced September 2022.
-
Web3 Challenges and Opportunities for the Market
Authors:
Dan Sheridan,
James Harris,
Frank Wear,
Jerry Cowell Jr,
Easton Wong,
Abbas Yazdinejad
Abstract:
The inability of a computer to think has been a limiter in its usefulness and a point of reassurance for humanity since the first computers were created. The semantic web is the first step toward removing that barrier, enabling computers to operate based on conceptual understanding, and AI and ML are the second. Both semantic knowledge and the ability to learn are fundamental to web3, as are block…
▽ More
The inability of a computer to think has been a limiter in its usefulness and a point of reassurance for humanity since the first computers were created. The semantic web is the first step toward removing that barrier, enabling computers to operate based on conceptual understanding, and AI and ML are the second. Both semantic knowledge and the ability to learn are fundamental to web3, as are blockchain, decentralization, transactional transparency, and ownership. Web3 is the next generational step in the information age, where the web evolves into a more digestible medium for users and machines to browse knowledge. The slow introduction of Web3 across the global software ecosystem will impact the people who enable the current iteration. This evolution of the internet space will expand the way knowledge is shared, consumed, and owned, which will lessen the requirement for a global standard and allow data to interact efficiently, no matter the construction of the knowledge. The heart of this paper understands the: 1) Enablement of Web3 across the digital ecosystem. 2) What a Web3 developer will look like. 3) How this alteration will evolve the market around software and knowledge in general.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.
-
A Reservoir Model of Explicit Human Intelligence
Authors:
Eric C. Wong
Abstract:
A fundamental feature of human intelligence is that we accumulate and transfer knowledge as a society and across generations. We describe here a network architecture for the human brain that may support this feature and suggest that two key innovations were the ability to consider an offline model of the world, and the use of language to record and communicate knowledge within this model. We propo…
▽ More
A fundamental feature of human intelligence is that we accumulate and transfer knowledge as a society and across generations. We describe here a network architecture for the human brain that may support this feature and suggest that two key innovations were the ability to consider an offline model of the world, and the use of language to record and communicate knowledge within this model. We propose that these two innovations, together with pre-existing mechanisms for associative learning, allowed us to develop a conceptually simple associative network that operates like a reservoir of attractors and can learn in a rapid, flexible, and robust manner. We hypothesize that explicit human intelligence is based primarily on this type of network, which works in conjunction with older and likely more complex deep networks that perform sensory, motor, and other implicit forms of processing.
△ Less
Submitted 6 July, 2022;
originally announced July 2022.
-
A Data-Based Perspective on Transfer Learning
Authors:
Saachi Jain,
Hadi Salman,
Alaa Khaddaj,
Eric Wong,
Sung Min Park,
Aleksander Madry
Abstract:
It is commonly believed that in transfer learning including more pre-training data translates into better performance. However, recent evidence suggests that removing data from the source dataset can actually help too. In this work, we take a closer look at the role of the source dataset's composition in transfer learning and present a framework for probing its impact on downstream performance. Ou…
▽ More
It is commonly believed that in transfer learning including more pre-training data translates into better performance. However, recent evidence suggests that removing data from the source dataset can actually help too. In this work, we take a closer look at the role of the source dataset's composition in transfer learning and present a framework for probing its impact on downstream performance. Our framework gives rise to new capabilities such as pinpointing transfer learning brittleness as well as detecting pathologies such as data-leakage and the presence of misleading examples in the source dataset. In particular, we demonstrate that removing detrimental datapoints identified by our framework improves transfer learning performance from ImageNet on a variety of target tasks. Code is available at https://github.com/MadryLab/data-transfer
△ Less
Submitted 12 July, 2022;
originally announced July 2022.
-
When does Bias Transfer in Transfer Learning?
Authors:
Hadi Salman,
Saachi Jain,
Andrew Ilyas,
Logan Engstrom,
Eric Wong,
Aleksander Madry
Abstract:
Using transfer learning to adapt a pre-trained "source model" to a downstream "target task" can dramatically increase performance with seemingly no downside. In this work, we demonstrate that there can exist a downside after all: bias transfer, or the tendency for biases of the source model to persist even after adapting the model to the target class. Through a combination of synthetic and natural…
▽ More
Using transfer learning to adapt a pre-trained "source model" to a downstream "target task" can dramatically increase performance with seemingly no downside. In this work, we demonstrate that there can exist a downside after all: bias transfer, or the tendency for biases of the source model to persist even after adapting the model to the target class. Through a combination of synthetic and natural experiments, we show that bias transfer both (a) arises in realistic settings (such as when pre-training on ImageNet or other standard datasets) and (b) can occur even when the target dataset is explicitly de-biased. As transfer-learned models are increasingly deployed in the real world, our work highlights the importance of understanding the limitations of pre-trained source models. Code is available at https://github.com/MadryLab/bias-transfer
△ Less
Submitted 6 July, 2022;
originally announced July 2022.
-
Recent Developments in Security-Constrained AC Optimal Power Flow: Overview of Challenge 1 in the ARPA-E Grid Optimization Competition
Authors:
Ignacio Aravena,
Daniel K. Molzahn,
Shixuan Zhang,
Cosmin G. Petra,
Frank E. Curtis,
Shenyinying Tu,
Andreas Wächter,
Ermin Wei,
Elizabeth Wong,
Amin Gholami,
Kaizhao Sun,
Xu Andy Sun,
Stephen T. Elbert,
Jesse T. Holzer,
Arun Veeramany
Abstract:
The optimal power flow problem is central to many tasks in the design and operation of electric power grids. This problem seeks the minimum cost operating point for an electric power grid while satisfying both engineering requirements and physical laws describing how power flows through the electric network. By additionally considering the possibility of component failures and using an accurate AC…
▽ More
The optimal power flow problem is central to many tasks in the design and operation of electric power grids. This problem seeks the minimum cost operating point for an electric power grid while satisfying both engineering requirements and physical laws describing how power flows through the electric network. By additionally considering the possibility of component failures and using an accurate AC power flow model of the electric network, the security-constrained AC optimal power flow (SC-AC-OPF) problem is of paramount practical relevance. To assess recent progress in solution algorithms for SC-AC-OPF problems and spur new innovations, the U.S. Department of Energy's Advanced Research Projects Agency--Energy (ARPA-E) organized Challenge 1 of the Grid Optimization (GO) competition. This paper describes the SC-AC-OPF problem formulation used in the competition, overviews historical developments and the state of the art in SC-AC-OPF algorithms, discusses the competition, and summarizes the algorithms used by the top three teams in Challenge 1 of the GO Competition (Teams gollnlp, GO-SNIP, and GMI-GO).
△ Less
Submitted 15 June, 2022;
originally announced June 2022.
-
Convexity, Squeezing, and the Elekes-Szabó Theorem
Authors:
Oliver Roche-Newton,
Elaine Wong
Abstract:
This paper explores the relationship between convexity and sum sets. In particular, we show that elementary number theoretical methods, principally the application of a squeezing principle, can be augmented with the Elekes-Szabó Theorem in order to give new information. Namely, if we let $A \subset \mathbb R$, we prove that there exist $a,a' \in A$ such that \[\left | \frac{(aA+1)^{(2)}(a'A+1)^{(2…
▽ More
This paper explores the relationship between convexity and sum sets. In particular, we show that elementary number theoretical methods, principally the application of a squeezing principle, can be augmented with the Elekes-Szabó Theorem in order to give new information. Namely, if we let $A \subset \mathbb R$, we prove that there exist $a,a' \in A$ such that \[\left | \frac{(aA+1)^{(2)}(a'A+1)^{(2)}}{(aA+1)^{(2)}(a'A+1)} \right | \gtrsim |A|^{31/12}.\] We are also able to prove that \[ \max \{|A+A-A|, |A^2+A^2-A^2|, |A^3 + A^3 - A^3|\} \gtrsim |A|^{19/12}.\] Both of these bounds are improvements of recent results and takes advantage of computer algebra to tackle some of the computations.
△ Less
Submitted 14 January, 2024; v1 submitted 27 May, 2022;
originally announced May 2022.
-
Near-monochromatic tuneable cryogenic niobium electron field emitter
Authors:
Cameron W. Johnson,
Andreas K. Schmid,
Marian Mankos,
Robin Röpke,
Nicole Kerker,
Ed K. Wong,
D. Frank Ogletree,
Andrew M. Minor,
Alexander Stibor
Abstract:
Creating, manipulating, and detecting coherent electrons is at the heart of future quantum microscopy and spectroscopy technologies. Leveraging and specifically altering the quantum features of an electron beam source at low temperatures can enhance its emission properties. Here, we describe electron field emission from a monocrystalline, superconducting niobium nanotip at a temperature of 5.9 K.…
▽ More
Creating, manipulating, and detecting coherent electrons is at the heart of future quantum microscopy and spectroscopy technologies. Leveraging and specifically altering the quantum features of an electron beam source at low temperatures can enhance its emission properties. Here, we describe electron field emission from a monocrystalline, superconducting niobium nanotip at a temperature of 5.9 K. The emitted electron energy spectrum reveals an ultra-narrow distribution down to 16 meV due to tunable resonant tunneling field emission via localized band states at a nano-protrusion's apex and a cut-off at the sharp low-temperature Fermi-edge. This is an order of magnitude lower than for conventional field emission electron sources. The self-focusing geometry of the tip leads to emission in an angle of 3.7 deg, a reduced brightness of 3.8 x 10exp8 A/(m2 sr V), and a stability of hours at 4.1 nA beam current and 69 meV energy width. This source will decrease the impact of lens aberration and enable new modes in low-energy electron microscopy, electron energy loss spectroscopy, and high-resolution vibrational spectroscopy.
△ Less
Submitted 6 October, 2022; v1 submitted 11 May, 2022;
originally announced May 2022.