-
Out-Of-Context Prompting Boosts Fairness and Robustness in Large Language Model Predictions
Authors:
Leonardo Cotta,
Chris J. Maddison
Abstract:
Frontier Large Language Models (LLMs) are increasingly being deployed for high-stakes decision-making. On the other hand, these models are still consistently making predictions that contradict users' or society's expectations, e.g., hallucinating, or discriminating. Thus, it is important that we develop test-time strategies to improve their trustworthiness. Inspired by prior work, we leverage caus…
▽ More
Frontier Large Language Models (LLMs) are increasingly being deployed for high-stakes decision-making. On the other hand, these models are still consistently making predictions that contradict users' or society's expectations, e.g., hallucinating, or discriminating. Thus, it is important that we develop test-time strategies to improve their trustworthiness. Inspired by prior work, we leverage causality as a tool to formally encode two aspects of trustworthiness in LLMs: fairness and robustness. Under this perspective, existing test-time solutions explicitly instructing the model to be fair or robust implicitly depend on the LLM's causal reasoning capabilities. In this work, we explore the opposite approach. Instead of explicitly asking the LLM for trustworthiness, we design prompts to encode the underlying causal inference algorithm that will, by construction, result in more trustworthy predictions. Concretely, we propose out-of-context prompting as a test-time solution to encourage fairness and robustness in LLMs. Out-of-context prompting leverages the user's prior knowledge of the task's causal model to apply (random) counterfactual transformations and improve the model's trustworthiness. Empirically, we show that out-of-context prompting consistently improves the fairness and robustness of frontier LLMs across five different benchmark datasets without requiring additional data, finetuning or pre-training.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Probabilistic Invariant Learning with Randomized Linear Classifiers
Authors:
Leonardo Cotta,
Gal Yehuda,
Assaf Schuster,
Chris J. Maddison
Abstract:
Designing models that are both expressive and preserve known invariances of tasks is an increasingly hard problem. Existing solutions tradeoff invariance for computational or memory resources. In this work, we show how to leverage randomness and design models that are both expressive and invariant but use less resources. Inspired by randomized algorithms, our key insight is that accepting probabil…
▽ More
Designing models that are both expressive and preserve known invariances of tasks is an increasingly hard problem. Existing solutions tradeoff invariance for computational or memory resources. In this work, we show how to leverage randomness and design models that are both expressive and invariant but use less resources. Inspired by randomized algorithms, our key insight is that accepting probabilistic notions of universal approximation and invariance can reduce our resource requirements. More specifically, we propose a class of binary classification models called Randomized Linear Classifiers (RLCs). We give parameter and sample size conditions in which RLCs can, with high probability, approximate any (smooth) function while preserving invariance to compact group transformations. Leveraging this result, we design three RLCs that are provably probabilistic invariant for classification tasks over sets, graphs, and spherical data. We show how these models can achieve probabilistic invariance and universality using less resources than (deterministic) neural networks and their invariant counterparts. Finally, we empirically demonstrate the benefits of this new class of models on invariant tasks where deterministic invariant neural networks are known to struggle.
△ Less
Submitted 27 September, 2023; v1 submitted 8 August, 2023;
originally announced August 2023.
-
Control of Small Spacecraft by Optimal Output Regulation: A Reinforcement Learning Approach
Authors:
Joao Leonardo Silva Cotta,
Omar Qasem,
Paula do Vale Pereira,
Hector Gutierrez
Abstract:
The growing number of noncooperative flying objects has prompted interest in sample-return and space debris removal missions. Current solutions are both costly and largely dependent on specific object identification and capture methods. In this paper, a low-cost modular approach for control of a swarm flight of small satellites in rendezvous and capture missions is proposed by solving the optimal…
▽ More
The growing number of noncooperative flying objects has prompted interest in sample-return and space debris removal missions. Current solutions are both costly and largely dependent on specific object identification and capture methods. In this paper, a low-cost modular approach for control of a swarm flight of small satellites in rendezvous and capture missions is proposed by solving the optimal output regulation problem. By integrating the theories of tracking control, adaptive optimal control, and output regulation, the optimal control policy is designed as a feedback-feedforward controller to guarantee the asymptotic tracking of a class of reference input generated by the leader. The estimated state vector of the space object of interest and communication within satellites is assumed to be available. The controller rejects the nonvanishing disturbances injected into the follower satellite while maintaining the closed-loop stability of the overall leader-follower system. The simulation results under the Basilisk-ROS2 framework environment for high-fidelity space applications with accurate spacecraft dynamics, are compared with those from a classical linear quadratic regulator controller, and the results reveal the efficiency and practicality of the proposed method.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
Causal Lifting and Link Prediction
Authors:
Leonardo Cotta,
Beatrice Bevilacqua,
Nesreen Ahmed,
Bruno Ribeiro
Abstract:
Existing causal models for link prediction assume an underlying set of inherent node factors -- an innate characteristic defined at the node's birth -- that governs the causal evolution of links in the graph. In some causal tasks, however, link formation is path-dependent: The outcome of link interventions depends on existing links. Unfortunately, these existing causal methods are not designed for…
▽ More
Existing causal models for link prediction assume an underlying set of inherent node factors -- an innate characteristic defined at the node's birth -- that governs the causal evolution of links in the graph. In some causal tasks, however, link formation is path-dependent: The outcome of link interventions depends on existing links. Unfortunately, these existing causal methods are not designed for path-dependent link formation, as the cascading functional dependencies between links (arising from path dependence) are either unidentifiable or require an impractical number of control variables. To overcome this, we develop the first causal model capable of dealing with path dependencies in link prediction. In this work we introduce the concept of causal lifting, an invariance in causal models of independent interest that, on graphs, allows the identification of causal link prediction queries using limited interventional data. Further, we show how structural pairwise embeddings exhibit lower bias and correctly represent the task's causal structure, as opposed to existing node embeddings, e.g., graph neural network node embeddings and matrix factorization. Finally, we validate our theoretical findings on three scenarios for causal link prediction tasks: knowledge base completion, covariance matrix estimation and consumer-product recommendations.
△ Less
Submitted 27 July, 2023; v1 submitted 2 February, 2023;
originally announced February 2023.
-
Reconstruction for Powerful Graph Representations
Authors:
Leonardo Cotta,
Christopher Morris,
Bruno Ribeiro
Abstract:
Graph neural networks (GNNs) have limited expressive power, failing to represent many graph classes correctly. While more expressive graph representation learning (GRL) alternatives can distinguish some of these classes, they are significantly harder to implement, may not scale well, and have not been shown to outperform well-tuned GNNs in real-world tasks. Thus, devising simple, scalable, and exp…
▽ More
Graph neural networks (GNNs) have limited expressive power, failing to represent many graph classes correctly. While more expressive graph representation learning (GRL) alternatives can distinguish some of these classes, they are significantly harder to implement, may not scale well, and have not been shown to outperform well-tuned GNNs in real-world tasks. Thus, devising simple, scalable, and expressive GRL architectures that also achieve real-world improvements remains an open challenge. In this work, we show the extent to which graph reconstruction -- reconstructing a graph from its subgraphs -- can mitigate the theoretical and practical problems currently faced by GRL architectures. First, we leverage graph reconstruction to build two new classes of expressive graph representations. Secondly, we show how graph reconstruction boosts the expressive power of any GNN architecture while being a (provably) powerful inductive bias for invariances to vertex removals. Empirically, we show how reconstruction can boost GNN's expressive power -- while maintaining its invariance to permutations of the vertices -- by solving seven graph property tasks not solvable by the original GNN. Further, we demonstrate how it boosts state-of-the-art GNN's performance across nine real-world benchmark datasets.
△ Less
Submitted 6 December, 2021; v1 submitted 1 October, 2021;
originally announced October 2021.
-
Unsupervised Joint $k$-node Graph Representations with Compositional Energy-Based Models
Authors:
Leonardo Cotta,
Carlos H. C. Teixeira,
Ananthram Swami,
Bruno Ribeiro
Abstract:
Existing Graph Neural Network (GNN) methods that learn inductive unsupervised graph representations focus on learning node and edge representations by predicting observed edges in the graph. Although such approaches have shown advances in downstream node classification tasks, they are ineffective in jointly representing larger $k$-node sets, $k{>}2$. We propose MHM-GNN, an inductive unsupervised g…
▽ More
Existing Graph Neural Network (GNN) methods that learn inductive unsupervised graph representations focus on learning node and edge representations by predicting observed edges in the graph. Although such approaches have shown advances in downstream node classification tasks, they are ineffective in jointly representing larger $k$-node sets, $k{>}2$. We propose MHM-GNN, an inductive unsupervised graph representation approach that combines joint $k$-node representations with energy-based models (hypergraph Markov networks) and GNNs. To address the intractability of the loss that arises from this combination, we endow our optimization with a loss upper bound using a finite-sample unbiased Markov Chain Monte Carlo estimator. Our experiments show that the unsupervised MHM-GNN representations of MHM-GNN produce better unsupervised representations than existing approaches from the literature.
△ Less
Submitted 8 October, 2020;
originally announced October 2020.
-
Graph Pattern Mining and Learning through User-defined Relations (Extended Version)
Authors:
Carlos H. C. Teixeira,
Leonardo Cotta,
Bruno Ribeiro,
Wagner Meira Jr
Abstract:
In this work we propose R-GPM, a parallel computing framework for graph pattern mining (GPM) through a user-defined subgraph relation. More specifically, we enable the computation of statistics of patterns through their subgraph classes, generalizing traditional GPM methods. R-GPM provides efficient estimators for these statistics by employing a MCMC sampling algorithm combined with several optimi…
▽ More
In this work we propose R-GPM, a parallel computing framework for graph pattern mining (GPM) through a user-defined subgraph relation. More specifically, we enable the computation of statistics of patterns through their subgraph classes, generalizing traditional GPM methods. R-GPM provides efficient estimators for these statistics by employing a MCMC sampling algorithm combined with several optimizations. We provide both theoretical guarantees and empirical evaluations of our estimators in application scenarios such as stochastic optimization of deep high-order graph neural network models and pattern (motif) counting. We also propose and evaluate optimizations that enable improvements of our estimators accuracy, while reducing their computational costs in up to 3-orders-of-magnitude. Finally,we show that R-GPM is scalable, providing near-linear speedups on 44 cores in all of our tests.
△ Less
Submitted 10 October, 2020; v1 submitted 13 September, 2018;
originally announced September 2018.