Search | arXiv e-print repository

Out-Of-Context Prompting Boosts Fairness and Robustness in Large Language Model Predictions

Authors: Leonardo Cotta, Chris J. Maddison

Abstract: Frontier Large Language Models (LLMs) are increasingly being deployed for high-stakes decision-making. On the other hand, these models are still consistently making predictions that contradict users' or society's expectations, e.g., hallucinating, or discriminating. Thus, it is important that we develop test-time strategies to improve their trustworthiness. Inspired by prior work, we leverage caus… ▽ More Frontier Large Language Models (LLMs) are increasingly being deployed for high-stakes decision-making. On the other hand, these models are still consistently making predictions that contradict users' or society's expectations, e.g., hallucinating, or discriminating. Thus, it is important that we develop test-time strategies to improve their trustworthiness. Inspired by prior work, we leverage causality as a tool to formally encode two aspects of trustworthiness in LLMs: fairness and robustness. Under this perspective, existing test-time solutions explicitly instructing the model to be fair or robust implicitly depend on the LLM's causal reasoning capabilities. In this work, we explore the opposite approach. Instead of explicitly asking the LLM for trustworthiness, we design prompts to encode the underlying causal inference algorithm that will, by construction, result in more trustworthy predictions. Concretely, we propose out-of-context prompting as a test-time solution to encourage fairness and robustness in LLMs. Out-of-context prompting leverages the user's prior knowledge of the task's causal model to apply (random) counterfactual transformations and improve the model's trustworthiness. Empirically, we show that out-of-context prompting consistently improves the fairness and robustness of frontier LLMs across five different benchmark datasets without requiring additional data, finetuning or pre-training. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2308.04412 [pdf, other]

Probabilistic Invariant Learning with Randomized Linear Classifiers

Authors: Leonardo Cotta, Gal Yehuda, Assaf Schuster, Chris J. Maddison

Abstract: Designing models that are both expressive and preserve known invariances of tasks is an increasingly hard problem. Existing solutions tradeoff invariance for computational or memory resources. In this work, we show how to leverage randomness and design models that are both expressive and invariant but use less resources. Inspired by randomized algorithms, our key insight is that accepting probabil… ▽ More Designing models that are both expressive and preserve known invariances of tasks is an increasingly hard problem. Existing solutions tradeoff invariance for computational or memory resources. In this work, we show how to leverage randomness and design models that are both expressive and invariant but use less resources. Inspired by randomized algorithms, our key insight is that accepting probabilistic notions of universal approximation and invariance can reduce our resource requirements. More specifically, we propose a class of binary classification models called Randomized Linear Classifiers (RLCs). We give parameter and sample size conditions in which RLCs can, with high probability, approximate any (smooth) function while preserving invariance to compact group transformations. Leveraging this result, we design three RLCs that are provably probabilistic invariant for classification tasks over sets, graphs, and spherical data. We show how these models can achieve probabilistic invariance and universality using less resources than (deterministic) neural networks and their invariant counterparts. Finally, we empirically demonstrate the benefits of this new class of models on invariant tasks where deterministic invariant neural networks are known to struggle. △ Less

Submitted 27 September, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

arXiv:2307.09428 [pdf, other]

Control of Small Spacecraft by Optimal Output Regulation: A Reinforcement Learning Approach

Authors: Joao Leonardo Silva Cotta, Omar Qasem, Paula do Vale Pereira, Hector Gutierrez

Abstract: The growing number of noncooperative flying objects has prompted interest in sample-return and space debris removal missions. Current solutions are both costly and largely dependent on specific object identification and capture methods. In this paper, a low-cost modular approach for control of a swarm flight of small satellites in rendezvous and capture missions is proposed by solving the optimal… ▽ More The growing number of noncooperative flying objects has prompted interest in sample-return and space debris removal missions. Current solutions are both costly and largely dependent on specific object identification and capture methods. In this paper, a low-cost modular approach for control of a swarm flight of small satellites in rendezvous and capture missions is proposed by solving the optimal output regulation problem. By integrating the theories of tracking control, adaptive optimal control, and output regulation, the optimal control policy is designed as a feedback-feedforward controller to guarantee the asymptotic tracking of a class of reference input generated by the leader. The estimated state vector of the space object of interest and communication within satellites is assumed to be available. The controller rejects the nonvanishing disturbances injected into the follower satellite while maintaining the closed-loop stability of the overall leader-follower system. The simulation results under the Basilisk-ROS2 framework environment for high-fidelity space applications with accurate spacecraft dynamics, are compared with those from a classical linear quadratic regulator controller, and the results reveal the efficiency and practicality of the proposed method. △ Less

Submitted 18 July, 2023; originally announced July 2023.

Comments: Accepted for presentation at the 37th Annual/USU Conference on Small Satellites. arXiv admin note: substantial text overlap with arXiv:2301.12489

arXiv:2302.01198 [pdf, other]

Causal Lifting and Link Prediction

Authors: Leonardo Cotta, Beatrice Bevilacqua, Nesreen Ahmed, Bruno Ribeiro

Abstract: Existing causal models for link prediction assume an underlying set of inherent node factors -- an innate characteristic defined at the node's birth -- that governs the causal evolution of links in the graph. In some causal tasks, however, link formation is path-dependent: The outcome of link interventions depends on existing links. Unfortunately, these existing causal methods are not designed for… ▽ More Existing causal models for link prediction assume an underlying set of inherent node factors -- an innate characteristic defined at the node's birth -- that governs the causal evolution of links in the graph. In some causal tasks, however, link formation is path-dependent: The outcome of link interventions depends on existing links. Unfortunately, these existing causal methods are not designed for path-dependent link formation, as the cascading functional dependencies between links (arising from path dependence) are either unidentifiable or require an impractical number of control variables. To overcome this, we develop the first causal model capable of dealing with path dependencies in link prediction. In this work we introduce the concept of causal lifting, an invariance in causal models of independent interest that, on graphs, allows the identification of causal link prediction queries using limited interventional data. Further, we show how structural pairwise embeddings exhibit lower bias and correctly represent the task's causal structure, as opposed to existing node embeddings, e.g., graph neural network node embeddings and matrix factorization. Finally, we validate our theoretical findings on three scenarios for causal link prediction tasks: knowledge base completion, covariance matrix estimation and consumer-product recommendations. △ Less

Submitted 27 July, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

arXiv:2110.00577 [pdf, other]

Reconstruction for Powerful Graph Representations

Authors: Leonardo Cotta, Christopher Morris, Bruno Ribeiro

Abstract: Graph neural networks (GNNs) have limited expressive power, failing to represent many graph classes correctly. While more expressive graph representation learning (GRL) alternatives can distinguish some of these classes, they are significantly harder to implement, may not scale well, and have not been shown to outperform well-tuned GNNs in real-world tasks. Thus, devising simple, scalable, and exp… ▽ More Graph neural networks (GNNs) have limited expressive power, failing to represent many graph classes correctly. While more expressive graph representation learning (GRL) alternatives can distinguish some of these classes, they are significantly harder to implement, may not scale well, and have not been shown to outperform well-tuned GNNs in real-world tasks. Thus, devising simple, scalable, and expressive GRL architectures that also achieve real-world improvements remains an open challenge. In this work, we show the extent to which graph reconstruction -- reconstructing a graph from its subgraphs -- can mitigate the theoretical and practical problems currently faced by GRL architectures. First, we leverage graph reconstruction to build two new classes of expressive graph representations. Secondly, we show how graph reconstruction boosts the expressive power of any GNN architecture while being a (provably) powerful inductive bias for invariances to vertex removals. Empirically, we show how reconstruction can boost GNN's expressive power -- while maintaining its invariance to permutations of the vertices -- by solving seven graph property tasks not solvable by the original GNN. Further, we demonstrate how it boosts state-of-the-art GNN's performance across nine real-world benchmark datasets. △ Less

Submitted 6 December, 2021; v1 submitted 1 October, 2021; originally announced October 2021.

Comments: Accepted to NeurIPS 2021

arXiv:2010.04259 [pdf, other]

Unsupervised Joint $k$-node Graph Representations with Compositional Energy-Based Models

Authors: Leonardo Cotta, Carlos H. C. Teixeira, Ananthram Swami, Bruno Ribeiro

Abstract: Existing Graph Neural Network (GNN) methods that learn inductive unsupervised graph representations focus on learning node and edge representations by predicting observed edges in the graph. Although such approaches have shown advances in downstream node classification tasks, they are ineffective in jointly representing larger $k$-node sets, $k{>}2$. We propose MHM-GNN, an inductive unsupervised g… ▽ More Existing Graph Neural Network (GNN) methods that learn inductive unsupervised graph representations focus on learning node and edge representations by predicting observed edges in the graph. Although such approaches have shown advances in downstream node classification tasks, they are ineffective in jointly representing larger $k$-node sets, $k{>}2$. We propose MHM-GNN, an inductive unsupervised graph representation approach that combines joint $k$-node representations with energy-based models (hypergraph Markov networks) and GNNs. To address the intractability of the loss that arises from this combination, we endow our optimization with a loss upper bound using a finite-sample unbiased Markov Chain Monte Carlo estimator. Our experiments show that the unsupervised MHM-GNN representations of MHM-GNN produce better unsupervised representations than existing approaches from the literature. △ Less

Submitted 8 October, 2020; originally announced October 2020.

Comments: accepted at NeurIPS 2020

arXiv:1809.05241 [pdf, other]

Graph Pattern Mining and Learning through User-defined Relations (Extended Version)

Authors: Carlos H. C. Teixeira, Leonardo Cotta, Bruno Ribeiro, Wagner Meira Jr

Abstract: In this work we propose R-GPM, a parallel computing framework for graph pattern mining (GPM) through a user-defined subgraph relation. More specifically, we enable the computation of statistics of patterns through their subgraph classes, generalizing traditional GPM methods. R-GPM provides efficient estimators for these statistics by employing a MCMC sampling algorithm combined with several optimi… ▽ More In this work we propose R-GPM, a parallel computing framework for graph pattern mining (GPM) through a user-defined subgraph relation. More specifically, we enable the computation of statistics of patterns through their subgraph classes, generalizing traditional GPM methods. R-GPM provides efficient estimators for these statistics by employing a MCMC sampling algorithm combined with several optimizations. We provide both theoretical guarantees and empirical evaluations of our estimators in application scenarios such as stochastic optimization of deep high-order graph neural network models and pattern (motif) counting. We also propose and evaluate optimizations that enable improvements of our estimators accuracy, while reducing their computational costs in up to 3-orders-of-magnitude. Finally,we show that R-GPM is scalable, providing near-linear speedups on 44 cores in all of our tests. △ Less

Submitted 10 October, 2020; v1 submitted 13 September, 2018; originally announced September 2018.

Comments: Extended version of the paper published in the ICDM 2018

Showing 1–7 of 7 results for author: Cotta, L