Search | arXiv e-print repository

REvolve: Reward Evolution with Large Language Models for Autonomous Driving

Authors: Rishi Hazra, Alkis Sygkounas, Andreas Persson, Amy Loutfi, Pedro Zuidberg Dos Martires

Abstract: Designing effective reward functions is crucial to training reinforcement learning (RL) algorithms. However, this design is non-trivial, even for domain experts, due to the subjective nature of certain tasks that are hard to quantify explicitly. In recent works, large language models (LLMs) have been used for reward generation from natural language task descriptions, leveraging their extensive ins… ▽ More Designing effective reward functions is crucial to training reinforcement learning (RL) algorithms. However, this design is non-trivial, even for domain experts, due to the subjective nature of certain tasks that are hard to quantify explicitly. In recent works, large language models (LLMs) have been used for reward generation from natural language task descriptions, leveraging their extensive instruction tuning and commonsense understanding of human behavior. In this work, we hypothesize that LLMs, guided by human feedback, can be used to formulate human-aligned reward functions. Specifically, we study this in the challenging setting of autonomous driving (AD), wherein notions of "good" driving are tacit and hard to quantify. To this end, we introduce REvolve, an evolutionary framework that uses LLMs for reward design in AD. REvolve creates and refines reward functions by utilizing human feedback to guide the evolution process, effectively translating implicit human knowledge into explicit reward functions for training (deep) RL agents. We demonstrate that agents trained on REvolve-designed rewards align closely with human driving standards, thereby outperforming other state-of-the-art baselines. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: project page: https://rishihazra.github.io/REvolve. arXiv admin note: text overlap with arXiv:2310.12931 by other authors

arXiv:2403.06235 [pdf, ps, other]

Probabilistic Neural Circuits

Authors: Pedro Zuidberg Dos Martires

Abstract: Probabilistic circuits (PCs) have gained prominence in recent years as a versatile framework for discussing probabilistic models that support tractable queries and are yet expressive enough to model complex probability distributions. Nevertheless, tractability comes at a cost: PCs are less expressive than neural networks. In this paper we introduce probabilistic neural circuits (PNCs), which strik… ▽ More Probabilistic circuits (PCs) have gained prominence in recent years as a versatile framework for discussing probabilistic models that support tractable queries and are yet expressive enough to model complex probability distributions. Nevertheless, tractability comes at a cost: PCs are less expressive than neural networks. In this paper we introduce probabilistic neural circuits (PNCs), which strike a balance between PCs and neural nets in terms of tractability and expressive power. Theoretically, we show that PNCs can be interpreted as deep mixtures of Bayesian networks. Experimentally, we demonstrate that PNCs constitute powerful function approximators. △ Less

Submitted 10 March, 2024; originally announced March 2024.

Comments: Proceedings of the AAAI Conference on Artificial Intelligence

arXiv:2402.13782 [pdf, other]

doi 10.1016/j.ijar.2024.109130

Semirings for Probabilistic and Neuro-Symbolic Logic Programming

Authors: Vincent Derkinderen, Robin Manhaeve, Pedro Zuidberg Dos Martires, Luc De Raedt

Abstract: The field of probabilistic logic programming (PLP) focuses on integrating probabilistic models into programming languages based on logic. Over the past 30 years, numerous languages and frameworks have been developed for modeling, inference and learning in probabilistic logic programs. While originally PLP focused on discrete probability, more recent approaches have incorporated continuous distribu… ▽ More The field of probabilistic logic programming (PLP) focuses on integrating probabilistic models into programming languages based on logic. Over the past 30 years, numerous languages and frameworks have been developed for modeling, inference and learning in probabilistic logic programs. While originally PLP focused on discrete probability, more recent approaches have incorporated continuous distributions as well as neural networks, effectively yielding neural-symbolic methods. We provide a unified algebraic perspective on PLP, showing that many if not most of the extensions of PLP can be cast within a common algebraic logic programming framework, in which facts are labeled with elements of a semiring and disjunction and conjunction are replaced by addition and multiplication. This does not only hold for the PLP variations itself but also for the underlying execution mechanism that is based on (algebraic) model counting. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Journal ref: International Journal of Approximate Reasoning (2024): 109130

arXiv:2311.12569 [pdf, other]

Differentiable Sampling of Categorical Distributions Using the CatLog-Derivative Trick

Authors: Lennert De Smet, Emanuele Sansone, Pedro Zuidberg Dos Martires

Abstract: Categorical random variables can faithfully represent the discrete and uncertain aspects of data as part of a discrete latent variable model. Learning in such models necessitates taking gradients with respect to the parameters of the categorical probability distributions, which is often intractable due to their combinatorial nature. A popular technique to estimate these otherwise intractable gradi… ▽ More Categorical random variables can faithfully represent the discrete and uncertain aspects of data as part of a discrete latent variable model. Learning in such models necessitates taking gradients with respect to the parameters of the categorical probability distributions, which is often intractable due to their combinatorial nature. A popular technique to estimate these otherwise intractable gradients is the Log-Derivative trick. This trick forms the basis of the well-known REINFORCE gradient estimator and its many extensions. While the Log-Derivative trick allows us to differentiate through samples drawn from categorical distributions, it does not take into account the discrete nature of the distribution itself. Our first contribution addresses this shortcoming by introducing the CatLog-Derivative trick - a variation of the Log-Derivative trick tailored towards categorical distributions. Secondly, we use the CatLog-Derivative trick to introduce IndeCateR, a novel and unbiased gradient estimator for the important case of products of independent categorical distributions with provably lower variance than REINFORCE. Thirdly, we empirically show that IndeCateR can be efficiently implemented and that its gradient estimates have significantly lower bias and variance for the same number of samples compared to the state of the art. △ Less

Submitted 21 November, 2023; originally announced November 2023.

MSC Class: 68T05 ACM Class: G.3; G.4; I.2.6

arXiv:2308.12682 [pdf, other]

SayCanPay: Heuristic Planning with Large Language Models using Learnable Domain Knowledge

Authors: Rishi Hazra, Pedro Zuidberg Dos Martires, Luc De Raedt

Abstract: Large Language Models (LLMs) have demonstrated impressive planning abilities due to their vast "world knowledge". Yet, obtaining plans that are both feasible (grounded in affordances) and cost-effective (in plan length), remains a challenge, despite recent progress. This contrasts with heuristic planning methods that employ domain knowledge (formalized in action models such as PDDL) and heuristic… ▽ More Large Language Models (LLMs) have demonstrated impressive planning abilities due to their vast "world knowledge". Yet, obtaining plans that are both feasible (grounded in affordances) and cost-effective (in plan length), remains a challenge, despite recent progress. This contrasts with heuristic planning methods that employ domain knowledge (formalized in action models such as PDDL) and heuristic search to generate feasible, optimal plans. Inspired by this, we propose to combine the power of LLMs and heuristic planning by leveraging the world knowledge of LLMs and the principles of heuristic search. Our approach, SayCanPay, employs LLMs to generate actions (Say) guided by learnable domain knowledge, that evaluates actions' feasibility (Can) and long-term reward/payoff (Pay), and heuristic search to select the best sequence of actions. Our contributions are (1) a novel framing of the LLM planning problem in the context of heuristic planning, (2) integrating grounding and cost-effective elements into the generated plans, and (3) using heuristic search over actions. Our extensive evaluations show that our model surpasses other LLM planning approaches. △ Less

Submitted 1 January, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

Comments: Accepted in AAAI 2024. Website: https://rishihazra.github.io/SayCanPay/

arXiv:2306.04541 [pdf, ps, other]

Top-Down Knowledge Compilation for Counting Modulo Theories

Authors: Vincent Derkinderen, Pedro Zuidberg Dos Martires, Samuel Kolb, Paolo Morettin

Abstract: Propositional model counting (#SAT) can be solved efficiently when the input formula is in deterministic decomposable negation normal form (d-DNNF). Translating an arbitrary formula into a representation that allows inference tasks, such as counting, to be performed efficiently, is called knowledge compilation. Top-down knowledge compilation is a state-of-the-art technique for solving #SAT problem… ▽ More Propositional model counting (#SAT) can be solved efficiently when the input formula is in deterministic decomposable negation normal form (d-DNNF). Translating an arbitrary formula into a representation that allows inference tasks, such as counting, to be performed efficiently, is called knowledge compilation. Top-down knowledge compilation is a state-of-the-art technique for solving #SAT problems that leverages the traces of exhaustive DPLL search to obtain d-DNNF representations. While knowledge compilation is well studied for propositional approaches, knowledge compilation for the (quantifier free) counting modulo theory setting (#SMT) has been studied to a much lesser degree. In this paper, we discuss compilation strategies for #SMT. We specifically advocate for a top-down compiler based on the traces of exhaustive DPLL(T) search. △ Less

Submitted 30 November, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

Comments: 9 pages; submitted to Workshop on Counting and Sampling 2023 at SAT2023

arXiv:2303.04660 [pdf, other]

Neural Probabilistic Logic Programming in Discrete-Continuous Domains

Authors: Lennert De Smet, Pedro Zuidberg Dos Martires, Robin Manhaeve, Giuseppe Marra, Angelika Kimmig, Luc De Raedt

Abstract: Neural-symbolic AI (NeSy) allows neural networks to exploit symbolic background knowledge in the form of logic. It has been shown to aid learning in the limited data regime and to facilitate inference on out-of-distribution data. Probabilistic NeSy focuses on integrating neural networks with both logic and probability theory, which additionally allows learning under uncertainty. A major limitation… ▽ More Neural-symbolic AI (NeSy) allows neural networks to exploit symbolic background knowledge in the form of logic. It has been shown to aid learning in the limited data regime and to facilitate inference on out-of-distribution data. Probabilistic NeSy focuses on integrating neural networks with both logic and probability theory, which additionally allows learning under uncertainty. A major limitation of current probabilistic NeSy systems, such as DeepProbLog, is their restriction to finite probability distributions, i.e., discrete random variables. In contrast, deep probabilistic programming (DPP) excels in modelling and optimising continuous probability distributions. Hence, we introduce DeepSeaProbLog, a neural probabilistic logic programming language that incorporates DPP techniques into NeSy. Doing so results in the support of inference and learning of both discrete and continuous probability distributions under logical constraints. Our main contributions are 1) the semantics of DeepSeaProbLog and its corresponding inference algorithm, 2) a proven asymptotically unbiased learning algorithm, and 3) a series of experiments that illustrate the versatility of our approach. △ Less

Submitted 14 March, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

Comments: 27 pages, 9 figures

ACM Class: D.3.1; I.2.4; I.2.6

arXiv:2302.10674 [pdf, other]

Declarative Probabilistic Logic Programming in Discrete-Continuous Domains

Authors: Pedro Zuidberg Dos Martires, Luc De Raedt, Angelika Kimmig

Abstract: Over the past three decades, the logic programming paradigm has been successfully expanded to support probabilistic modeling, inference and learning. The resulting paradigm of probabilistic logic programming (PLP) and its programming languages owes much of its success to a declarative semantics, the so-called distribution semantics. However, the distribution semantics is limited to discrete random… ▽ More Over the past three decades, the logic programming paradigm has been successfully expanded to support probabilistic modeling, inference and learning. The resulting paradigm of probabilistic logic programming (PLP) and its programming languages owes much of its success to a declarative semantics, the so-called distribution semantics. However, the distribution semantics is limited to discrete random variables only. While PLP has been extended in various ways for supporting hybrid, that is, mixed discrete and continuous random variables, we are still lacking a declarative semantics for hybrid PLP that not only generalizes the distribution semantics and the modeling language but also the standard inference algorithm that is based on knowledge compilation. We contribute the hybrid distribution semantics together with the hybrid PLP language DC-ProbLog and its inference engine infinitesimal algebraic likelihood weighting (IALW). These have the original distribution semantics, standard PLP languages such as ProbLog, and standard inference engines for PLP based on knowledge compilation as special cases. Thus, we generalize the state-of-the-art of PLP towards hybrid PLP in three different aspects: semantics, language and inference. Furthermore, IALW is the first inference algorithm for hybrid probabilistic programming based on knowledge compilation. △ Less

Submitted 21 February, 2023; originally announced February 2023.

arXiv:2103.13901 [pdf, other]

Measure Theoretic Weighted Model Integration

Authors: Ivan Miosic, Pedro Zuidberg Dos Martires

Abstract: Weighted model counting (WMC) is a popular framework to perform probabilistic inference with discrete random variables. Recently, WMC has been extended to weighted model integration (WMI) in order to additionally handle continuous variables. At their core, WMI problems consist of computing integrals and sums over weighted logical formulas. From a theoretical standpoint, WMI has been formulated by… ▽ More Weighted model counting (WMC) is a popular framework to perform probabilistic inference with discrete random variables. Recently, WMC has been extended to weighted model integration (WMI) in order to additionally handle continuous variables. At their core, WMI problems consist of computing integrals and sums over weighted logical formulas. From a theoretical standpoint, WMI has been formulated by patching the sum over weighted formulas, which is already present in WMC, with Riemann integration. A more principled approach to integration, which is rooted in measure theory, is Lebesgue integration. Lebesgue integration allows one to treat discrete and continuous variables on equal footing in a principled fashion. We propose a theoretically sound measure theoretic formulation of weighted model integration, which naturally reduces to weighted model counting in the absence of continuous variables. Instead of regarding weighted model integration as an extension of weighted model counting, WMC emerges as a special case of WMI in our formulation. △ Less

Submitted 25 March, 2021; originally announced March 2021.

arXiv:2002.10373 [pdf, other]

Symbolic Learning and Reasoning with Noisy Data for Probabilistic Anchoring

Authors: Pedro Zuidberg Dos Martires, Nitesh Kumar, Andreas Persson, Amy Loutfi, Luc De Raedt

Abstract: Robotic agents should be able to learn from sub-symbolic sensor data, and at the same time, be able to reason about objects and communicate with humans on a symbolic level. This raises the question of how to overcome the gap between symbolic and sub-symbolic artificial intelligence. We propose a semantic world modeling approach based on bottom-up object anchoring using an object-centered represent… ▽ More Robotic agents should be able to learn from sub-symbolic sensor data, and at the same time, be able to reason about objects and communicate with humans on a symbolic level. This raises the question of how to overcome the gap between symbolic and sub-symbolic artificial intelligence. We propose a semantic world modeling approach based on bottom-up object anchoring using an object-centered representation of the world. Perceptual anchoring processes continuous perceptual sensor data and maintains a correspondence to a symbolic representation. We extend the definitions of anchoring to handle multi-modal probability distributions and we couple the resulting symbol anchoring system to a probabilistic logic reasoner for performing inference. Furthermore, we use statistical relational learning to enable the anchoring framework to learn symbolic knowledge in the form of a set of probabilistic logic rules of the world from noisy and sub-symbolic sensor input. The resulting framework, which combines perceptual anchoring and statistical relational learning, is able to maintain a semantic world model of all the objects that have been perceived over time, while still exploiting the expressiveness of logical rules to reason about the state of objects which are not directly observed through sensory input data. To validate our approach we demonstrate, on the one hand, the ability of our system to perform probabilistic reasoning over multi-modal probability distributions, and on the other hand, the learning of probabilistic logical rules from anchored objects produced by perceptual observations. The learned logical rules are, subsequently, used to assess our proposed probabilistic anchoring procedure. We demonstrate our system in a setting involving object interactions where object occlusions arise and where probabilistic inference is needed to correctly anchor objects. △ Less

Submitted 24 February, 2020; originally announced February 2020.

arXiv:2001.04566 [pdf, other]

Monte Carlo Anti-Differentiation for Approximate Weighted Model Integration

Authors: Pedro Zuidberg Dos Martires, Samuel Kolb

Abstract: Probabilistic inference in the hybrid domain, i.e. inference over discrete-continuous domains, requires tackling two well known #P-hard problems 1)~weighted model counting (WMC) over discrete variables and 2)~integration over continuous variables. For both of these problems inference techniques have been developed separately in order to manage their #P-hardness, such as knowledge compilation for W… ▽ More Probabilistic inference in the hybrid domain, i.e. inference over discrete-continuous domains, requires tackling two well known #P-hard problems 1)~weighted model counting (WMC) over discrete variables and 2)~integration over continuous variables. For both of these problems inference techniques have been developed separately in order to manage their #P-hardness, such as knowledge compilation for WMC and Monte Carlo (MC) methods for (approximate) integration in the continuous domain. Weighted model integration (WMI), the extension of WMC to the hybrid domain, has been proposed as a formalism to study probabilistic inference over discrete and continuous variables alike. Recently developed WMI solvers have focused on exploiting structure in WMI problems, for which they rely on symbolic integration to find the primitive of an integrand, i.e. to perform anti-differentiation. To combine these advances with state-of-the-art Monte Carlo integration techniques, we introduce \textit{Monte Carlo anti-differentiation} (MCAD), which computes MC approximations of anti-derivatives. In our empirical evaluation we substitute the exact symbolic integration backend in an existing WMI solver with an MCAD backend. Our experiments show that that equip** existing WMI solvers with MCAD yields a fast yet reliable approximate inference scheme. △ Less

Submitted 13 January, 2020; originally announced January 2020.

arXiv:1904.13324 [pdf, other]

Learning from Implicit Information in Natural Language Instructions for Robotic Manipulations

Authors: Ozan Arkan Can, Pedro Zuidberg Dos Martires, Andreas Persson, Julian Gaal, Amy Loutfi, Luc De Raedt, Deniz Yuret, Alessandro Saffiotti

Abstract: Human-robot interaction often occurs in the form of instructions given from a human to a robot. For a robot to successfully follow instructions, a common representation of the world and objects in it should be shared between humans and the robot so that the instructions can be grounded. Achieving this representation can be done via learning, where both the world representation and the language gro… ▽ More Human-robot interaction often occurs in the form of instructions given from a human to a robot. For a robot to successfully follow instructions, a common representation of the world and objects in it should be shared between humans and the robot so that the instructions can be grounded. Achieving this representation can be done via learning, where both the world representation and the language grounding are learned simultaneously. However, in robotics this can be a difficult task due to the cost and scarcity of data. In this paper, we tackle the problem by separately learning the world representation of the robot and the language grounding. While this approach can address the challenges in getting sufficient data, it may give rise to inconsistencies between both learned components. Therefore, we further propose Bayesian learning to resolve such inconsistencies between the natural language grounding and a robot's world representation by exploiting spatio-relational information that is implicitly present in instructions given by a human. Moreover, we demonstrate the feasibility of our approach on a scenario involving a robotic arm in the physical world. △ Less

Submitted 30 April, 2019; originally announced April 2019.

arXiv:1902.09937 [pdf, other]

doi 10.1109/TCDS.2019.2915763

Semantic Relational Object Tracking

Authors: Andreas Persson, Pedro Zuidberg Dos Martires, Amy Loutfi, Luc De Raedt

Abstract: This paper addresses the topic of semantic world modeling by conjoining probabilistic reasoning and object anchoring. The proposed approach uses a so-called bottom-up object anchoring method that relies on the rich continuous data from perceptual sensor data. A novel anchoring matching function method learns to maintain object entities in space and time and is validated using a large set of traine… ▽ More This paper addresses the topic of semantic world modeling by conjoining probabilistic reasoning and object anchoring. The proposed approach uses a so-called bottom-up object anchoring method that relies on the rich continuous data from perceptual sensor data. A novel anchoring matching function method learns to maintain object entities in space and time and is validated using a large set of trained humanly annotated ground truth data of real-world objects. For more complex scenarios, a high-level probabilistic object tracker has been integrated with the anchoring framework and handles the tracking of occluded objects via reasoning about the state of unobserved objects. We demonstrate the performance of our integrated approach through scenarios such as the shell game scenario, where we illustrate how anchored objects are retained by preserving relations through probabilistic reasoning. △ Less

Submitted 26 February, 2019; originally announced February 2019.

arXiv:1807.00614 [pdf, other]

Knowledge Compilation with Continuous Random Variables and its Application in Hybrid Probabilistic Logic Programming

Authors: Pedro Zuidberg Dos Martires, Anton Dries, Luc De Raedt

Abstract: In probabilistic reasoning, the traditionally discrete domain has been elevated to the hybrid domain encompassing additionally continuous random variables. Inference in the hybrid domain, however, usually necessitates to condone trade-offs on either the inference on discrete or continuous random variables. We introduce a novel approach based on weighted model integration and algebraic model counti… ▽ More In probabilistic reasoning, the traditionally discrete domain has been elevated to the hybrid domain encompassing additionally continuous random variables. Inference in the hybrid domain, however, usually necessitates to condone trade-offs on either the inference on discrete or continuous random variables. We introduce a novel approach based on weighted model integration and algebraic model counting that circumvents these trade-offs. We then show how it supports knowledge compilation and exact probabilistic inference. Moreover, we introduce the hybrid probabilistic logic programming language HAL-ProbLog, an extension of ProbLog, to which we apply our inference approach. △ Less

Submitted 12 July, 2018; v1 submitted 2 July, 2018; originally announced July 2018.

Comments: 8 pages, 2 figures, StarAI

Showing 1–14 of 14 results for author: Martires, P Z D