Skip to main content

Showing 1–7 of 7 results for author: Hasanbeig, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.11314  [pdf, other

    cs.LG cs.LO eess.SY

    Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis

    Authors: Rohan Mitta, Hosein Hasanbeig, Jun Wang, Daniel Kroening, Yiannis Kantaros, Alessandro Abate

    Abstract: This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL), such that the safety constraint violations are bounded at any point during learning. In a variety of RL applications the safety of the agent is particularly important, e.g. autonomous platforms or robots that work in proximity of humans. As enforcing safety during training might severely limit th… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  2. arXiv:2311.17059  [pdf, other

    cs.RO cs.AI cs.LG

    Mission-driven Exploration for Accelerated Deep Reinforcement Learning with Temporal Logic Task Specifications

    Authors: Jun Wang, Hosein Hasanbeig, Kaiyuan Tan, Zihe Sun, Yiannis Kantaros

    Abstract: This paper addresses the problem of designing optimal control policies for mobile robots with mission and safety requirements specified using Linear Temporal Logic (LTL). We consider robots with unknown stochastic dynamics operating in environments with unknown geometric structure. The robots are equipped with sensors allowing them to detect obstacles. Our goal is to synthesize a control policy th… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  3. arXiv:2310.00313  [pdf, other

    cs.CL

    Decoding In-Context Learning: Neuroscience-inspired Analysis of Representations in Large Language Models

    Authors: Safoora Yousefi, Leo Betthauser, Hosein Hasanbeig, Raphaël Millière, Ida Momennejad

    Abstract: Large language models (LLMs) exhibit remarkable performance improvement through in-context learning (ICL) by leveraging task-specific examples in the input. However, the mechanisms behind this improvement remain elusive. In this work, we investigate how LLM embeddings and attention representations change following in-context-learning, and how these changes mediate improvement in behavior. We emplo… ▽ More

    Submitted 21 February, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

  4. arXiv:2309.15129  [pdf, other

    cs.AI cs.CL cs.LG

    Evaluating Cognitive Maps and Planning in Large Language Models with CogEval

    Authors: Ida Momennejad, Hosein Hasanbeig, Felipe Vieira, Hiteshi Sharma, Robert Osazuwa Ness, Nebojsa Jojic, Hamid Palangi, Jonathan Larson

    Abstract: Recently an influx of studies claim emergent cognitive abilities in large language models (LLMs). Yet, most rely on anecdotes, overlook contamination of training sets, or lack systematic Evaluation involving multiple tasks, control conditions, multiple iterations, and statistical robustness tests. Here we make two major contributions. First, we propose CogEval, a cognitive science-inspired protoco… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

  5. arXiv:2309.13701  [pdf, other

    cs.CL cs.AI cs.HC

    ALLURE: Auditing and Improving LLM-based Evaluation of Text using Iterative In-Context-Learning

    Authors: Hosein Hasanbeig, Hiteshi Sharma, Leo Betthauser, Felipe Vieira Frujeri, Ida Momennejad

    Abstract: From grading papers to summarizing medical documents, large language models (LLMs) are evermore used for evaluation of text generated by humans and AI alike. However, despite their extensive utility, LLMs exhibit distinct failure modes, necessitating a thorough audit and improvement of their text evaluation capabilities. Here we introduce ALLURE, a systematic approach to Auditing Large Language Mo… ▽ More

    Submitted 26 September, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

  6. arXiv:2209.10341  [pdf, other

    cs.LG cs.AI cs.LO

    LCRL: Certified Policy Synthesis via Logically-Constrained Reinforcement Learning

    Authors: Hosein Hasanbeig, Daniel Kroening, Alessandro Abate

    Abstract: LCRL is a software tool that implements model-free Reinforcement Learning (RL) algorithms over unknown Markov Decision Processes (MDPs), synthesising policies that satisfy a given linear temporal specification with maximal probability. LCRL leverages partially deterministic finite-state machines known as Limit Deterministic Buchi Automata (LDBA) to express a given linear temporal specification. A… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: Evaluated and Accepted by the 19th International Conference on Quantitative Evaluation of Systems 2022

  7. arXiv:1902.00778  [pdf, other

    cs.LG stat.ML

    Certified Reinforcement Learning with Logic Guidance

    Authors: Hosein Hasanbeig, Daniel Kroening, Alessandro Abate

    Abstract: Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied to a variety of control problems. However, applications in safety-critical domains require a systematic and formal approach to specifying requirements as tasks or goals. We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuo… ▽ More

    Submitted 6 June, 2023; v1 submitted 2 February, 2019; originally announced February 2019.