Skip to main content

Showing 1–28 of 28 results for author: Ammar, H B

.
  1. arXiv:2405.20304  [pdf, other

    cs.CL cs.LG

    Group Robust Preference Optimization in Reward-free RLHF

    Authors: Shyam Sundhar Ramesh, Yifan Hu, Iason Chaimalas, Viraj Mehta, Pier Giuseppe Sessa, Haitham Bou Ammar, Ilija Bogunovic

    Abstract: Adapting large language models (LLMs) for specific tasks usually involves fine-tuning through reinforcement learning with human feedback (RLHF) on preference data. While these data often come from diverse labelers' groups (e.g., different demographics, ethnicities, company teams, etc.), traditional RLHF approaches adopt a "one-size-fits-all" approach, i.e., they indiscriminately assume and optimiz… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Preprint

  2. arXiv:2402.12940  [pdf

    cs.CL

    Normalized Orthography for Tunisian Arabic

    Authors: Houcemeddine Turki, Kawthar Ellouze, Hager Ben Ammar, Mohamed Ali Hadj Taieb, Imed Adel, Mohamed Ben Aouicha, Pier Luigi Farri, Abderrezak Bennour

    Abstract: Tunisian Arabic (ISO 693-3: aeb) isa distinct variety native to Tunisia, derived from Arabic and enriched by various historical influences. This research introduces the "Normalized Orthography for Tunisian Arabic" (NOTA), an adaptation of CODA* guidelines for transcribing Tunisian Arabic using Arabic script. The aim is to enhance language resource development by ensuring user-friendliness and cons… ▽ More

    Submitted 11 June, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: Final Report for the Derja Association. Camera-Ready for LPKM 2024

  3. arXiv:2306.09803  [pdf, other

    cs.LG

    Framework and Benchmarks for Combinatorial and Mixed-variable Bayesian Optimization

    Authors: Kamil Dreczkowski, Antoine Grosnit, Haitham Bou Ammar

    Abstract: This paper introduces a modular framework for Mixed-variable and Combinatorial Bayesian Optimization (MCBO) to address the lack of systematic benchmarking and standardized evaluation in the field. Current MCBO papers often introduce non-diverse or non-standard benchmarks to evaluate their methods, impeding the proper assessment of different MCBO primitives and their combinations. Additionally, pap… ▽ More

    Submitted 11 December, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

  4. arXiv:2305.15930  [pdf, other

    cs.LG

    End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes

    Authors: Alexandre Maraval, Matthieu Zimmer, Antoine Grosnit, Haitham Bou Ammar

    Abstract: Meta-Bayesian optimisation (meta-BO) aims to improve the sample efficiency of Bayesian optimisation by leveraging data from related tasks. While previous methods successfully meta-learn either a surrogate model or an acquisition function independently, joint training of both components remains an open challenge. This paper proposes the first end-to-end differentiable meta-BO framework that general… ▽ More

    Submitted 22 December, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

  5. arXiv:2305.09793  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Reinforcement Learning for Safe Robot Control using Control Lyapunov Barrier Functions

    Authors: Desong Du, Shaohang Han, Naiming Qi, Haitham Bou Ammar, Jun Wang, Wei Pan

    Abstract: Reinforcement learning (RL) exhibits impressive performance when managing complicated control tasks for robots. However, its wide application to physical robots is limited by the absence of strong safety guarantees. To overcome this challenge, this paper explores the control Lyapunov barrier function (CLBF) to analyze the safety and reachability solely based on data without explicitly employing a… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

  6. arXiv:2302.00815  [pdf, ps, other

    math.AG

    Slope inequality for an arbitrary divisor

    Authors: Houari Benammar Ammar

    Abstract: Let $f: S \longrightarrow C$ be a surjective morphism with connected fibers from a smooth complex projective surface $S$ to a smooth complex projective curve $C$ with general fiber $F$. In this paper, we develop a more general version of slope inequality for a data $(D, \mathcal{F})$ where $D$ is an arbitrary relatively effective divisor on $S$ and $\mathcal{F}$ is a locally free sub-sheaf of… ▽ More

    Submitted 12 October, 2023; v1 submitted 1 February, 2023; originally announced February 2023.

    Comments: Many changes and comments are very welcome

  7. arXiv:2209.04698  [pdf, other

    cs.LG

    Structured Q-learning For Antibody Design

    Authors: Alexander I. Cowen-Rivers, Philip John Gorinski, Aivar Sootla, Asif Khan, Liu Furui, Jun Wang, Jan Peters, Haitham Bou Ammar

    Abstract: Optimizing combinatorial structures is core to many real-world problems, such as those encountered in life sciences. For example, one of the crucial steps involved in antibody design is to find an arrangement of amino acids in a protein sequence that improves its binding with a pathogen. Combinatorial optimization of antibodies is difficult due to extremely large search spaces and non-linear objec… ▽ More

    Submitted 13 September, 2022; v1 submitted 10 September, 2022; originally announced September 2022.

  8. arXiv:2206.02675  [pdf, other

    cs.LG cs.AI

    Effects of Safety State Augmentation on Safe Exploration

    Authors: Aivar Sootla, Alexander I. Cowen-Rivers, Jun Wang, Haitham Bou Ammar

    Abstract: Safe exploration is a challenging and important problem in model-free reinforcement learning (RL). Often the safety cost is sparse and unknown, which unavoidably leads to constraint violations -- a phenomenon ideally to be avoided in safety-critical applications. We tackle this problem by augmenting the state-space with a safety state, which is nonnegative if and only if the constraint is satisfie… ▽ More

    Submitted 12 October, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Published in Neurips 2022

  9. arXiv:2205.13902  [pdf, other

    cs.LG

    Sample-Efficient Optimisation with Probabilistic Transformer Surrogates

    Authors: Alexandre Maraval, Matthieu Zimmer, Antoine Grosnit, Rasul Tutunov, Jun Wang, Haitham Bou Ammar

    Abstract: Faced with problems of increasing complexity, recent research in Bayesian Optimisation (BO) has focused on adapting deep probabilistic models as flexible alternatives to Gaussian Processes (GPs). In a similar vein, this paper investigates the feasibility of employing state-of-the-art probabilistic transformers in BO. Upon further investigation, we observe two drawbacks stemming from their training… ▽ More

    Submitted 30 May, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

  10. arXiv:2201.09612  [pdf, other

    cs.RO

    Learning Geometric Constraints in Task and Motion Planning

    Authors: Tianyu Ren, Alexander Imani Cowen-Rivers, Haitham Bou Ammar, Jan Peters

    Abstract: Searching for bindings of geometric parameters in task and motion planning (TAMP) is a finite-horizon stochastic planning problem with high-dimensional decision spaces. A robot manipulator can only move in a subspace of its whole range that is subjected to many geometric constraints. A TAMP solver usually takes many explorations before finding a feasible binding set for each task. It is favorable… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

  11. arXiv:2111.06178  [pdf, other

    cs.LG stat.ML

    BOiLS: Bayesian Optimisation for Logic Synthesis

    Authors: Antoine Grosnit, Cedric Malherbe, Rasul Tutunov, Xingchen Wan, Jun Wang, Haitham Bou Ammar

    Abstract: Optimising the quality-of-results (QoR) of circuits during logic synthesis is a formidable challenge necessitating the exploration of exponentially sized search spaces. While expert-designed operations aid in uncovering effective sequences, the increase in complexity of logic circuits favours automated procedures. Inspired by the successes of machine learning, researchers adapted deep learning and… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

  12. arXiv:2107.02474  [pdf, other

    stat.ML cs.LG

    Viscos Flows: Variational Schur Conditional Sampling With Normalizing Flows

    Authors: Vincent Moens, Aivar Sootla, Haitham Bou Ammar, Jun Wang

    Abstract: We present a method for conditional sampling for pre-trained normalizing flows when only part of an observation is available. We derive a lower bound to the conditioning variable log-probability using Schur complement properties in the spirit of Gaussian conditional sampling. Our derivation relies on partitioning flow's domain in such a way that the flow restrictions to subdomains remain bijective… ▽ More

    Submitted 15 October, 2021; v1 submitted 6 July, 2021; originally announced July 2021.

  13. arXiv:2103.07780  [pdf, other

    cs.AI cs.GT

    Online Double Oracle

    Authors: Le Cong Dinh, Yaodong Yang, Stephen McAleer, Zheng Tian, Nicolas Perez Nieves, Oliver Slumbers, David Henry Mguni, Haitham Bou Ammar, Jun Wang

    Abstract: Solving strategic games with huge action space is a critical yet under-explored topic in economics, operations research and artificial intelligence. This paper proposes new learning algorithms for solving two-player zero-sum normal-form games where the number of pure strategies is prohibitively large. Specifically, we combine no-regret analysis from online learning with Double Oracle (DO) methods… ▽ More

    Submitted 15 February, 2023; v1 submitted 13 March, 2021; originally announced March 2021.

    Comments: Accepted at Transactions on Machine Learning Research (TMLR)

    Journal ref: Transactions on Machine Learning Research 2022

  14. arXiv:2102.07659  [pdf, other

    cs.AI cs.MA

    Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems

    Authors: Yaodong Yang, Jun Luo, Ying Wen, Oliver Slumbers, Daniel Graves, Haitham Bou Ammar, Jun Wang, Matthew E. Taylor

    Abstract: Multiagent reinforcement learning (MARL) has achieved a remarkable amount of success in solving various types of video games. A cornerstone of this success is the auto-curriculum framework, which shapes the learning process by continually creating new challenging tasks for agents to adapt to, thereby facilitating the acquisition of new skills. In order to extend MARL methods to real-world domains… ▽ More

    Submitted 16 February, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: AAMAS 2021

  15. arXiv:2012.03826  [pdf, other

    cs.LG math.OC

    HEBO Pushing The Limits of Sample-Efficient Hyperparameter Optimisation

    Authors: Alexander I. Cowen-Rivers, Wenlong Lyu, Rasul Tutunov, Zhi Wang, Antoine Grosnit, Ryan Rhys Griffiths, Alexandre Max Maraval, Hao Jianye, Jun Wang, Jan Peters, Haitham Bou Ammar

    Abstract: In this work we rigorously analyse assumptions inherent to black-box optimisation hyper-parameter tuning tasks. Our results on the Bayesmark benchmark indicate that heteroscedasticity and non-stationarity pose significant challenges for black-box optimisers. Based on these findings, we propose a Heteroscedastic and Evolutionary Bayesian Optimisation solver (HEBO). HEBO performs non-linear input an… ▽ More

    Submitted 25 May, 2022; v1 submitted 7 December, 2020; originally announced December 2020.

    Comments: Accepted at JAIR

  16. arXiv:2010.09776  [pdf, other

    cs.MA cs.AI cs.GT cs.LG eess.SY

    SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving

    Authors: Ming Zhou, Jun Luo, Julian Villella, Yaodong Yang, David Rusu, Jiayu Miao, Weinan Zhang, Montgomery Alban, Iman Fadakar, Zheng Chen, Aurora Chongxi Huang, Ying Wen, Kimia Hassanzadeh, Daniel Graves, Dong Chen, Zhengbang Zhu, Nhat Nguyen, Mohamed Elsayed, Kun Shao, Sanjeevan Ahilan, Baokuan Zhang, Jiannan Wu, Zhengang Fu, Kasra Rezaee, Peyman Yadmellat , et al. (12 additional authors not shown)

    Abstract: Multi-agent interaction is a fundamental aspect of autonomous driving in the real world. Despite more than a decade of research and development, the problem of how to competently interact with diverse road users in diverse scenarios remains largely unsolved. Learning methods have much to offer towards solving this problem. But they require a realistic multi-agent simulator that generates diverse a… ▽ More

    Submitted 31 October, 2020; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: 20 pages, 11 figures. Paper accepted to CoRL 2020

  17. arXiv:1910.08285  [pdf, other

    cs.LG stat.ML

    Multi-View Reinforcement Learning

    Authors: Minne Li, Lisheng Wu, Haitham Bou Ammar, Jun Wang

    Abstract: This paper is concerned with multi-view reinforcement learning (MVRL), which allows for decision making when agents share common dynamics but adhere to different observation models. We define the MVRL framework by extending partially observable Markov decision processes (POMDPs) to support more than one observation model and propose two solution methods through observation augmentation and cross-v… ▽ More

    Submitted 18 October, 2019; originally announced October 2019.

    Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019)

  18. arXiv:1910.04034  [pdf, ps, other

    cs.LG stat.ML

    Derivative-Free & Order-Robust Optimisation

    Authors: Victor Gabillon, Rasul Tutunov, Michal Valko, Haitham Bou Ammar

    Abstract: In this paper, we formalise order-robust optimisation as an instance of online learning minimising simple regret, and propose Vroom, a zero'th order optimisation algorithm capable of achieving vanishing regret in non-stationary environments, while recovering favorable rates under stochastic reward-generating processes. Our results are the first to target simple regret definitions in adversarial sc… ▽ More

    Submitted 22 October, 2019; v1 submitted 9 October, 2019; originally announced October 2019.

  19. arXiv:1909.11628  [pdf, other

    cs.MA cs.LG

    $α^α$-Rank: Practically Scaling $α$-Rank through Stochastic Optimisation

    Authors: Yaodong Yang, Rasul Tutunov, Phu Sakulwongtana, Haitham Bou Ammar

    Abstract: Recently, $α$-Rank, a graph-based algorithm, has been proposed as a solution to ranking joint policy profiles in large scale multi-agent systems. $α$-Rank claimed tractability through a polynomial time implementation with respect to the total number of pure strategy profiles. Here, we note that inputs to the algorithm were not clearly specified in the original presentation; as such, we deem comple… ▽ More

    Submitted 2 March, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

    Comments: AAMAS 2020 Full Paper

  20. arXiv:1907.13196  [pdf, other

    cs.LG cs.AI stat.ML

    Wasserstein Robust Reinforcement Learning

    Authors: Mohammed Amin Abdullah, Hang Ren, Haitham Bou Ammar, Vladimir Milenkovic, Rui Luo, Mingtian Zhang, Jun Wang

    Abstract: Reinforcement learning algorithms, though successful, tend to over-fit to training environments hampering their application to the real-world. This paper proposes $\text{W}\text{R}^{2}\text{L}$ -- a robust reinforcement learning algorithm with significant robust performance on low and high-dimensional control tasks. Our method formalises robust reinforcement learning as a novel min-max game with a… ▽ More

    Submitted 16 September, 2019; v1 submitted 30 July, 2019; originally announced July 2019.

  21. arXiv:1810.04444  [pdf, other

    cs.AI

    Learning to Communicate Implicitly By Actions

    Authors: Zheng Tian, Shihao Zou, Ian Davies, Tim Warr, Lisheng Wu, Haitham Bou Ammar, Jun Wang

    Abstract: In situations where explicit communication is limited, human collaborators act by learning to: (i) infer meaning behind their partner's actions, and (ii) convey private information about the state to their partner implicitly through actions. The first component of this learning process has been well-studied in multi-agent systems, whereas the second --- which is equally crucial for successful coll… ▽ More

    Submitted 20 November, 2019; v1 submitted 10 October, 2018; originally announced October 2018.

    Comments: AAAI 2020

  22. arXiv:1606.06593  [pdf, other

    cs.DC math.OC

    A Distributed Newton Method for Large Scale Consensus Optimization

    Authors: Rasul Tutunov, Haitham Bou Ammar, Ali Jadbabaie

    Abstract: In this paper, we propose a distributed Newton method for consensus optimization. Our approach outperforms state-of-the-art methods, including ADMM. The key idea is to exploit the sparsity of the dual Hessian and recast the computation of the Newton step as one of efficiently solving symmetric diagonally dominant linear equations. We validate our algorithm both theoretically and empirically. On th… ▽ More

    Submitted 21 June, 2016; originally announced June 2016.

  23. Estimating 3D Trajectories from 2D Projections via Disjunctive Factored Four-Way Conditional Restricted Boltzmann Machines

    Authors: Decebal Constantin Mocanu, Haitham Bou Ammar, Luis Puig, Eric Eaton, Antonio Liotta

    Abstract: Estimation, recognition, and near-future prediction of 3D trajectories based on their two dimensional projections available from one camera source is an exceptionally difficult problem due to uncertainty in the trajectories and environment, high dimensionality of the specific trajectory states, lack of enough labeled data and so on. In this article, we propose a solution to solve this problem base… ▽ More

    Submitted 29 April, 2017; v1 submitted 20 April, 2016; originally announced April 2016.

    Comments: Pattern Recognition, ISSN 0031-3203, Elsevier, 2017

  24. arXiv:1604.03986  [pdf, other

    cs.LG

    Theoretically-Grounded Policy Advice from Multiple Teachers in Reinforcement Learning Settings with Applications to Negative Transfer

    Authors: Yusen Zhan, Haitham Bou Ammar, Matthew E. taylor

    Abstract: Policy advice is a transfer learning method where a student agent is able to learn faster via advice from a teacher. However, both this and other reinforcement learning transfer methods have little theoretical analysis. This paper formally defines a setting where multiple teacher agents can provide advice to a student and introduces an algorithm to leverage both autonomous exploration and teacher'… ▽ More

    Submitted 13 April, 2016; originally announced April 2016.

    Comments: 10 pages, 6 figures, IJCAI 2016 conference paper

  25. arXiv:1505.05798  [pdf, other

    cs.LG

    Safe Policy Search for Lifelong Reinforcement Learning with Sublinear Regret

    Authors: Haitham Bou Ammar, Rasul Tutunov, Eric Eaton

    Abstract: Lifelong reinforcement learning provides a promising framework for develo** versatile agents that can accumulate knowledge over a lifetime of experience and rapidly learn new tasks by building upon prior knowledge. However, current lifelong learning methods exhibit non-vanishing regret as the amount of experience increases and include limitations that can lead to suboptimal or unsafe control pol… ▽ More

    Submitted 21 May, 2015; originally announced May 2015.

  26. arXiv:1503.06883  [pdf, ps, other

    math.OC

    Fast, Accurate Second Order Methods for Network Optimization

    Authors: Rasul Tutunov, Haitham Bou Ammar, Ali Jadbabaie

    Abstract: Dual descent methods are commonly used to solve network flow optimization problems, since their implementation can be distributed over the network. These algorithms, however, often exhibit slow convergence rates. Approximate Newton methods which compute descent directions locally have been proposed as alternatives to accelerate the convergence rates of conventional dual descent. The effectiveness… ▽ More

    Submitted 23 March, 2015; originally announced March 2015.

    Comments: arXiv admin note: text overlap with arXiv:1502.03158

  27. arXiv:1502.03158  [pdf, ps, other

    cs.DC

    A Fast Distributed Solver for Symmetric Diagonally Dominant Linear Equations

    Authors: Rasul Tutunov, Haitham Bou Ammar, Ali Jadbabaie

    Abstract: In this paper, we propose a fast distributed solver for linear equations given by symmetric diagonally dominant M-Matrices. Our approach is based on a distributed implementation of the parallel solver of Spielman and Peng by considering a specific approximated inverse chain which can be computed efficiently in a distributed fashion. Representing the system of equations by a graph $\mathbb{G}$, the… ▽ More

    Submitted 10 February, 2015; originally announced February 2015.

  28. arXiv:1410.8515  [pdf, other

    math.PR cs.SI physics.soc-ph

    On the Degree Distribution of Pólya Urn Graph Processes

    Authors: Rasul Tutunov, Haitham Bou Ammar, Ali Jadbabaie, Eric Eaton

    Abstract: This paper presents a tighter bound on the degree distribution of arbitrary Pólya urn graph processes, proving that the proportion of vertices with degree $d$ obeys a power-law distribution $P(d) \propto d^{-γ}$ for $d \leq n^{\frac{1}{6}-ε}$ for any $ε> 0$, where $n$ represents the number of vertices in the network. Previous work by Bollobás et al. formalized the well-known preferential attachmen… ▽ More

    Submitted 30 October, 2014; originally announced October 2014.

    Comments: 26 pages, 2 figures

    MSC Class: 60Cxx