Search | arXiv e-print repository

WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks

Authors: Léo Boisvert, Megh Thakkar, Maxime Gasse, Massimo Caccia, Thibault Le Sellier De Chezelles, Quentin Cappart, Nicolas Chapados, Alexandre Lacoste, Alexandre Drouin

Abstract: The ability of large language models (LLMs) to mimic human-like intelligence has led to a surge in LLM-based autonomous agents. Though recent LLMs seem capable of planning and reasoning given user instructions, their effectiveness in applying these capabilities for autonomous task solving remains underexplored. This is especially true in enterprise settings, where automated agents hold the promise… ▽ More The ability of large language models (LLMs) to mimic human-like intelligence has led to a surge in LLM-based autonomous agents. Though recent LLMs seem capable of planning and reasoning given user instructions, their effectiveness in applying these capabilities for autonomous task solving remains underexplored. This is especially true in enterprise settings, where automated agents hold the promise of a high impact. To fill this gap, we propose WorkArena++, a novel benchmark consisting of 682 tasks corresponding to realistic workflows routinely performed by knowledge workers. WorkArena++ is designed to evaluate the planning, problem-solving, logical/arithmetic reasoning, retrieval, and contextual understanding abilities of web agents. Our empirical studies across state-of-the-art LLMs and vision-language models (VLMs), as well as human workers, reveal several challenges for such models to serve as useful assistants in the workplace. In addition to the benchmark, we provide a mechanism to effortlessly generate thousands of ground-truth observation/action traces, which can be used for fine-tuning existing models. Overall, we expect this work to serve as a useful resource to help the community progress toward capable autonomous agents. The benchmark can be found at https://github.com/ServiceNow/WorkArena/tree/workarena-plus-plus. △ Less

Submitted 7 July, 2024; originally announced July 2024.

arXiv:2403.07718 [pdf, other]

WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?

Authors: Alexandre Drouin, Maxime Gasse, Massimo Caccia, Issam H. Laradji, Manuel Del Verme, Tom Marty, Léo Boisvert, Megh Thakkar, Quentin Cappart, David Vazquez, Nicolas Chapados, Alexandre Lacoste

Abstract: We study the use of large language model-based agents for interacting with software via web browsers. Unlike prior work, we focus on measuring the agents' ability to perform tasks that span the typical daily work of knowledge workers utilizing enterprise software systems. To this end, we propose WorkArena, a remote-hosted benchmark of 33 tasks based on the widely-used ServiceNow platform. We also… ▽ More We study the use of large language model-based agents for interacting with software via web browsers. Unlike prior work, we focus on measuring the agents' ability to perform tasks that span the typical daily work of knowledge workers utilizing enterprise software systems. To this end, we propose WorkArena, a remote-hosted benchmark of 33 tasks based on the widely-used ServiceNow platform. We also introduce BrowserGym, an environment for the design and evaluation of such agents, offering a rich set of actions as well as multimodal observations. Our empirical evaluation reveals that while current agents show promise on WorkArena, there remains a considerable gap towards achieving full task automation. Notably, our analysis uncovers a significant performance disparity between open and closed-source LLMs, highlighting a critical area for future exploration and development in the field. △ Less

Submitted 14 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

Comments: 21 pages, 11 figures, preprint

arXiv:2403.06026 [pdf, other]

Towards a Generic Representation of Combinatorial Problems for Learning-Based Approaches

Authors: Léo Boisvert, Hélène Verhaeghe, Quentin Cappart

Abstract: In recent years, there has been a growing interest in using learning-based approaches for solving combinatorial problems, either in an end-to-end manner or in conjunction with traditional optimization algorithms. In both scenarios, the challenge lies in encoding the targeted combinatorial problems into a structure compatible with the learning algorithm. Many existing works have proposed problem-sp… ▽ More In recent years, there has been a growing interest in using learning-based approaches for solving combinatorial problems, either in an end-to-end manner or in conjunction with traditional optimization algorithms. In both scenarios, the challenge lies in encoding the targeted combinatorial problems into a structure compatible with the learning algorithm. Many existing works have proposed problem-specific representations, often in the form of a graph, to leverage the advantages of \textit{graph neural networks}. However, these approaches lack generality, as the representation cannot be easily transferred from one combinatorial problem to another one. While some attempts have been made to bridge this gap, they still offer a partial generality only. In response to this challenge, this paper advocates for progress toward a fully generic representation of combinatorial problems for learning-based approaches. The approach we propose involves constructing a graph by breaking down any constraint of a combinatorial problem into an abstract syntax tree and expressing relationships (e.g., a variable involved in a constraint) through the edges. Furthermore, we introduce a graph neural network architecture capable of efficiently learning from this representation. The tool provided operates on combinatorial problems expressed in the XCSP3 format, handling all the constraints available in the 2023 mini-track competition. Experimental results on four combinatorial problems demonstrate that our architecture achieves performance comparable to dedicated architectures while maintaining generality. Our code and trained models are publicly available at \url{https://github.com/corail-research/learning-generic-csp}. △ Less

Submitted 12 March, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

arXiv:2402.06040 [pdf, other]

Deep Learning for Data-Driven Districting-and-Routing

Authors: Arthur Ferraz, Quentin Cappart, Thibaut Vidal

Abstract: Districting-and-routing is a strategic problem aiming to aggregate basic geographical units (e.g., zip codes) into delivery districts. Its goal is to minimize the expected long-term routing cost of performing deliveries in each district separately. Solving this stochastic problem poses critical challenges since repeatedly evaluating routing costs on a set of scenarios while searching for optimal d… ▽ More Districting-and-routing is a strategic problem aiming to aggregate basic geographical units (e.g., zip codes) into delivery districts. Its goal is to minimize the expected long-term routing cost of performing deliveries in each district separately. Solving this stochastic problem poses critical challenges since repeatedly evaluating routing costs on a set of scenarios while searching for optimal districts takes considerable time. Consequently, solution approaches usually replace the true cost estimation with continuous cost approximation formulas extending Beardwood-Halton-Hammersley and Daganzo's work. These formulas commit errors that can be magnified during the optimization step. To reconcile speed and solution quality, we introduce a supervised learning and optimization methodology leveraging a graph neural network for delivery-cost estimation. This network is trained to imitate known costs generated on a limited subset of training districts. It is used within an iterated local search procedure to produce high-quality districting plans. Our computational experiments, conducted on five metropolitan areas in the United Kingdom, demonstrate that the graph neural network predicts long-term district cost operations more accurately, and that optimizing over this oracle permits large economic gains (10.12% on average) over baseline methods that use continuous approximation formulas or shallow neural networks. Finally, we observe that having compact districts alone does not guarantee high-quality solutions and that other learnable geometrical features of the districts play an essential role. △ Less

Submitted 8 February, 2024; originally announced February 2024.

arXiv:2312.14836 [pdf, other]

Learning Lagrangian Multipliers for the Travelling Salesman Problem

Authors: Augustin Parjadis, Quentin Cappart, Bistra Dilkina, Aaron Ferber, Louis-Martin Rousseau

Abstract: Lagrangian relaxation is a versatile mathematical technique employed to relax constraints in an optimization problem, enabling the generation of dual bounds to prove the optimality of feasible solutions and the design of efficient propagators in constraint programming (such as the weighted circuit constraint). However, the conventional process of deriving Lagrangian multipliers (e.g., using subgra… ▽ More Lagrangian relaxation is a versatile mathematical technique employed to relax constraints in an optimization problem, enabling the generation of dual bounds to prove the optimality of feasible solutions and the design of efficient propagators in constraint programming (such as the weighted circuit constraint). However, the conventional process of deriving Lagrangian multipliers (e.g., using subgradient methods) is often computationally intensive, limiting its practicality for large-scale or time-sensitive problems. To address this challenge, we propose an innovative unsupervised learning approach that harnesses the capabilities of graph neural networks to exploit the problem structure, aiming to generate accurate Lagrangian multipliers efficiently. We apply this technique to the well-known Held-Karp Lagrangian relaxation for the travelling salesman problem. The core idea is to predict accurate Lagrangian multipliers and to employ them as a warm start for generating Held-Karp relaxation bounds. These bounds are subsequently utilized to enhance the filtering process carried out by branch-and-bound algorithms. In contrast to much of the existing literature, which primarily focuses on finding feasible solutions, our approach operates on the dual side, demonstrating that learning can also accelerate the proof of optimality. We conduct experiments across various distributions of the metric travelling salesman problem, considering instances with up to 200 cities. The results illustrate that our approach can improve the filtering level of the weighted circuit global constraint, reduce the optimality gap by a factor two for unsolved instances up to a timeout, and reduce the execution time for solved instances by 10%. △ Less

Submitted 22 December, 2023; originally announced December 2023.

arXiv:2312.08884 [pdf, other]

Global Rewards in Multi-Agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems

Authors: Heiko Hoppe, Tobias Enders, Quentin Cappart, Maximilian Schiffer

Abstract: We study vehicle dispatching in autonomous mobility on demand (AMoD) systems, where a central operator assigns vehicles to customer requests or rejects these with the aim of maximizing its total profit. Recent approaches use multi-agent deep reinforcement learning (MADRL) to realize scalable yet performant algorithms, but train agents based on local rewards, which distorts the reward signal with r… ▽ More We study vehicle dispatching in autonomous mobility on demand (AMoD) systems, where a central operator assigns vehicles to customer requests or rejects these with the aim of maximizing its total profit. Recent approaches use multi-agent deep reinforcement learning (MADRL) to realize scalable yet performant algorithms, but train agents based on local rewards, which distorts the reward signal with respect to the system-wide profit, leading to lower performance. We therefore propose a novel global-rewards-based MADRL algorithm for vehicle dispatching in AMoD systems, which resolves so far existing goal conflicts between the trained agents and the operator by assigning rewards to agents leveraging a counterfactual baseline. Our algorithm shows statistically significant improvements across various settings on real-world data compared to state-of-the-art MADRL algorithms with local rewards. We further provide a structural analysis which shows that the utilization of global rewards can improve implicit vehicle balancing and demand forecasting abilities. Our code is available at https://github.com/tumBAIS/GR-MADRL-AMoD. △ Less

Submitted 19 May, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: 22 pages, 6 figures, extended version of paper accepted at the 6th Learning for Dynamics & Control Conference (L4DC 2024)

arXiv:2312.01404 [pdf, other]

Decision Diagrams in Space!

Authors: Isaac Rudich, Quentin Cappart, Manuel López-Ibáñez, Michael Römer, Louis-Martin Rousseau

Abstract: The Asteroid Routing Problem is like the Traveling Salesman Problem, but in space. The European Space Agency is interested in visiting asteroids to extract minerals, however, the asteroids are moving, and calculating trajectories between asteroids can be computationally expensive. The goal of the Asteroid Routing Problem is to find the optimal path that visits a set of asteroids while taking the m… ▽ More The Asteroid Routing Problem is like the Traveling Salesman Problem, but in space. The European Space Agency is interested in visiting asteroids to extract minerals, however, the asteroids are moving, and calculating trajectories between asteroids can be computationally expensive. The goal of the Asteroid Routing Problem is to find the optimal path that visits a set of asteroids while taking the movement of the asteroids into account. Existing methods of solving this problem do not yield exact solutions. We provide the first exact solution to the Asteroid Routing Problem by using a method of solving optimization problems with decision diagrams called peel-and-bound. We also discuss how this methodology can be used to generate new heuristic search techniques for global trajectory optimization problems. △ Less

Submitted 3 December, 2023; originally announced December 2023.

Comments: 9 pages, 2 figures, submitted to CPAIOR 2024

MSC Class: 90-08

arXiv:2302.05483 [pdf, other]

doi 10.1613/jair.1.14607

Improved Peel-and-Bound: Methods for Generating Dual Bounds with Multivalued Decision Diagrams

Authors: Isaac Rudich, Quentin Cappart, Louis-Martin Rousseau

Abstract: Decision diagrams are an increasingly important tool in cutting-edge solvers for discrete optimization. However, the field of decision diagrams is relatively new, and is still incorporating the library of techniques that conventional solvers have had decades to build. We drew inspiration from the warm-start technique used in conventional solvers to address one of the major challenges faced by deci… ▽ More Decision diagrams are an increasingly important tool in cutting-edge solvers for discrete optimization. However, the field of decision diagrams is relatively new, and is still incorporating the library of techniques that conventional solvers have had decades to build. We drew inspiration from the warm-start technique used in conventional solvers to address one of the major challenges faced by decision diagram based methods. Decision diagrams become more useful the wider they are allowed to be, but also become more costly to generate, especially with large numbers of variables. In the original version of this paper, we presented a method of peeling off a sub-graph of previously constructed diagrams and using it as the initial diagram for subsequent iterations that we call peel-and-bound. We tested the method on the sequence ordering problem, and our results indicate that our peel-and-bound scheme generates stronger bounds than a branch-and-bound scheme using the same propagators, and at significantly less computational cost. In this extended version of the paper, we also propose new methods for using relaxed decision diagrams to improve the solutions found using restricted decision diagrams, discuss the heuristic decisions involved with the parallelization of peel-and-bound, and discuss how peel-and-bound can be hyper-optimized for sequencing problems. Furthermore, we test the new methods on the sequence ordering problem and the traveling salesman problem with time-windows (TSPTW), and include an updated and generalized implementation of the algorithm capable of handling any discrete optimization problem. The new results show that peel-and-bound outperforms ddo (a decision diagram based branch-and-bound solver) on the TSPTW. We also close 15 open benchmark instances of the TSPTW. △ Less

Submitted 17 August, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

Comments: 50 pages, 31 figures, published by JAIR, supplementary materials at https://github.com/IsaacRudich/ImprovedPnB. arXiv admin note: substantial text overlap with arXiv:2205.05216

MSC Class: 90-08

Journal ref: Journal of Artificial Intelligence Research, 77:1489-1538 (2023)

arXiv:2301.01913 [pdf, other]

doi 10.4230/LIPIcs.CP.2023.25

Learning a Generic Value-Selection Heuristic Inside a Constraint Programming Solver

Authors: Tom Marty, Tristan François, Pierre Tessier, Louis Gauthier, Louis-Martin Rousseau, Quentin Cappart

Abstract: Constraint programming is known for being an efficient approach for solving combinatorial problems. Important design choices in a solver are the branching heuristics, which are designed to lead the search to the best solutions in a minimum amount of time. However, develo** these heuristics is a time-consuming process that requires problem-specific expertise. This observation has motivated many e… ▽ More Constraint programming is known for being an efficient approach for solving combinatorial problems. Important design choices in a solver are the branching heuristics, which are designed to lead the search to the best solutions in a minimum amount of time. However, develo** these heuristics is a time-consuming process that requires problem-specific expertise. This observation has motivated many efforts to use machine learning to automatically learn efficient heuristics without expert intervention. To the best of our knowledge, it is still an open research question. Although several generic variable-selection heuristics are available in the literature, the options for a generic value-selection heuristic are more scarce. In this paper, we propose to tackle this issue by introducing a generic learning procedure that can be used to obtain a value-selection heuristic inside a constraint programming solver. This has been achieved thanks to the combination of a deep Q-learning algorithm, a tailored reward signal, and a heterogeneous graph neural network architecture. Experiments on graph coloring, maximum independent set, and maximum cut problems show that our framework is able to find better solutions close to optimality without requiring a large amounts of backtracks while being generic. △ Less

Submitted 2 October, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

Comments: 15 pages

Journal ref: Constraint Programming 29 (2023) 25:1--25:19

arXiv:2205.05216 [pdf, other]

doi 10.4230/LIPIcs.CP.2022.35

Peel-and-Bound: Generating Stronger Relaxed Bounds with Multivalued Decision Diagrams

Authors: Isaac Rudich, Quentin Cappart, Louis-Martin Rousseau

Abstract: Decision diagrams are an increasingly important tool in cutting-edge solvers for discrete optimization. However, the field of decision diagrams is relatively new, and is still incorporating the library of techniques that conventional solvers have had decades to build. We drew inspiration from the warm-start technique used in conventional solvers to address one of the major challenges faced by deci… ▽ More Decision diagrams are an increasingly important tool in cutting-edge solvers for discrete optimization. However, the field of decision diagrams is relatively new, and is still incorporating the library of techniques that conventional solvers have had decades to build. We drew inspiration from the warm-start technique used in conventional solvers to address one of the major challenges faced by decision diagram based methods. Decision diagrams become more useful the wider they are allowed to be, but also become more costly to generate, especially with large numbers of variables. We present a method of peeling off a sub-graph of previously constructed diagrams and using it as the initial diagram for subsequent iterations that we call peel-and-bound. We test the method on the sequence ordering problem, and our results indicate that our peel-and-bound scheme generates stronger bounds than a branch-and-bound scheme using the same propagators, and at significantly less computational cost. △ Less

Submitted 10 May, 2022; originally announced May 2022.

Comments: 20 pages with 11 figures, to be published in CP 2022

MSC Class: 90-08

Journal ref: 28th International Conference on Principles and Practice of Constraint Programming (CP 2022): 235:1-235:20

arXiv:2203.02433 [pdf, ps, other]

The Machine Learning for Combinatorial Optimization Competition (ML4CO): Results and Insights

Authors: Maxime Gasse, Quentin Cappart, Jonas Charfreitag, Laurent Charlin, Didier Chételat, Antonia Chmiela, Justin Dumouchelle, Ambros Gleixner, Aleksandr M. Kazachkov, Elias Khalil, Pawel Lichocki, Andrea Lodi, Miles Lubin, Chris J. Maddison, Christopher Morris, Dimitri J. Papageorgiou, Augustin Parjadis, Sebastian Pokutta, Antoine Prouvost, Lara Scavuzzo, Giulia Zarpellon, Linxin Yang, Sha Lai, Akang Wang, Xiaodong Luo , et al. (16 additional authors not shown)

Abstract: Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused on solving problem instances in isolation, ignoring that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning as a new approach for solving combinatorial problems, either dir… ▽ More Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused on solving problem instances in isolation, ignoring that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning as a new approach for solving combinatorial problems, either directly as solvers or by enhancing exact solvers. Based on this context, the ML4CO aims at improving state-of-the-art combinatorial optimization solvers by replacing key heuristic components. The competition featured three challenging tasks: finding the best feasible solution, producing the tightest optimality certificate, and giving an appropriate solver configuration. Three realistic datasets were considered: balanced item placement, workload apportionment, and maritime inventory routing. This last dataset was kept anonymous for the contestants. △ Less

Submitted 17 March, 2022; v1 submitted 4 March, 2022; originally announced March 2022.

Comments: Neurips 2021 competition. arXiv admin note: text overlap with arXiv:2112.12251 by other authors

arXiv:2112.10229 [pdf, other]

On Causal Inference for Data-free Structured Pruning

Authors: Martin Ferianc, Anush Sankaran, Olivier Mastropietro, Ehsan Saboori, Quentin Cappart

Abstract: Neural networks (NNs) are making a large impact both on research and industry. Nevertheless, as NNs' accuracy increases, it is followed by an expansion in their size, required number of compute operations and energy consumption. Increase in resource consumption results in NNs' reduced adoption rate and real-world deployment impracticality. Therefore, NNs need to be compressed to make them availabl… ▽ More Neural networks (NNs) are making a large impact both on research and industry. Nevertheless, as NNs' accuracy increases, it is followed by an expansion in their size, required number of compute operations and energy consumption. Increase in resource consumption results in NNs' reduced adoption rate and real-world deployment impracticality. Therefore, NNs need to be compressed to make them available to a wider audience and at the same time decrease their runtime costs. In this work, we approach this challenge from a causal inference perspective, and we propose a scoring mechanism to facilitate structured pruning of NNs. The approach is based on measuring mutual information under a maximum entropy perturbation, sequentially propagated through the NN. We demonstrate the method's performance on two datasets and various NNs' sizes, and we show that our approach achieves competitive performance under challenging conditions. △ Less

Submitted 19 December, 2021; originally announced December 2021.

Comments: Accepted to ITCI'22: The AAAI-22 Workshop on Information-Theoretic Methods for Causal Inference and Discovery

arXiv:2102.09544 [pdf, ps, other]

Combinatorial optimization and reasoning with graph neural networks

Authors: Quentin Cappart, Didier Chételat, Elias Khalil, Andrea Lodi, Christopher Morris, Petar Veličković

Abstract: Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused on solving problem instances in isolation, ignoring that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning, especially graph neural networks (GNNs), as a key building bloc… ▽ More Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused on solving problem instances in isolation, ignoring that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning, especially graph neural networks (GNNs), as a key building block for combinatorial tasks, either directly as solvers or by enhancing exact solvers. The inductive bias of GNNs effectively encodes combinatorial and relational input due to their invariance to permutations and awareness of input sparsity. This paper presents a conceptual review of recent key advancements in this emerging field, aiming at optimization and machine learning researchers. △ Less

Submitted 23 September, 2022; v1 submitted 18 February, 2021; originally announced February 2021.

Journal ref: Journal of Machine Learning Research, 24(130):1-61, 2023

arXiv:2102.09193 [pdf, other]

SeaPearl: A Constraint Programming Solver guided by Reinforcement Learning

Authors: Félix Chalumeau, Ilan Coulon, Quentin Cappart, Louis-Martin Rousseau

Abstract: The design of efficient and generic algorithms for solving combinatorial optimization problems has been an active field of research for many years. Standard exact solving approaches are based on a clever and complete enumeration of the solution set. A critical and non-trivial design choice with such methods is the branching strategy, directing how the search is performed. The last decade has shown… ▽ More The design of efficient and generic algorithms for solving combinatorial optimization problems has been an active field of research for many years. Standard exact solving approaches are based on a clever and complete enumeration of the solution set. A critical and non-trivial design choice with such methods is the branching strategy, directing how the search is performed. The last decade has shown an increasing interest in the design of machine learning-based heuristics to solve combinatorial optimization problems. The goal is to leverage knowledge from historical data to solve similar new instances of a problem. Used alone, such heuristics are only able to provide approximate solutions efficiently, but cannot prove optimality nor bounds on their solution. Recent works have shown that reinforcement learning can be successfully used for driving the search phase of constraint programming (CP) solvers. However, it has also been shown that this hybridization is challenging to build, as standard CP frameworks do not natively include machine learning mechanisms, leading to some sources of inefficiencies. This paper presents the proof of concept for SeaPearl, a new CP solver implemented in Julia, that supports machine learning routines in order to learn branching decisions using reinforcement learning. Support for modeling the learning component is also provided. We illustrate the modeling and solution performance of this new solver on two problems. Although not yet competitive with industrial solvers, SeaPearl aims to provide a flexible and open-source framework in order to facilitate future research in the hybridization of constraint programming and machine learning. △ Less

Submitted 20 April, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

arXiv:2006.07054 [pdf, other]

doi 10.4230/LIPIcs.CP.2021.33

Learning the Travelling Salesperson Problem Requires Rethinking Generalization

Authors: Chaitanya K. Joshi, Quentin Cappart, Louis-Martin Rousseau, Thomas Laurent

Abstract: End-to-end training of neural network solvers for graph combinatorial optimization problems such as the Travelling Salesperson Problem (TSP) have seen a surge of interest recently, but remain intractable and inefficient beyond graphs with few hundreds of nodes. While state-of-the-art learning-driven approaches for TSP perform closely to classical solvers when trained on trivially small sizes, they… ▽ More End-to-end training of neural network solvers for graph combinatorial optimization problems such as the Travelling Salesperson Problem (TSP) have seen a surge of interest recently, but remain intractable and inefficient beyond graphs with few hundreds of nodes. While state-of-the-art learning-driven approaches for TSP perform closely to classical solvers when trained on trivially small sizes, they are unable to generalize the learnt policy to larger instances at practical scales. This work presents an end-to-end neural combinatorial optimization pipeline that unifies several recent papers in order to identify the inductive biases, model architectures and learning algorithms that promote generalization to instances larger than those seen in training. Our controlled experiments provide the first principled investigation into such zero-shot generalization, revealing that extrapolating beyond training data requires rethinking the neural combinatorial optimization pipeline, from network layers and learning paradigms to evaluation protocols. Additionally, we analyze recent advances in deep learning for routing problems through the lens of our pipeline and provide new directions to stimulate future research. △ Less

Submitted 25 May, 2022; v1 submitted 12 June, 2020; originally announced June 2020.

Comments: Accepted to the 27th International Conference on Principles and Practice of Constraint Programming (CP 2021) and Constraints (2022). Code and data available at https://github.com/chaitjo/learning-tsp

arXiv:2006.01610 [pdf, other]

Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

Authors: Quentin Cappart, Thierry Moisan, Louis-Martin Rousseau, Isabeau Prémont-Schwarz, Andre Cire

Abstract: Combinatorial optimization has found applications in numerous fields, from aerospace to transportation planning and economics. The goal is to find an optimal solution among a finite set of possibilities. The well-known challenge one faces with combinatorial optimization is the state-space explosion problem: the number of possibilities grows exponentially with the problem size, which makes solving… ▽ More Combinatorial optimization has found applications in numerous fields, from aerospace to transportation planning and economics. The goal is to find an optimal solution among a finite set of possibilities. The well-known challenge one faces with combinatorial optimization is the state-space explosion problem: the number of possibilities grows exponentially with the problem size, which makes solving intractable for large problems. In the last years, deep reinforcement learning (DRL) has shown its promise for designing good heuristics dedicated to solve NP-hard combinatorial optimization problems. However, current approaches have two shortcomings: (1) they mainly focus on the standard travelling salesman problem and they cannot be easily extended to other problems, and (2) they only provide an approximate solution with no systematic ways to improve it or to prove optimality. In another context, constraint programming (CP) is a generic tool to solve combinatorial optimization problems. Based on a complete search procedure, it will always find the optimal solution if we allow an execution time large enough. A critical design choice, that makes CP non-trivial to use in practice, is the branching decision, directing how the search space is explored. In this work, we propose a general and hybrid approach, based on DRL and CP, for solving combinatorial optimization problems. The core of our approach is based on a dynamic programming formulation, that acts as a bridge between both techniques. We experimentally show that our solver is efficient to solve two challenging problems: the traveling salesman problem with time windows, and the 4-moments portfolio optimization problem. Results obtained show that the framework introduced outperforms the stand-alone RL and CP solutions, while being competitive with industrial solvers. △ Less

Submitted 2 June, 2020; originally announced June 2020.

arXiv:1909.13121 [pdf, other]

How to Evaluate Machine Learning Approaches for Combinatorial Optimization: Application to the Travelling Salesman Problem

Authors: Antoine François, Quentin Cappart, Louis-Martin Rousseau

Abstract: Combinatorial optimization is the field devoted to the study and practice of algorithms that solve NP-hard problems. As Machine Learning (ML) and deep learning have popularized, several research groups have started to use ML to solve combinatorial optimization problems, such as the well-known Travelling Salesman Problem (TSP). Based on deep (reinforcement) learning, new models and architecture for… ▽ More Combinatorial optimization is the field devoted to the study and practice of algorithms that solve NP-hard problems. As Machine Learning (ML) and deep learning have popularized, several research groups have started to use ML to solve combinatorial optimization problems, such as the well-known Travelling Salesman Problem (TSP). Based on deep (reinforcement) learning, new models and architecture for the TSP have been successively developed and have gained increasing performances. At the time of writing, state-of-the-art models provide solutions to TSP instances of 100 cities that are roughly 1.33% away from optimal solutions. However, despite these apparently positive results, the performances remain far from those that can be achieved using a specialized search procedure. In this paper, we address the limitations of ML approaches for solving the TSP and investigate two fundamental questions: (1) how can we measure the level of accuracy of the pure ML component of such methods; and (2) what is the impact of a search procedure plugged inside a ML model on the performances? To answer these questions, we propose a new metric, ratio of optimal decisions (ROD), based on a fair comparison with a parametrized oracle, mimicking a ML model with a controlled accuracy. All the experiments are carried out on four state-of-the-art ML approaches dedicated to solve the TSP. Finally, we made ROD open-source in order to ease future research in the field. △ Less

Submitted 28 September, 2019; originally announced September 2019.

arXiv:1809.03359 [pdf, other]

Improving Optimization Bounds using Machine Learning: Decision Diagrams meet Deep Reinforcement Learning

Authors: Quentin Cappart, Emmanuel Goutierre, David Bergman, Louis-Martin Rousseau

Abstract: Finding tight bounds on the optimal solution is a critical element of practical solution methods for discrete optimization problems. In the last decade, decision diagrams (DDs) have brought a new perspective on obtaining upper and lower bounds that can be significantly better than classical bounding mechanisms, such as linear relaxations. It is well known that the quality of the bounds achieved th… ▽ More Finding tight bounds on the optimal solution is a critical element of practical solution methods for discrete optimization problems. In the last decade, decision diagrams (DDs) have brought a new perspective on obtaining upper and lower bounds that can be significantly better than classical bounding mechanisms, such as linear relaxations. It is well known that the quality of the bounds achieved through this flexible bounding method is highly reliant on the ordering of variables chosen for building the diagram, and finding an ordering that optimizes standard metrics is an NP-hard problem. In this paper, we propose an innovative and generic approach based on deep reinforcement learning for obtaining an ordering for tightening the bounds obtained with relaxed and restricted DDs. We apply the approach to both the Maximum Independent Set Problem and the Maximum Cut Problem. Experimental results on synthetic instances show that the deep reinforcement learning approach, by achieving tighter objective function bounds, generally outperforms ordering methods commonly used in the literature when the distribution of instances is known. To the best knowledge of the authors, this is the first paper to apply machine learning to directly improve relaxation bounds obtained by general-purpose bounding mechanisms for combinatorial optimization problems. △ Less

Submitted 27 February, 2019; v1 submitted 10 September, 2018; originally announced September 2018.

Comments: Accepted and presented at AAAI'19

arXiv:1605.06245 [pdf, other]

Verification of railway interlocking - Compositional approach with OCRA

Authors: Christophe Limbree, Quentin Cappart, Charles Pecheur, Stefano Tonetta

Abstract: In the railway domain, an electronic interlocking is a computerised system that controls the railway signalling components (e.g. switches or signals) in order to allow a safe operation of the train traffic. Interlockings are controlled by a software logic that relies on a generic software and a set of application data particular to the station under control. The verification of the application dat… ▽ More In the railway domain, an electronic interlocking is a computerised system that controls the railway signalling components (e.g. switches or signals) in order to allow a safe operation of the train traffic. Interlockings are controlled by a software logic that relies on a generic software and a set of application data particular to the station under control. The verification of the application data is time consuming and error prone as it is mostly performed by human testers. In the first stage of our research, we built a model of a small Belgian railway station and we performed the verification of the application data with the nusmv model checker. However, the verification of larger stations fails due to the state space explosion problem. The intuition is that large stations can be split into smaller components that can be verified separately. This concept is known as compositional verification. This article explains how we used the ocra tool in order to model a medium size station and how we verified safety properties by mean of contracts. We also took advantage of new algorithms (k-liveness and ic3) recently implemented in nuxmv in order to verify LTL properties on our model. △ Less

Submitted 20 May, 2016; originally announced May 2016.

Comments: 16 pages

arXiv:1605.02529 [pdf, other]

doi 10.1109/HASE.2017.10

Verification of interlocking systems using statistical model checking

Authors: Quentin Cappart, Christophe Limbree, Pierre Schaus, Jean Quilbeuf, Louis-Marie Traonouez, Axel Legay

Abstract: In the railway domain, an interlocking is the system ensuring safe train traffic inside a station by controlling its active elements such as the signals or points. Modern interlockings are configured using particular data, called application data, reflecting the track layout and defining the actions that the interlocking can take. The safety of the train traffic relies thereby on application data… ▽ More In the railway domain, an interlocking is the system ensuring safe train traffic inside a station by controlling its active elements such as the signals or points. Modern interlockings are configured using particular data, called application data, reflecting the track layout and defining the actions that the interlocking can take. The safety of the train traffic relies thereby on application data correctness, errors inside them can cause safety issues such as derailments or collisions. Given the high level of safety required by such a system, its verification is a critical concern. In addition to the safety, an interlocking must also ensure that availability properties, stating that no train would be stopped forever in a station, are satisfied. Most of the research dealing with this verification relies on model checking. However, due to the state space explosion problem, this approach does not scale for large stations. More recently, a discrete event simulation approach limiting the verification to a set of likely scenarios, was proposed. The simulation enables the verification of larger stations, but with no proof that all the interesting scenarios are covered by the simulation. In this paper, we apply an intermediate statistical model checking approach, offering both the advantages of model checking and simulation. Even if exhaustiveness is not obtained, statistical model checking evaluates with a parametrizable confidence the reliability and the availability of the entire system. △ Less

Submitted 4 August, 2017; v1 submitted 9 May, 2016; originally announced May 2016.

Comments: 12 pages, 3 figures, 2 tables

Journal ref: IEEE 18th International Symposium on High Assurance Systems Engineering (HASE), Singapore, pp. 61-68 (2017)

arXiv:1506.03554 [pdf, other]

doi 10.4204/EPTCS.184.2

Verification of railway interlocking systems

Authors: Simon Busard, Quentin Cappart, Christophe Limbrée, Charles Pecheur, Pierre Schaus

Abstract: In the railway domain, an interlocking is a computerised system that controls the railway signalling objects in order to allow a safe operation of the train traffic. Each interlocking makes use of particular data, called application data, that reflects the track layout of the station under control. The verification and validation of the application data are performed manually and is thus error-pr… ▽ More In the railway domain, an interlocking is a computerised system that controls the railway signalling objects in order to allow a safe operation of the train traffic. Each interlocking makes use of particular data, called application data, that reflects the track layout of the station under control. The verification and validation of the application data are performed manually and is thus error-prone and costly. In this paper, we explain how we built an executable model in NuSMV of a railway interlocking based on the application data. We also detail the tool that we have developed in order to translate the application data into our model automatically. Finally we show how we could verify a realistic set of safety properties on a real-size station model by customizing the existing model-checking algorithm with PyNuSMV a Python library based on NuSMV. △ Less

Submitted 11 June, 2015; originally announced June 2015.

Comments: In Proceedings ESSS 2015, arXiv:1506.03250

Journal ref: EPTCS 184, 2015, pp. 19-31

Showing 1–21 of 21 results for author: Cappart, Q