-
MurTree: Optimal Classification Trees via Dynamic Programming and Search
Authors:
Emir Demirović,
Anna Lukina,
Emmanuel Hebrard,
Jeffrey Chan,
James Bailey,
Christopher Leckie,
Kotagiri Ramamohanarao,
Peter J. Stuckey
Abstract:
Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy. A commonly criticised point, however, is that the resulting trees may not necessarily be the best representation of the data in terms of accuracy and size. In r…
▽ More
Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy. A commonly criticised point, however, is that the resulting trees may not necessarily be the best representation of the data in terms of accuracy and size. In recent years, this motivated the development of optimal classification tree algorithms that globally optimise the decision tree in contrast to heuristic methods that perform a sequence of locally optimal decisions. We follow this line of work and provide a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our algorithm supports constraints on the depth of the tree and number of nodes. The success of our approach is attributed to a series of specialised techniques that exploit properties unique to classification trees. Whereas algorithms for optimal classification trees have traditionally been plagued by high runtimes and limited scalability, we show in a detailed experimental study that our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances, providing several orders of magnitude improvements and notably contributing towards the practical realisation of optimal decision trees.
△ Less
Submitted 28 June, 2022; v1 submitted 24 July, 2020;
originally announced July 2020.
-
Partial Queries for Constraint Acquisition
Authors:
Christian Bessiere,
Clement Carbonnel,
Anton Dries,
Emmanuel Hebrard,
George Katsirelos,
Nadjib Lazaar,
Nina Narodytska,
Claude-Guy Quimper,
Kostas Stergiou,
Dimosthenis C. Tsouros,
Toby Walsh
Abstract:
Learning constraint networks is known to require a number of membership queries exponential in the number of variables. In this paper, we learn constraint networks by asking the user partial queries. That is, we ask the user to classify assignments to subsets of the variables as positive or negative. We provide an algorithm, called QUACQ, that, given a negative example, focuses onto a constraint o…
▽ More
Learning constraint networks is known to require a number of membership queries exponential in the number of variables. In this paper, we learn constraint networks by asking the user partial queries. That is, we ask the user to classify assignments to subsets of the variables as positive or negative. We provide an algorithm, called QUACQ, that, given a negative example, focuses onto a constraint of the target network in a number of queries logarithmic in the size of the example. The whole constraint network can then be learned with a polynomial number of partial queries. We give information theoretic lower bounds for learning some simple classes of constraint networks and show that our generic algorithm is optimal in some cases.
△ Less
Submitted 12 October, 2021; v1 submitted 14 March, 2020;
originally announced March 2020.
-
Propagation via Kernelization: The Vertex Cover Constraint
Authors:
Clément Carbonnel,
Emmanuel Hébrard
Abstract:
The technique of kernelization consists in extracting, from an instance of a problem, an essentially equivalent instance whose size is bounded in a parameter k. Besides being the basis for efficient param-eterized algorithms, this method also provides a wealth of information to reason about in the context of constraint programming. We study the use of kernelization for designing propagators throug…
▽ More
The technique of kernelization consists in extracting, from an instance of a problem, an essentially equivalent instance whose size is bounded in a parameter k. Besides being the basis for efficient param-eterized algorithms, this method also provides a wealth of information to reason about in the context of constraint programming. We study the use of kernelization for designing propagators through the example of the Vertex Cover constraint. Since the classic kernelization rules often correspond to dominance rather than consistency, we introduce the notion of "loss-less" kernel. While our preliminary experimental results show the potential of the approach, they also show some of its limits. In particular, this method is more effective for vertex covers of large and sparse graphs, as they tend to have, relatively, smaller kernels.
△ Less
Submitted 7 February, 2017;
originally announced February 2017.
-
Constraint programming for planning test campaigns of communications satellites
Authors:
Emmanuel Hébrard,
Marie-José Huguet,
Daniel Veysseire,
Ludivine Sauvan,
Bertrand Cabon
Abstract:
The payload of communications satellites must go through a series of tests to assert their ability to survive in space. Each test involves some equipment of the payload to be active, which has an impact on the temperature of the payload. Sequencing these tests in a way that ensures the thermal stability of the payload and minimizes the overall duration of the test campaign is a very important obje…
▽ More
The payload of communications satellites must go through a series of tests to assert their ability to survive in space. Each test involves some equipment of the payload to be active, which has an impact on the temperature of the payload. Sequencing these tests in a way that ensures the thermal stability of the payload and minimizes the overall duration of the test campaign is a very important objective for satellite manufacturers. The problem can be decomposed in two sub-problems corresponding to two objectives: First, the number of distinct configurations necessary to run the tests must be minimized. This can be modeled as packing the tests into configurations, and we introduce a set of implied constraints to improve the lower bound of the model. Second, tests must be sequenced so that the number of times an equipment unit has to be switched on or off is minimized. We model this aspect using the constraint Switch, where a buffer with limited capacity represents the currently active equipment units, and we introduce an improvement of the propagation algorithm for this constraint. We then introduce a search strategy in which we sequentially solve the sub-problems (packing and sequencing). Experiments conducted on real and random instances show the respective interest of our contributions.
△ Less
Submitted 23 January, 2017;
originally announced January 2017.
-
On Backdoors To Tractable Constraint Languages
Authors:
Clement Carbonnel,
Martin C. Cooper,
Emmanuel Hebrard
Abstract:
In the context of CSPs, a strong backdoor is a subset of variables such that every complete assignment yields a residual instance guaranteed to have a specified property. If the property allows efficient solving, then a small strong backdoor provides a reasonable decomposition of the original instance into easy instances. An important challenge is the design of algorithms that can find quickly a s…
▽ More
In the context of CSPs, a strong backdoor is a subset of variables such that every complete assignment yields a residual instance guaranteed to have a specified property. If the property allows efficient solving, then a small strong backdoor provides a reasonable decomposition of the original instance into easy instances. An important challenge is the design of algorithms that can find quickly a small strong backdoor if one exists. We present a systematic study of the parameterized complexity of backdoor detection when the target property is a restricted type of constraint language defined by means of a family of polymorphisms. In particular, we show that under the weak assumption that the polymorphisms are idempotent, the problem is unlikely to be FPT when the parameter is either r (the constraint arity) or k (the size of the backdoor) unless P = NP or FPT = W[2]. When the parameter is k+r, however, we are able to identify large classes of languages for which the problem of finding a small backdoor is FPT.
△ Less
Submitted 10 October, 2014; v1 submitted 14 April, 2014;
originally announced April 2014.
-
Soft Constraints of Difference and Equality
Authors:
Emmanuel Hebrard,
Dániel Marx,
Barry O'Sullivan,
Igor Razgon
Abstract:
In many combinatorial problems one may need to model the diversity or similarity of assignments in a solution. For example, one may wish to maximise or minimise the number of distinct values in a solution. To formulate problems of this type, we can use soft variants of the well known AllDifferent and AllEqual constraints. We present a taxonomy of six soft global constraints, generated by combinin…
▽ More
In many combinatorial problems one may need to model the diversity or similarity of assignments in a solution. For example, one may wish to maximise or minimise the number of distinct values in a solution. To formulate problems of this type, we can use soft variants of the well known AllDifferent and AllEqual constraints. We present a taxonomy of six soft global constraints, generated by combining the two latter ones and the two standard cost functions, which are either maximised or minimised. We characterise the complexity of achieving arc and bounds consistency on these constraints, resolving those cases for which NP-hardness was neither proven nor disproven. In particular, we explore in depth the constraint ensuring that at least k pairs of variables have a common value. We show that achieving arc consistency is NP-hard, however achieving bounds consistency can be done in polynomial time through dynamic programming. Moreover, we show that the maximum number of pairs of equal variables can be approximated by a factor 1/2 with a linear time greedy algorithm. Finally, we provide a fixed parameter tractable algorithm with respect to the number of values appearing in more than two distinct domains. Interestingly, this taxonomy shows that enforcing equality is harder than enforcing difference.
△ Less
Submitted 16 January, 2014;
originally announced January 2014.
-
Models and Strategies for Variants of the Job Shop Scheduling Problem
Authors:
Diarmuid Grimes,
Emmanuel Hebrard
Abstract:
Recently, a variety of constraint programming and Boolean satisfiability approaches to scheduling problems have been introduced. They have in common the use of relatively simple propagation mechanisms and an adaptive way to focus on the most constrained part of the problem. In some cases, these methods compare favorably to more classical constraint programming methods relying on propagation algori…
▽ More
Recently, a variety of constraint programming and Boolean satisfiability approaches to scheduling problems have been introduced. They have in common the use of relatively simple propagation mechanisms and an adaptive way to focus on the most constrained part of the problem. In some cases, these methods compare favorably to more classical constraint programming methods relying on propagation algorithms for global unary or cumulative resource constraints and dedicated search heuristics. In particular, we described an approach that combines restarting, with a generic adaptive heuristic and solution guided branching on a simple model based on a decomposition of disjunctive constraints. In this paper, we introduce an adaptation of this technique for an important subclass of job shop scheduling problems (JSPs), where the objective function involves minimization of earliness/tardiness costs. We further show that our technique can be improved by adding domain specific information for one variant of the JSP (involving time lag constraints). In particular we introduce a dedicated greedy heuristic, and an improved model for the case where the maximal time lag is 0 (also referred to as no-wait JSPs).
△ Less
Submitted 27 September, 2011;
originally announced September 2011.
-
The Complexity of Reasoning with Global Constraints
Authors:
Christian Bessiere,
Emmanuel Hebrard,
Brahim Hnich,
Toby Walsh
Abstract:
Constraint propagation is one of the techniques central to the success of constraint programming. To reduce search, fast algorithms associated with each constraint prune the domains of variables. With global (or non-binary) constraints, the cost of such propagation may be much greater than the quadratic cost for binary constraints. We therefore study the computational complexity of reasoning wit…
▽ More
Constraint propagation is one of the techniques central to the success of constraint programming. To reduce search, fast algorithms associated with each constraint prune the domains of variables. With global (or non-binary) constraints, the cost of such propagation may be much greater than the quadratic cost for binary constraints. We therefore study the computational complexity of reasoning with global constraints. We first characterise a number of important questions related to constraint propagation. We show that such questions are intractable in general, and identify dependencies between the tractability and intractability of the different questions. We then demonstrate how the tools of computational complexity can be used in the design and analysis of specific global constraints. In particular, we illustrate how computational complexity can be used to determine when a lesser level of local consistency should be enforced, when constraints can be safely generalized, when decomposing constraints will reduce the amount of pruning, and when combining constraints is tractable.
△ Less
Submitted 5 March, 2009;
originally announced March 2009.
-
SLIDE: A Useful Special Case of the CARDPATH Constraint
Authors:
Christian Bessiere,
Emmanuel Hebrard,
Brahim Hnich,
Zeynep Kiziltan,
Toby Walsh
Abstract:
We study the CardPath constraint. This ensures a given constraint holds a number of times down a sequence of variables. We show that SLIDE, a special case of CardPath where the slid constraint must hold always, can be used to encode a wide range of sliding sequence constraints including CardPath itself. We consider how to propagate SLIDE and provide a complete propagator for CardPath. Since prop…
▽ More
We study the CardPath constraint. This ensures a given constraint holds a number of times down a sequence of variables. We show that SLIDE, a special case of CardPath where the slid constraint must hold always, can be used to encode a wide range of sliding sequence constraints including CardPath itself. We consider how to propagate SLIDE and provide a complete propagator for CardPath. Since propagation is NP-hard in general, we identify special cases where propagation takes polynomial time. Our experiments demonstrate that using SLIDE to encode global constraints can be as efficient and effective as specialised propagators.
△ Less
Submitted 3 March, 2009;
originally announced March 2009.
-
The Parameterized Complexity of Global Constraints
Authors:
Christian Bessiere,
Emmanuel Hebrard,
Brahim Hnich,
Zeynep Kiziltan,
Toby Walsh
Abstract:
We argue that parameterized complexity is a useful tool with which to study global constraints. In particular, we show that many global constraints which are intractable to propagate completely have natural parameters which make them fixed-parameter tractable and which are easy to compute. This tractability tends either to be the result of a simple dynamic program or of a decomposition which has…
▽ More
We argue that parameterized complexity is a useful tool with which to study global constraints. In particular, we show that many global constraints which are intractable to propagate completely have natural parameters which make them fixed-parameter tractable and which are easy to compute. This tractability tends either to be the result of a simple dynamic program or of a decomposition which has a strong backdoor of bounded size. This strong backdoor is often a cycle cutset. We also show that parameterized complexity can be used to study other aspects of constraint programming like symmetry breaking. For instance, we prove that value symmetry is fixed-parameter tractable to break in the number of symmetries. Finally, we argue that parameterized complexity can be used to derive results about the approximability of constraint propagation.
△ Less
Submitted 3 March, 2009;
originally announced March 2009.
-
Range and Roots: Two Common Patterns for Specifying and Propagating Counting and Occurrence Constraints
Authors:
Christian Bessiere,
Emmanuel Hebrard,
Brahim Hnich,
Zeynep Kiziltan,
Toby Walsh
Abstract:
We propose Range and Roots which are two common patterns useful for specifying a wide range of counting and occurrence constraints. We design specialised propagation algorithms for these two patterns. Counting and occurrence constraints specified using these patterns thus directly inherit a propagation algorithm. To illustrate the capabilities of the Range and Roots constraints, we specify a num…
▽ More
We propose Range and Roots which are two common patterns useful for specifying a wide range of counting and occurrence constraints. We design specialised propagation algorithms for these two patterns. Counting and occurrence constraints specified using these patterns thus directly inherit a propagation algorithm. To illustrate the capabilities of the Range and Roots constraints, we specify a number of global constraints taken from the literature. Preliminary experiments demonstrate that propagating counting and occurrence constraints using these two patterns leads to a small loss in performance when compared to specialised global constraints and is competitive with alternative decompositions using elementary constraints.
△ Less
Submitted 2 March, 2009;
originally announced March 2009.