-
A Knowledge Compilation Take on Binary Polynomial Optimization
Authors:
Florent Capelli,
Alberto Del Pia,
Silvia Di Gregorio
Abstract:
The Binary Polynomial Optimization (BPO) problem is defined as the problem of maximizing a given polynomial function over all binary points. The main contribution of this paper is to draw a novel connection between BPO and the problem of finding the maximal assignment for a Boolean function with weights on variables. This connection allows us to give a strongly polynomial algorithm that solves BPO…
▽ More
The Binary Polynomial Optimization (BPO) problem is defined as the problem of maximizing a given polynomial function over all binary points. The main contribution of this paper is to draw a novel connection between BPO and the problem of finding the maximal assignment for a Boolean function with weights on variables. This connection allows us to give a strongly polynomial algorithm that solves BPO with a hypergraph that is either $β$-acyclic or with bounded incidence treewidth. This result unifies and significantly extends the known tractable classes of BPO. The generality of our technique allows us to deal also with extensions of BPO, where we enforce extended cardinality constraints on the set of binary points, and where we seek $k$ best feasible solutions. We also extend our results to the significantly more general problem where variables are replaced by literals. Preliminary computational results show that the resulting algorithms can be significantly faster than current state-of-the-art.
△ Less
Submitted 31 October, 2023;
originally announced November 2023.
-
Partial Optimality in Cubic Correlation Clustering
Authors:
David Stein,
Silvia Di Gregorio,
Bjoern Andres
Abstract:
The higher-order correlation clustering problem is an expressive model, and recently, local search heuristics have been proposed for several applications. Certifying optimality, however, is NP-hard and practically hampered already by the complexity of the problem statement. Here, we focus on establishing partial optimality conditions for the special case of complete graphs and cubic objective func…
▽ More
The higher-order correlation clustering problem is an expressive model, and recently, local search heuristics have been proposed for several applications. Certifying optimality, however, is NP-hard and practically hampered already by the complexity of the problem statement. Here, we focus on establishing partial optimality conditions for the special case of complete graphs and cubic objective functions. In addition, we define and implement algorithms for testing these conditions and examine their effect numerically, on two datasets.
△ Less
Submitted 31 March, 2023; v1 submitted 9 February, 2023;
originally announced February 2023.
-
A Polyhedral Study of Lifted Multicuts
Authors:
Bjoern Andres,
Silvia Di Gregorio,
Jannik Irmai,
Jan-Hendrik Lange
Abstract:
Fundamental to many applications in data analysis are the decompositions of a graph, i.e. partitions of the node set into component-inducing subsets. One way of encoding decompositions is by multicuts, the subsets of those edges that straddle distinct components. Recently, a lifting of multicuts from a graph $G = (V, E)$ to an augmented graph $\hat G = (V, E \cup F)$ has been proposed in the field…
▽ More
Fundamental to many applications in data analysis are the decompositions of a graph, i.e. partitions of the node set into component-inducing subsets. One way of encoding decompositions is by multicuts, the subsets of those edges that straddle distinct components. Recently, a lifting of multicuts from a graph $G = (V, E)$ to an augmented graph $\hat G = (V, E \cup F)$ has been proposed in the field of image analysis, with the goal of obtaining a more expressive characterization of graph decompositions in which it is made explicit also for pairs $F \subseteq \tbinom{V}{2} \setminus E$ of non-neighboring nodes whether these are in the same or distinct components. In this work, we study in detail the polytope in $\mathbb{R}^{E \cup F}$ whose vertices are precisely the characteristic vectors of multicuts of $\hat G$ lifted from $G$, connecting it, in particular, to the rich body of prior work on the clique partitioning and multilinear polytope.
△ Less
Submitted 16 February, 2022;
originally announced February 2022.
-
SIMDRAM: An End-to-End Framework for Bit-Serial SIMD Computing in DRAM
Authors:
Nastaran Ha**azar,
Geraldo F. Oliveira,
Sven Gregorio,
João Ferreira,
Nika Mansouri Ghiasi,
Minesh Patel,
Mohammed Alser,
Saugata Ghose,
Juan Gómez Luna,
Onur Mutlu
Abstract:
Processing-using-DRAM has been proposed for a limited set of basic operations (i.e., logic operations, addition). However, in order to enable full adoption of processing-using-DRAM, it is necessary to provide support for more complex operations. In this paper, we propose SIMDRAM, a flexible general-purpose processing-using-DRAM framework that (1) enables the efficient implementation of complex ope…
▽ More
Processing-using-DRAM has been proposed for a limited set of basic operations (i.e., logic operations, addition). However, in order to enable full adoption of processing-using-DRAM, it is necessary to provide support for more complex operations. In this paper, we propose SIMDRAM, a flexible general-purpose processing-using-DRAM framework that (1) enables the efficient implementation of complex operations, and (2) provides a flexible mechanism to support the implementation of arbitrary user-defined operations. The SIMDRAM framework comprises three key steps. The first step builds an efficient MAJ/NOT representation of a given desired operation. The second step allocates DRAM rows that are reserved for computation to the operation's input and output operands, and generates the required sequence of DRAM commands to perform the MAJ/NOT implementation of the desired operation in DRAM. The third step uses the SIMDRAM control unit located inside the memory controller to manage the computation of the operation from start to end, by executing the DRAM commands generated in the second step of the framework. We design the hardware and ISA support for SIMDRAM framework to (1) address key system integration challenges, and (2) allow programmers to employ new SIMDRAM operations without hardware changes.
We evaluate SIMDRAM for reliability, area overhead, throughput, and energy efficiency using a wide range of operations and seven real-world applications to demonstrate SIMDRAM's generality. Using 16 DRAM banks, SIMDRAM provides (1) 88x and 5.8x the throughput, and 257x and 31x the energy efficiency, of a CPU and a high-end GPU, respectively, over 16 operations; (2) 21x and 2.1x the performance of the CPU and GPU, over seven real-world applications. SIMDRAM incurs an area overhead of only 0.2% in a high-end CPU.
△ Less
Submitted 30 June, 2021; v1 submitted 26 May, 2021;
originally announced May 2021.
-
SIMDRAM: A Framework for Bit-Serial SIMD Processing Using DRAM
Authors:
Nastaran Ha**azar,
Geraldo F. Oliveira,
Sven Gregorio,
João Dinis Ferreira,
Nika Mansouri Ghiasi,
Minesh Patel,
Mohammed Alser,
Saugata Ghose,
Juan Gómez-Luna,
Onur Mutlu
Abstract:
Processing-using-DRAM has been proposed for a limited set of basic operations (i.e., logic operations, addition). However, in order to enable the full adoption of processing-using-DRAM, it is necessary to provide support for more complex operations. In this paper, we propose SIMDRAM, a flexible general-purpose processing-using-DRAM framework that enables massively-parallel computation of a wide ra…
▽ More
Processing-using-DRAM has been proposed for a limited set of basic operations (i.e., logic operations, addition). However, in order to enable the full adoption of processing-using-DRAM, it is necessary to provide support for more complex operations. In this paper, we propose SIMDRAM, a flexible general-purpose processing-using-DRAM framework that enables massively-parallel computation of a wide range of operations by using each DRAM column as an independent SIMD lane to perform bit-serial operations. SIMDRAM consists of three key steps to enable a desired operation in DRAM: (1) building an efficient majority-based representation of the desired operation, (2) map** the operation input and output operands to DRAM rows and to the required DRAM commands that produce the desired operation, and (3) executing the operation. These three steps ensure efficient computation of any arbitrary and complex operation in DRAM. The first two steps give users the flexibility to efficiently implement and compute any desired operation in DRAM. The third step controls the execution flow of the in-DRAM computation, transparently from the user. We comprehensively evaluate SIMDRAM's reliability, area overhead, operation throughput, and energy efficiency using a wide range of operations and seven diverse real-world kernels to demonstrate its generality. Our results show that SIMDRAM provides up to 5.1x higher operation throughput and 2.5x higher energy efficiency than a state-of-the-art in-DRAM computing mechanism, and up to 2.5x speedup for real-world kernels while incurring less than 1% DRAM chip area overhead. Compared to a CPU and a high-end GPU, SIMDRAM is 257x and 31x more energy-efficient, while providing 93x and 6x higher operation throughput, respectively.
△ Less
Submitted 22 December, 2020;
originally announced December 2020.
-
On the complexity of binary polynomial optimization over acyclic hypergraphs
Authors:
Alberto Del Pia,
Silvia Di Gregorio
Abstract:
In this work we advance the understanding of the fundamental limits of computation for Binary Polynomial Optimization (BPO), which is the problem of maximizing a given polynomial function over all binary points. In our main result we provide a novel class of BPO that can be solved efficiently both from a theoretical and computational perspective. In fact, we give a strongly polynomial-time algorit…
▽ More
In this work we advance the understanding of the fundamental limits of computation for Binary Polynomial Optimization (BPO), which is the problem of maximizing a given polynomial function over all binary points. In our main result we provide a novel class of BPO that can be solved efficiently both from a theoretical and computational perspective. In fact, we give a strongly polynomial-time algorithm for instances whose corresponding hypergraph is beta-acyclic. We note that the beta-acyclicity assumption is natural in several applications including relational database schemes and the lifted multicut problem on trees. Due to the novelty of our proving technique, we obtain an algorithm which is interesting also from a practical viewpoint. This is because our algorithm is very simple to implement and the running time is a polynomial of very low degree in the number of nodes and edges of the hypergraph. Our result completely settles the computational complexity of BPO over acyclic hypergraphs, since the problem is NP-hard on alpha-acyclic instances. Our algorithm can also be applied to any general BPO problem that contains beta-cycles. For these problems, the algorithm returns a smaller instance together with a rule to extend any optimal solution of the smaller instance to an optimal solution of the original instance.
△ Less
Submitted 14 December, 2022; v1 submitted 11 July, 2020;
originally announced July 2020.