Search | arXiv e-print repository

Approximating the Maximum Independent Set of Convex Polygons with a Bounded Number of Directions

Authors: Fabrizio Grandoni, Edin Husić, Mathieu Mari, Antoine Tinguely

Abstract: In the maximum independent set of convex polygons problem, we are given a set of $n$ convex polygons in the plane with the objective of selecting a maximum cardinality subset of non-overlap** polygons. Here we study a special case of the problem where the edges of the polygons can take at most $d$ fixed directions. We present an $8d/3$-approximation algorithm for this problem running in time… ▽ More In the maximum independent set of convex polygons problem, we are given a set of $n$ convex polygons in the plane with the objective of selecting a maximum cardinality subset of non-overlap** polygons. Here we study a special case of the problem where the edges of the polygons can take at most $d$ fixed directions. We present an $8d/3$-approximation algorithm for this problem running in time $O((nd)^{O(d4^d)})$. The previous-best polynomial-time approximation (for constant $d$) was a classical $n^\varepsilon$ approximation by Fox and Pach [SODA'11] that has recently been improved to a $OPT^{\varepsilon}$-approximation algorithm by Cslovjecsek, Pilipczuk and Węgrzycki [SODA '24], which also extends to an arbitrary set of convex polygons. Our result builds on, and generalizes the recent constant factor approximation algorithms for the maximum independent set of axis-parallel rectangles problem (which is a special case of our problem with $d=2$) by Mitchell [FOCS'21] and Gálvez, Khan, Mari, Mömke, Reddy, and Wiese [SODA'22]. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: To appear at SoCG 2024

ACM Class: F.2.2

arXiv:2212.07492 [pdf, other]

Machine Learning Coarse-Grained Potentials of Protein Thermodynamics

Authors: Maciej Majewski, Adrià Pérez, Philipp Thölke, Stefan Doerr, Nicholas E. Charron, Toni Giorgino, Brooke E. Husic, Cecilia Clementi, Frank Noé, Gianni De Fabritiis

Abstract: A generalized understanding of protein dynamics is an unsolved scientific problem, the solution of which is critical to the interpretation of the structure-function relationships that govern essential biological processes. Here, we approach this problem by constructing coarse-grained molecular potentials based on artificial neural networks and grounded in statistical mechanics. For training, we bu… ▽ More A generalized understanding of protein dynamics is an unsolved scientific problem, the solution of which is critical to the interpretation of the structure-function relationships that govern essential biological processes. Here, we approach this problem by constructing coarse-grained molecular potentials based on artificial neural networks and grounded in statistical mechanics. For training, we build a unique dataset of unbiased all-atom molecular dynamics simulations of approximately 9 ms for twelve different proteins with multiple secondary structure arrangements. The coarse-grained models are capable of accelerating the dynamics by more than three orders of magnitude while preserving the thermodynamics of the systems. Coarse-grained simulations identify relevant structural states in the ensemble with comparable energetics to the all-atom systems. Furthermore, we show that a single coarse-grained potential can integrate all twelve proteins and can capture experimental structural features of mutated proteins. These results indicate that machine learning coarse-grained potentials could provide a feasible approach to simulate and understand protein dynamics. △ Less

Submitted 14 December, 2022; originally announced December 2022.

arXiv:2211.03883 [pdf, ps, other]

Approximating Nash Social Welfare by Matching and Local Search

Authors: Jugal Garg, Edin Husić, Wenzheng Li, László A. Végh, Jan Vondrák

Abstract: For any $\varepsilon>0$, we give a simple, deterministic $(4+\varepsilon)$-approximation algorithm for the Nash social welfare (NSW) problem under submodular valuations. The previous best approximation factor was $380$ via a randomized algorithm. We also consider the asymmetric variant of the problem, where the objective is to maximize the weighted geometric mean of agents' valuations, and give an… ▽ More For any $\varepsilon>0$, we give a simple, deterministic $(4+\varepsilon)$-approximation algorithm for the Nash social welfare (NSW) problem under submodular valuations. The previous best approximation factor was $380$ via a randomized algorithm. We also consider the asymmetric variant of the problem, where the objective is to maximize the weighted geometric mean of agents' valuations, and give an $(ω+ 2 +\varepsilon) e$-approximation if the ratio between the largest weight and the average weight is at most $ω$. We also show that the $1/2$-EFX envy-freeness property can be attained simultaneously with a constant-factor approximation. More precisely, we can find an allocation in polynomial time which is both $1/2$-EFX and a $(8+\varepsilon)$-approximation to the symmetric NSW problem under submodular valuations. The previous best approximation factor under $1/2$-EFX was linear in the number of agents. △ Less

Submitted 29 March, 2023; v1 submitted 7 November, 2022; originally announced November 2022.

Comments: 28 pages, 1 figure. To appear in STOC 2023

arXiv:2209.09896 [pdf, ps, other]

On the Correlation Gap of Matroids

Authors: Edin Husić, Zhuan Khye Koh, Georg Loho, László A. Végh

Abstract: A set function can be extended to the unit cube in various ways; the correlation gap measures the ratio between two natural extensions. This quantity has been identified as the performance guarantee in a range of approximation algorithms and mechanism design settings. It is known that the correlation gap of a monotone submodular function is at least $1-1/e$, and this is tight for simple matroid ra… ▽ More A set function can be extended to the unit cube in various ways; the correlation gap measures the ratio between two natural extensions. This quantity has been identified as the performance guarantee in a range of approximation algorithms and mechanism design settings. It is known that the correlation gap of a monotone submodular function is at least $1-1/e$, and this is tight for simple matroid rank functions. We initiate a fine-grained study of the correlation gap of matroid rank functions. In particular, we present an improved lower bound on the correlation gap as parametrized by the rank and girth of the matroid. We also show that for any matroid, the correlation gap of its weighted matroid rank function is minimized under uniform weights. Such improved lower bounds have direct applications for submodular maximization under matroid constraints, mechanism design, and contention resolution schemes. △ Less

Submitted 21 June, 2024; v1 submitted 20 September, 2022; originally announced September 2022.

arXiv:2112.10199 [pdf, ps, other]

Tractable Fragments of the Maximum Nash Welfare Problem

Authors: Jugal Garg, Edin Husić, Aniket Murhekar, László Végh

Abstract: We study the problem of maximizing Nash welfare (MNW) while allocating indivisible goods to asymmetric agents. The Nash welfare of an allocation is the weighted geometric mean of agents' utilities, and the allocation with maximum Nash welfare is known to satisfy several desirable fairness and efficiency properties. However, computing such an MNW allocation is NP-hard, even for two agents with iden… ▽ More We study the problem of maximizing Nash welfare (MNW) while allocating indivisible goods to asymmetric agents. The Nash welfare of an allocation is the weighted geometric mean of agents' utilities, and the allocation with maximum Nash welfare is known to satisfy several desirable fairness and efficiency properties. However, computing such an MNW allocation is NP-hard, even for two agents with identical, additive valuations. Hence, we aim to identify tractable classes that either admit a PTAS, an FPTAS, or an exact polynomial-time algorithm. To this end, we design a PTAS for finding an MNW allocation for the case of asymmetric agents with identical, additive valuations, thus generalizing a similar result for symmetric agents. Our techniques can also be adapted to give a PTAS for the problem of computing the optimal $p$-mean welfare. We also show that an MNW allocation can be computed exactly in polynomial time for identical agents with $k$-ary valuations when $k$ is a constant, where every agent has at most $k$ different values for the goods. Next, we consider the special case where every agent finds at most two goods valuable, and show that this class admits an efficient algorithm, even for general monotone valuations. In contrast, we note that when agents can value three or more goods, maximizing Nash welfare is NP-hard, even when agents are symmetric and have additive valuations, showing our algorithmic result is essentially tight. Finally, we show that for constantly many asymmetric agents with additive valuations, the MNW problem admits an FPTAS. △ Less

Submitted 28 April, 2022; v1 submitted 19 December, 2021; originally announced December 2021.

Comments: 20 pages

arXiv:2110.15013 [pdf, other]

doi 10.1088/2632-2153/ac3de0

Deeptime: a Python library for machine learning dynamical models from time series data

Authors: Moritz Hoffmann, Martin Scherer, Tim Hempel, Andreas Mardt, Brian de Silva, Brooke E. Husic, Stefan Klus, Hao Wu, Nathan Kutz, Steven L. Brunton, Frank Noé

Abstract: Generation and analysis of time-series data is relevant to many quantitative fields ranging from economics to fluid mechanics. In the physical sciences, structures such as metastable and coherent sets, slow relaxation processes, collective variables dominant transition pathways or manifolds and channels of probability flow can be of great importance for understanding and characterizing the kinetic… ▽ More Generation and analysis of time-series data is relevant to many quantitative fields ranging from economics to fluid mechanics. In the physical sciences, structures such as metastable and coherent sets, slow relaxation processes, collective variables dominant transition pathways or manifolds and channels of probability flow can be of great importance for understanding and characterizing the kinetic, thermodynamic and mechanistic properties of the system. Deeptime is a general purpose Python library offering various tools to estimate dynamical models based on time-series data including conventional linear learning methods, such as Markov state models (MSMs), Hidden Markov Models and Koopman models, as well as kernel and deep learning approaches such as VAMPnets and deep MSMs. The library is largely compatible with scikit-learn, having a range of Estimator classes for these different models, but in contrast to scikit-learn also provides deep Model classes, e.g. in the case of an MSM, which provide a multitude of analysis methods to compute interesting thermodynamic, kinetic and dynamical quantities, such as free energies, relaxation times and transition paths. The library is designed for ease of use but also easily maintainable and extensible code. In this paper we introduce the main features and structure of the deeptime software. △ Less

Submitted 11 December, 2021; v1 submitted 28 October, 2021; originally announced October 2021.

Journal ref: Machine Learning: Science and Technology, Volume 3, Number 1, 2021

arXiv:2107.06961 [pdf, ps, other]

On complete classes of valuated matroids

Authors: Edin Husić, Georg Loho, Ben Smith, László A. Végh

Abstract: We characterize a rich class of valuated matroids, called R-minor valuated matroids that includes the indicator functions of matroids, and is closed under operations such as taking minors, duality, and induction by network. We exhibit a family of valuated matroids that are not R-minor based on sparse paving matroids. Valuated matroids are inherently related to gross substitute valuations in mathem… ▽ More We characterize a rich class of valuated matroids, called R-minor valuated matroids that includes the indicator functions of matroids, and is closed under operations such as taking minors, duality, and induction by network. We exhibit a family of valuated matroids that are not R-minor based on sparse paving matroids. Valuated matroids are inherently related to gross substitute valuations in mathematical economics. By the same token we refute the Matroid Based Valuation Conjecture by Ostrovsky and Paes Leme (Theoretical Economics 2015) asserting that every gross substitute valuation arises from weighted matroid rank functions by repeated applications of merge and endowment operations. Our result also has implications in the context of Lorentzian polynomials: it reveals the limitations of known construction operations. △ Less

Submitted 27 February, 2023; v1 submitted 14 July, 2021; originally announced July 2021.

Comments: 53 pages, 13 figures. An extended abstract appeared in SODA 2022

arXiv:2009.14793 [pdf, ps, other]

Approximating Nash Social Welfare under Rado Valuations

Authors: Jugal Garg, Edin Husic, Laszlo A. Vegh

Abstract: We consider the problem of approximating maximum Nash social welfare (NSW) while allocating a set of indivisible items to $n$ agents. The NSW is a popular objective that provides a balanced tradeoff between the often conflicting requirements of fairness and efficiency, defined as the weighted geometric mean of agents' valuations. For the symmetric additive case of the problem, where agents have th… ▽ More We consider the problem of approximating maximum Nash social welfare (NSW) while allocating a set of indivisible items to $n$ agents. The NSW is a popular objective that provides a balanced tradeoff between the often conflicting requirements of fairness and efficiency, defined as the weighted geometric mean of agents' valuations. For the symmetric additive case of the problem, where agents have the same weight with additive valuations, the first constant-factor approximation algorithm was obtained in 2015. This led to a flurry of work obtaining constant-factor approximation algorithms for the symmetric case under mild generalizations of additive, and $O(n)$-approximation algorithms for more general valuations and for the asymmetric case. In this paper, we make significant progress towards both symmetric and asymmetric NSW problems. We present the first constant-factor approximation algorithm for the symmetric case under Rado valuations. Rado valuations form a general class of valuation functions that arise from maximum cost independent matching problems, including as special cases assignment (OXS) valuations and weighted matroid rank functions. Furthermore, our approach also gives the first constant-factor approximation algorithm for the asymmetric case under Rado valuations, provided that the maximum ratio between the weights is bounded by a constant. △ Less

Submitted 30 September, 2020; originally announced September 2020.

Comments: 44 pages, 3 figures

arXiv:2007.09768 [pdf, other]

FPT Algorithms for Finding Near-Cliques in $c$-Closed Graphs

Authors: Balaram Behera, Edin Husić, Shweta Jain, Tim Roughgarden, C. Seshadhri

Abstract: Finding large cliques or cliques missing a few edges is a fundamental algorithmic task in the study of real-world graphs, with applications in community detection, pattern recognition, and clustering. A number of effective backtracking-based heuristics for these problems have emerged from recent empirical work in social network analysis. Given the NP-hardness of variants of clique counting, these… ▽ More Finding large cliques or cliques missing a few edges is a fundamental algorithmic task in the study of real-world graphs, with applications in community detection, pattern recognition, and clustering. A number of effective backtracking-based heuristics for these problems have emerged from recent empirical work in social network analysis. Given the NP-hardness of variants of clique counting, these results raise a challenge for beyond worst-case analysis of these problems. Inspired by the triadic closure of real-world graphs, Fox et al. (SICOMP 2020) introduced the notion of $c$-closed graphs and proved that maximal clique enumeration is fixed-parameter tractable with respect to $c$. In practice, due to noise in data, one wishes to actually discover "near-cliques", which can be characterized as cliques with a sparse subgraph removed. In this work, we prove that many different kinds of maximal near-cliques can be enumerated in polynomial time (and FPT in $c$) for $c$-closed graphs. We study various established notions of such substructures, including $k$-plexes, complements of bounded-degeneracy and bounded-treewidth graphs. Interestingly, our algorithms follow relatively simple backtracking procedures, analogous to what is done in practice. Our results underscore the significance of the $c$-closed graph class for theoretical understanding of social network analysis. △ Less

Submitted 19 November, 2021; v1 submitted 19 July, 2020; originally announced July 2020.

Comments: Accepted to ITCS 2022

MSC Class: 68W01; 68R10; 05C85

arXiv:1908.07948 [pdf, other]

Auction Algorithms for Market Equilibrium with Weak Gross Substitute Demands

Authors: Jugal Garg, Edin Husić, László A. Végh

Abstract: We consider the Arrow--Debreu exchange market model under the assumption that the agents' demands satisfy the weak gross substitutes (WGS) property. We present a simple auction algorithm that obtains an approximate market equilibrium for WGS demands assuming the availability of a price update oracle. We exhibit specific implementations of such an oracle for WGS demands with bounded price elasticit… ▽ More We consider the Arrow--Debreu exchange market model under the assumption that the agents' demands satisfy the weak gross substitutes (WGS) property. We present a simple auction algorithm that obtains an approximate market equilibrium for WGS demands assuming the availability of a price update oracle. We exhibit specific implementations of such an oracle for WGS demands with bounded price elasticities and for Gale demand systems. As an application of our result, we obtain an efficient algorithm to find an approximate spending-restricted market equilibrium for WGS demands, a model that has been recently introduced as a continuous relaxation of the Nash social welfare (NSW) problem. This leads to a polynomial-time constant factor approximation algorithm for the NSW problem with capped additive separable piecewise linear utility functions; only a pseudopolynomial approximation algorithm was known for this setting previously. △ Less

Submitted 1 May, 2022; v1 submitted 21 August, 2019; originally announced August 2019.

Comments: 42 pages, 1 figure. A preliminary version appeared in STACS 2021

arXiv:1807.04427 [pdf, other]

doi 10.1371/journal.pone.0212442

Simultaneous Coherent Structure Coloring facilitates interpretable clustering of scientific data by amplifying dissimilarity

Authors: Brooke E. Husic, Kristy L. Schlueter-Kuck, John O. Dabiri

Abstract: The clustering of data into physically meaningful subsets often requires assumptions regarding the number, size, or shape of the subgroups. Here, we present a new method, simultaneous coherent structure coloring (sCSC), which accomplishes the task of unsupervised clustering without a priori guidance regarding the underlying structure of the data. sCSC performs a sequence of binary splittings on th… ▽ More The clustering of data into physically meaningful subsets often requires assumptions regarding the number, size, or shape of the subgroups. Here, we present a new method, simultaneous coherent structure coloring (sCSC), which accomplishes the task of unsupervised clustering without a priori guidance regarding the underlying structure of the data. sCSC performs a sequence of binary splittings on the dataset such that the most dissimilar data points are required to be in separate clusters. To achieve this, we obtain a set of orthogonal coordinates along which dissimilarity in the dataset is maximized from a generalized eigenvalue problem based on the pairwise dissimilarity between the data points to be clustered. This sequence of bifurcations produces a binary tree representation of the system, from which the number of clusters in the data and their interrelationships naturally emerge. To illustrate the effectiveness of the method in the absence of a priori assumptions, we apply it to three exemplary problems in fluid dynamics. Then, we illustrate its capacity for interpretability using a high-dimensional protein folding simulation dataset. While we restrict our examples to dynamical physical systems in this work, we anticipate straightforward translation to other fields where existing analysis tools require ad hoc assumptions on the data structure, lack the interpretability of the present method, or in which the underlying processes are less accessible, such as genomics and neuroscience. △ Less

Submitted 13 March, 2019; v1 submitted 12 July, 2018; originally announced July 2018.

Journal ref: PLoS ONE 14(3): e0212442 (2019)

arXiv:1803.04465 [pdf, other]

PotentialNet for Molecular Property Prediction

Authors: Evan N. Feinberg, Debnil Sur, Zhenqin Wu, Brooke E. Husic, Huanghao Mai, Yang Li, Saisai Sun, Jianyi Yang, Bharath Ramsundar, Vijay S. Pande

Abstract: The arc of drug discovery entails a multiparameter optimization problem spanning vast length scales. They key parameters range from solubility (angstroms) to protein-ligand binding (nanometers) to in vivo toxicity (meters). Through feature learning---instead of feature engineering---deep neural networks promise to outperform both traditional physics-based and knowledge-based machine learning model… ▽ More The arc of drug discovery entails a multiparameter optimization problem spanning vast length scales. They key parameters range from solubility (angstroms) to protein-ligand binding (nanometers) to in vivo toxicity (meters). Through feature learning---instead of feature engineering---deep neural networks promise to outperform both traditional physics-based and knowledge-based machine learning models for predicting molecular properties pertinent to drug discovery. To this end, we present the PotentialNet family of graph convolutions. These models are specifically designed for and achieve state-of-the-art performance for protein-ligand binding affinity. We further validate these deep neural networks by setting new standards of performance in several ligand-based tasks. In parallel, we introduce a new metric, the Regression Enrichment Factor $EF_χ^{(R)}$, to measure the early enrichment of computational models for chemical data. Finally, we introduce a cross-validation strategy based on structural homology clustering that can more accurately measure model generalizability, which crucially distinguishes the aims of machine learning for drug discovery from standard machine learning tasks. △ Less

Submitted 22 October, 2018; v1 submitted 12 March, 2018; originally announced March 2018.

Comments: 13 pages, 5 figures, 8 tables

arXiv:1701.05492 [pdf, other]

doi 10.1145/3182178

Perfect phylogenies via branchings in acyclic digraphs and a generalization of Dilworth's theorem

Authors: Ademir Hujdurović, Edin Husić, Martin Milanič, Romeo Rizzi, Alexandru I. Tomescu

Abstract: Motivated by applications in cancer genomics and following the work of Hajirasouliha and Raphael (WABI 2014), Hujdurović et al. (IEEE TCBB, to appear) introduced the minimum conflict-free row split (MCRS) problem: split each row of a given binary matrix into a bitwise OR of a set of rows so that the resulting matrix corresponds to a perfect phylogeny and has the minimum possible number of rows amo… ▽ More Motivated by applications in cancer genomics and following the work of Hajirasouliha and Raphael (WABI 2014), Hujdurović et al. (IEEE TCBB, to appear) introduced the minimum conflict-free row split (MCRS) problem: split each row of a given binary matrix into a bitwise OR of a set of rows so that the resulting matrix corresponds to a perfect phylogeny and has the minimum possible number of rows among all matrices with this property. Hajirasouliha and Raphael also proposed the study of a similar problem, in which the task is to minimize the number of distinct rows of the resulting matrix. Hujdurović et al. proved that both problems are NP-hard, gave a related characterization of transitively orientable graphs, and proposed a polynomial-time heuristic algorithm for the MCRS problem based on coloring cocomparability graphs. We give new, more transparent formulations of the two problems, showing that the problems are equivalent to two optimization problems on branchings in a derived directed acyclic graph. Building on these formulations, we obtain new results on the two problems, including: (i) a strengthening of the heuristic by Hujdurović et al. via a new min-max result in digraphs generalizing Dilworth's theorem, which may be of independent interest, (ii) APX-hardness results for both problems, (iii) approximation algorithms, and (iv) exponential-time algorithms solving the two problems to optimality faster than the naïve brute-force approach. Our work relates to several well studied notions in combinatorial optimization: chain partitions in partially ordered sets, laminar hypergraphs, and (classical and weighted) colorings of graphs. △ Less

Submitted 27 January, 2018; v1 submitted 19 January, 2017; originally announced January 2017.

Comments: 29 pages, 10 figures, extended abstract appeared in Proceedings of WG 2017, full paper accepted for publication in ACM Transactions on Algorithms

Showing 1–13 of 13 results for author: Husić, E