-
Efficiently Computing Similarities to Private Datasets
Authors:
Arturs Backurs,
Zinan Lin,
Sepideh Mahabadi,
Sandeep Silwal,
Jakub Tarnawski
Abstract:
Many methods in differentially private model training rely on computing the similarity between a query point (such as public or synthetic data) and private data. We abstract out this common subroutine and study the following fundamental algorithmic problem: Given a similarity function $f$ and a large high-dimensional private dataset $X \subset \mathbb{R}^d$, output a differentially private (DP) da…
▽ More
Many methods in differentially private model training rely on computing the similarity between a query point (such as public or synthetic data) and private data. We abstract out this common subroutine and study the following fundamental algorithmic problem: Given a similarity function $f$ and a large high-dimensional private dataset $X \subset \mathbb{R}^d$, output a differentially private (DP) data structure which approximates $\sum_{x \in X} f(x,y)$ for any query $y$. We consider the cases where $f$ is a kernel function, such as $f(x,y) = e^{-\|x-y\|_2^2/σ^2}$ (also known as DP kernel density estimation), or a distance function such as $f(x,y) = \|x-y\|_2$, among others.
Our theoretical results improve upon prior work and give better privacy-utility trade-offs as well as faster query times for a wide range of kernels and distance functions. The unifying approach behind our results is leveraging `low-dimensional structures' present in the specific functions $f$ that we study, using tools such as provable dimensionality reduction, approximation theory, and one-dimensional decomposition of the functions. Our algorithms empirically exhibit improved query times and accuracy over prior state of the art. We also present an application to DP classification. Our experiments demonstrate that the simple methodology of classifying based on average similarity is orders of magnitude faster than prior DP-SGD based approaches for comparable accuracy.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
DéjàVu: KV-cache Streaming for Fast, Fault-tolerant Generative LLM Serving
Authors:
Foteini Strati,
Sara Mcallister,
Amar Phanishayee,
Jakub Tarnawski,
Ana Klimovic
Abstract:
Distributed LLM serving is costly and often underutilizes hardware accelerators due to three key challenges: bubbles in pipeline-parallel deployments caused by the bimodal latency of prompt and token processing, GPU memory overprovisioning, and long recovery times in case of failures. In this paper, we propose DéjàVu, a system to address all these challenges using a versatile and efficient KV cach…
▽ More
Distributed LLM serving is costly and often underutilizes hardware accelerators due to three key challenges: bubbles in pipeline-parallel deployments caused by the bimodal latency of prompt and token processing, GPU memory overprovisioning, and long recovery times in case of failures. In this paper, we propose DéjàVu, a system to address all these challenges using a versatile and efficient KV cache streaming library (DéjàVuLib). Using DéjàVuLib, we propose and implement efficient prompt-token disaggregation to reduce pipeline bubbles, microbatch swap** for efficient GPU memory management, and state replication for fault-tolerance. We highlight the efficacy of these solutions on a range of large models across cloud deployments.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Fairness in Submodular Maximization over a Matroid Constraint
Authors:
Marwa El Halabi,
Jakub Tarnawski,
Ashkan Norouzi-Fard,
Thuy-Duong Vuong
Abstract:
Submodular maximization over a matroid constraint is a fundamental problem with various applications in machine learning. Some of these applications involve decision-making over datapoints with sensitive attributes such as gender or race. In such settings, it is crucial to guarantee that the selected solution is fairly distributed with respect to this attribute. Recently, fairness has been investi…
▽ More
Submodular maximization over a matroid constraint is a fundamental problem with various applications in machine learning. Some of these applications involve decision-making over datapoints with sensitive attributes such as gender or race. In such settings, it is crucial to guarantee that the selected solution is fairly distributed with respect to this attribute. Recently, fairness has been investigated in submodular maximization under a cardinality constraint in both the streaming and offline settings, however the more general problem with matroid constraint has only been considered in the streaming setting and only for monotone objectives. This work fills this gap. We propose various algorithms and impossibility results offering different trade-offs between quality, fairness, and generality.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Fairness in Streaming Submodular Maximization over a Matroid Constraint
Authors:
Marwa El Halabi,
Federico Fusco,
Ashkan Norouzi-Fard,
Jakab Tardos,
Jakub Tarnawski
Abstract:
Streaming submodular maximization is a natural model for the task of selecting a representative subset from a large-scale dataset. If datapoints have sensitive attributes such as gender or race, it becomes important to enforce fairness to avoid bias and discrimination. This has spurred significant interest in develo** fair machine learning algorithms. Recently, such algorithms have been develope…
▽ More
Streaming submodular maximization is a natural model for the task of selecting a representative subset from a large-scale dataset. If datapoints have sensitive attributes such as gender or race, it becomes important to enforce fairness to avoid bias and discrimination. This has spurred significant interest in develo** fair machine learning algorithms. Recently, such algorithms have been developed for monotone submodular maximization under a cardinality constraint.
In this paper, we study the natural generalization of this problem to a matroid constraint. We give streaming algorithms as well as impossibility results that provide trade-offs between efficiency, quality and fairness. We validate our findings empirically on a range of well-known real-world applications: exemplar-based clustering, movie recommendation, and maximum coverage in social networks.
△ Less
Submitted 19 October, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Near-Optimal Correlation Clustering with Privacy
Authors:
Vincent Cohen-Addad,
Chenglin Fan,
Silvio Lattanzi,
Slobodan Mitrović,
Ashkan Norouzi-Fard,
Nikos Parotsidis,
Jakub Tarnawski
Abstract:
Correlation clustering is a central problem in unsupervised learning, with applications spanning community detection, duplicate detection, automated labelling and many more. In the correlation clustering problem one receives as input a set of nodes and for each node a list of co-clustering preferences, and the goal is to output a clustering that minimizes the disagreement with the specified nodes'…
▽ More
Correlation clustering is a central problem in unsupervised learning, with applications spanning community detection, duplicate detection, automated labelling and many more. In the correlation clustering problem one receives as input a set of nodes and for each node a list of co-clustering preferences, and the goal is to output a clustering that minimizes the disagreement with the specified nodes' preferences. In this paper, we introduce a simple and computationally efficient algorithm for the correlation clustering problem with provable privacy guarantees. Our approximation guarantees are stronger than those shown in prior work and are optimal up to logarithmic factors.
△ Less
Submitted 2 March, 2022;
originally announced March 2022.
-
Harmony: Overcoming the Hurdles of GPU Memory Capacity to Train Massive DNN Models on Commodity Servers
Authors:
Youjie Li,
Amar Phanishayee,
Derek Murray,
Jakub Tarnawski,
Nam Sung Kim
Abstract:
Deep neural networks (DNNs) have grown exponentially in size over the past decade, leaving only those who have massive datacenter-based resources with the ability to develop and train such models. One of the main challenges for the long tail of researchers who might have only limited resources (e.g., a single multi-GPU server) is limited GPU memory capacity compared to model size. The problem is s…
▽ More
Deep neural networks (DNNs) have grown exponentially in size over the past decade, leaving only those who have massive datacenter-based resources with the ability to develop and train such models. One of the main challenges for the long tail of researchers who might have only limited resources (e.g., a single multi-GPU server) is limited GPU memory capacity compared to model size. The problem is so acute that the memory requirement of training massive DNN models can often exceed the aggregate capacity of all available GPUs on a single server; this problem only gets worse with the trend of ever-growing model sizes. Current solutions that rely on virtualizing GPU memory (by swap** to/from CPU memory) incur excessive swap** overhead. In this paper, we present a new training framework, Harmony, and advocate rethinking how DNN frameworks schedule computation and move data to push the boundaries of training massive models efficiently on a single commodity server. Across various massive DNN models, Harmony is able to reduce swap load by up to two orders of magnitude and obtain a training throughput speedup of up to 7.6x over highly optimized baselines with virtualized memory.
△ Less
Submitted 1 August, 2022; v1 submitted 2 February, 2022;
originally announced February 2022.
-
Online Edge Coloring via Tree Recurrences and Correlation Decay
Authors:
Janardhan Kulkarni,
Yang P. Liu,
Ashwin Sah,
Mehtaab Sawhney,
Jakub Tarnawski
Abstract:
We give an online algorithm that with high probability computes a $\left(\frac{e}{e-1} + o(1)\right)Δ$ edge coloring on a graph $G$ with maximum degree $Δ= ω(\log n)$ under online edge arrivals against oblivious adversaries, making first progress on the conjecture of Bar-Noy, Motwani, and Naor in this general setting. Our algorithm is based on reducing to a matching problem on locally treelike gra…
▽ More
We give an online algorithm that with high probability computes a $\left(\frac{e}{e-1} + o(1)\right)Δ$ edge coloring on a graph $G$ with maximum degree $Δ= ω(\log n)$ under online edge arrivals against oblivious adversaries, making first progress on the conjecture of Bar-Noy, Motwani, and Naor in this general setting. Our algorithm is based on reducing to a matching problem on locally treelike graphs, and then applying a tree recurrences based approach for arguing correlation decay.
△ Less
Submitted 1 November, 2021;
originally announced November 2021.
-
Correlation Clustering in Constant Many Parallel Rounds
Authors:
Vincent Cohen-Addad,
Silvio Lattanzi,
Slobodan Mitrović,
Ashkan Norouzi-Fard,
Nikos Parotsidis,
Jakub Tarnawski
Abstract:
Correlation clustering is a central topic in unsupervised learning, with many applications in ML and data mining. In correlation clustering, one receives as input a signed graph and the goal is to partition it to minimize the number of disagreements. In this work we propose a massively parallel computation (MPC) algorithm for this problem that is considerably faster than prior work. In particular,…
▽ More
Correlation clustering is a central topic in unsupervised learning, with many applications in ML and data mining. In correlation clustering, one receives as input a signed graph and the goal is to partition it to minimize the number of disagreements. In this work we propose a massively parallel computation (MPC) algorithm for this problem that is considerably faster than prior work. In particular, our algorithm uses machines with memory sublinear in the number of nodes in the graph and returns a constant approximation while running only for a constant number of rounds. To the best of our knowledge, our algorithm is the first that can provably approximate a clustering problem on graphs using only a constant number of MPC rounds in the sublinear memory regime. We complement our analysis with an experimental analysis of our techniques.
△ Less
Submitted 15 June, 2021;
originally announced June 2021.
-
On the Hardness of Scheduling With Non-Uniform Communication Delays
Authors:
Sami Davies,
Janardhan Kulkarni,
Thomas Rothvoss,
Sai Sandeep,
Jakub Tarnawski,
Yihao Zhang
Abstract:
In the scheduling with non-uniform communication delay problem, the input is a set of jobs with precedence constraints. Associated with every precedence constraint between a pair of jobs is a communication delay, the time duration the scheduler has to wait between the two jobs if they are scheduled on different machines. The objective is to assign the jobs to machines to minimize the makespan of t…
▽ More
In the scheduling with non-uniform communication delay problem, the input is a set of jobs with precedence constraints. Associated with every precedence constraint between a pair of jobs is a communication delay, the time duration the scheduler has to wait between the two jobs if they are scheduled on different machines. The objective is to assign the jobs to machines to minimize the makespan of the schedule. Despite being a fundamental problem in theory and a consequential problem in practice, the approximability of scheduling problems with communication delays is not very well understood. One of the top ten open problems in scheduling theory, in the influential list by Schuurman and Woeginger and its latest update by Bansal, asks if the problem admits a constant factor approximation algorithm. In this paper, we answer the question in negative by proving that there is a logarithmic hardness for the problem under the standard complexity theory assumption that NP-complete problems do not admit quasi-polynomial time algorithms.
Our hardness result is obtained using a surprisingly simple reduction from a problem that we call Unique Machine Precedence constraints Scheduling (UMPS). We believe that this problem is of central importance in understanding the hardness of many scheduling problems and conjecture that it is very hard to approximate. Among other things, our conjecture implies a logarithmic hardness of related machine scheduling with precedences, a long-standing open problem in scheduling theory and approximation algorithms.
△ Less
Submitted 30 April, 2021;
originally announced May 2021.
-
Fairness in Streaming Submodular Maximization: Algorithms and Hardness
Authors:
Marwa El Halabi,
Slobodan Mitrović,
Ashkan Norouzi-Fard,
Jakab Tardos,
Jakub Tarnawski
Abstract:
Submodular maximization has become established as the method of choice for the task of selecting representative and diverse summaries of data. However, if datapoints have sensitive attributes such as gender or age, such machine learning algorithms, left unchecked, are known to exhibit bias: under- or over-representation of particular groups. This has made the design of fair machine learning algori…
▽ More
Submodular maximization has become established as the method of choice for the task of selecting representative and diverse summaries of data. However, if datapoints have sensitive attributes such as gender or age, such machine learning algorithms, left unchecked, are known to exhibit bias: under- or over-representation of particular groups. This has made the design of fair machine learning algorithms increasingly important. In this work we address the question: Is it possible to create fair summaries for massive datasets? To this end, we develop the first streaming approximation algorithms for submodular maximization under fairness constraints, for both monotone and non-monotone functions. We validate our findings empirically on exemplar-based clustering, movie recommendation, DPP-based summarization, and maximum coverage in social networks, showing that fairness constraints do not significantly impact utility.
△ Less
Submitted 18 October, 2020; v1 submitted 14 October, 2020;
originally announced October 2020.
-
Efficient Algorithms for Device Placement of DNN Graph Operators
Authors:
Jakub Tarnawski,
Amar Phanishayee,
Nikhil R. Devanur,
Divya Mahajan,
Fanny Nina Paravecino
Abstract:
Modern machine learning workloads use large models, with complex structures, that are very expensive to execute. The devices that execute complex models are becoming increasingly heterogeneous as we see a flourishing of domain-specific accelerators being offered as hardware accelerators in addition to CPUs. These trends necessitate distributing the workload across multiple devices. Recent work has…
▽ More
Modern machine learning workloads use large models, with complex structures, that are very expensive to execute. The devices that execute complex models are becoming increasingly heterogeneous as we see a flourishing of domain-specific accelerators being offered as hardware accelerators in addition to CPUs. These trends necessitate distributing the workload across multiple devices. Recent work has shown that significant gains can be obtained with model parallelism, i.e, partitioning a neural network's computational graph onto multiple devices. In particular, this form of parallelism assumes a pipeline of devices, which is fed a stream of samples and yields high throughput for training and inference of DNNs. However, for such settings (large models and multiple heterogeneous devices), we require automated algorithms and toolchains that can partition the ML workload across devices. In this paper, we identify and isolate the structured optimization problem at the core of device placement of DNN operators, for both inference and training, especially in modern pipelined settings. We then provide algorithms that solve this problem to optimality. We demonstrate the applicability and efficiency of our approaches using several contemporary DNN computation graphs.
△ Less
Submitted 29 October, 2020; v1 submitted 29 June, 2020;
originally announced June 2020.
-
Fully Dynamic Algorithm for Constrained Submodular Optimization
Authors:
Silvio Lattanzi,
Slobodan Mitrović,
Ashkan Norouzi-Fard,
Jakub Tarnawski,
Morteza Zadimoghaddam
Abstract:
The task of maximizing a monotone submodular function under a cardinality constraint is at the core of many machine learning and data mining applications, including data summarization, sparse regression and coverage problems. We study this classic problem in the fully dynamic setting, where elements can be both inserted and removed. Our main result is a randomized algorithm that maintains an effic…
▽ More
The task of maximizing a monotone submodular function under a cardinality constraint is at the core of many machine learning and data mining applications, including data summarization, sparse regression and coverage problems. We study this classic problem in the fully dynamic setting, where elements can be both inserted and removed. Our main result is a randomized algorithm that maintains an efficient data structure with a poly-logarithmic amortized update time and yields a $(1/2-ε)$-approximate solution. We complement our theoretical analysis with an empirical study of the performance of our algorithm.
△ Less
Submitted 24 May, 2023; v1 submitted 8 June, 2020;
originally announced June 2020.
-
Hierarchy-Based Algorithms for Minimizing Makespan under Precedence and Communication Constraints
Authors:
Janardhan Kulkarni,
Shi Li,
Jakub Tarnawski,
Minwei Ye
Abstract:
We consider the classic problem of scheduling jobs with precedence constraints on a set of identical machines to minimize the makespan objective function. Understanding the exact approximability of the problem when the number of machines is a constant is a well-known question in scheduling theory. Indeed, an outstanding open problem from the classic book of Garey and Johnson asks whether this prob…
▽ More
We consider the classic problem of scheduling jobs with precedence constraints on a set of identical machines to minimize the makespan objective function. Understanding the exact approximability of the problem when the number of machines is a constant is a well-known question in scheduling theory. Indeed, an outstanding open problem from the classic book of Garey and Johnson asks whether this problem is NP-hard even in the case of 3 machines and unit-length jobs. In a recent breakthrough, Levey and Rothvoss gave a $(1+ε)$-approximation algorithm, which runs in nearly quasi-polynomial time, for the case when job have unit lengths. However, a substantially more difficult case where jobs have arbitrary processing lengths has remained open.
We make progress on this more general problem. We show that there exists a $(1+ε)$-approximation algorithm (with similar running time as that of Levey and Rothvoss) for the non-migratory setting: when every job has to be scheduled entirely on a single machine, but within a machine the job need not be scheduled during consecutive time steps. Further, we also show that our algorithmic framework generalizes to another classic scenario where, along with the precedence constraints, the jobs also have communication delay constraints. Both of these fundamental problems are highly relevant to the practice of datacenter scheduling.
△ Less
Submitted 28 April, 2020;
originally announced April 2020.
-
Scheduling with Communication Delays via LP Hierarchies and Clustering
Authors:
Sami Davies,
Janardhan Kulkarni,
Thomas Rothvoss,
Jakub Tarnawski,
Yihao Zhang
Abstract:
We consider the classic problem of scheduling jobs with precedence constraints on identical machines to minimize makespan, in the presence of communication delays. In this setting, denoted by $\mathsf{P} \mid \mathsf{prec}, c \mid C_{\mathsf{max}}$, if two dependent jobs are scheduled on different machines, then at least $c$ units of time must pass between their executions. Despite its relevance t…
▽ More
We consider the classic problem of scheduling jobs with precedence constraints on identical machines to minimize makespan, in the presence of communication delays. In this setting, denoted by $\mathsf{P} \mid \mathsf{prec}, c \mid C_{\mathsf{max}}$, if two dependent jobs are scheduled on different machines, then at least $c$ units of time must pass between their executions. Despite its relevance to many applications, this model remains one of the most poorly understood in scheduling theory. Even for a special case where an unlimited number of machines is available, the best known approximation ratio is $2/3 \cdot (c+1)$, whereas Graham's greedy list scheduling algorithm already gives a $(c+1)$-approximation in that setting. An outstanding open problem in the top-10 list by Schuurman and Woeginger and its recent update by Bansal asks whether there exists a constant-factor approximation algorithm.
In this work we give a polynomial-time $O(\log c \cdot \log m)$-approximation algorithm for this problem, where $m$ is the number of machines and $c$ is the communication delay. Our approach is based on a Sherali-Adams lift of a linear programming relaxation and a randomized clustering of the semimetric space induced by this lift.
△ Less
Submitted 20 April, 2020;
originally announced April 2020.
-
Beyond $1/2$-Approximation for Submodular Maximization on Massive Data Streams
Authors:
Ashkan Norouzi-Fard,
Jakub Tarnawski,
Slobodan Mitrović,
Amir Zandieh,
Aida Mousavifar,
Ola Svensson
Abstract:
Many tasks in machine learning and data mining, such as data diversification, non-parametric learning, kernel machines, clustering etc., require extracting a small but representative summary from a massive dataset. Often, such problems can be posed as maximizing a submodular set function subject to a cardinality constraint. We consider this question in the streaming setting, where elements arrive…
▽ More
Many tasks in machine learning and data mining, such as data diversification, non-parametric learning, kernel machines, clustering etc., require extracting a small but representative summary from a massive dataset. Often, such problems can be posed as maximizing a submodular set function subject to a cardinality constraint. We consider this question in the streaming setting, where elements arrive over time at a fast pace and thus we need to design an efficient, low-memory algorithm. One such method, proposed by Badanidiyuru et al. (2014), always finds a $0.5$-approximate solution. Can this approximation factor be improved? We answer this question affirmatively by designing a new algorithm SALSA for streaming submodular maximization. It is the first low-memory, single-pass algorithm that improves the factor $0.5$, under the natural assumption that elements arrive in a random order. We also show that this assumption is necessary, i.e., that there is no such algorithm with better than $0.5$-approximation when elements arrive in arbitrary order. Our experiments demonstrate that SALSA significantly outperforms the state of the art in applications related to exemplar-based clustering, social graph analysis, and recommender systems.
△ Less
Submitted 6 August, 2018;
originally announced August 2018.
-
Streaming Robust Submodular Maximization: A Partitioned Thresholding Approach
Authors:
Slobodan Mitrović,
Ilija Bogunovic,
Ashkan Norouzi-Fard,
Jakub Tarnawski,
Volkan Cevher
Abstract:
We study the classical problem of maximizing a monotone submodular function subject to a cardinality constraint k, with two additional twists: (i) elements arrive in a streaming fashion, and (ii) m items from the algorithm's memory are removed after the stream is finished. We develop a robust submodular algorithm STAR-T. It is based on a novel partitioning structure and an exponentially decreasing…
▽ More
We study the classical problem of maximizing a monotone submodular function subject to a cardinality constraint k, with two additional twists: (i) elements arrive in a streaming fashion, and (ii) m items from the algorithm's memory are removed after the stream is finished. We develop a robust submodular algorithm STAR-T. It is based on a novel partitioning structure and an exponentially decreasing thresholding rule. STAR-T makes one pass over the data and retains a short but robust summary. We show that after the removal of any m elements from the obtained summary, a simple greedy algorithm STAR-T-GREEDY that runs on the remaining elements achieves a constant-factor approximation guarantee. In two different data summarization tasks, we demonstrate that it matches or outperforms existing greedy and streaming methods, even if they are allowed the benefit of knowing the removed subset in advance.
△ Less
Submitted 7 November, 2017;
originally announced November 2017.
-
A Constant-Factor Approximation Algorithm for the Asymmetric Traveling Salesman Problem
Authors:
Ola Svensson,
Jakub Tarnawski,
László A. Végh
Abstract:
We give a constant-factor approximation algorithm for the asymmetric traveling salesman problem (ATSP). Our approximation guarantee is analyzed with respect to the standard LP relaxation, and thus our result confirms the conjectured constant integrality gap of that relaxation.
The main idea of our approach is a reduction to Subtour Partition Cover, an easier problem obtained by significantly rel…
▽ More
We give a constant-factor approximation algorithm for the asymmetric traveling salesman problem (ATSP). Our approximation guarantee is analyzed with respect to the standard LP relaxation, and thus our result confirms the conjectured constant integrality gap of that relaxation.
The main idea of our approach is a reduction to Subtour Partition Cover, an easier problem obtained by significantly relaxing the general connectivity requirements into local connectivity conditions. We first show that any algorithm for Subtour Partition Cover can be turned into an algorithm for ATSP while only losing a small constant factor in the performance guarantee. Next, we present a reduction from general ATSP instances to structured instances, on which we then solve Subtour Partition Cover, yielding our constant-factor approximation algorithm for ATSP.
△ Less
Submitted 15 September, 2020; v1 submitted 14 August, 2017;
originally announced August 2017.
-
The Matching Problem in General Graphs is in Quasi-NC
Authors:
Ola Svensson,
Jakub Tarnawski
Abstract:
We show that the perfect matching problem in general graphs is in Quasi-NC. That is, we give a deterministic parallel algorithm which runs in $O(\log^3 n)$ time on $n^{O(\log^2 n)}$ processors. The result is obtained by a derandomization of the Isolation Lemma for perfect matchings, which was introduced in the classic paper by Mulmuley, Vazirani and Vazirani [1987] to obtain a Randomized NC algori…
▽ More
We show that the perfect matching problem in general graphs is in Quasi-NC. That is, we give a deterministic parallel algorithm which runs in $O(\log^3 n)$ time on $n^{O(\log^2 n)}$ processors. The result is obtained by a derandomization of the Isolation Lemma for perfect matchings, which was introduced in the classic paper by Mulmuley, Vazirani and Vazirani [1987] to obtain a Randomized NC algorithm.
Our proof extends the framework of Fenner, Gurjar and Thierauf [2016], who proved the analogous result in the special case of bipartite graphs. Compared to that setting, several new ingredients are needed due to the significantly more complex structure of perfect matchings in general graphs. In particular, our proof heavily relies on the laminar structure of the faces of the perfect matching polytope.
△ Less
Submitted 4 September, 2017; v1 submitted 6 April, 2017;
originally announced April 2017.
-
Active Learning and Proofreading for Delineation of Curvilinear Structures
Authors:
Agata Mosinska,
Jakub Tarnawski,
Pascal Fua
Abstract:
Many state-of-the-art delineation methods rely on supervised machine learning algorithms. As a result, they require manually annotated training data, which is tedious to obtain. Furthermore, even minor classification errors may significantly affect the topology of the final result. In this paper we propose a generic approach to addressing both of these problems by taking into account the influence…
▽ More
Many state-of-the-art delineation methods rely on supervised machine learning algorithms. As a result, they require manually annotated training data, which is tedious to obtain. Furthermore, even minor classification errors may significantly affect the topology of the final result. In this paper we propose a generic approach to addressing both of these problems by taking into account the influence of a potential misclassification on the resulting delineation. In an Active Learning context, we identify parts of linear structures that should be annotated first in order to train a classifier effectively. In a proofreading context, we similarly find regions of the resulting reconstruction that should be verified in priority to obtain a nearly-perfect result. In both cases, by focusing the attention of the human expert on potential classification mistakes which are the most critical parts of the delineation, we reduce the amount of required supervision. We demonstrate the effectiveness of our approach on microscopy images depicting blood vessels and neurons.
△ Less
Submitted 13 March, 2017; v1 submitted 23 December, 2016;
originally announced December 2016.
-
Unrelated Machine Scheduling of Jobs with Uniform Smith Ratios
Authors:
Christos Kalaitzis,
Ola Svensson,
Jakub Tarnawski
Abstract:
We consider the classic problem of scheduling jobs on unrelated machines so as to minimize the weighted sum of completion times. Recently, for a small constant $\varepsilon >0 $, Bansal et al. gave a $(3/2-\varepsilon)$-approximation algorithm improving upon the natural barrier of $3/2$ which follows from independent randomized rounding. In simplified terms, their result is obtained by an enhancem…
▽ More
We consider the classic problem of scheduling jobs on unrelated machines so as to minimize the weighted sum of completion times. Recently, for a small constant $\varepsilon >0 $, Bansal et al. gave a $(3/2-\varepsilon)$-approximation algorithm improving upon the natural barrier of $3/2$ which follows from independent randomized rounding. In simplified terms, their result is obtained by an enhancement of independent randomized rounding via strong negative correlation properties.
In this work, we take a different approach and propose to use the same elegant rounding scheme for the weighted completion time objective as devised by Shmoys and Tardos for optimizing a linear function subject to makespan constraints. Our main result is a $1.21$-approximation algorithm for the natural special case where the weight of a job is proportional to its processing time (specifically, all jobs have the same Smith ratio), which expresses the notion that each unit of work has the same weight. In addition, as a direct consequence of the rounding, our algorithm also achieves a bi-criteria $2$-approximation for the makespan objective. Our technical contribution is a tight analysis of the expected cost of the solution compared to the one given by the Configuration-LP relaxation - we reduce this task to that of understanding certain worst-case instances which are simple to analyze.
△ Less
Submitted 3 November, 2016; v1 submitted 26 July, 2016;
originally announced July 2016.
-
Constant Factor Approximation for ATSP with Two Edge Weights
Authors:
Ola Svensson,
Jakub Tarnawski,
László A. Végh
Abstract:
We give a constant factor approximation algorithm for the Asymmetric Traveling Salesman Problem on shortest path metrics of directed graphs with two different edge weights. For the case of unit edge weights, the first constant factor approximation was given recently by Svensson. This was accomplished by introducing an easier problem called Local-Connectivity ATSP and showing that a good solution t…
▽ More
We give a constant factor approximation algorithm for the Asymmetric Traveling Salesman Problem on shortest path metrics of directed graphs with two different edge weights. For the case of unit edge weights, the first constant factor approximation was given recently by Svensson. This was accomplished by introducing an easier problem called Local-Connectivity ATSP and showing that a good solution to this problem can be used to obtain a constant factor approximation for ATSP. In this paper, we solve Local-Connectivity ATSP for two different edge weights. The solution is based on a flow decomposition theorem for solutions of the Held-Karp relaxation, which may be of independent interest.
△ Less
Submitted 4 September, 2017; v1 submitted 22 November, 2015;
originally announced November 2015.
-
Fast Generation of Random Spanning Trees and the Effective Resistance Metric
Authors:
Aleksander Madry,
Damian Straszak,
Jakub Tarnawski
Abstract:
We present a new algorithm for generating a uniformly random spanning tree in an undirected graph. Our algorithm samples such a tree in expected $\tilde{O}(m^{4/3})$ time. This improves over the best previously known bound of $\min(\tilde{O}(m\sqrt{n}),O(n^ω))$ -- that follows from the work of Kelner and Mądry [FOCS'09] and of Colbourn et al. [J. Algorithms'96] -- whenever the input graph is suffi…
▽ More
We present a new algorithm for generating a uniformly random spanning tree in an undirected graph. Our algorithm samples such a tree in expected $\tilde{O}(m^{4/3})$ time. This improves over the best previously known bound of $\min(\tilde{O}(m\sqrt{n}),O(n^ω))$ -- that follows from the work of Kelner and Mądry [FOCS'09] and of Colbourn et al. [J. Algorithms'96] -- whenever the input graph is sufficiently sparse.
At a high level, our result stems from carefully exploiting the interplay of random spanning trees, random walks, and the notion of effective resistance, as well as from devising a way to algorithmically relate these concepts to the combinatorial structure of the graph. This involves, in particular, establishing a new connection between the effective resistance metric and the cut structure of the underlying graph.
△ Less
Submitted 1 January, 2015;
originally announced January 2015.