Computer Science > Data Structures and Algorithms
[Submitted on 29 Oct 2015 (v1), last revised 16 Aug 2016 (this version, v2)]
Title:Mixed Robust/Average Submodular Partitioning: Fast Algorithms, Guarantees, and Applications to Parallel Machine Learning and Multi-Label Image Segmentation
View PDFAbstract:We study two mixed robust/average-case submodular partitioning problems that we collectively call Submodular Partitioning. These problems generalize both purely robust instances of the problem (namely max-min submodular fair allocation (SFA) and min-max submodular load balancing (SLB) and also generalize average-case instances (that is the submodular welfare problem (SWP) and submodular multiway partition (SMP). While the robust versions have been studied in the theory community, existing work has focused on tight approximation guarantees, and the resultant algorithms are not, in general, scalable to very large real-world applications. This is in contrast to the average case, where most of the algorithms are scalable. In the present paper, we bridge this gap, by proposing several new algorithms (including those based on greedy, majorization-minimization, minorization-maximization, and relaxation algorithms) that not only scale to large sizes but that also achieve theoretical approximation guarantees close to the state-of-the-art, and in some cases achieve new tight bounds. We also provide new scalable algorithms that apply to additive combinations of the robust and average-case extreme objectives. We show that these problems have many applications in machine learning (ML). This includes: 1) data partitioning and load balancing for distributed machine algorithms on parallel machines; 2) data clustering; and 3) multi-label image segmentation with (only) Boolean submodular functions via pixel partitioning. We empirically demonstrate the efficacy of our algorithms on real-world problems involving data partitioning for distributed optimization of standard machine learning objectives (including both convex and deep neural network objectives), and also on purely unsupervised (i.e., no supervised or semi-supervised learning, and no interactive segmentation) image segmentation.
Submission history
From: Kai Wei [view email][v1] Thu, 29 Oct 2015 20:07:32 UTC (3,652 KB)
[v2] Tue, 16 Aug 2016 04:00:41 UTC (4,171 KB)
Current browse context:
cs.DS
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.