Search | arXiv e-print repository

A Sinkhorn-type Algorithm for Constrained Optimal Transport

Authors: Xun Tang, Holakou Rahmanian, Michael Shavlovsky, Kiran Koshy Thekumparampil, Tesi Xiao, Lexing Ying

Abstract: Entropic optimal transport (OT) and the Sinkhorn algorithm have made it practical for machine learning practitioners to perform the fundamental task of calculating transport distance between statistical distributions. In this work, we focus on a general class of OT problems under a combination of equality and inequality constraints. We derive the corresponding entropy regularization formulation an… ▽ More Entropic optimal transport (OT) and the Sinkhorn algorithm have made it practical for machine learning practitioners to perform the fundamental task of calculating transport distance between statistical distributions. In this work, we focus on a general class of OT problems under a combination of equality and inequality constraints. We derive the corresponding entropy regularization formulation and introduce a Sinkhorn-type algorithm for such constrained OT problems supported by theoretical guarantees. We first bound the approximation error when solving the problem through entropic regularization, which reduces exponentially with the increase of the regularization parameter. Furthermore, we prove a sublinear first-order convergence rate of the proposed Sinkhorn-type algorithm in the dual space by characterizing the optimization procedure with a Lyapunov function. To achieve fast and higher-order convergence under weak entropy regularization, we augment the Sinkhorn-type algorithm with dynamic regularization scheduling and second-order acceleration. Overall, this work systematically combines recent theoretical and numerical advances in entropic optimal transport with the constrained case, allowing practitioners to derive approximate transport plans in complex scenarios. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2401.12253 [pdf, other]

Accelerating Sinkhorn Algorithm with Sparse Newton Iterations

Authors: Xun Tang, Michael Shavlovsky, Holakou Rahmanian, Elisa Tardini, Kiran Koshy Thekumparampil, Tesi Xiao, Lexing Ying

Abstract: Computing the optimal transport distance between statistical distributions is a fundamental task in machine learning. One remarkable recent advancement is entropic regularization and the Sinkhorn algorithm, which utilizes only matrix scaling and guarantees an approximated solution with near-linear runtime. Despite the success of the Sinkhorn algorithm, its runtime may still be slow due to the pote… ▽ More Computing the optimal transport distance between statistical distributions is a fundamental task in machine learning. One remarkable recent advancement is entropic regularization and the Sinkhorn algorithm, which utilizes only matrix scaling and guarantees an approximated solution with near-linear runtime. Despite the success of the Sinkhorn algorithm, its runtime may still be slow due to the potentially large number of iterations needed for convergence. To achieve possibly super-exponential convergence, we present Sinkhorn-Newton-Sparse (SNS), an extension to the Sinkhorn algorithm, by introducing early stop** for the matrix scaling steps and a second stage featuring a Newton-type subroutine. Adopting the variational viewpoint that the Sinkhorn algorithm maximizes a concave Lyapunov potential, we offer the insight that the Hessian matrix of the potential function is approximately sparse. Sparsification of the Hessian results in a fast $O(n^2)$ per-iteration complexity, the same as the Sinkhorn algorithm. In terms of total iteration count, we observe that the SNS algorithm converges orders of magnitude faster across a wide range of practical cases, including optimal transportation between empirical distributions and calculating the Wasserstein $W_1, W_2$ distance of discretized densities. The empirical performance is corroborated by a rigorous bound on the approximate sparsity of the Hessian matrix. △ Less

Submitted 20 January, 2024; originally announced January 2024.

Comments: In ICLR 2024

arXiv:2308.00177 [pdf, other]

Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity

Authors: Charlie Hou, Kiran Koshy Thekumparampil, Michael Shavlovsky, Giulia Fanti, Yesh Dattatreya, Sujay Sanghavi

Abstract: On tabular data, a significant body of literature has shown that current deep learning (DL) models perform at best similarly to Gradient Boosted Decision Trees (GBDTs), while significantly underperforming them on outlier data. However, these works often study idealized problem settings which may fail to capture complexities of real-world scenarios. We identify a natural tabular data setting where… ▽ More On tabular data, a significant body of literature has shown that current deep learning (DL) models perform at best similarly to Gradient Boosted Decision Trees (GBDTs), while significantly underperforming them on outlier data. However, these works often study idealized problem settings which may fail to capture complexities of real-world scenarios. We identify a natural tabular data setting where DL models can outperform GBDTs: tabular Learning-to-Rank (LTR) under label scarcity. Tabular LTR applications, including search and recommendation, often have an abundance of unlabeled data, and scarce labeled data. We show that DL rankers can utilize unsupervised pretraining to exploit this unlabeled data. In extensive experiments over both public and proprietary datasets, we show that pretrained DL rankers consistently outperform GBDT rankers on ranking metrics -- sometimes by as much as 38% -- both overall and on outliers. △ Less

Submitted 25 June, 2024; v1 submitted 31 July, 2023; originally announced August 2023.

Comments: ICML-MFPL 2023 Workshop Oral, SPIGM@ICML2024

arXiv:1608.07886 [pdf, other]

Incentives for Truthful Evaluations

Authors: Luca de Alfaro, Marco Faella, Vassilis Polychronopoulos, Michael Shavlovsky

Abstract: We consider crowdsourcing problems where the users are asked to provide evaluations for items; the user evaluations are then used directly, or aggregated into a consensus value. Lacking an incentive scheme, users have no motive in making effort in completing the evaluations, providing inaccurate answers instead. We propose incentive schemes that are truthful and cheap: truthful as the optimal user… ▽ More We consider crowdsourcing problems where the users are asked to provide evaluations for items; the user evaluations are then used directly, or aggregated into a consensus value. Lacking an incentive scheme, users have no motive in making effort in completing the evaluations, providing inaccurate answers instead. We propose incentive schemes that are truthful and cheap: truthful as the optimal user behavior consists in providing accurate evaluations, and cheap because the truthfulness is achieved with little overhead cost. We consider both discrete evaluation tasks, where an evaluation can be done either correctly, or incorrectly, with no degrees of approximation in between, and quantitative evaluation tasks, where evaluations are real numbers, and the error is measured as distance from the correct value. For both types of tasks, we propose hierarchical incentive schemes that can be effected with a small amount of additional evaluations, and that scale to arbitrarily large crowd sizes: they have the property that the strength of the incentive does not weaken with increasing hierarchy depth. Interestingly, we show that for these schemes to work, the only requisite is that workers know their place in the hierarchy in advance. △ Less

Submitted 5 May, 2017; v1 submitted 28 August, 2016; originally announced August 2016.

Report number: UCSC Technical Report UCSC-SOE-16-14

arXiv:1604.03178 [pdf, other]

Incentives for Truthful Peer Grading

Authors: Luca de Alfaro, Michael Shavlovsky, Vassilis Polychronopoulos

Abstract: Peer grading systems work well only if users have incentives to grade truthfully. An example of non-truthful grading, that we observed in classrooms, consists in students assigning the maximum grade to all submissions. With a naive grading scheme, such as averaging the assigned grades, all students would receive the maximum grade. In this paper, we develop three grading schemes that provide incent… ▽ More Peer grading systems work well only if users have incentives to grade truthfully. An example of non-truthful grading, that we observed in classrooms, consists in students assigning the maximum grade to all submissions. With a naive grading scheme, such as averaging the assigned grades, all students would receive the maximum grade. In this paper, we develop three grading schemes that provide incentives for truthful peer grading. In the first scheme, the instructor grades a fraction p of the submissions, and penalizes students whose grade deviates from the instructor grade. We provide lower bounds on p to ensure truthfulness, and conclude that these schemes work only for moderate class sizes, up to a few hundred students. To overcome this limitation, we propose a hierarchical extension of this supervised scheme, and we show that it can handle classes of any size with bounded (and little) instructor work, and is therefore applicable to Massive Open Online Courses (MOOCs). Finally, we propose unsupervised incentive schemes, in which the student incentive is based on statistical properties of the grade distribution, without any grading required by the instructor. We show that the proposed unsupervised schemes provide incentives to truthful grading, at the price of being possibly unfair to individual students. △ Less

Submitted 11 April, 2016; originally announced April 2016.

Comments: 26 pages

Report number: UCSC-SOE-15-19

arXiv:1308.5273 [pdf, ps, other]

CrowdGrader: Crowdsourcing the Evaluation of Homework Assignments

Authors: Luca de Alfaro, Michael Shavlovsky

Abstract: Crowdsourcing offers a practical method for ranking and scoring large amounts of items. To investigate the algorithms and incentives that can be used in crowdsourcing quality evaluations, we built CrowdGrader, a tool that lets students submit and collaboratively grade solutions to homework assignments. We present the algorithms and techniques used in CrowdGrader, and we describe our results and ex… ▽ More Crowdsourcing offers a practical method for ranking and scoring large amounts of items. To investigate the algorithms and incentives that can be used in crowdsourcing quality evaluations, we built CrowdGrader, a tool that lets students submit and collaboratively grade solutions to homework assignments. We present the algorithms and techniques used in CrowdGrader, and we describe our results and experience in using the tool for several computer-science assignments. CrowdGrader combines the student-provided grades into a consensus grade for each submission using a novel crowdsourcing algorithm that relies on a reputation system. The algorithm iterativerly refines inter-dependent estimates of the consensus grades, and of the grading accuracy of each student. On synthetic data, the algorithm performs better than alternatives not based on reputation. On our preliminary experimental data, the performance seems dependent on the nature of review errors, with errors that can be ascribed to the reviewer being more tractable than those arising from random external events. To provide an incentive for reviewers, the grade each student receives in an assignment is a combination of the consensus grade received by their submissions, and of a reviewing grade capturing their reviewing effort and accuracy. This incentive worked well in practice. △ Less

Submitted 23 August, 2013; originally announced August 2013.

Comments: Technical Report UCSC-SOE-13-11

Showing 1–6 of 6 results for author: Shavlovsky, M