Approximation Algorithms for School Assignment: Group Fairness and Multi-criteria Optimization

Santhini K. A Indian Institute of Technology, Madras
{cs18d013,meghana}@cse.iitm.ac.in Kamesh Munagala Supported by NSF grant CCF-2113798. Duke University
{munagala,govind.s.sankar}@duke.edu Meghana Nasre Indian Institute of Technology, Madras
{cs18d013,meghana}@cse.iitm.ac.in Govind S. Sankar¹¹footnotemark: 1 Duke University
{munagala,govind.s.sankar}@duke.edu

Abstract

We consider the problem of assigning students to schools, when students have different utilities for schools and schools have capacity. There are additional group fairness considerations over students that can be captured either by concave objectives, or additional constraints on the groups. We present approximation algorithms for this problem via convex program rounding that achieve various trade-offs between utility violation, capacity violation, and running time. We also show that our techniques easily extend to the setting where there are arbitrary covering constraints on the feasible assignment, capturing multi-criteria and ranking optimization.

1 Introduction

In this paper, we consider a general model of assignment with a group fairness objective or constraints, motivated by school choice. In the school assignment problem, we are required to assign students to schools, subject to: (1) matching every student, (2) respecting the capacities of schools, and (3) being fair on the utilities to a pre-defined set of $g$ groups of students, which can potentially overlap. These groups could capture demographic features like income, race, etc. Each student has arbitrary cardinal utilities bounded in $[0,1]$ over schools, and the group fairness could either be captured in the objective defined over the utilities obtained by each group, or as a set of constraints capturing the same.

Our main contribution is a set of approximation algorithms for this problem. For arbitrary fairness objectives, we present two algorithms based on rounding a natural convex programming relaxation, which yield somewhat different guarantees:

•

A polynomial time algorithm where each group achieves at least as much utility as in the relaxation. However, school $v$ ’s capacity is violated by $1+\delta_{v}$ , where $\sum_{v}\delta_{v}\leq 2g$ . This algorithm is a careful adaptation of the rounding algorithm in [7].
•

A $n^{O(g)}$ time algorithm where each group receives at least its utility in the relaxation, while each school $v$ ’s capacity is violated by $\delta_{v}$ , where $\sum_{v}\delta_{v}=O(g^{2})$ .

For practical school choice scenarios [1], the number of students significantly outweigh the number of available schools, while the number of groups $g$ is typically small, even constant. The above violations are therefore quite mild.

General Covering Constraints.

We subsequently present an extension of our framework to handle assignments with more general covering constraints. As one application, suppose the utility of a student for a school is multi-dimensional, capturing aspects like academic excellence, or location, or diversity of student body. The goal is to achieve at least a certain total utility value in each dimension. Such multi-objective optimization can be modeled using covering constraints.

As another application, we study the assignment with ranks problem first considered in [6]. Here, each student ordinally ranks the schools with possible ties. An input signature $\vec{\rho}$ of length $r$ , the goal is to find an assignment where the number of students who are assigned their first $k$ choices (for all $k\leq r$ ) is at least $\sum_{j=1}^{k}\rho_{j}$ . Using the same ideas as before, we give a polynomial time algorithm that finds such a matching, while violating the capacity of school $v$ by $1+\delta_{v}$ , where $\sum_{v}\delta_{v}\leq 2r$ . We also give a $O(n^{O(r)})$ time algorithm that finds such a matching, while violating the capacity of school $v$ by $\delta_{v}$ , where $\sum_{v}\delta_{v}=O(r^{2})$ . In comparison to the algorithm in [6] that runs in time $n^{O(r^{2})}$ and uses multivariate polynomial interpolation, our algorithms present substantial improvements in both runtime and ease of implementation at the cost of a small capacity violation, when students are given a small choice $r$ of ranks.

1.1 Related Work and Techniques

GAP Rounding.

Our algorithm uses LP rounding, and borrows ideas from the semimal Generalized Assignment Problem (GAP) rounding technique of Lenstra, Shmoys, and Tardos [7, 11]. Their iterative rounding procedure involves the observation that the number of fractional variables in a vertex solution to a linear programming relaxation is bounded. We build on this idea and apply it to a linear program written on paths and cycles instead of assignments, enabling us to combine it with recent methods for finding proportional allocations mentioned below.

In addition to GAP rounding, closely related rounding techniques include that for the bounded degree minimum spanning tree problem [4, 12], which outputs a spanning tree whose cost is at most the optimal solution and whose degree constraints are violated by small additive constants.

School Assignment and Matching.

The school assignment problem as formulated above was first considered recently in Procaccia, Robinson, and Tucker-Foltz [10]. This work only considers the objective of proportionality — in the assignment, each of the $g$ groups is required to achieve at least $1/g$ fraction of the utility it could have achieved had it been the only group in the system. They present an algorithm (not based on rounding) tailored to proportionality by applying a theorem of Stromquist and Woodall [14], which they call “cake frosting”. The theorem in [14] is a consequence of the celebrated ham-sandwich theorem [13], and is hence non-constructive. Using this technique, the algorithm for proportional school choice in [10] loses an additive $O(g\log g)$ on utilities (hence achieving approximate proportionality), in addition to violating the total capacity by $O(g\log g)$ . In contrast, we use convex programming relaxation to handle arbitrary fairness objectives, for instance, proportional fairness, Pareto-optimality, and leximin fairness, hence vastly generalizing the space of objectives. Our method only loses $O(g^{2})$ on the total capacity, while preserving utilities from the fractional relaxation. For instance, our method would achieve exact proportionality.

At a high level, our main technical contribution is to show how the cake frosting method can be applied to certain types of fractional solutions, in particular, a vertex solution constructed via GAP rounding of the convex programming relaxation. We hence showcase the full power of the technique in [14]. In addition, as discussed above, our techniques extend smoothly to handle arbitrary covering constraints on the allocations, which is motivated by multi-objective optimization and rank optimization.

We note that the idea of using cake frosting to round fractional solutions has appeared before for packing problems in Grandoni et al. [5], where the authors develop a PTAS for matchings in general graphs with $O(1)$ budget constraints on the set of chosen edges. At a high level, all these approaches – the ones in [10, 5] and our work – apply cake frosting to decompose paths and cycles to approximately preserve constraints, but differ in the details of how the paths and cycles are constructed from the integer or fractional solutions. For instance, in contrast to [10] that defines the frosting function based on schools, we define it based on students, hence avoiding an additive violation on the utility. Further, since [5] consider packing problems, their reduction to cake frosting is entirely different in the technical details.

Matching with Violations.

Recently, several papers have considered assignment problems with small capacity violations. These papers mostly fall into two categories - those that try to directly optimize the capacity violations in some form while achieving a set goal like stability or perfectness [2, 3, 6] and those that optimize some other objective like fairness with provably small capacity violations [5, 9, 10]. Our model falls in the latter category – we want to find a matching that satisfies some notion of fairness, while violating capacities by as little as possible.

1.2 Roadmap

We present the group fairness problem in school choice, along with the statement of our main result (Theorem 1) in Section 2. We present the two approximation algorithms that prove Theorem 1 in Section 3. We generalize the model to covering constraints in Section 4, specializing it to the ranking problem mentioned above in Section 4.3.

2 Preliminaries and Main Result

We first present the school redistricting problem introduced in [10]. There is a set $S$ of $n$ students divided into $g$ possibly overlap** groups sets $S_{1},S_{2},\ldots,S_{g}$ . There is a set $T$ of schools, and school $j$ has capacity $C_{j}$ .

An assignment of students to schools is a feasible solution $\vec{y}\in\mathcal{P}$ , where $\mathcal{P}$ is defined by:

\begin{array}[]{rcll}\sum_{j\in T}y_{ij}&=&1&\forall i\in S\\ \sum_{i\in S}y_{ij}&\leq&C_{j}&\forall j\in T\\ y_{ij}&\in&\{0,1\}&\forall i\in S,j\in T\end{array}

Utility and Objective.

The utility of assigning student $i$ to school $j$ is $u_{ij}\in[0,1]$ . Given an assignment $\vec{y}\in\mathcal{P}$ , we define the utility of each group $S_{k}$ as

U_{k}(\vec{y})=\sum_{i\in S_{k}}\sum_{j}u_{ij}y_{ij}.

Let $\vec{U}=\langle U_{1},U_{2},\ldots,U_{g}\rangle$ . The goal is to find an assignment of each student to a school, in order to maximize some fairness function on the utilities perceived by the $g$ groups. Let $f(\cdot)$ be a non-decreasing concave function. Then the goal is to maximize

f(\vec{U})=\sum_{k}f\left(U_{k}(\vec{y})\right).

As an example, the celebrated proportional fairness objective sets $f=\log$ , and the optimal solution $\vec{U^{*}}$ has the following property: For any other feasible utility vector $\vec{U^{\prime}}$ :

\frac{1}{g}\sum_{k}\frac{U^{\prime}_{k}}{U^{*}_{k}}\leq 1.

Such an allocation is not only proportional to the groups: $U^{\prime}_{k}\leq g\cdot U^{*}_{k}$ , but also proportional to various subsets of groups.

2.1 Main Result

Our main result is the following:

Theorem 1.

Let $\vec{U^{*}}$ be the optimal solution for any concave fairness function. Then, we can compute an assignment $\vec{y}\in\mathcal{P}$ that satisfies relaxed school capacities $\vec{C^{\prime}}$ and yields utilities $\vec{U^{\prime}}$ with $U^{\prime}_{k}\geq U^{*}_{k}$ for all groups $k$ , with one of the following guarantees on capacity:

1.

A polynomial¹¹1Throughout the paper, we use this to mean polynomial in both $n,g$ . time algorithm that yields $C^{\prime}_{j}\leq C_{j}+1+\delta_{j}$ , where $\sum_{j}\delta_{j}\leq 2g$ .
2.

A $n^{O(g)}$ running time algorithm that yields $C^{\prime}_{j}\leq C_{j}+\delta_{j}$ with $\sum_{j}\delta_{j}=O(g^{2})$ .

Note that the latter algorithm is slower, but yields better violation of capacities if $g\ll|T|$ .

2.2 Hardness Results

The hardness results below motivate the need to violate capacities in Theorem 1 if we want to preserve utilities exactly. We first show that the problem under the max-min fairness objective is strongly NP-Hard when the number of groups $g$ is part of the input.

Theorem 2.

Suppose the number of groups $g$ is part of the input, and the objective is to decide if the minimum utility received by any group is at least one. Then the school redistricting problem is NP-Complete even when there are only two schools.

Proof.

We reduce from Set Cover with a collection $\mathcal{C}$ of sets, and a universe $U$ of elements. Suppose the goal is to decide if a set cover instance has $k$ sets that cover $U$ . Then each element becomes a group, and each set a student. A student belongs to a group if the corresponding set covers the corresponding element. There are two schools $s_{1}$ and $s_{2}$ . The former school has capacity $k$ and the latter has capacity $\infty$ . Each student has utility $1$ for $s_{1}$ and $0$ for $s_{2}$ . Then the goal of matching $k$ students to $s_{1}$ to give each group utility at least one is exactly the same as finding a set cover of size $k$ , completing the proof. ∎

We next show that the school redistricting problem under the max-min fairness (or proportionality) objective is weakly NP-Hard even when the number of groups $g=2$ .

Theorem 3.

Suppose the number of groups $g=2$ , and the objective is to decide if an exactly proportional allocation exists. Then the school redistricting problem is weakly NP-Complete.

Proof.

We reduce from Partition. Given a set of numbers $x_{1},..,x_{n}$ , the goal is to decide if there is a subset of sum exactly $X/2$ where $X=\sum_{i}x_{i}$ .

For every number $x_{i}$ , create two students $p_{i}$ and $q_{i}$ , and one school $S_{i}$ of capacity 1. There is also a dummy school $S_{0}$ of capacity $n$ . The students $p_{i}$ and $q_{i}$ have edges only to $S_{i}$ and $S_{0}$ , where $p_{i}$ and $q_{i}$ have utility $x_{i}$ for $S_{i}$ and 0 for $S_{0}$ .

All the $p_{i}$ students belong to group 1 and all the $q_{i}$ students belong to group 2. We want to find a matching that gives utility $X/2$ to both groups, which is the proportional share.

Suppose there is a subset $T$ of the numbers that sums to exactly $X/2$ . Then for every $x_{i}\in T$ , we assign $p_{i}$ to $S_{i}$ and $q_{i}$ to $S_{0}$ . For every $x_{i}\notin T$ , we assign $q_{i}$ to $S_{i}$ and $p_{i}$ to $S_{0}$ . Both groups get utility $X/2$ each. The reverse direction is similar, completing the proof. ∎

3 Proof of Theorem 1

In this section, we prove Theorem 1. We begin with a convex programming relaxation to the problem, and then present two rounding schemes that yield the two guarantees in the theorem.

3.1 Convex Program Relaxation

The first step is to write the following convex programming relaxation:

\mbox{Maximize}\sum_{k}f(U_{k})

\begin{array}[]{rcll}\sum_{i\in S_{k}}\sum_{j\in T}u_{ij}y_{ij}&\geq&U_{k}&% \forall k\\ \sum_{j\in T}y_{ij}&=&1&\forall i\in S\\ \sum_{i\in S}y_{ij}&\leq&C_{j}&\forall j\in T\\ y_{ij}&\geq&0&\forall i\in S,j\in T\\ U_{k}&\geq&0&\forall\mbox{ groups }k\end{array}

This can clearly be solved in polynomial time. Let the optimal solution to the convex program yield utility vector $\vec{U^{*}}$ . We now need to round the following set of constraints, so that the $\{y_{ij}\}$ values are integer. Denote this formulation as (LP1).

\begin{array}[]{rcll}\sum_{i\in S_{k}}\sum_{j\in T}u_{ij}y_{ij}&\geq&U^{*}_{k}% &\forall k\\ \sum_{j\in T}y_{ij}&=&1&\forall i\in S\\ \sum_{i\in S}y_{ij}&\leq&C_{j}&\forall j\in T\\ y_{ij}&\geq&0&\forall i\in S,j\in T\end{array}

We now present two rounding algorithms that yields the corresponding guarantees in Theorem 1.

3.2 Generalized Assignment Rounding

The first rounding algorithm is similar to rounding for generalized assignment (GAP) [7, 11], and is presented in Algorithm 1. We remark that the iterative procedure from Step 1-6 is not necessary; the same can be achieved with a single LP solution. We present it this way for ease of exposition. We show that it achieves the following guarantee, yielding the first part of Theorem 1.

Theorem 4.

Algorithm 1 runs in polynomial time and finds an integer solution $\vec{y}\in\mathcal{P}$ that satisfies relaxed school capacities $\vec{C^{\prime}}$ and yields utilities $\vec{U^{\prime}}$ , where

1.

$U^{\prime}_{k}\geq U^{*}_{k}$ for all groups $k$ , and
2.

$C^{\prime}_{j}\leq C_{j}+1+\delta_{j}$ , where $\sum_{j}\delta_{j}\leq 2g$ .

Algorithm 1 GAP Rounding

1:repeat

2: Obtain a vertex solution

\vec{y}

to (LP1).

3: for all

y_{ij}=b\in\{0,1\}

4: Fix

y_{ij}=b

and remove this variable from (LP1), updating the constraints as needed.

5: end for

6:until (LP1) is not modified.

7:For each remaining student

i

(these have degree more than

1

), assign this student to

\mbox{argmax}_{j}\{u_{ij}|y_{ij}>0\}

Proof of Theorem 4..

We denote the number of incident edges with $y_{ij}>0$ as the “degree” of a vertex. First note that since each student $i$ is assigned to $\mbox{argmax}_{j}\{u_{ij}|y_{ij}>0\}$ , the utility in the integer solution is at least that in (LP1).

We now argue the capacity violation. We note that Step 4 cannot violate any capacities, since $\vec{y}$ was feasible for (LP1). Thus, it only remains to argue that Step 7 does not incur too many capacity violations. Let $E$ be the set of remaining edges with $y_{ij}\in(0,1)$ . Since $\vec{y}$ was an extreme point solution to (LP1), $|E|$ constraints of (LP1) must be tight.

At the beginning of Step 7, let $T^{\prime}$ be the set of remaining schools and among these, let $\hat{T}$ be those whose capacity constraints are tight. Let $S^{\prime}$ be the set of remaining students. Note that each student has a tight constraint associated with it. Suppose $g^{\prime}$ of the $g$ constraints corresponding to the groups are tight. Since we have a vertex solution,

\displaystyle 2|E|=2(|S^{\prime}|+|\hat{T}|+g^{\prime})

(1)

From the Handshaking lemma, we also have

	$\displaystyle\sum_{v\in T^{\prime}\setminus\hat{T}}\mathrm{deg}(v)+\sum_{v\in S% ^{\prime}\cup\hat{T}}\mathrm{deg}(v)=2\|E\|.$
Combining this with Eq. 1, we have
	$\displaystyle\sum_{v\in T^{\prime}\setminus\hat{T}}\mathrm{deg}(v)+\sum_{v\in S% ^{\prime}\cup\hat{T}}\mathrm{deg}(v)-2\|S^{\prime}\|-2\|\hat{T}\|$	$\displaystyle=2g^{\prime}\leq 2g$
	$\displaystyle\implies\sum_{v\in T^{\prime}\setminus\hat{T}}\mathrm{deg}(v)+% \sum_{v\in S^{\prime}\cup\hat{T}}(\mathrm{deg}(v)-2)$	$\displaystyle\leq 2g$

We know that each school in $\hat{T}$ has degree at least 2, since the capacity is an integer, and all assignment variables are strict fractions. Similarly, each student in $S^{\prime}$ has degree at least $2$ . Therefore, every term in the above summation is non-negative. Let school $v\in\hat{T}$ and student $i\in S^{\prime}$ have degree $2+\delta_{v}$ , while school $v\in T^{\prime}\setminus\hat{T}$ have degree $\delta_{v}$ . We will refer to these $\delta_{v}$ terms as the excess degrees. Then the above implies

\sum_{v\in T^{\prime}\cup S^{\prime}}\delta_{v}\leq 2g.

(2)

To bound the capacity violation in 7, we just observe that

•

If a school in $\hat{T}$ had degree 2, then it must have had capacity at least 1, and in the worst case, both students with edges to it will match to it. This leads to a violation of 1 in this school’s capacity.
•

For any other school $v\in\hat{T}$ , it again has capacity at least 1 and has $2+\delta_{v}$ students applying to it. In the worst case, this leads to a capacity violation of at most $1+\delta_{v}$ .
•

For schools $v\in T^{\prime}\setminus\hat{T}$ , since the degree is $\delta_{v}$ , this leads to a violation of at most $\delta_{v}$ .

In total, this leads to a violation of 1 per school and the excess degrees $\sum_{v}\delta_{v}$ lead to an additional $2g$ violation overall. This completes the proof. ∎

3.3 Improved Capacity Violation via Cake Cutting

In this part, we show the following theorem, corresponding to the second part of Theorem 1. The main advantage is that we don’t need extra capacity at each school, albeit at the expense of a worse running time.

Theorem 5.

There is a $n^{O(g)}$ time algorithm that computes an integer assignment $\vec{y}\in\mathcal{P}$ that satisfies relaxed school capacities $\vec{C^{\prime}}$ and yields utilities $\vec{U^{\prime}}$ , such that:

1.

$U^{\prime}_{k}\geq U^{*}_{k}$ for all groups $k$ ; and
2.

$C^{\prime}_{j}\leq C_{j}+\delta_{j}$ with $\sum_{j}\delta_{j}=O(g^{2})$ .

We prove the above theorem by replace Step 7 in Algorithm 1 with a more involved cake cutting process, building on the work of [10].

Paths and Cycles.

Let $G$ be the graph at the beginning of Step 7 in Algorithm 1. Recall that the maximum degree in $G$ was 2, save for some vertices with excess degrees in Eq. 2. We will process the graph into a graph of maximum degree 2, with some additional properties, in Algorithm 2.

Algorithm 2 Graph Modification

1:for each student

i

with degree strictly more than

2

2: Add a capacity of one to

j^{*}=\mbox{argmax}_{j}\{u_{ij}|y_{ij}>0\}

3: Fix

y_{ij^{*}}=1

and remove this student.

4: Decrease the capacity of

j^{*}

correspondingly.

5:end for

S_{1}=\{j|j\in\hat{T},\mbox{degree}(j)>2\}

S_{2}=\{j|j\in\hat{T},\mbox{degree}(j)=2,C_{j}=2\}

8:for each school

j\in S_{1}\cup S_{2}\cup(T^{\prime}\setminus\hat{T})

d=\mbox{degree}(j)

10: Create

d

copies of

j

, each with capacity one.

11: Assign (add an edge from) each

i

with

y_{ij}>0

to a distinct copy of

j

12:

\triangleright

Each new school has degree one.

13:end for

14:If a school has degree one, reduce its capacity to one.

At the end of the process, let $G(V,E)$ be the resulting graph on fractional edges. Any vertex has degree at most two, and hence we get a graph with the following structure:

•

Every connected component is a path or a cycle.
•

Every student has degree exactly two, and is an internal node of a path or cycle.
•

Every school $j\in T^{\prime}$ has capacity one and degree at most two.
•

School $j\in T^{\prime}\setminus\hat{T}$ has degree one and capacity one, and is therefore a leaf of a path.

Lemma 6.

Algorithm 2 violates the total capacity by an additional $4g$ .

Proof.

For $j\in S_{1}$ , let $\mbox{degree}(j)=2+\delta_{j}>2$ , implying $\delta_{j}\geq 1$ . We increase the capacity by $1+\delta_{j}\leq 2\delta_{j}$ . For $j\in S_{2}$ , the new capacity is the same as the original capacity. For $j\in T^{\prime}\setminus\hat{T}$ , suppose $\mbox{degree}(j)=\delta_{j}$ , then we increase the capacity by $\delta_{j}$ . By Eq. 2, the total increase is therefore at most $4g$ . ∎

In the graph $G$ , suppose $e=(i,j)$ ; then we denote $x_{e}=y_{ij}$ . This graph is a collection of paths and cycles. For non-leaf school or student $v$ , the above conditions imply $x_{e_{1}}+x_{e_{2}}=1$ if the two edges incident on $v$ are $e_{1}$ and $e_{2}$ . This follows because a degree-two school must belong to $\hat{T}$ and corresponds to a tight constraint, and any student is associated with a tight constraint. This implies the following claim:

Claim 7.

For every component (path or cycle) $C$ of $G$ , there is some $\alpha\in(0,1)$ such that every even edge $e$ in the component has $x_{e}=\alpha$ and every odd edge has $x_{e}=1-\alpha$ .

Bounding the number of fractional components.

We view this fractional solution in the following way. For a component $C$ , set $z_{C}=\alpha$ if $x_{e}=\alpha$ for every even edge. Let $u_{C}^{even}(i)$ be the utility that group $i$ gets in the assignment that selects all even edges (and none of the odd edges) of component $C$ . Let $u_{C}^{odd}(\ell)$ be the utility that group $\ell$ gets in the assignment that selects all odd edges of component $C$ . We modify (LP1) to the following, where $\hat{U}^{*}_{\ell}$ is the modified utility after removing all the integral variables.

\begin{array}[]{rcll}\sum_{C}z_{C}\cdot u_{C}^{even}(\ell)+(1-z_{C})\cdot u_{C% }^{odd}(\ell)&\geq&\hat{U}^{*}_{\ell}&\forall\mbox{ groups }\ell\\ z_{C}&\in&[0,1]&\forall\mbox{ components }C\end{array}

Let $s$ denote the number of variables. In any extreme point solution, at least $s-g$ of the constraints $z_{C}\in[0,1]$ are tight, which means at most $g$ of the $z_{C}$ variables can be fractional, in $(0,1)$ . For all integral $z_{C}$ , we select the even matching if $z_{C}=1$ or the odd matching if $z_{C}=0$ . Remove these variables and rewrite the above LP just on the fractional variables.

Algorithm 3 Cake Frosting Rounding

1:for every student

i

2: if

[\frac{i-1}{r},\frac{i}{r})\subseteq X

then

3: Choose the edge from the even matching for student

i

, and include

i

in set

T_{1}

4: else if

[\frac{i-1}{r},\frac{i}{r})\subseteq[0,1]\setminus X

then

5: Choose the edge from the odd matching for student

i

and include

i

in set

T_{2}

6: else

7: Assign

i

\mbox{argmax}_{j^{\prime}}\{u_{ij^{\prime}},y_{ij^{\prime}}>0\}

8: end if

9:end for

Reduction to Cake Frosting.

For the $g$ components with fractional $z_{C}$ , we need to find an integral solution that approximately preserves the utilities. This would be achieved if we could ‘interpolate’ $z_{C}$ fraction of the way from the odd matching to the even matching. We can view this as a cake-frosting problem as in [10] where the $g$ groups are the players. First, we convert each cycle into a path as follows: Pick some student $i$ on this cycle, assign $i$ to $\mbox{argmax}_{j}\{u_{ij}|y_{ij}>0\}$ , and delete this student. This step increases the capacity of at most $g$ schools by one, and reduces each cycle to a path that begins and ends at a school.

Before proceeding, we recall the “cake frosting” lemma first presented in [14] and generalized in [5, 10].

Lemma 8 (Cake Frosting Lemma).

Given $g$ piecewise constant functions $f_{\ell},\ell=1,2,\ldots,g$ with domain $[0,1]$ , and given any $\alpha\in(0,1)$ , there is a ‘perfect frosting’ $X\subseteq[0,1]$ written as a union of at most $2g-1$ intervals such that for all $\ell$ :

\int_{X}f_{\ell}(x)dx=\alpha\cdot\int_{0}^{1}f_{\ell}(x)dx.

We now show how to apply the above lemma in a fashion similar to [10, 5]. Fix a path $C$ . Let $z_{C}=\alpha$ . Let there be $r$ students in $C$ , indexed from $1$ to $r$ . We divide the interval $[0,1]$ into $r$ parts where $[\frac{i-1}{r},\frac{i}{r})$ belongs to the $i^{th}$ student.²²2This is in contrast to the method in [10], which defines intervals based on schools. Define for every group $\ell$ ,

	$\displaystyle u_{even}(\ell,i)=\begin{cases}u_{ij}&\mbox{ if the even matching% assigns student $i$ school $j$ and student $i$ is in group $\ell$}\\ 0&\mbox{ Otherwise}\end{cases}$
	$\displaystyle u_{odd}(\ell,i)=\begin{cases}u_{ij}&\mbox{ if the odd matching % assigns student $i$ school $j$ and student $i$ is in group $\ell$}\\ 0&\mbox{ Otherwise}\end{cases}$

For $x\in[\frac{i-1}{r},\frac{i}{r})$ , define $f_{\ell}(x)$ as

f_{\ell}(x)=r(u_{even}(\ell,i)-u_{odd}(\ell,i)).

Rounding Procedure.

For path $C$ , we now apply the cake frosting lemma to the function $f$ as defined above, with $\alpha=z_{C}$ to find the perfect frosting $X$ that is a union of at most $2g-1$ intervals. Given $X$ , we construct the assignment as in Algorithm 3.

The final algorithm applies this procedure separately to each of the $g$ fractional paths. Note that the $\alpha$ value depends on the path.

Analysis.

We first bound the utility of each group $\ell$ in path $C$ . Define $T_{3}:=[r]\setminus(T_{1}\cup T_{2})$ , i.e. the set of students in $C$ not in $T_{1}$ or $T_{2}$ . The utility of group $\ell$ in the integer solution is:

		$\displaystyle\sum_{i\in T_{1}}u_{even}(\ell,i)+\sum_{i\in T_{2}}u_{odd}(\ell,i% )+\sum_{i\in T_{3}}\max(u_{even}(\ell,i),u_{odd}(\ell,i))$
	$\displaystyle=$	$\displaystyle\sum_{i\in T_{1}}(u_{even}(\ell,j)-u_{odd}(\ell,j))+\sum_{i\in[r]% }u_{odd}(\ell,j)+\sum_{i\in T_{3}}\max(u_{even}(\ell,i),u_{odd}(\ell,i))-u_{% odd}(\ell,i)$
	$\displaystyle\geq$	$\displaystyle\sum_{i\in T_{1}}(u_{even}(\ell,j)-u_{odd}(\ell,j))+\sum_{i\in[r]% }u_{odd}(\ell,j)+\sum_{i\in T_{3}}\absolutevalue{\left[\frac{i-1}{r},\frac{i}{% r}\right)\cap X}(u_{even}(\ell,i)-u_{odd}(\ell,i))$
	$\displaystyle=$	$\displaystyle\frac{1}{r}\sum_{i\in T_{1}}\int_{x\in[\frac{i-1}{r},\frac{i}{r})% }f_{\ell}(x)+u_{C}^{odd}(\ell)+\frac{1}{r}\sum_{i\in T_{3}}\int_{x\in[\frac{i-% 1}{r},\frac{i}{r})\cap X}f_{\ell}(x)$
	$\displaystyle=$	$\displaystyle\frac{1}{r}\int_{x\in X}f_{\ell}(x)+u_{C}^{odd}(\ell)=\frac{% \alpha}{r}\cdot\int_{x\in[0,1]}f_{\ell}(x)+u_{C}^{odd}(\ell)=\alpha\cdot(u_{C}% ^{even}(\ell)-u_{C}^{odd}(\ell))+u_{C}^{odd}(\ell)$
	$\displaystyle=$	$\displaystyle\alpha\cdot u_{C}^{even}(\ell)+(1-\alpha)\cdot u_{C}^{odd}(\ell).$

The first equality follows by adding and subtracting $\sum_{i\in T_{1}\cup T_{3}}u_{odd}(\ell,i)$ . The second line and the only inequality follows from the observation that $\absolutevalue{\left[\frac{i-1}{r},\frac{i}{r}\right)\cap X}\leq 1$ and $\max(u_{even}(\ell,i),u_{odd}(\ell,i))-u_{odd}(\ell,i)\geq 0$ . The third line follows from the definition of $f$ , the fourth line follows from the the structure of $X$ and $T_{1},T_{3}$ , and the fifth follows from the cake frosting lemma. The above chain of equalities shows that the integer solution for path $C$ has utility for each group $\ell$ at least the fractional solution.

To bound the total capacity violation, note that Algorithm 3 violates the capacity by one at every interval boundary. By the Cake Frosting lemma, this is an additional violation of $O(g)$ per path, and hence $O(g^{2})$ overall. This completes the proof of Theorem 5, and hence Theorem 1.

4 Generalization to Covering Constraints

We now generalize the setting in Theorems 4 and 5. As before, we are given a set $T$ of schools, where school $j$ has capacity $C_{j}$ , a set $S$ of students, and a bipartite graph $G=(S\cup T,E)$ between the students and schools. The objective is to find an integral assignment $\vec{y}$ of all the students that satisfies an additional set of $r$ covering constraints with non-negative coefficients.

In other words, we wish solve the following integer program:

(IP)

$\displaystyle\sum_{j}y_{ij}$	$\displaystyle=1$	$\displaystyle\forall\mbox{ students }i$	(3)
$\displaystyle\sum_{i}y_{ij}$	$\displaystyle\leq C_{j}$	$\displaystyle\forall\mbox{ schools }j$	(4)
$\displaystyle\sum_{i,j}q_{ij}^{\ell}y_{ij}$	$\displaystyle\geq Q_{\ell}$	$\displaystyle\forall\ell\in\{1,2,\ldots,r\}$	(5)
$\displaystyle y_{ij}$	$\displaystyle\in\{0,1\}\qquad$	$\displaystyle\forall~{}i,j$	(6)

As motivation, each $q_{ij}^{\ell}$ could capture a different type of utility student $i$ has for school $j$ , for instance, academic excellence, closeness, diversity of student body, etc, and we want a solution that achieves high average utility in all dimensions.

Define LP to be linear relaxation of IP obtained by relaxing Eq. 6 to $y_{ij}\in[0,1]$ . Unlike the previous section, a given $y_{ij}$ variable can appear in the constraints in an arbitrary way. We show that Theorems 4 and 5 generalizes to arbitrary covering constraints at the cost of incurring an additive loss proportional to $Q_{\mathrm{max}}:=\max_{i,j,\ell}q_{ij}^{\ell}$ and $r$ , the number of rows in $Q$ .

4.1 Generalizing Theorem 4

Our algorithm runs in the following steps, which build on Algorithm 1.

1.

Solve the linear programming relaxation, fix and remove the integral variables, and find a vertex solution. Let $E$ be the set of fractional variables, and $S^{\prime}$ be the remaining students.
2.

Rewrite LP on the variables $E$ and without the capacity constraints Eq. 4.
3.

Keep eliminating integer variables, stop** at a vertex solution where all variables are fractional. Let $E^{\prime}$ be the remaining variables and $S^{\prime\prime}$ be the remaining students.
4.

Set an arbitrary $y_{ij}>0$ to $1$ for each $i\in S^{\prime\prime}$ .

Theorem 9.

For arbitrary covering constraints, when the linear programming relaxation has a feasible solution, there is a polynomial time algorithm that outputs an integer matching and that achieves the following guarantee:

•

The constraints Eq. 5 are preserved up to an additive $r\cdot Q_{\mathrm{max}}$ ; and
•

If each school is given one unit extra capacity, the total violation in Eq. 4 over this is $2r$ .

Proof.

First, the proof of Theorem 4 shows that regardless of how the students in $S^{\prime}$ are assigned, if each school is given one extra unit of capacity, then the total violation in capacity is at most $2r$ .

We can therefore focus on assigning the students so that the constraints Eq. 5 are not violated significantly. In Step (2), since any student in $S^{\prime\prime}$ has degree at least $2$ , we have $|E^{\prime}|\geq 2|S^{\prime\prime}|$ . Further, any extreme point in Step (3) has exactly $|E^{\prime}|$ tight constraints. But since the number of potential tight constraints is at most $|S^{\prime\prime}|+r$ , we obtain $|S^{\prime\prime}|\leq r$ . Therefore Step (4) violates each constraint by an additive $r\cdot Q_{\mathrm{max}}$ , completing the proof. ∎

4.2 Generalizing Theorem 5

We next generalize Theorem 5. We first apply Algorithm 1 to the linear programming relaxation. We then follow the procedure in Section 3.3 and sequentially apply Algorithms 2 and 3 to the fractional solution. To set up the cake frosting game to apply Algorithm 3, we view each of the $r$ Eq. 5 constraints as a player of the cake frosting instance, and the function $u_{even}(\ell,i):=q_{ij}^{\ell}$ where $(i,j)$ is the even matching edge adjacent to $j$ and $u_{odd}(\ell,i)$ is defined similarly. The only steps that are different are the assignment steps – Step 3 in Algorithm 2 and Step 7 in Algorithm 3. Here, we perform an arbitrary assignment of the students to the schools. We present the complete algorithm in Algorithm 4 for completeness.

Algorithm 4 Algorithm for Theorem 10

1:repeat

2: Obtain a vertex solution

\vec{y}

to (LP1).

3: for all

y_{ij}=b\in\{0,1\}

4: Fix

y_{ij}=b

and remove this variable from (LP1), updating the constraints as needed.

5: end for

6:until (LP1) is not modified.

7:for each student

i

with degree strictly more than

2

8: Add a capacity of one to an arbitrary school

j

with

y_{ij}>0

9: Fix

y_{ij}=1

and remove this student.

10: Decrease the capacity of

j

correspondingly.

11:end for

12:

S_{1}=\{j|j\in\hat{T},\mbox{degree}(j)>2\}

13:

S_{2}=\{j|j\in\hat{T},\mbox{degree}(j)=2,C_{j}=2\}

14:for each school

j\in S_{1}\cup S_{2}\cup(T^{\prime}\setminus\hat{T})

15:

d=\mbox{degree}(j)

16: Create

d

copies of

j

, each with capacity one.

17: Assign (add an edge from) each

i

with

y_{ij}>0

to a distinct copy of

j

18:end for

19:If a school has degree one, reduce its capacity to one.

20:Set up the Cake Frosting instance as described in the text. Let

X

be a perfect frosting.

21:for every student

i

22: if

[\frac{i-1}{r},\frac{i}{r})\subseteq X

then

23: Choose the edge from the even matching for student

i

24: else if

[\frac{i-1}{r},\frac{i}{r})\subseteq[0,1]\setminus X

then

25: Choose the edge from the odd matching for student

i

26: else

27: Assign

i

to some

j

with

y_{ij^{\prime}}>0

28: end if

29:end for

Theorem 10.

When the linear programming relaxation has a feasible solution, if $r$ is the number of constraints Eq. 5, there is a $n^{O(r)}$ time algorithm that outputs an integer matching and that achieves the following guarantee:

•

The constraints Eq. 5 are preserved up to an additive $O(r^{2}\cdot Q_{\mathrm{max}})$ ; and
•

The total violation in Eq. 4 is $O(r^{2})$ .

Proof.

We first argue about the violation in Eq. 5. The only steps that affect the constraints are the assignment steps – Steps 9 and 27 in Algorithm 4. In Step 9, the number of students assigned is $O(r)$ from Eq. 2, while that in Step 27 is $O(r^{2})$ . If these students are arbitrarily assigned, each assignment loses an additive $Q_{\mathrm{max}}$ in the constraint. Therefore, the overall additive loss is $O(r^{2}\cdot Q_{\mathrm{max}})$ . Note that the capacity bound violation follows from the proof of Theorem 5 and holds even when these students are arbitrarily assigned. ∎

Using Theorem 4.12 in [5], we can improve Theorem 10 to the following corollary. It follows by guessing $8r^{2}/\epsilon$ chosen edges with highest utility for each group, and applying Algorithms 2 and 3 subsequently. This yields the following corollary, and we omit the standard details.

Corollary 11.

When the linear programming relaxation has a feasible solution, if $r$ is the number of constraints Eq. 5, then for any constant $\epsilon>0$ , there is a $n^{O(r^{3}/\epsilon)}$ time algorithm that outputs an integer matching and that achieves the following guarantee:

•

The constraints Eq. 5 are preserved up to a multiplicative factor of $(1-\epsilon)$ ; and
•

The total violation in Eq. 4 is $O(r^{2})$ .

4.3 Better Bounds for Monotonic Constraints

We next show that if the constraints $Q$ have additional monotonicity structure, then we can generalize Theorem 4 without the additive loss in the constraints. We say that $Q$ satisfies “monotonicity” if for each student $i$ , there is an ordering $\succeq_{i}$ of the schools $j_{1}\succeq_{i}j_{2}\succeq_{i}\ldots\succeq_{i}j_{m}$ such that for all $\ell\in\{1,2,\ldots,r\}$ and $k\in\{1,2,\ldots,m-1\}$ , we have $q_{ij_{k}}^{\ell}\geq q_{ij_{k+1}}^{\ell}$ .

Theorem 12.

If the constraints $Q$ are monotone, when the linear programming relaxation has a feasible solution, there is a polynomial time algorithm that outputs an integer matching and that achieves the following guarantee:

•

The constraints Eq. 5 are preserved; and
•

If each school is given one unit extra capacity, the total violation in Eq. 4 over this is $2r$ .

Theorem 13.

If the constraints $Q$ are monotone, when the linear programming relaxation has a feasible solution, there is a $n^{O(r)}$ time algorithm that outputs an integer matching and that achieves the following guarantee:

•

The constraints Eq. 5 are preserved; and
•

The total violation in Eq. 4 over this is $O(r^{2})$ .

Proof of Theorems 12 and 13.

We proceed as in Theorems 4 and 5. At each point where the algorithm assigns a student to $j^{*}=\arg\max_{j\in X}u_{ij}$ for some set $X$ , we simply assign it to $\min_{k}\{j_{k}\ |j_{k}\in X\}$ . That is, assign it to the most preferred school (according to $\succeq_{i}$ ). This also preserves the $r$ constraints because of monotonicity. ∎

Application: Weak Dominance of Ranks.

As a special case, we consider the setting in [6]. Here, every student ranks the schools it has an edge to, and this ranking may have ties. Let $r$ be the largest rank any student has, which can be much smaller than the number of schools. Given a matching, the rank of edge $(p,q)$ is the rank of school $q$ in student $p$ ’s ranking. A matching $M$ has signature $\sigma=(\sigma_{1},\sigma_{2},\ldots,\sigma_{r})$ if it has $\sigma_{t}$ rank $t$ edges for every $t\in[r]$ . We say that signature $\sigma$ weakly dominates³³3The authors of [6] use ‘cumulatively better than’. signature $\rho$ , or $\sigma\succ\rho$ if

\displaystyle\forall~{}t\in[r],\sum_{t^{\prime}=1}^{t}\sigma_{t^{\prime}}\geq% \sum_{t^{\prime}=1}^{t}\rho_{t^{\prime}}.

(7)

Given an input signature $\rho$ , the goal is to find a matching whose signature weakly dominates $\rho$ . We term this the matching with ranking problem. We have the following theorems, which directly follow from the observation that the constraints satisfy the monotonicity assumption. At each step where, for some set $X$ , we assign $i$ to $\arg\max_{j\in X}{u_{ij}}$ , we instead assign it to its most preferred school from $X$ . This preserves the signature of any fractional assignment. Our approach yields faster $n^{O(r)}$ time deterministic algorithms at the cost of small violations in capacities whereas the algorithm in [6] is randomized and takes $n^{O(r^{2})}$ time. Their approach essentially uses the algorithm for Exact Matchings [8]. It has been a longstanding open question whether it is possible to derandomize it.

Theorem 14.

If there is a feasible solution to matching with ranking with input signature $\rho$ , then there is an algorithm that runs in time $\poly(n,r)$ and finds a matching with signature $\sigma\succ\rho$ that satisfies relaxed capacities $C^{\prime}$ where $C^{\prime}_{j}\leq C_{j}+1+\delta_{j}$ , and $\sum_{j}\delta_{j}\leq 2r$ .

Theorem 15.

If there is a feasible solution to matching with ranking with input signature $\rho$ , then there is an algorithm that runs in time $n^{O(r)}$ and finds a matching with signature $\sigma\succ\rho$ that satisfies relaxed capacities $C^{\prime}$ where $C^{\prime}_{j}\leq C_{j}+\delta_{j}$ , and $\sum_{j}\delta_{j}=O(r^{2})$ .

References

[1] Atila Abdulkadiroğlu and Tayfun Sönmez. School choice: A mechanism design approach. American economic review, 93(3):729–747, 2003.
[2] Federico Bobbio, Margarida Carvalho, Andrea Lodi, and Alfredo Torrico. Capacity variation in the many-to-one stable matching, 2022.
[3] Jiehua Chen and Gergely Csáji. Optimal capacity modification for many-to-one matching problems. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, AAMAS ’23, page 2880–2882, Richland, SC, 2023. International Foundation for Autonomous Agents and Multiagent Systems.
[4] Michel X Goemans. Minimum bounded degree spanning trees. In 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06), pages 273–282. IEEE, 2006.
[5] Fabrizio Grandoni, R. Ravi, Mohit Singh, and Rico Zenklusen. New approaches to multi-objective optimization. Math. Program., 146(1-2):525–554, 2014.
[6] Santhini K. A., Govind S. Sankar, and Meghana Nasre. Optimal matchings with one-sided preferences: Fixed and cost-based quotas. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, AAMAS ’22, page 696–704, Richland, SC, 2022. International Foundation for Autonomous Agents and Multiagent Systems.
[7] Jan Karel Lenstra, David B. Shmoys, and Eva Tardos. Approximation algorithms for scheduling unrelated parallel machines. In 28th Annual Symposium on Foundations of Computer Science (sfcs 1987), pages 217–224, 1987.
[8] Ketan Mulmuley, Umesh V Vazirani, and Vijay V Vazirani. Matching is as easy as matrix inversion. In Proceedings of the nineteenth annual ACM symposium on Theory of computing, pages 345–354, 1987.
[9] Thanh Nguyen and Rakesh Vohra. Near-feasible stable matchings with couples. American Economic Review, 108(11):3154–3169, 2018.
[10] Ariel D. Procaccia, Isaac Robinson, and Jamie Tucker-Foltz. School redistricting: Wi** unfairness off the map. In Proc. ACM-SIAM SODA, 2024.
[11] David B. Shmoys and Éva Tardos. An approximation algorithm for the generalized assignment problem. Math. Program., 62:461–474, 1993.
[12] Mohit Singh and Lap Chi Lau. Approximating minimum bounded degree spanning trees to within one of optimal. In Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing, STOC ’07, page 661–670, New York, NY, USA, 2007. Association for Computing Machinery.
[13] A. H. Stone and J. W. Tukey. Generalized “sandwich” theorems. Duke Mathematical Journal, 9(2):356 – 359, 1942.
[14] Walter Stromquist and D.R Woodall. Sets on which several measures agree. Journal of Mathematical Analysis and Applications, 108(1):241–248, 1985.