-
Multiway Cuts with a Choice of Representatives
Authors:
Kristóf Bérczi,
Tamás Király,
Daniel P. Szabo
Abstract:
In this paper, we study several generalizations of multiway cut where the terminals can be chosen as \emph{representatives} from sets of \emph{candidates} $T_1,\ldots,T_q$. In this setting, one is allowed to choose these representatives so that the minimum-weight cut separating these sets \emph{via their representatives} is as small as possible. We distinguish different cases depending on (A) whet…
▽ More
In this paper, we study several generalizations of multiway cut where the terminals can be chosen as \emph{representatives} from sets of \emph{candidates} $T_1,\ldots,T_q$. In this setting, one is allowed to choose these representatives so that the minimum-weight cut separating these sets \emph{via their representatives} is as small as possible. We distinguish different cases depending on (A) whether the representative of a candidate set has to be separated from the other candidate sets completely or only from the representatives, and (B) whether there is a single representative for each candidate set or the choice of representative is independent for each pair of candidate sets. For fixed $q$, we give approximation algorithms for each of these problems that match the best known approximation guarantee for multiway cut. Our technical contribution is a new extension of the CKR relaxation that preserves approximation guarantees. For general $q$, we show $o(\log q)$-inapproximability for all cases where the choice of representatives may depend on the pair of candidate sets, as well as for the case where the goal is to separate a fixed node from a single representative from each candidate set. As a positive result, we give a $2$-approximation algorithm for the case where we need to choose a single representative from each candidate set. This is a generalization of the $(2-2/k)$-approximation for k-cut, and we can solve it by relating the tree case to optimization over a gammoid.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
A Uniformly Random Solution to Algorithmic Redistricting
Authors:
**-Yi Cai,
Jacob Kruse,
Kenneth Mayer,
Daniel P. Szabo
Abstract:
The process of drawing electoral district boundaries is known as political redistricting. Within this context, gerrymandering is the practice of drawing these boundaries such that they unfairly favor a particular political party, often leading to unequal representation and skewed electoral outcomes. One of the few ways to detect gerrymandering is by algorithmically sampling redistricting plans. Pr…
▽ More
The process of drawing electoral district boundaries is known as political redistricting. Within this context, gerrymandering is the practice of drawing these boundaries such that they unfairly favor a particular political party, often leading to unequal representation and skewed electoral outcomes. One of the few ways to detect gerrymandering is by algorithmically sampling redistricting plans. Previous methods mainly focus on sampling from some neighborhood of ``realistic' districting plans, rather than a uniform sample of the entire space. We present a deterministic subexponential time algorithm to uniformly sample from the space of all possible $ k $-partitions of a bounded degree planar graph, and with this construct a sample of the entire space of redistricting plans. We also give a way to restrict this sample space to plans that match certain compactness and population constraints at the cost of added complexity. The algorithm runs in $ 2^{O(\sqrt{n}\log n)} $ time, although we only give a heuristic implementation. Our method generalizes an algorithm to count self-avoiding walks on a square to count paths that split general planar graphs into $ k $ regions, and uses this to sample from the space of all $ k $-partitions of a planar graph.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Diameter Reduction Via Flip** Arcs
Authors:
Panna Gehér,
Max Kölbl,
Lydia Mirabel Mendoza-Cadena,
Daniel P. Szabo
Abstract:
The diameter of a directed graph is a fundamental parameter defined as the maximum distance realized among the pairs of vertices. As graphs of small diameter are of interest in many applications, we study the following problem: for a given directed graph and a positive integer $d$, what is the minimum number of arc flips (also known as arc reversal) required to obtain a graph with diameter at most…
▽ More
The diameter of a directed graph is a fundamental parameter defined as the maximum distance realized among the pairs of vertices. As graphs of small diameter are of interest in many applications, we study the following problem: for a given directed graph and a positive integer $d$, what is the minimum number of arc flips (also known as arc reversal) required to obtain a graph with diameter at most $d$? It is a generalization of the well-known problem \textsc{Oriented Diameter}, first studied by Chvátal and Thomassen. Here we investigate variants of the above problem, considering the number of flips and the target diameter as parameters. We prove that most of the related questions of this type are hard. Special cases of graphs are also considered, as planar and cactus graphs, where we give polynomial time algorithms.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Holey graphs: very large Betti numbers are testable
Authors:
Dániel Szabó,
Simon Apers
Abstract:
We show that the graph property of having a (very) large $k$-th Betti number $β_k$ for constant $k$ is testable with a constant number of queries in the dense graph model. More specifically, we consider a clique complex defined by an underlying graph and prove that for any $\varepsilon>0$, there exists $δ(\varepsilon,k)>0$ such that testing whether $β_k \geq (1-δ) d_k$ for…
▽ More
We show that the graph property of having a (very) large $k$-th Betti number $β_k$ for constant $k$ is testable with a constant number of queries in the dense graph model. More specifically, we consider a clique complex defined by an underlying graph and prove that for any $\varepsilon>0$, there exists $δ(\varepsilon,k)>0$ such that testing whether $β_k \geq (1-δ) d_k$ for $δ\leq δ(\varepsilon,k)$ reduces to tolerantly testing $(k+2)$-clique-freeness, which is known to be testable. This complements a result by Elek (2010) showing that Betti numbers are testable in the bounded-degree model. Our result combines the Euler characteristic, matroid theory and the graph removal lemma.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Bringing Spatial Interaction Measures into Multi-Criteria Assessment of Redistricting Plans Using Interactive Web Map**
Authors:
Jacob Kruse,
Song Gao,
Yuhan Ji,
Daniel P. Szabo,
Kenneth Mayer
Abstract:
Redistricting is the process by which electoral district boundaries are drawn, and a common normative assumption in this process is that districts should be drawn so as to capture coherent communities of interest (COIs). While states rely on various proxies for community illustration, such as compactness metrics and municipal split counts, to guide redistricting, recent legal challenges and schola…
▽ More
Redistricting is the process by which electoral district boundaries are drawn, and a common normative assumption in this process is that districts should be drawn so as to capture coherent communities of interest (COIs). While states rely on various proxies for community illustration, such as compactness metrics and municipal split counts, to guide redistricting, recent legal challenges and scholarly works have shown the failings of such proxy measures and the difficulty of balancing multiple criteria in district plan creation. To address these issues, we propose the use of spatial interaction communities to directly quantify the degree to which districts capture the underlying COIs. Using large-scale human mobility flow data, we condense spatial interaction community capture for a set of districts into a single number, the interaction ratio (IR), which can be used for redistricting plan evaluation. To compare the IR to traditional redistricting criteria (compactness and fairness), and to explore the range of IR values found in valid districting plans, we employ a Markov chain-based regionalization algorithm (ReCom) to produce ensembles of valid plans, and calculate the degree to which they capture spatial interaction communities. Furthermore, we propose two methods for biasing the ReCom algorithm towards different IR values. We perform a multi-criteria assessment of the space of valid maps, and present the results in an interactive web map. The experiments on Wisconsin congressional districting plans demonstrate the effectiveness of our methods for biasing sampling towards higher or lower IR values. Furthermore, the analysis of the districts produced with these methods suggests that districts with higher IR and compactness values tend to produce district plans that are more proportional with regards to seats allocated to each of the two major parties.
△ Less
Submitted 23 September, 2023;
originally announced September 2023.
-
Constructing Highly Symmetric Compact Manifolds and Algebraic Varieties
Authors:
Dávid R. Szabó
Abstract:
For every algebraically closed field and natural number $r$, we construct an algebraic variety (over the field) whose birational automorphism group contains every finite nilpotent group of class at most $2$ whose rank is at most $r$ and whose order is coprime to the characteristic of the field. This construction is sharp in characteristic $0$, i.e. up to bounded extension, the set of groups from t…
▽ More
For every algebraically closed field and natural number $r$, we construct an algebraic variety (over the field) whose birational automorphism group contains every finite nilpotent group of class at most $2$ whose rank is at most $r$ and whose order is coprime to the characteristic of the field. This construction is sharp in characteristic $0$, i.e. up to bounded extension, the set of groups from the statement cannot be replaced by a larger one.
Using similar main steps, for every $r$, we construct several compact manifolds whose diffeomorphism groups contain every finite nilpotent group of class at most $2$ whose rank is at most $r$. This result answers a question of Mundet i Riera affirmatively and is conjecturally sharp up to bounded extension.
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
Finite Class 2 Nilpotent and Heisenberg Groups
Authors:
Dávid R. Szabó
Abstract:
We present a structural description of finite nilpotent groups of class at most $2$ using a specified number of subdirect and central products of $2$-generated such groups. As a corollary, we show that all of these groups are isomorphic to a subgroup of a Heisenberg group satisfying certain properties. The motivation for these results is of topological nature as they can be used to give lower boun…
▽ More
We present a structural description of finite nilpotent groups of class at most $2$ using a specified number of subdirect and central products of $2$-generated such groups. As a corollary, we show that all of these groups are isomorphic to a subgroup of a Heisenberg group satisfying certain properties. The motivation for these results is of topological nature as they can be used to give lower bounds to the nilpotently Jordan property of the birational automorphism group of varieties and the homeomorphism group of compact manifolds.
△ Less
Submitted 13 April, 2023; v1 submitted 4 January, 2023;
originally announced January 2023.
-
A (simple) classical algorithm for estimating Betti numbers
Authors:
Simon Apers,
Sander Gribling,
Sayantan Sen,
Dániel Szabó
Abstract:
We describe a simple algorithm for estimating the $k$-th normalized Betti number of a simplicial complex over $n$ elements using the path integral Monte Carlo method. For a general simplicial complex, the running time of our algorithm is $n^{O\left(\frac{1}{\sqrtγ}\log\frac{1}{\varepsilon}\right)}$ with $γ$ measuring the spectral gap of the combinatorial Laplacian and $\varepsilon \in (0,1)$ the a…
▽ More
We describe a simple algorithm for estimating the $k$-th normalized Betti number of a simplicial complex over $n$ elements using the path integral Monte Carlo method. For a general simplicial complex, the running time of our algorithm is $n^{O\left(\frac{1}{\sqrtγ}\log\frac{1}{\varepsilon}\right)}$ with $γ$ measuring the spectral gap of the combinatorial Laplacian and $\varepsilon \in (0,1)$ the additive precision. In the case of a clique complex, the running time of our algorithm improves to $\left(n/λ_{\max}\right)^{O\left(\frac{1}{\sqrtγ}\log\frac{1}{\varepsilon}\right)}$ with $λ_{\max} \geq k$, where $λ_{\max}$ is the maximum eigenvalue of the combinatorial Laplacian. Our algorithm provides a classical benchmark for a line of quantum algorithms for estimating Betti numbers. On clique complexes it matches their running time when, for example, $γ\in Ω(1)$ and $k \in Ω(n)$.
△ Less
Submitted 5 December, 2023; v1 submitted 17 November, 2022;
originally announced November 2022.
-
End-to-End Annotator Bias Approximation on Crowdsourced Single-Label Sentiment Analysis
Authors:
Gerhard Johann Hagerer,
David Szabo,
Andreas Koch,
Maria Luisa Ripoll Dominguez,
Christian Widmer,
Maximilian Wich,
Hannah Danner,
Georg Groh
Abstract:
Sentiment analysis is often a crowdsourcing task prone to subjective labels given by many annotators. It is not yet fully understood how the annotation bias of each annotator can be modeled correctly with state-of-the-art methods. However, resolving annotator bias precisely and reliably is the key to understand annotators' labeling behavior and to successfully resolve corresponding individual misc…
▽ More
Sentiment analysis is often a crowdsourcing task prone to subjective labels given by many annotators. It is not yet fully understood how the annotation bias of each annotator can be modeled correctly with state-of-the-art methods. However, resolving annotator bias precisely and reliably is the key to understand annotators' labeling behavior and to successfully resolve corresponding individual misconceptions and wrongdoings regarding the annotation task. Our contribution is an explanation and improvement for precise neural end-to-end bias modeling and ground truth estimation, which reduces an undesired mismatch in that regard of the existing state-of-the-art. Classification experiments show that it has potential to improve accuracy in cases where each sample is annotated only by one single annotator. We provide the whole source code publicly and release an own domain-specific sentiment dataset containing 10,000 sentences discussing organic food products. These are crawled from social media and are singly labeled by 10 non-expert annotators.
△ Less
Submitted 24 July, 2023; v1 submitted 3 November, 2021;
originally announced November 2021.
-
Does home advantage without crowd exist in American football?
Authors:
Dávid Zoltán Szabó,
Diego Andrés Pérez
Abstract:
It is well-known that home team has an inherent advantage against visiting teams when playing team sports. One of the most obvious underlying reasons, the presence of supporting fans has mostly disappeared in major leagues with the emergence of COVID-19 pandemic. This paper investigates with the help of historical National Football League (NFL) data, how much effect spectators have on the game out…
▽ More
It is well-known that home team has an inherent advantage against visiting teams when playing team sports. One of the most obvious underlying reasons, the presence of supporting fans has mostly disappeared in major leagues with the emergence of COVID-19 pandemic. This paper investigates with the help of historical National Football League (NFL) data, how much effect spectators have on the game outcome. Our findings reveal that under no allowance of spectators the home teams' performance is substantially lower than under normal circumstances, even performing slightly worse than the visiting teams. On the other hand, when a limited amount of spectators are allowed to the game, the home teams' performance is no longer significantly different than what we observe with full stadiums. This suggests that from a psychological point of view the effect of crowd support is already induced by a fraction of regular fans.
△ Less
Submitted 22 April, 2021;
originally announced April 2021.
-
Quantum Inspired Adaptive Boosting
Authors:
Bálint Daróczy,
Katalin Friedl,
László Kabódi,
Attila Pereszlényi,
Dániel Szabó
Abstract:
Building on the quantum ensemble based classifier algorithm of Schuld and Petruccione [arXiv:1704.02146v1], we devise equivalent classical algorithms which show that this quantum ensemble method does not have advantage over classical algorithms. Essentially, we simplify their algorithm until it is intuitive to come up with an equivalent classical version. One of the classical algorithms is extreme…
▽ More
Building on the quantum ensemble based classifier algorithm of Schuld and Petruccione [arXiv:1704.02146v1], we devise equivalent classical algorithms which show that this quantum ensemble method does not have advantage over classical algorithms. Essentially, we simplify their algorithm until it is intuitive to come up with an equivalent classical version. One of the classical algorithms is extremely simple and runs in constant time for each input to be classified. We further develop the idea and, as the main contribution of the paper, we propose methods inspired by combining the quantum ensemble method with adaptive boosting. The algorithms were tested and found to be comparable to the AdaBoost algorithm on publicly available data sets.
△ Less
Submitted 1 February, 2021;
originally announced February 2021.
-
Pristine annotations-based multi-modal trained artificial intelligence solution to triage chest X-ray for COVID-19
Authors:
Tao Tan,
Bipul Das,
Ravi Soni,
Mate Fejes,
Sohan Ranjan,
Daniel Attila Szabo,
Vikram Melapudi,
K S Shriram,
Utkarsh Agrawal,
Laszlo Rusko,
Zita Herczeg,
Barbara Darazs,
Pal Tegzes,
Lehel Ferenczi,
Rakesh Mullick,
Gopal Avinash
Abstract:
The COVID-19 pandemic continues to spread and impact the well-being of the global population. The front-line modalities including computed tomography (CT) and X-ray play an important role for triaging COVID patients. Considering the limited access of resources (both hardware and trained personnel) and decontamination considerations, CT may not be ideal for triaging suspected subjects. Artificial i…
▽ More
The COVID-19 pandemic continues to spread and impact the well-being of the global population. The front-line modalities including computed tomography (CT) and X-ray play an important role for triaging COVID patients. Considering the limited access of resources (both hardware and trained personnel) and decontamination considerations, CT may not be ideal for triaging suspected subjects. Artificial intelligence (AI) assisted X-ray based applications for triaging and monitoring require experienced radiologists to identify COVID patients in a timely manner and to further delineate the disease region boundary are seen as a promising solution. Our proposed solution differs from existing solutions by industry and academic communities, and demonstrates a functional AI model to triage by inferencing using a single x-ray image, while the deep-learning model is trained using both X-ray and CT data. We report on how such a multi-modal training improves the solution compared to X-ray only training. The multi-modal solution increases the AUC (area under the receiver operating characteristic curve) from 0.89 to 0.93 and also positively impacts the Dice coefficient (0.59 to 0.62) for localizing the pathology. To the best our knowledge, it is the first X-ray solution by leveraging multi-modal information for the development.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.
-
Discrete element model for high strain rate deformations of snow
Authors:
Thiemo Theile,
Denes Szabo,
Carolin Willibald,
Martin Schneebeli
Abstract:
In engineering applications snow often undergoes large and fast deformations. During these deformations the snow transforms from a sintered porous material into a granular material. In order to capture the fundamental mechanical behavior of this process a discrete element (DE) model is the physically most appropriate. It explicitly includes all the relevant components: the snow microstructure, con…
▽ More
In engineering applications snow often undergoes large and fast deformations. During these deformations the snow transforms from a sintered porous material into a granular material. In order to capture the fundamental mechanical behavior of this process a discrete element (DE) model is the physically most appropriate. It explicitly includes all the relevant components: the snow microstructure, consisting of bonded grains, the breaking of the bonds and the following rearrangement and interaction of the loose grains. We developed and calibrated a DE snow model based on the open source DE code liggghts. In the model snow grains are represented by randomly distributed elastic spheres connected by elastic-brittle bonds. This bonded structure corresponds to sintered snow. After applying external forces, the stresses in the bonds might exceed their strength, the bonds break, and we obtain loose particles, corresponding to granular snow. Model parameters can be divided into temperature dependent material parameters and snow type dependent microstructure parameters. The model was calibrated by angle of repose experiments and several high strain rate mechanical tests, performed in a cold laboratory. We demonstrate the performance of the DE snow model by the simulation of a combined compression and shear deformation of different snow types with large strains. The model successfully reproduces the experiments. Most characteristics of the mechanical snow behavior are captured by the model, like the fracture behavior, the differences between low and high density snow, the granular shear flow or the densification of low density snow. The model is promising to simulate arbitrary high strain rate processes for a wide range of snow types, and thus seems useful to be applied to different snow engineering problems.
△ Less
Submitted 3 July, 2020;
originally announced July 2020.
-
An Automated Approach for the Discovery of Interoperability
Authors:
Duygu Sap,
Daniel P. Szabo
Abstract:
In this article, we present an automated approach that would test for and discover the interoperability of CAD systems based on the approximately-invariant shape properties of their models. We further show that exchanging models in standard format does not guarantee the preservation of shape properties. Our analysis is based on utilizing queries in deriving the shape properties and constructing th…
▽ More
In this article, we present an automated approach that would test for and discover the interoperability of CAD systems based on the approximately-invariant shape properties of their models. We further show that exchanging models in standard format does not guarantee the preservation of shape properties. Our analysis is based on utilizing queries in deriving the shape properties and constructing the proxy models of the given CAD models [1]. We generate template files to accommodate the information necessary for the property computations and proxy model constructions, and implement an interoperability discovery program called DTest to execute the interoperability testing. We posit that our method could be extended to interoperability testing on CAD-to-CAE and/or CAD-to-CAM interactions by modifying the set of property checks and providing the additional requirements that may emerge in CAE or CAM applications.
△ Less
Submitted 26 January, 2020;
originally announced January 2020.
-
Special $p$-groups acting on compact manifolds
Authors:
Dávid R. Szabó
Abstract:
Riera proved at arXiv:1412.6964 that the diffeomorphism group of particular compact manifolds are not Jordan by exhibiting subgroups isomorphic to extra-special $p$-groups of exponent $p$ for primes $p$ satisfying some conditions. Generalising the methods of that paper, we construct a compact connected smooth real manifold for every natural number $r$ whose diffeomorphism group contains not only e…
▽ More
Riera proved at arXiv:1412.6964 that the diffeomorphism group of particular compact manifolds are not Jordan by exhibiting subgroups isomorphic to extra-special $p$-groups of exponent $p$ for primes $p$ satisfying some conditions. Generalising the methods of that paper, we construct a compact connected smooth real manifold for every natural number $r$ whose diffeomorphism group contains not only every extra-special $p$-group, but also every special $p$-group of order $p^r$ independently of its exponent for every prime $p$. We obtain a similar statement about finite Heisenberg groups as well as we display a very explicit counterexample to the conjecture of Ghys about Jordan property of diffeomorphism group of compact manifolds.
△ Less
Submitted 23 December, 2019; v1 submitted 22 January, 2019;
originally announced January 2019.
-
Network Coding as a Service
Authors:
Dávid Szabó,
Attila Csoma,
Péter Megyesi,
András Gulyás,
Frank H. P. Fitzek
Abstract:
Network Coding (NC) shows great potential in various communication scenarios through changing the packet forwarding principles of current networks. It can improve not only throughput, latency, reliability and security but also alleviates the need of coordination in many cases. However, it is still controversial due to widespread misunderstandings on how to exploit the advantages of it. The aim of…
▽ More
Network Coding (NC) shows great potential in various communication scenarios through changing the packet forwarding principles of current networks. It can improve not only throughput, latency, reliability and security but also alleviates the need of coordination in many cases. However, it is still controversial due to widespread misunderstandings on how to exploit the advantages of it. The aim of the paper is to facilitate the usage of NC by $(i)$ explaining how it can improve the performance of the network (regardless the existence of any butterfly in the network), $(ii)$ showing how Software Defined Networking (SDN) can resolve the crucial problems of deployment and orchestration of NC elements, and $(iii)$ providing a prototype architecture with measurement results on the performance of our network coding capable software router implementation compared by fountain codes.
△ Less
Submitted 13 January, 2016;
originally announced January 2016.
-
Deductive Way of Reasoning about the Internet AS Level Topology
Authors:
Dávid Szabó,
Attila Kőrösi,
József Bíró,
András Gulyás
Abstract:
Our current understanding about the AS level topology of the Internet is based on measurements and inductive-type models which set up rules describing the behavior (node and edge dynamics) of the individual ASes and generalize the consequences of these individual actions for the complete AS ecosystem using induction. In this paper we suggest a third, deductive approach in which we have premises fo…
▽ More
Our current understanding about the AS level topology of the Internet is based on measurements and inductive-type models which set up rules describing the behavior (node and edge dynamics) of the individual ASes and generalize the consequences of these individual actions for the complete AS ecosystem using induction. In this paper we suggest a third, deductive approach in which we have premises for the whole AS system and the consequences of these premises are determined through deductive reasoning. We show that such a deductive approach can give complementary insights into the topological properties of the AS graph. While inductive models can mostly reflect high level statistics (e.g. degree distribution, clustering, diameter), deductive reasoning can identify omnipresent subgraphs and peering likelihood. We also propose a model, called YEAS, incorporating our deductive analytical findings that produces topologies contain both traditional and novel metrics for the AS level Internet.
△ Less
Submitted 10 December, 2015;
originally announced December 2015.
-
Comparing dealing methods with repeating cards
Authors:
Marton Balazs,
David Zoltan Szabo
Abstract:
In a recent work Conger and Howald derived asymptotic formulas for the randomness, after shuffling, of decks with repeating cards or all-distinct decks dealt into hands. In the latter case the deck does not need to be fully randomized: the order of cards received by a player is indifferent. They called these cases the "fixed source" and the "fixed target" case, respectively, and treated them separ…
▽ More
In a recent work Conger and Howald derived asymptotic formulas for the randomness, after shuffling, of decks with repeating cards or all-distinct decks dealt into hands. In the latter case the deck does not need to be fully randomized: the order of cards received by a player is indifferent. They called these cases the "fixed source" and the "fixed target" case, respectively, and treated them separately. We build on their results and mix these two cases: we obtain asymptotic formulas for the randomness of a deck of repeating cards which is shuffled and then dealt into hands of players. We confirm that switching from ordered to cyclic dealing, or from cyclic to back-and-forth dealing improves randomness in a similar fashion than in the non-repeating "fixed target" case. Our formulas allow to improve even the back-and-forth dealing when the deck only contains two types of cards.
△ Less
Submitted 18 January, 2014; v1 submitted 3 August, 2012;
originally announced August 2012.