Search | arXiv e-print repository

Hypergraph regularity and random sampling

Authors: Felix Joos, Jaehoon Kim, Daniela Kühn, Deryk Osthus

Abstract: Suppose a $k$-uniform hypergraph $H$ that satisfies a certain regularity instance (that is, there is a partition of $H$ given by the hypergraph regularity lemma into a bounded number of quasirandom subhypergraphs of prescribed densities). We prove that with high probability a large enough uniform random sample of the vertex set of $H$ also admits the same regularity instance. Here the crucial feat… ▽ More Suppose a $k$-uniform hypergraph $H$ that satisfies a certain regularity instance (that is, there is a partition of $H$ given by the hypergraph regularity lemma into a bounded number of quasirandom subhypergraphs of prescribed densities). We prove that with high probability a large enough uniform random sample of the vertex set of $H$ also admits the same regularity instance. Here the crucial feature is that the error term measuring the quasirandomness of the subhypergraphs requires only an arbitrarily small additive correction. This has applications to combinatorial property testing. The graph case of the sampling result was proved by Alon, Fischer, Newman and Shapira. △ Less

Submitted 11 August, 2022; v1 submitted 4 October, 2021; originally announced October 2021.

Comments: 49 pages; we split our paper arXiv:1707.03303 into two, this one and the new version of arXiv:1707.03303. Final version, to appear in Random Structures and Algorithms

arXiv:2109.11438 [pdf, ps, other]

A special case of Vu's conjecture: Coloring nearly disjoint graphs of bounded maximum degree

Authors: Tom Kelly, Daniela Kühn, Deryk Osthus

Abstract: A collection of graphs is \textit{nearly disjoint} if every pair of them intersects in at most one vertex. We prove that if $G_1, \dots, G_m$ are nearly disjoint graphs of maximum degree at most $D$, then the following holds. For every fixed $C$, if each vertex $v \in \bigcup_{i=1}^m V(G_i)$ is contained in at most $C$ of the graphs $G_1, \dots, G_m$, then the (list) chromatic number of… ▽ More A collection of graphs is \textit{nearly disjoint} if every pair of them intersects in at most one vertex. We prove that if $G_1, \dots, G_m$ are nearly disjoint graphs of maximum degree at most $D$, then the following holds. For every fixed $C$, if each vertex $v \in \bigcup_{i=1}^m V(G_i)$ is contained in at most $C$ of the graphs $G_1, \dots, G_m$, then the (list) chromatic number of $\bigcup_{i=1}^m G_i$ is at most $D + o(D)$. This result confirms a special case of a conjecture of Vu and generalizes Kahn's bound on the list chromatic index of linear uniform hypergraphs of bounded maximum degree. In fact, this result holds for the correspondence (or DP) chromatic number and thus implies a recent result of Molloy, and we derive this result from a more general list coloring result in the setting of `color degrees' that also implies a result of Reed and Sudakov. △ Less

Submitted 28 October, 2023; v1 submitted 23 September, 2021; originally announced September 2021.

Comments: 16 pages with one-page appendix; final version, to appear in Combinatorics, Probability, and Computing

arXiv:2106.13733 [pdf, other]

Graph and hypergraph colouring via nibble methods: A survey

Authors: Dong Yeap Kang, Tom Kelly, Daniela Kühn, Abhishek Methuku, Deryk Osthus

Abstract: This paper provides a survey of methods, results, and open problems on graph and hypergraph colourings, with a particular emphasis on semi-random `nibble' methods. We also give a detailed sketch of some aspects of the recent proof of the Erdős-Faber-Lovász conjecture. This paper provides a survey of methods, results, and open problems on graph and hypergraph colourings, with a particular emphasis on semi-random `nibble' methods. We also give a detailed sketch of some aspects of the recent proof of the Erdős-Faber-Lovász conjecture. △ Less

Submitted 16 November, 2021; v1 submitted 25 June, 2021; originally announced June 2021.

Comments: Final version, to appear in the proceedings of the 8th European Congress of Mathematics; 33 pages, 3 figures

arXiv:1901.03677 [pdf, other]

doi 10.1371/journal.pcbi.1007165

Estimating influenza incidence using search query deceptiveness and generalized ridge regression

Authors: Reid Priedhorsky, Ashlynn R. Daughton, Martha Barnard, Fiona O'Connell, Dave Osthus

Abstract: Seasonal influenza is a sometimes surprisingly impactful disease, causing thousands of deaths per year along with much additional morbidity. Timely knowledge of the outbreak state is valuable for managing an effective response. The current state of the art is to gather this knowledge using in-person patient contact. While accurate, this is time-consuming and expensive. This has motivated inquiry i… ▽ More Seasonal influenza is a sometimes surprisingly impactful disease, causing thousands of deaths per year along with much additional morbidity. Timely knowledge of the outbreak state is valuable for managing an effective response. The current state of the art is to gather this knowledge using in-person patient contact. While accurate, this is time-consuming and expensive. This has motivated inquiry into new approaches using internet activity traces, based on the theory that lay observations of health status lead to informative features in internet data. These approaches risk being deceived by activity traces having a coincidental, rather than informative, relationship to disease incidence; to our knowledge, this risk has not yet been quantitatively explored. We evaluated both simulated and real activity traces of varying deceptiveness for influenza incidence estimation using linear regression. We found that deceptiveness knowledge does reduce error in such estimates, that it may help automatically-selected features perform as well or better than features that require human curation, and that a semantic distance measure derived from the Wikipedia article category tree serves as a useful proxy for deceptiveness. This suggests that disease incidence estimation models should incorporate not only data about how internet features map to incidence but also additional data to estimate feature deceptiveness. By doing so, we may gain one more step along the path to accurate, reliable disease incidence estimation using internet data. This capability would improve public health by decreasing the cost and increasing the timeliness of such estimates. △ Less

Submitted 11 January, 2019; originally announced January 2019.

Comments: 27 pages, 8 figures

Report number: LA-UR 18-24467

arXiv:1711.06241 [pdf, other]

Deceptiveness of internet data for disease surveillance

Authors: Reid Priedhorsky, Dave Osthus, Ashlynn R. Daughton, Kelly R. Moran, Aron Culotta

Abstract: Quantifying how many people are or will be sick, and where, is a critical ingredient in reducing the burden of disease because it helps the public health system plan and implement effective outbreak response. This process of disease surveillance is currently based on data gathering using clinical and laboratory methods; this distributed human contact and resulting bureaucratic data aggregation yie… ▽ More Quantifying how many people are or will be sick, and where, is a critical ingredient in reducing the burden of disease because it helps the public health system plan and implement effective outbreak response. This process of disease surveillance is currently based on data gathering using clinical and laboratory methods; this distributed human contact and resulting bureaucratic data aggregation yield expensive procedures that lag real time by weeks or months. The promise of new surveillance approaches using internet data, such as web event logs or social media messages, is to achieve the same goal but faster and cheaper. However, prior work in this area lacks a rigorous model of information flow, making it difficult to assess the reliability of both specific approaches and the body of work as a whole. We model disease surveillance as a Shannon communication. This new framework lets any two disease surveillance approaches be compared using a unified vocabulary and conceptual model. Using it, we describe and compare the deficiencies suffered by traditional and internet-based surveillance, introduce a new risk metric called deceptiveness, and offer mitigations for some of these deficiencies. This framework also makes the rich tools of information theory applicable to disease surveillance. This better understanding will improve the decision-making of public health practitioners by hel** to leverage internet-based surveillance in a way complementary to the strengths of traditional surveillance. △ Less

Submitted 31 July, 2018; v1 submitted 16 November, 2017; originally announced November 2017.

Comments: 26 pages, 6 figures

Report number: LA-UR 17-24564 ACM Class: H.1.1; J.3; H.2.8; H.3.5

arXiv:1707.03303 [pdf, ps, other]

A characterization of testable hypergraph properties

Authors: Felix Joos, Jaehoon Kim, Daniela Kühn, Deryk Osthus

Abstract: We provide a combinatorial characterization of all testable properties of $k$-uniform hypergraphs ($k$-graphs for short). Here, a $k$-graph property $P$ is testable if there is a randomized algorithm which makes a bounded number of edge queries and distinguishes with probability $2/3$ between $k$-graphs that satisfy $P$ and those that are far from satisfying $P$. For the $2$-graph case, such a com… ▽ More We provide a combinatorial characterization of all testable properties of $k$-uniform hypergraphs ($k$-graphs for short). Here, a $k$-graph property $P$ is testable if there is a randomized algorithm which makes a bounded number of edge queries and distinguishes with probability $2/3$ between $k$-graphs that satisfy $P$ and those that are far from satisfying $P$. For the $2$-graph case, such a combinatorial characterization was obtained by Alon, Fischer, Newman and Shapira. Our results for the $k$-graph setting are in contrast to those of Austin and Tao, who showed that for the somewhat stronger concept of local repairability, the testability results for graphs do not extend to the $3$-graph setting. Our proof relies on a random subhypergraph sampling result proved in a companion paper. △ Less

Submitted 5 October, 2021; v1 submitted 11 July, 2017; originally announced July 2017.

Comments: 39 pages; we split the paper into two parts; the second part is arXiv:2110.01570

arXiv:1401.4931 [pdf, ps, other]

A domination algorithm for $\{0,1\}$-instances of the travelling salesman problem

Authors: Daniela Kühn, Deryk Osthus, Viresh Patel

Abstract: We present an approximation algorithm for $\{0,1\}$-instances of the travelling salesman problem which performs well with respect to combinatorial dominance. More precisely, we give a polynomial-time algorithm which has domination ratio $1-n^{-1/29}$. In other words, given a $\{0,1\}$-edge-weighting of the complete graph $K_n$ on $n$ vertices, our algorithm outputs a Hamilton cycle $H^*$ of $K_n$… ▽ More We present an approximation algorithm for $\{0,1\}$-instances of the travelling salesman problem which performs well with respect to combinatorial dominance. More precisely, we give a polynomial-time algorithm which has domination ratio $1-n^{-1/29}$. In other words, given a $\{0,1\}$-edge-weighting of the complete graph $K_n$ on $n$ vertices, our algorithm outputs a Hamilton cycle $H^*$ of $K_n$ with the following property: the proportion of Hamilton cycles of $K_n$ whose weight is smaller than that of $H^*$ is at most $n^{-1/29}$. Our analysis is based on a martingale approach. Previously, the best result in this direction was a polynomial-time algorithm with domination ratio $1/2-o(1)$ for arbitrary edge-weights. We also prove a hardness result showing that, if the Exponential Time Hypothesis holds, there exists a constant $C$ such that $n^{-1/29}$ cannot be replaced by $\exp(-(\log n)^C)$ in the result above. △ Less

Submitted 26 May, 2015; v1 submitted 20 January, 2014; originally announced January 2014.

Comments: 29 pages (final version to appear in Random Structures and Algorithms)

arXiv:1202.6219 [pdf, other]

Hamilton decompositions of regular expanders: a proof of Kelly's conjecture for large tournaments

Authors: Daniela Kühn, Deryk Osthus

Abstract: A long-standing conjecture of Kelly states that every regular tournament on n vertices can be decomposed into (n-1)/2 edge-disjoint Hamilton cycles. We prove this conjecture for large n. In fact, we prove a far more general result, based on our recent concept of robust expansion and a new method for decomposing graphs. We show that every sufficiently large regular digraph G on n vertices whose deg… ▽ More A long-standing conjecture of Kelly states that every regular tournament on n vertices can be decomposed into (n-1)/2 edge-disjoint Hamilton cycles. We prove this conjecture for large n. In fact, we prove a far more general result, based on our recent concept of robust expansion and a new method for decomposing graphs. We show that every sufficiently large regular digraph G on n vertices whose degree is linear in n and which is a robust outexpander has a decomposition into edge-disjoint Hamilton cycles. This enables us to obtain numerous further results, e.g. as a special case we confirm a conjecture of Erdos on packing Hamilton cycles in random tournaments. As corollaries to the main result, we also obtain several results on packing Hamilton cycles in undirected graphs, giving e.g. the best known result on a conjecture of Nash-Williams. We also apply our result to solve a problem on the domination ratio of the Asymmetric Travelling Salesman problem, which was raised e.g. by Glover and Punnen as well as Alon, Gutin and Krivelevich. △ Less

Submitted 10 May, 2013; v1 submitted 28 February, 2012; originally announced February 2012.

Comments: new version includes a standalone version of the `robust decomposition lemma' for application in subsequent papers

MSC Class: 05C45; 05C70; 05C85; 05C35; 05C38; 05C20

Journal ref: Advances in Mathematics 237 (2013), 62-146

Showing 1–8 of 8 results for author: Osthus, D