Search | arXiv e-print repository

Discovering modular solutions that generalize compositionally

Authors: Simon Schug, Sei** Kobayashi, Yassir Akram, Maciej Wołczyk, Alexandra Proca, Johannes von Oswald, Razvan Pascanu, João Sacramento, Angelika Steger

Abstract: Many complex tasks can be decomposed into simpler, independent parts. Discovering such underlying compositional structure has the potential to enable compositional generalization. Despite progress, our most powerful systems struggle to compose flexibly. It therefore seems natural to make models more modular to help capture the compositional nature of many tasks. However, it is unclear under which… ▽ More Many complex tasks can be decomposed into simpler, independent parts. Discovering such underlying compositional structure has the potential to enable compositional generalization. Despite progress, our most powerful systems struggle to compose flexibly. It therefore seems natural to make models more modular to help capture the compositional nature of many tasks. However, it is unclear under which circumstances modular systems can discover hidden compositional structure. To shed light on this question, we study a teacher-student setting with a modular teacher where we have full control over the composition of ground truth modules. This allows us to relate the problem of compositional generalization to that of identification of the underlying modules. In particular we study modularity in hypernetworks representing a general class of multiplicative interactions. We show theoretically that identification up to linear transformation purely from demonstrations is possible without having to learn an exponential number of module combinations. We further demonstrate empirically that under the theoretically identified conditions, meta-learning from finite data can discover modular policies that generalize compositionally in a number of complex environments. △ Less

Submitted 25 March, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

Comments: Published as a conference paper at ICLR 2024; Code available at https://github.com/smonsays/modular-hyperteacher

arXiv:2309.01775 [pdf, other]

Gated recurrent neural networks discover attention

Authors: Nicolas Zucchet, Sei** Kobayashi, Yassir Akram, Johannes von Oswald, Maxime Larcher, Angelika Steger, João Sacramento

Abstract: Recent architectural developments have enabled recurrent neural networks (RNNs) to reach and even surpass the performance of Transformers on certain sequence modeling tasks. These modern RNNs feature a prominent design pattern: linear recurrent layers interconnected by feedforward paths with multiplicative gating. Here, we show how RNNs equipped with these two design elements can exactly implement… ▽ More Recent architectural developments have enabled recurrent neural networks (RNNs) to reach and even surpass the performance of Transformers on certain sequence modeling tasks. These modern RNNs feature a prominent design pattern: linear recurrent layers interconnected by feedforward paths with multiplicative gating. Here, we show how RNNs equipped with these two design elements can exactly implement (linear) self-attention, the main building block of Transformers. By reverse-engineering a set of trained RNNs, we find that gradient descent in practice discovers our construction. In particular, we examine RNNs trained to solve simple in-context learning tasks on which Transformers are known to excel and find that gradient descent instills in our RNNs the same attention-based in-context learning algorithm used by Transformers. Our findings highlight the importance of multiplicative interactions in neural networks and suggest that certain RNNs might be unexpectedly implementing attention under the hood. △ Less

Submitted 7 February, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

arXiv:2209.07509 [pdf, other]

Random initialisations performing above chance and how to find them

Authors: Frederik Benzing, Simon Schug, Robert Meier, Johannes von Oswald, Yassir Akram, Nicolas Zucchet, Laurence Aitchison, Angelika Steger

Abstract: Neural networks trained with stochastic gradient descent (SGD) starting from different random initialisations typically find functionally very similar solutions, raising the question of whether there are meaningful differences between different SGD solutions. Entezari et al.\ recently conjectured that despite different initialisations, the solutions found by SGD lie in the same loss valley after t… ▽ More Neural networks trained with stochastic gradient descent (SGD) starting from different random initialisations typically find functionally very similar solutions, raising the question of whether there are meaningful differences between different SGD solutions. Entezari et al.\ recently conjectured that despite different initialisations, the solutions found by SGD lie in the same loss valley after taking into account the permutation invariance of neural networks. Concretely, they hypothesise that any two solutions found by SGD can be permuted such that the linear interpolation between their parameters forms a path without significant increases in loss. Here, we use a simple but powerful algorithm to find such permutations that allows us to obtain direct empirical evidence that the hypothesis is true in fully connected networks. Strikingly, we find that two networks already live in the same loss valley at the time of initialisation and averaging their random, but suitably permuted initialisation performs significantly above chance. In contrast, for convolutional architectures, our evidence suggests that the hypothesis does not hold. Especially in a large learning rate regime, SGD seems to discover diverse modes. △ Less

Submitted 7 November, 2022; v1 submitted 15 September, 2022; originally announced September 2022.

Comments: NeurIPS 2022, 14th Annual Workshop on Optimization for Machine Learning (OPT2022)

arXiv:2012.02551 [pdf, other]

An O(n) time algorithm for finding Hamilton cycles with high probability

Authors: Rajko Nenadov, Angelika Steger, Pascal Su

Abstract: We design a randomized algorithm that finds a Hamilton cycle in $\mathcal{O}(n)$ time with high probability in a random graph $G_{n,p}$ with edge probability $p\ge C \log n / n$. This closes a gap left open in a seminal paper by Angluin and Valiant from 1979. We design a randomized algorithm that finds a Hamilton cycle in $\mathcal{O}(n)$ time with high probability in a random graph $G_{n,p}$ with edge probability $p\ge C \log n / n$. This closes a gap left open in a seminal paper by Angluin and Valiant from 1979. △ Less

Submitted 4 December, 2020; originally announced December 2020.

arXiv:2002.05121 [pdf, ps, other]

An Optimal Decentralized $(Δ+ 1)$-Coloring Algorithm

Authors: Daniel Bertschinger, Johannes Lengler, Anders Martinsson, Robert Meier, Angelika Steger, Miloš Trujić, Emo Welzl

Abstract: Consider the following simple coloring algorithm for a graph on $n$ vertices. Each vertex chooses a color from $\{1, \dotsc, Δ(G) + 1\}$ uniformly at random. While there exists a conflicted vertex choose one such vertex uniformly at random and recolor it with a randomly chosen color. This algorithm was introduced by Bhartia et al. [MOBIHOC'16] for channel selection in WIFI-networks. We show that t… ▽ More Consider the following simple coloring algorithm for a graph on $n$ vertices. Each vertex chooses a color from $\{1, \dotsc, Δ(G) + 1\}$ uniformly at random. While there exists a conflicted vertex choose one such vertex uniformly at random and recolor it with a randomly chosen color. This algorithm was introduced by Bhartia et al. [MOBIHOC'16] for channel selection in WIFI-networks. We show that this algorithm always converges to a proper coloring in expected $O(n \log Δ)$ steps, which is optimal and proves a conjecture of Chakrabarty and Supinski [SOSA'20]. △ Less

Submitted 3 May, 2021; v1 submitted 12 February, 2020; originally announced February 2020.

arXiv:1910.05268 [pdf, other]

Improving Gradient Estimation in Evolutionary Strategies With Past Descent Directions

Authors: Florian Meier, Asier Mujika, Marcelo Matheus Gauy, Angelika Steger

Abstract: Evolutionary Strategies (ES) are known to be an effective black-box optimization technique for deep neural networks when the true gradients cannot be computed, such as in Reinforcement Learning. We continue a recent line of research that uses surrogate gradients to improve the gradient estimation of ES. We propose a novel method to optimally incorporate surrogate gradient information. Our approach… ▽ More Evolutionary Strategies (ES) are known to be an effective black-box optimization technique for deep neural networks when the true gradients cannot be computed, such as in Reinforcement Learning. We continue a recent line of research that uses surrogate gradients to improve the gradient estimation of ES. We propose a novel method to optimally incorporate surrogate gradient information. Our approach, unlike previous work, needs no information about the quality of the surrogate gradients and is always guaranteed to find a descent direction that is better than the surrogate gradient. This allows to iteratively use the previous gradient estimate as surrogate gradient for the current search point. We theoretically prove that this yields fast convergence to the true gradient for linear functions and show under simplifying assumptions that it significantly improves gradient estimates for general functions. Finally, we evaluate our approach empirically on MNIST and reinforcement learning tasks and show that it considerably improves the gradient estimation of ES at no extra computational cost. △ Less

Submitted 11 October, 2019; originally announced October 2019.

arXiv:1910.05245 [pdf, other]

Decoupling Hierarchical Recurrent Neural Networks With Locally Computable Losses

Authors: Asier Mujika, Felix Weissenberger, Angelika Steger

Abstract: Learning long-term dependencies is a key long-standing challenge of recurrent neural networks (RNNs). Hierarchical recurrent neural networks (HRNNs) have been considered a promising approach as long-term dependencies are resolved through shortcuts up and down the hierarchy. Yet, the memory requirements of Truncated Backpropagation Through Time (TBPTT) still prevent training them on very long seque… ▽ More Learning long-term dependencies is a key long-standing challenge of recurrent neural networks (RNNs). Hierarchical recurrent neural networks (HRNNs) have been considered a promising approach as long-term dependencies are resolved through shortcuts up and down the hierarchy. Yet, the memory requirements of Truncated Backpropagation Through Time (TBPTT) still prevent training them on very long sequences. In this paper, we empirically show that in (deep) HRNNs, propagating gradients back from higher to lower levels can be replaced by locally computable losses, without harming the learning capability of the network, over a wide range of tasks. This decoupling by local losses reduces the memory requirements of training by a factor exponential in the depth of the hierarchy in comparison to standard TBPTT. △ Less

Submitted 11 October, 2019; originally announced October 2019.

arXiv:1902.03993 [pdf, other]

Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning

Authors: Frederik Benzing, Marcelo Matheus Gauy, Asier Mujika, Anders Martinsson, Angelika Steger

Abstract: One of the central goals of Recurrent Neural Networks (RNNs) is to learn long-term dependencies in sequential data. Nevertheless, the most popular training method, Truncated Backpropagation through Time (TBPTT), categorically forbids learning dependencies beyond the truncation horizon. In contrast, the online training algorithm Real Time Recurrent Learning (RTRL) provides untruncated gradients, wi… ▽ More One of the central goals of Recurrent Neural Networks (RNNs) is to learn long-term dependencies in sequential data. Nevertheless, the most popular training method, Truncated Backpropagation through Time (TBPTT), categorically forbids learning dependencies beyond the truncation horizon. In contrast, the online training algorithm Real Time Recurrent Learning (RTRL) provides untruncated gradients, with the disadvantage of impractically large computational costs. Recently published approaches reduce these costs by providing noisy approximations of RTRL. We present a new approximation algorithm of RTRL, Optimal Kronecker-Sum Approximation (OK). We prove that OK is optimal for a class of approximations of RTRL, which includes all approaches published so far. Additionally, we show that OK has empirically negligible noise: Unlike previous algorithms it matches TBPTT in a real world task (character-level Penn TreeBank) and can exploit online parameter updates to outperform TBPTT in a synthetic string memorization task. Code availiable on github. △ Less

Submitted 17 May, 2019; v1 submitted 11 February, 2019; originally announced February 2019.

Comments: ICML 2019 camera ready version; new version includes additional plots in the appendix

arXiv:1808.05566 [pdf, ps, other]

The linear hidden subset problem for the (1+1) EA with scheduled and adaptive mutation rates

Authors: Hafsteinn Einarsson, Marcelo Matheus Gauy, Johannes Lengler, Florian Meier, Asier Mujika, Angelika Steger, Felix Weissenberger

Abstract: We study unbiased $(1+1)$ evolutionary algorithms on linear functions with an unknown number $n$ of bits with non-zero weight. Static algorithms achieve an optimal runtime of $O(n (\ln n)^{2+ε})$, however, it remained unclear whether more dynamic parameter policies could yield better runtime guarantees. We consider two setups: one where the mutation rate follows a fixed schedule, and one where it… ▽ More We study unbiased $(1+1)$ evolutionary algorithms on linear functions with an unknown number $n$ of bits with non-zero weight. Static algorithms achieve an optimal runtime of $O(n (\ln n)^{2+ε})$, however, it remained unclear whether more dynamic parameter policies could yield better runtime guarantees. We consider two setups: one where the mutation rate follows a fixed schedule, and one where it may be adapted depending on the history of the run. For the first setup, we give a schedule that achieves a runtime of $(1\pm o(1))βn \ln n$, where $β\approx 3.552$, which is an asymptotic improvement over the runtime of the static setup. Moreover, we show that no schedule admits a better runtime guarantee and that the optimal schedule is essentially unique. For the second setup, we show that the runtime can be further improved to $(1\pm o(1)) e n \ln n$, which matches the performance of algorithms that know $n$ in advance. Finally, we study the related model of initial segment uncertainty with static position-dependent mutation rates, and derive asymptotically optimal lower bounds. This answers a question by Doerr, Doerr, and Kötzing. △ Less

Submitted 16 August, 2018; originally announced August 2018.

arXiv:1808.01137 [pdf, ps, other]

When Does Hillclimbing Fail on Monotone Functions: An entropy compression argument

Authors: Johannes Lengler, Anders Martinsson, Angelika Steger

Abstract: Hillclimbing is an essential part of any optimization algorithm. An important benchmark for hillclimbing algorithms on pseudo-Boolean functions $f: \{0,1\}^n \to \mathbb{R}$ are (strictly) montone functions, on which a surprising number of hillclimbers fail to be efficient. For example, the $(1+1)$-Evolutionary Algorithm is a standard hillclimber which flips each bit independently with probability… ▽ More Hillclimbing is an essential part of any optimization algorithm. An important benchmark for hillclimbing algorithms on pseudo-Boolean functions $f: \{0,1\}^n \to \mathbb{R}$ are (strictly) montone functions, on which a surprising number of hillclimbers fail to be efficient. For example, the $(1+1)$-Evolutionary Algorithm is a standard hillclimber which flips each bit independently with probability $c/n$ in each round. Perhaps surprisingly, this algorithm shows a phase transition: it optimizes any monotone pseudo-boolean function in quasilinear time if $c<1$, but there are monotone functions for which the algorithm needs exponential time if $c>2.2$. But so far it was unclear whether the threshold is at $c=1$. In this paper we show how Moser's entropy compression argument can be adapted to this situation, that is, we show that a long runtime would allow us to encode the random steps of the algorithm with less bits than their entropy. Thus there exists a $c_0 > 1$ such that for all $0<c\le c_0$ the $(1+1)$-Evolutionary Algorithm with rate $c/n$ finds the optimum in $O(n \log^2 n)$ steps in expectation. △ Less

Submitted 3 August, 2018; originally announced August 2018.

Comments: 14 pages, no figures

MSC Class: 68W40; 68W20; 60J10

arXiv:1805.10842 [pdf, other]

Approximating Real-Time Recurrent Learning with Random Kronecker Factors

Authors: Asier Mujika, Florian Meier, Angelika Steger

Abstract: Despite all the impressive advances of recurrent neural networks, sequential data is still in need of better modelling. Truncated backpropagation through time (TBPTT), the learning algorithm most widely used in practice, suffers from the truncation bias, which drastically limits its ability to learn long-term dependencies.The Real-Time Recurrent Learning algorithm (RTRL) addresses this issue, but… ▽ More Despite all the impressive advances of recurrent neural networks, sequential data is still in need of better modelling. Truncated backpropagation through time (TBPTT), the learning algorithm most widely used in practice, suffers from the truncation bias, which drastically limits its ability to learn long-term dependencies.The Real-Time Recurrent Learning algorithm (RTRL) addresses this issue, but its high computational requirements make it infeasible in practice. The Unbiased Online Recurrent Optimization algorithm (UORO) approximates RTRL with a smaller runtime and memory cost, but with the disadvantage of obtaining noisy gradients that also limit its practical applicability. In this paper we propose the Kronecker Factored RTRL (KF-RTRL) algorithm that uses a Kronecker product decomposition to approximate the gradients for a large class of RNNs. We show that KF-RTRL is an unbiased and memory efficient online learning algorithm. Our theoretical analysis shows that, under reasonable assumptions, the noise introduced by our algorithm is not only stable over time but also asymptotically much smaller than the one of the UORO algorithm. We also confirm these theoretical results experimentally. Further, we show empirically that the KF-RTRL algorithm captures long-term dependencies and almost matches the performance of TBPTT on real world tasks by training Recurrent Highway Networks on a synthetic string memorization task and on the Penn TreeBank task, respectively. These results indicate that RTRL based approaches might be a promising future alternative to TBPTT. △ Less

Submitted 5 December, 2018; v1 submitted 28 May, 2018; originally announced May 2018.

arXiv:1801.07193 [pdf, ps, other]

Even flying cops should think ahead

Authors: Anders Martinsson, Florian Meier, Patrick Schnider, Angelika Steger

Abstract: We study the entanglement game, which is a version of cops and robbers, on sparse graphs. While the minimum degree of a graph G is a lower bound for the number of cops needed to catch a robber in G, we show that the required number of cops can be much larger, even for graphs with small maximum degree. In particular, we show that there are 3-regular graphs where a linear number of cops are needed. We study the entanglement game, which is a version of cops and robbers, on sparse graphs. While the minimum degree of a graph G is a lower bound for the number of cops needed to catch a robber in G, we show that the required number of cops can be much larger, even for graphs with small maximum degree. In particular, we show that there are 3-regular graphs where a linear number of cops are needed. △ Less

Submitted 22 January, 2018; originally announced January 2018.

arXiv:1705.08639 [pdf, ps, other]

Fast-Slow Recurrent Neural Networks

Authors: Asier Mujika, Florian Meier, Angelika Steger

Abstract: Processing sequential data of variable length is a major challenge in a wide range of applications, such as speech recognition, language modeling, generative image modeling and machine translation. Here, we address this challenge by proposing a novel recurrent neural network (RNN) architecture, the Fast-Slow RNN (FS-RNN). The FS-RNN incorporates the strengths of both multiscale RNNs and deep trans… ▽ More Processing sequential data of variable length is a major challenge in a wide range of applications, such as speech recognition, language modeling, generative image modeling and machine translation. Here, we address this challenge by proposing a novel recurrent neural network (RNN) architecture, the Fast-Slow RNN (FS-RNN). The FS-RNN incorporates the strengths of both multiscale RNNs and deep transition RNNs as it processes sequential data on different timescales and learns complex transition functions from one time step to the next. We evaluate the FS-RNN on two character level language modeling data sets, Penn Treebank and Hutter Prize Wikipedia, where we improve state of the art results to $1.19$ and $1.25$ bits-per-character (BPC), respectively. In addition, an ensemble of two FS-RNNs achieves $1.20$ BPC on Hutter Prize Wikipedia outperforming the best known compression algorithm with respect to the BPC measure. We also present an empirical investigation of the learning and network dynamics of the FS-RNN, which explains the improved performance compared to other RNN architectures. Our approach is general as any kind of RNN cell is a possible building block for the FS-RNN architecture, and thus can be flexibly applied to different tasks. △ Less

Submitted 9 June, 2017; v1 submitted 24 May, 2017; originally announced May 2017.

Comments: Corrected minor typos in Figure 1 and Zoneout citation

arXiv:1610.01753 [pdf, ps, other]

A general lower bound for collaborative tree exploration

Authors: Yann Disser, Frank Mousset, Andreas Noever, Nemanja Škorić, Angelika Steger

Abstract: We consider collaborative graph exploration with a set of $k$ agents. All agents start at a common vertex of an initially unknown graph and need to collectively visit all other vertices. We assume agents are deterministic, vertices are distinguishable, moves are simultaneous, and we allow agents to communicate globally. For this setting, we give the first non-trivial lower bounds that bridge the g… ▽ More We consider collaborative graph exploration with a set of $k$ agents. All agents start at a common vertex of an initially unknown graph and need to collectively visit all other vertices. We assume agents are deterministic, vertices are distinguishable, moves are simultaneous, and we allow agents to communicate globally. For this setting, we give the first non-trivial lower bounds that bridge the gap between small ($k \leq \sqrt n$) and large ($k \geq n$) teams of agents. Remarkably, our bounds tightly connect to existing results in both domains. First, we significantly extend a lower bound of $Ω(\log k / \log\log k)$ by Dynia et al. on the competitive ratio of a collaborative tree exploration strategy to the range $k \leq n \log^c n$ for any $c \in \mathbb{N}$. Second, we provide a tight lower bound on the number of agents needed for any competitive exploration algorithm. In particular, we show that any collaborative tree exploration algorithm with $k = Dn^{1+o(1)}$ agents has a competitive ratio of $ω(1)$, while Dereniowski et al. gave an algorithm with $k = Dn^{1+\varepsilon}$ agents and competitive ratio $O(1)$, for any $\varepsilon > 0$ and with $D$ denoting the diameter of the graph. Lastly, we show that, for any exploration algorithm using $k = n$ agents, there exist trees of arbitrarily large height $D$ that require $Ω(D^2)$ rounds, and we provide a simple algorithm that matches this bound for all trees. △ Less

Submitted 6 October, 2016; originally announced October 2016.

arXiv:1608.06451 [pdf, other]

Failure Detection for Facial Landmark Detectors

Authors: Andreas Steger, Radu Timofte, Luc Van Gool

Abstract: Most face applications depend heavily on the accuracy of the face and facial landmarks detectors employed. Prediction of attributes such as gender, age, and identity usually completely fail when the faces are badly aligned due to inaccurate facial landmark detection. Despite the impressive recent advances in face and facial landmark detection, little study is on the recovery from and detection of… ▽ More Most face applications depend heavily on the accuracy of the face and facial landmarks detectors employed. Prediction of attributes such as gender, age, and identity usually completely fail when the faces are badly aligned due to inaccurate facial landmark detection. Despite the impressive recent advances in face and facial landmark detection, little study is on the recovery from and detection of failures or inaccurate predictions. In this work we study two top recent facial landmark detectors and devise confidence models for their outputs. We validate our failure detection approaches on standard benchmarks (AFLW, HELEN) and correctly identify more than 40% of the failures in the outputs of the landmark detectors. Moreover, with our failure detection we can achieve a 12% error reduction on a gender estimation application at the cost of a small increase in computation. △ Less

Submitted 23 August, 2016; originally announced August 2016.

arXiv:1608.03226 [pdf, ps, other]

Drift Analysis and Evolutionary Algorithms Revisited

Authors: Johannes Lengler, Angelika Steger

Abstract: One of the easiest randomized greedy optimization algorithms is the following evolutionary algorithm which aims at maximizing a boolean function $f:\{0,1\}^n \to {\mathbb R}$. The algorithm starts with a random search point $ξ\in \{0,1\}^n$, and in each round it flips each bit of $ξ$ with probability $c/n$ independently at random, where $c>0$ is a fixed constant. The thus created offspring $ξ'$ re… ▽ More One of the easiest randomized greedy optimization algorithms is the following evolutionary algorithm which aims at maximizing a boolean function $f:\{0,1\}^n \to {\mathbb R}$. The algorithm starts with a random search point $ξ\in \{0,1\}^n$, and in each round it flips each bit of $ξ$ with probability $c/n$ independently at random, where $c>0$ is a fixed constant. The thus created offspring $ξ'$ replaces $ξ$ if and only if $f(ξ') \ge f(ξ)$. The analysis of the runtime of this simple algorithm on monotone and on linear functions turned out to be highly non-trivial. In this paper we review known results and provide new and self-contained proofs of partly stronger results. △ Less

Submitted 15 November, 2017; v1 submitted 10 August, 2016; originally announced August 2016.

Comments: minor changes to improve readability

MSC Class: 60G40; 60J10; 68W20; 68W40 ACM Class: G.3

arXiv:1607.05212 [pdf, other]

Polynomial Lower Bound for Distributed Graph Coloring in a Weak LOCAL Model

Authors: Dan Hefetz, Fabian Kuhn, Yannic Maus, Angelika Steger

Abstract: We show an $Ω\big(Δ^{\frac{1}{3}-\fracη{3}}\big)$ lower bound on the runtime of any deterministic distributed $\mathcal{O}\big(Δ^{1+η}\big)$-graph coloring algorithm in a weak variant of the \LOCAL\ model. In particular, given a network graph \mbox{$G=(V,E)$}, in the weak \LOCAL\ model nodes communicate in synchronous rounds and they can use unbounded local computation. We assume that the nodes… ▽ More We show an $Ω\big(Δ^{\frac{1}{3}-\fracη{3}}\big)$ lower bound on the runtime of any deterministic distributed $\mathcal{O}\big(Δ^{1+η}\big)$-graph coloring algorithm in a weak variant of the \LOCAL\ model. In particular, given a network graph \mbox{$G=(V,E)$}, in the weak \LOCAL\ model nodes communicate in synchronous rounds and they can use unbounded local computation. We assume that the nodes have no identifiers, but that instead, the computation starts with an initial valid vertex coloring. A node can \textbf{broadcast} a \textbf{single} message of \textbf{unbounded} size to its neighbors and receives the \textbf{set of messages} sent to it by its neighbors. That is, if two neighbors of a node $v\in V$ send the same message to $v$, $v$ will receive this message only a single time; without any further knowledge, $v$ cannot know whether a received message was sent by only one or more than one neighbor. Neighborhood graphs have been essential in the proof of lower bounds for distributed coloring algorithms, e.g., \cite{linial92,Kuhn2006On}. Our proof analyzes the recursive structure of the neighborhood graph of the respective model to devise an $Ω\big(Δ^{\frac{1}{3}-\fracη{3}}\big)$ lower bound on the runtime for any deterministic distributed $\mathcal{O}\big(Δ^{1+η}\big)$-graph coloring algorithm. Furthermore, we hope that the proof technique improves the understanding of neighborhood graphs in general and that it will help towards finding a lower (runtime) bound for distributed graph coloring in the standard \LOCAL\ model. Our proof technique works for one-round algorithms in the standard \LOCAL\ model and provides a simpler and more intuitive proof for an existing $Ω(Δ^2)$ lower bound. △ Less

Submitted 14 September, 2016; v1 submitted 18 July, 2016; originally announced July 2016.

arXiv:1605.03043 [pdf, ps, other]

Unique reconstruction threshold for random jigsaw puzzles

Authors: Rajko Nenadov, Pascal Pfister, Angelika Steger

Abstract: A random jigsaw puzzle is constructed by arranging $n^2$ square pieces into an $n \times n$ grid and assigning to each edge of a piece one of $q$ available colours uniformly at random, with the restriction that touching edges receive the same colour. We show that if $q = o(n)$ then with high probability such a puzzle does not have a unique solution, while if $q \ge n^{1 + \varepsilon}$ for any con… ▽ More A random jigsaw puzzle is constructed by arranging $n^2$ square pieces into an $n \times n$ grid and assigning to each edge of a piece one of $q$ available colours uniformly at random, with the restriction that touching edges receive the same colour. We show that if $q = o(n)$ then with high probability such a puzzle does not have a unique solution, while if $q \ge n^{1 + \varepsilon}$ for any constant $\varepsilon > 0$ then the solution is unique. This solves a conjecture of Mossel and Ross (Shotgun assembly of labeled graphs, arXiv:1504.07682). △ Less

Submitted 11 May, 2016; v1 submitted 10 May, 2016; originally announced May 2016.

arXiv:1104.1309 [pdf, other]

Explosive Percolation in Erdös-Rényi-Like Random Graph Processes

Authors: Konstantinos Panagiotou, Reto Spöhel, Angelika Steger, Henning Thomas

Abstract: The evolution of the largest component has been studied intensely in a variety of random graph processes, starting in 1960 with the Erdös-Rényi process. It is well known that this process undergoes a phase transition at n/2 edges when, asymptotically almost surely, a linear-sized component appears. Moreover, this phase transition is continuous, i.e., in the limit the function f(c) denoting the fra… ▽ More The evolution of the largest component has been studied intensely in a variety of random graph processes, starting in 1960 with the Erdös-Rényi process. It is well known that this process undergoes a phase transition at n/2 edges when, asymptotically almost surely, a linear-sized component appears. Moreover, this phase transition is continuous, i.e., in the limit the function f(c) denoting the fraction of vertices in the largest component in the process after cn edge insertions is continuous. A variation of the Erdös-Rényi process are the so-called Achlioptas processes in which in every step a random pair of edges is drawn, and a fixed edge-selection rule selects one of them to be included in the graph while the other is put back. Recently, Achlioptas, D'Souza and Spencer (2009) gave strong numerical evidence that a variety of edge-selection rules exhibit a discontinuous phase transition. However, Riordan and Warnke (2011) very recently showed that all Achlioptas processes have a continuous phase transition. In this work we prove discontinuous phase transitions for a class of Erdös-Rényi-like processes in which in every step we connect two vertices, one chosen randomly from all vertices, and one chosen randomly from a restricted set of vertices. △ Less

Submitted 7 April, 2011; originally announced April 2011.

arXiv:1006.1231 [pdf, ps, other]

On the Insertion Time of Cuckoo Hashing

Authors: Nikolaos Fountoulakis, Konstantinos Panagiotou, Angelika Steger

Abstract: Cuckoo hashing is an efficient technique for creating large hash tables with high space utilization and guaranteed constant access times. There, each item can be placed in a location given by any one out of k different hash functions. In this paper we investigate further the random walk heuristic for inserting in an online fashion new items into the hash table. Provided that k > 2 and that the num… ▽ More Cuckoo hashing is an efficient technique for creating large hash tables with high space utilization and guaranteed constant access times. There, each item can be placed in a location given by any one out of k different hash functions. In this paper we investigate further the random walk heuristic for inserting in an online fashion new items into the hash table. Provided that k > 2 and that the number of items in the table is below (but arbitrarily close) to the theoretically achievable load threshold, we show a polylogarithmic bound for the maximum insertion time that holds with high probability. △ Less

Submitted 10 October, 2013; v1 submitted 7 June, 2010; originally announced June 2010.

Comments: 27 pages, final version accepted by the SIAM Journal on Computing

ACM Class: E.2; G.2.2

Showing 1–20 of 20 results for author: Steger, A