Search | arXiv e-print repository

The NFA Acceptance Hypothesis: Non-Combinatorial and Dynamic Lower Bounds

Authors: Karl Bringmann, Allan Grønlund, Marvin Künnemann, Kasper Green Larsen

Abstract: We pose the fine-grained hardness hypothesis that the textbook algorithm for the NFA Acceptance problem is optimal up to subpolynomial factors, even for dense NFAs and fixed alphabets. We show that this barrier appears in many variations throughout the algorithmic literature by introducing a framework of Colored Walk problems. These yield fine-grained equivalent formulations of the NFA Acceptanc… ▽ More We pose the fine-grained hardness hypothesis that the textbook algorithm for the NFA Acceptance problem is optimal up to subpolynomial factors, even for dense NFAs and fixed alphabets. We show that this barrier appears in many variations throughout the algorithmic literature by introducing a framework of Colored Walk problems. These yield fine-grained equivalent formulations of the NFA Acceptance problem as problems concerning detection of an $s$-$t$-walk with a prescribed color sequence in a given edge- or node-colored graph. For NFA Acceptance on sparse NFAs (or equivalently, Colored Walk in sparse graphs), a tight lower bound under the Strong Exponential Time Hypothesis has been rediscovered several times in recent years. We show that our hardness hypothesis, which concerns dense NFAs, has several interesting implications: - It gives a tight lower bound for Context-Free Language Reachability. This proves conditional optimality for the class of 2NPDA-complete problems, explaining the cubic bottleneck of interprocedural program analysis. - It gives a tight $(n+nm^{1/3})^{1-o(1)}$ lower bound for the Word Break problem on strings of length $n$ and dictionaries of total size $m$. - It implies the popular OMv hypothesis. Since the NFA acceptance problem is a static (i.e., non-dynamic) problem, this provides a static reason for the hardness of many dynamic problems. Thus, a proof of the NFA Acceptance hypothesis would resolve several interesting barriers. Conversely, a refutation of the NFA Acceptance hypothesis may lead the way to attacking the current barriers observed for Context-Free Language Reachability, the Word Break problem and the growing list of dynamic problems proven hard under the OMv hypothesis. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: 31 pages, Accepted at ITCS

arXiv:2307.06113 [pdf, ps, other]

Sublinear Time Shortest Path in Expander Graphs

Authors: Noga Alon, Allan Grønlund, Søren Fuglede Jørgensen, Kasper Green Larsen

Abstract: Computing a shortest path between two nodes in an undirected unweighted graph is among the most basic algorithmic tasks. Breadth first search solves this problem in linear time, which is clearly also a lower bound in the worst case. However, several works have shown how to solve this problem in sublinear time in expectation when the input graph is drawn from one of several classes of random graphs… ▽ More Computing a shortest path between two nodes in an undirected unweighted graph is among the most basic algorithmic tasks. Breadth first search solves this problem in linear time, which is clearly also a lower bound in the worst case. However, several works have shown how to solve this problem in sublinear time in expectation when the input graph is drawn from one of several classes of random graphs. In this work, we extend these results by giving sublinear time shortest path (and short path) algorithms for expander graphs. We thus identify a natural deterministic property of a graph (that is satisfied by typical random regular graphs) which suffices for sublinear time shortest paths. The algorithms are very simple, involving only bidirectional breadth first search and short random walks. We also complement our new algorithms by near-matching lower bounds. △ Less

Submitted 31 July, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

arXiv:2106.07989

Compression Implies Generalization

Authors: Allan Grønlund, Mikael Høgsgaard, Lior Kamma, Kasper Green Larsen

Abstract: Explaining the surprising generalization performance of deep neural networks is an active and important line of research in theoretical machine learning. Influential work by Arora et al. (ICML'18) showed that, noise stability properties of deep nets occurring in practice can be used to provably compress model representations. They then argued that the small representations of compressed networks i… ▽ More Explaining the surprising generalization performance of deep neural networks is an active and important line of research in theoretical machine learning. Influential work by Arora et al. (ICML'18) showed that, noise stability properties of deep nets occurring in practice can be used to provably compress model representations. They then argued that the small representations of compressed networks imply good generalization performance albeit only of the compressed nets. Extending their compression framework to yield generalization bounds for the original uncompressed networks remains elusive. Our main contribution is the establishment of a compression-based framework for proving generalization bounds. The framework is simple and powerful enough to extend the generalization bounds by Arora et al. to also hold for the original network. To demonstrate the flexibility of the framework, we also show that it allows us to give simple proofs of the strongest known generalization bounds for other popular machine learning models, namely Support Vector Machines and Boosting. △ Less

Submitted 1 July, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

Comments: There is a bug in the proof and that we are working on fixing it. No replacement paper at this point

arXiv:2105.12385 [pdf, other]

Learning to Detect Fortified Areas

Authors: Allan Grønlund, Jonas Tranberg

Abstract: High resolution data models like grid terrain models made from LiDAR data are a prerequisite for modern day Geographic Information Systems applications. Besides providing the foundation for the very accurate digital terrain models, LiDAR data is also extensively used to classify which parts of the considered surface comprise relevant elements like water, buildings and vegetation. In this paper we… ▽ More High resolution data models like grid terrain models made from LiDAR data are a prerequisite for modern day Geographic Information Systems applications. Besides providing the foundation for the very accurate digital terrain models, LiDAR data is also extensively used to classify which parts of the considered surface comprise relevant elements like water, buildings and vegetation. In this paper we consider the problem of classifying which areas of a given surface are fortified by for instance, roads, sidewalks, parking spaces, paved driveways and terraces. We consider using LiDAR data and orthophotos, combined and alone, to show how well the modern machine learning algorithms Gradient Boosted Trees and Convolutional Neural Networks are able to detect fortified areas on large real world data. The LiDAR data features, in particular the intensity feature that measures the signal strength of the return, that we consider in this project are heavily dependent on the actual LiDAR sensor that made the measurement. This is highly problematic, in particular for the generalisation capability of pattern matching algorithms, as this means that data features for test data may be very different from the data the model is trained on. We propose an algorithmic solution to this problem by designing a neural net embedding architecture that transforms data from all the different sensor systems into a new common representation that works as well as if the training data and test data originated from the same sensor. The final algorithm result has an accuracy above 96 percent, and an AUC score above 0.99. △ Less

Submitted 26 May, 2021; originally announced May 2021.

arXiv:2105.06680 [pdf]

Desperately seeking the impact of learning analytics in education at scale: Marrying data analysis with teaching and learning

Authors: Olga Viberg, Ake Gronlund

Abstract: Learning analytics (LA) is argued to be able to improve learning outcomes, learner support and teaching. However, despite an increasingly expanding amount of student (digital) data accessible from various online education and learning platforms and the growing interest in LA worldwide as well as considerable research efforts already made, there is still little empirical evidence of impact on pract… ▽ More Learning analytics (LA) is argued to be able to improve learning outcomes, learner support and teaching. However, despite an increasingly expanding amount of student (digital) data accessible from various online education and learning platforms and the growing interest in LA worldwide as well as considerable research efforts already made, there is still little empirical evidence of impact on practice that shows the effectiveness of LA in education settings. Based on a selection of theoretical and empirical research, this chapter provides a critical discussion about the possibilities of collecting and using student data as well as barriers and challenges to overcome in providing data-informed support to educators' everyday teaching practices. We argue that in order to increase the impact of data-driven decision-making aimed at students' improved learning in education at scale, we need to better understand educators' needs, their teaching practices and the context in which these practices occur, and how to support them in develo** relevant knowledge, strategies and skills to facilitate the data-informed process of digitalization of education. △ Less

Submitted 14 May, 2021; originally announced May 2021.

arXiv:2011.04998 [pdf, other]

Margins are Insufficient for Explaining Gradient Boosting

Authors: Allan Grønlund, Lior Kamma, Kasper Green Larsen

Abstract: Boosting is one of the most successful ideas in machine learning, achieving great practical performance with little fine-tuning. The success of boosted classifiers is most often attributed to improvements in margins. The focus on margin explanations was pioneered in the seminal work by Schapire et al. (1998) and has culminated in the $k$'th margin generalization bound by Gao and Zhou (2013), which… ▽ More Boosting is one of the most successful ideas in machine learning, achieving great practical performance with little fine-tuning. The success of boosted classifiers is most often attributed to improvements in margins. The focus on margin explanations was pioneered in the seminal work by Schapire et al. (1998) and has culminated in the $k$'th margin generalization bound by Gao and Zhou (2013), which was recently proved to be near-tight for some data distributions (Gronlund et al. 2019). In this work, we first demonstrate that the $k$'th margin bound is inadequate in explaining the performance of state-of-the-art gradient boosters. We then explain the short comings of the $k$'th margin bound and prove a stronger and more refined margin-based generalization bound for boosted classifiers that indeed succeeds in explaining the performance of modern gradient boosters. Finally, we improve upon the recent generalization lower bound by Grønlund et al. (2019). △ Less

Submitted 10 November, 2020; originally announced November 2020.

arXiv:2006.02175 [pdf, ps, other]

Near-Tight Margin-Based Generalization Bounds for Support Vector Machines

Authors: Allan Grønlund, Lior Kamma, Kasper Green Larsen

Abstract: Support Vector Machines (SVMs) are among the most fundamental tools for binary classification. In its simplest formulation, an SVM produces a hyperplane separating two classes of data using the largest possible margin to the data. The focus on maximizing the margin has been well motivated through numerous generalization bounds. In this paper, we revisit and improve the classic generalization bound… ▽ More Support Vector Machines (SVMs) are among the most fundamental tools for binary classification. In its simplest formulation, an SVM produces a hyperplane separating two classes of data using the largest possible margin to the data. The focus on maximizing the margin has been well motivated through numerous generalization bounds. In this paper, we revisit and improve the classic generalization bounds in terms of margins. Furthermore, we complement our new generalization bound by a nearly matching lower bound, thus almost settling the generalization performance of SVMs in terms of margins. △ Less

Submitted 3 June, 2020; originally announced June 2020.

arXiv:2003.05808 [pdf, other]

Explaining the poor performance of the KASS algorithm implementation

Authors: Allan Grønlund

Abstract: By investigating the code for the KASS algorithm implementation used in the paper "Exploring the quantum speed limit with computer games" [1, arXiv:1506.09091] by Sørensen et al. (provided by the authors), we describe how the poor performance of the KASS algorithm reported in [1] is entirely caused by a simple sign error in a derivative calculation. Changing only this one sign in the KASS implemen… ▽ More By investigating the code for the KASS algorithm implementation used in the paper "Exploring the quantum speed limit with computer games" [1, arXiv:1506.09091] by Sørensen et al. (provided by the authors), we describe how the poor performance of the KASS algorithm reported in [1] is entirely caused by a simple sign error in a derivative calculation. Changing only this one sign in the KASS implementation, we show that the algorithm provides results comparable to all other algorithms considered for the problem [2,7], and performs better than all player solutions of [1]. Furthermore, we show that the player solutions were optimized with a different algorithm before being compared to the results from the KASS algorithm. The authors of [1] have acknowledged both findings. Finally, we show that in contrast to the claims in [1], the players did not explore two different strategies. In fact, all the players followed the same strategy. △ Less

Submitted 12 March, 2020; originally announced March 2020.

arXiv:1909.12518 [pdf, ps, other]

Margin-Based Generalization Lower Bounds for Boosted Classifiers

Authors: Allan Grønlund, Lior Kamma, Kasper Green Larsen, Alexander Mathiasen, Jelani Nelson

Abstract: Boosting is one of the most successful ideas in machine learning. The most well-accepted explanations for the low generalization error of boosting algorithms such as AdaBoost stem from margin theory. The study of margins in the context of boosting algorithms was initiated by Schapire, Freund, Bartlett and Lee (1998) and has inspired numerous boosting algorithms and generalization bounds. To date,… ▽ More Boosting is one of the most successful ideas in machine learning. The most well-accepted explanations for the low generalization error of boosting algorithms such as AdaBoost stem from margin theory. The study of margins in the context of boosting algorithms was initiated by Schapire, Freund, Bartlett and Lee (1998) and has inspired numerous boosting algorithms and generalization bounds. To date, the strongest known generalization (upper bound) is the $k$th margin bound of Gao and Zhou (2013). Despite the numerous generalization upper bounds that have been proved over the last two decades, nothing is known about the tightness of these bounds. In this paper, we give the first margin-based lower bounds on the generalization error of boosted classifiers. Our lower bounds nearly match the $k$th margin bound and thus almost settle the generalization performance of boosted classifiers in terms of margins. △ Less

Submitted 7 May, 2020; v1 submitted 27 September, 2019; originally announced September 2019.

arXiv:1909.07685 [pdf, other]

Learning to Find Hydrological Corrections

Authors: Lars Arge, Allan Grønlund, Svend Christian Svendsen, Jonas Tranberg

Abstract: High resolution Digital Elevation models, such as the (Big) grid terrain model of Denmark with more than 200 billion measurements, is a basic requirement for water flow modelling and flood risk analysis. However, a large number of modifications often need to be made to even very accurate terrain models, such as the Danish model, before they can be used in realistic flow modeling. These modificatio… ▽ More High resolution Digital Elevation models, such as the (Big) grid terrain model of Denmark with more than 200 billion measurements, is a basic requirement for water flow modelling and flood risk analysis. However, a large number of modifications often need to be made to even very accurate terrain models, such as the Danish model, before they can be used in realistic flow modeling. These modifications include removal of bridges, which otherwise will act as dams in flow modeling, and inclusion of culverts that transport water underneath roads. In fact, the danish model is accompanied by a detailed set of hydrological corrections for the digital elevation model. However, producing these hydrological corrections is a very slow an expensive process, since it is to a large extent done manually and often with local input. This also means that corrections can be of varying quality. In this paper we propose a new algorithmic apporach based on machine learning and convolutional neural networks for automatically detecting hydrological corrections for such large terrain data. Our model is able to detect most hydrological corrections known for the danish model and quite a few more that should have been included in the original list. △ Less

Submitted 17 September, 2019; originally announced September 2019.

Comments: 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2019)

arXiv:1904.01008 [pdf, other]

Algorithms Clearly Beat Gamers at Quantum Moves. A Verification

Authors: Allan Grønlund

Abstract: The paper [Sørensen et al., Nature 532] considers how human players compare to algorithms for solving the Quantum Moves game BringHomeWater and design new algorithms based on the intuition extracted from players. The claim by [Sørensen et al., Nature 532] is that players outperform widely used algorithms, in particular the KASS algorithm, based on the Krotov algorithm, and that player intuition is… ▽ More The paper [Sørensen et al., Nature 532] considers how human players compare to algorithms for solving the Quantum Moves game BringHomeWater and design new algorithms based on the intuition extracted from players. The claim by [Sørensen et al., Nature 532] is that players outperform widely used algorithms, in particular the KASS algorithm, based on the Krotov algorithm, and that player intuition is crucial to develop improved methods. However, as initially discussed by D. Sels [D. Sels, Phys. Rev. A 97], a standard Coordinate Ascent algorithm outperforms all players by a large margin. Albeit D. Sels only compare to player solutions, the simple algorithm outperforms all algorithms based on player solutions and Krotov, and it does so using much less time and iterations. In this paper we elaborate on the methods discussed by D. Sels and verify that the presented algorithm, solves the problem better than all players and algorithms derived from player solutions in [Sørensen et al., Nature 532]. We also verify the theoretical analysis presented by D. Sels, that gives a theoretically derived protocol that outperforms all players. We add a comparison with gradient ascent or GRAPE. Starting from uniform random values, GRAPE outperforms all players by a large margin. GRAPE works at least as well as the methods from [Sørensen et al., Nature 532] initialized with player solutions. A standard analysis of the results from GRAPE provides a starting point for GRAPE, that outperform all algorithms from [Sørensen et al., Nature 532]. We compare with a basic Krotov algorithm, and get results similar to GRAPE, clearly outperforming players and the KASS algorithm. These experiments verify and underline the result in [D. Sels, Phys. Rev. A 97] that the conclusions from [Sørensen et al., Nature 532] regarding algorithms are untenable. In fact the opposite conclusions are true. △ Less

Submitted 2 July, 2019; v1 submitted 1 April, 2019; originally announced April 2019.

arXiv:1901.10789 [pdf, other]

Optimal Minimal Margin Maximization with Boosting

Authors: Allan Grønlund, Kasper Green Larsen, Alexander Mathiasen

Abstract: Boosting algorithms produce a classifier by iteratively combining base hypotheses. It has been observed experimentally that the generalization error keeps improving even after achieving zero training error. One popular explanation attributes this to improvements in margins. A common goal in a long line of research, is to maximize the smallest margin using as few base hypotheses as possible, culmin… ▽ More Boosting algorithms produce a classifier by iteratively combining base hypotheses. It has been observed experimentally that the generalization error keeps improving even after achieving zero training error. One popular explanation attributes this to improvements in margins. A common goal in a long line of research, is to maximize the smallest margin using as few base hypotheses as possible, culminating with the AdaBoostV algorithm by (R{ä}tsch and Warmuth [JMLR'04]). The AdaBoostV algorithm was later conjectured to yield an optimal trade-off between number of hypotheses trained and the minimal margin over all training points (Nie et al. [JMLR'13]). Our main contribution is a new algorithm refuting this conjecture. Furthermore, we prove a lower bound which implies that our new algorithm is optimal. △ Less

Submitted 30 January, 2019; originally announced January 2019.

arXiv:1802.06545 [pdf, ps, other]

Upper and lower bounds for dynamic data structures on strings

Authors: Raphael Clifford, Allan Grønlund, Kasper Green Larsen, Tatiana Starikovskaya

Abstract: We consider a range of simply stated dynamic data structure problems on strings. An update changes one symbol in the input and a query asks us to compute some function of the pattern of length $m$ and a substring of a longer text. We give both conditional and unconditional lower bounds for variants of exact matching with wildcards, inner product, and Hamming distance computation via a sequence of… ▽ More We consider a range of simply stated dynamic data structure problems on strings. An update changes one symbol in the input and a query asks us to compute some function of the pattern of length $m$ and a substring of a longer text. We give both conditional and unconditional lower bounds for variants of exact matching with wildcards, inner product, and Hamming distance computation via a sequence of reductions. As an example, we show that there does not exist an $O(m^{1/2-\varepsilon})$ time algorithm for a large range of these problems unless the online Boolean matrix-vector multiplication conjecture is false. We also provide nearly matching upper bounds for most of the problems we consider. △ Less

Submitted 19 February, 2018; originally announced February 2018.

Comments: Accepted at STACS'18

arXiv:1701.07204 [pdf, other]

Fast Exact k-Means, k-Medians and Bregman Divergence Clustering in 1D

Authors: Allan Grønlund, Kasper Green Larsen, Alexander Mathiasen, Jesper Sindahl Nielsen, Stefan Schneider, Mingzhou Song

Abstract: The $k$-Means clustering problem on $n$ points is NP-Hard for any dimension $d\ge 2$, however, for the 1D case there exists exact polynomial time algorithms. Previous literature reported an $O(kn^2)$ time dynamic programming algorithm that uses $O(kn)$ space. It turns out that the problem has been considered under a different name more than twenty years ago. We present all the existing work that h… ▽ More The $k$-Means clustering problem on $n$ points is NP-Hard for any dimension $d\ge 2$, however, for the 1D case there exists exact polynomial time algorithms. Previous literature reported an $O(kn^2)$ time dynamic programming algorithm that uses $O(kn)$ space. It turns out that the problem has been considered under a different name more than twenty years ago. We present all the existing work that had been overlooked and compare the various solutions theoretically. Moreover, we show how to reduce the space usage for some of them, as well as generalize them to data structures that can quickly report an optimal $k$-Means clustering for any $k$. Finally we also generalize all the algorithms to work for the absolute distance and to work for any Bregman Divergence. We complement our theoretical contributions by experiments that compare the practical performance of the various algorithms. △ Less

Submitted 25 April, 2018; v1 submitted 25 January, 2017; originally announced January 2017.

arXiv:1611.00918 [pdf, other]

A Dichotomy for Regular Expression Membership Testing

Authors: Karl Bringmann, Allan Grønlund, Kasper Green Larsen

Abstract: We study regular expression membership testing: Given a regular expression of size $m$ and a string of size $n$, decide whether the string is in the language described by the regular expression. Its classic $O(nm)$ algorithm is one of the big success stories of the 70s, which allowed pattern matching to develop into the standard tool that it is today. Many special cases of pattern matching have… ▽ More We study regular expression membership testing: Given a regular expression of size $m$ and a string of size $n$, decide whether the string is in the language described by the regular expression. Its classic $O(nm)$ algorithm is one of the big success stories of the 70s, which allowed pattern matching to develop into the standard tool that it is today. Many special cases of pattern matching have been studied that can be solved faster than in quadratic time. However, a systematic study of tractable cases was made possible only recently, with the first conditional lower bounds reported by Backurs and Indyk [FOCS'16]. Restricted to any "type" of homogeneous regular expressions of depth 2 or 3, they either presented a near-linear time algorithm or a quadratic conditional lower bound, with one exception known as the Word Break problem. In this paper we complete their work as follows: 1) We present two almost-linear time algorithms that generalize all known almost-linear time algorithms for special cases of regular expression membership testing. 2) We classify all types, except for the Word Break problem, into almost-linear time or quadratic time assuming the Strong Exponential Time Hypothesis. This extends the classification from depth 2 and 3 to any constant depth. 3) For the Word Break problem we give an improved $\tilde{O}(n m^{1/3} + m)$ algorithm. Surprisingly, we also prove a matching conditional lower bound for combinatorial algorithms. This establishes Word Break as the only intermediate problem. In total, we prove matching upper and lower bounds for any type of bounded-depth homogeneous regular expressions, which yields a full dichotomy for regular expression membership testing. △ Less

Submitted 7 November, 2016; v1 submitted 3 November, 2016; originally announced November 2016.

arXiv:1504.01836 [pdf, ps, other]

New Unconditional Hardness Results for Dynamic and Online Problems

Authors: Raphael Clifford, Allan Grønlund, Kasper Green Larsen

Abstract: There has been a resurgence of interest in lower bounds whose truth rests on the conjectured hardness of well known computational problems. These conditional lower bounds have become important and popular due to the painfully slow progress on proving strong unconditional lower bounds. Nevertheless, the long term goal is to replace these conditional bounds with unconditional ones. In this paper we… ▽ More There has been a resurgence of interest in lower bounds whose truth rests on the conjectured hardness of well known computational problems. These conditional lower bounds have become important and popular due to the painfully slow progress on proving strong unconditional lower bounds. Nevertheless, the long term goal is to replace these conditional bounds with unconditional ones. In this paper we make progress in this direction by studying the cell probe complexity of two conjectured to be hard problems of particular importance: matrix-vector multiplication and a version of dynamic set disjointness known as Patrascu's Multiphase Problem. We give improved unconditional lower bounds for these problems as well as introducing new proof techniques of independent interest. These include a technique capable of proving strong threshold lower bounds of the following form: If we insist on having a very fast query time, then the update time has to be slow enough to compute a lookup table with the answer to every possible query. This is the first time a lower bound of this type has been proven. △ Less

Submitted 8 April, 2015; originally announced April 2015.

arXiv:1411.0644 [pdf, ps, other]

Towards Tight Lower Bounds for Range Reporting on the RAM

Authors: Allan Grønlund, Kasper Green Larsen

Abstract: In the orthogonal range reporting problem, we are to preprocess a set of $n$ points with integer coordinates on a $U \times U$ grid. The goal is to support reporting all $k$ points inside an axis-aligned query rectangle. This is one of the most fundamental data structure problems in databases and computational geometry. Despite the importance of the problem its complexity remains unresolved in the… ▽ More In the orthogonal range reporting problem, we are to preprocess a set of $n$ points with integer coordinates on a $U \times U$ grid. The goal is to support reporting all $k$ points inside an axis-aligned query rectangle. This is one of the most fundamental data structure problems in databases and computational geometry. Despite the importance of the problem its complexity remains unresolved in the word-RAM. On the upper bound side, three best tradeoffs exists: (1.) Query time $O(\lg \lg n + k)$ with $O(nlg^{\varepsilon}n)$ words of space for any constant $\varepsilon>0$. (2.) Query time $O((1 + k) \lg \lg n)$ with $O(n \lg \lg n)$ words of space. (3.) Query time $O((1+k)\lg^{\varepsilon} n)$ with optimal $O(n)$ words of space. However, the only known query time lower bound is $Ω(\log \log n +k)$, even for linear space data structures. All three current best upper bound tradeoffs are derived by reducing range reporting to a ball-inheritance problem. Ball-inheritance is a problem that essentially encapsulates all previous attempts at solving range reporting in the word-RAM. In this paper we make progress towards closing the gap between the upper and lower bounds for range reporting by proving cell probe lower bounds for ball-inheritance. Our lower bounds are tight for a large range of parameters, excluding any further progress for range reporting using the ball-inheritance reduction. △ Less

Submitted 3 November, 2014; originally announced November 2014.

arXiv:1407.2907 [pdf, ps, other]

Approximate Range Emptiness in Constant Time and Optimal Space

Authors: Mayank Goswami, Allan Grønlund, Kasper Green Larsen, Rasmus Pagh

Abstract: This paper studies the \emph{$\varepsilon$-approximate range emptiness} problem, where the task is to represent a set $S$ of $n$ points from $\{0,\ldots,U-1\}$ and answer emptiness queries of the form "$[a ; b]\cap S \neq \emptyset$ ?" with a probability of \emph{false positives} allowed. This generalizes the functionality of \emph{Bloom filters} from single point queries to any interval length… ▽ More This paper studies the \emph{$\varepsilon$-approximate range emptiness} problem, where the task is to represent a set $S$ of $n$ points from $\{0,\ldots,U-1\}$ and answer emptiness queries of the form "$[a ; b]\cap S \neq \emptyset$ ?" with a probability of \emph{false positives} allowed. This generalizes the functionality of \emph{Bloom filters} from single point queries to any interval length $L$. Setting the false positive rate to $\varepsilon/L$ and performing $L$ queries, Bloom filters yield a solution to this problem with space $O(n \lg(L/\varepsilon))$ bits, false positive probability bounded by $\varepsilon$ for intervals of length up to $L$, using query time $O(L \lg(L/\varepsilon))$. Our first contribution is to show that the space/error trade-off cannot be improved asymptotically: Any data structure for answering approximate range emptiness queries on intervals of length up to $L$ with false positive probability $\varepsilon$, must use space $Ω(n \lg(L/\varepsilon)) - O(n)$ bits. On the positive side we show that the query time can be improved greatly, to constant time, while matching our space lower bound up to a lower order additive term. This result is achieved through a succinct data structure for (non-approximate 1d) range emptiness/reporting queries, which may be of independent interest. △ Less

Submitted 10 July, 2014; originally announced July 2014.

arXiv:1404.0799 [pdf, other]

Threesomes, Degenerates, and Love Triangles

Authors: Allan Grønlund, Seth Pettie

Abstract: The 3SUM problem is to decide, given a set of $n$ real numbers, whether any three sum to zero. It is widely conjectured that a trivial $O(n^2)$-time algorithm is optimal and over the years the consequences of this conjecture have been revealed. This 3SUM conjecture implies $Ω(n^2)$ lower bounds on numerous problems in computational geometry and a variant of the conjecture implies strong lower boun… ▽ More The 3SUM problem is to decide, given a set of $n$ real numbers, whether any three sum to zero. It is widely conjectured that a trivial $O(n^2)$-time algorithm is optimal and over the years the consequences of this conjecture have been revealed. This 3SUM conjecture implies $Ω(n^2)$ lower bounds on numerous problems in computational geometry and a variant of the conjecture implies strong lower bounds on triangle enumeration, dynamic graph algorithms, and string matching data structures. In this paper we refute the 3SUM conjecture. We prove that the decision tree complexity of 3SUM is $O(n^{3/2}\sqrt{\log n})$ and give two subquadratic 3SUM algorithms, a deterministic one running in $O(n^2 / (\log n/\log\log n)^{2/3})$ time and a randomized one running in $O(n^2 (\log\log n)^2 / \log n)$ time with high probability. Our results lead directly to improved bounds for $k$-variate linear degeneracy testing for all odd $k\ge 3$. The problem is to decide, given a linear function $f(x_1,\ldots,x_k) = α_0 + \sum_{1\le i\le k} α_i x_i$ and a set $A \subset \mathbb{R}$, whether $0\in f(A^k)$. We show the decision tree complexity of this problem is $O(n^{k/2}\sqrt{\log n})$. Finally, we give a subcubic algorithm for a generalization of the $(\min,+)$-product over real-valued matrices and apply it to the problem of finding zero-weight triangles in weighted graphs. We give a depth-$O(n^{5/2}\sqrt{\log n})$ decision tree for this problem, as well as an algorithm running in time $O(n^3 (\log\log n)^2/\log n)$. △ Less

Submitted 30 May, 2014; v1 submitted 3 April, 2014; originally announced April 2014.

arXiv:1205.0505 [pdf, other]

doi 10.1371/journal.pone.0033960

Fractal Profit Landscape of the Stock Market

Authors: Andreas Gronlund, Il Gu Yi, Beom Jun Kim

Abstract: We investigate the structure of the profit landscape obtained from the most basic, fluctuation based, trading strategy applied for the daily stock price data. The strategy is parameterized by only two variables, p and q. Stocks are sold and bought if the log return is bigger than p and less than -q, respectively. Repetition of this simple strategy for a long time gives the profit defined in the un… ▽ More We investigate the structure of the profit landscape obtained from the most basic, fluctuation based, trading strategy applied for the daily stock price data. The strategy is parameterized by only two variables, p and q. Stocks are sold and bought if the log return is bigger than p and less than -q, respectively. Repetition of this simple strategy for a long time gives the profit defined in the underlying two-dimensional parameter space of p and q. It is revealed that the local maxima in the profit landscape are spread in the form of a fractal structure. The fractal structure implies that successful strategies are not localized to any region of the profit landscape and are neither spaced evenly throughout the profit landscape, which makes the optimization notoriously hard and hypersensitive for partial or limited information. The concrete implication of this property is demonstrated by showing that optimization of one stock for future values or other stocks renders worse profit than a strategy that ignores fluctuations, i.e., a long-term buy-and-hold strategy. △ Less

Submitted 2 May, 2012; originally announced May 2012.

Comments: 12 pages, 4 figures

Journal ref: PLoS One (7)4 : e33960 (2012)

arXiv:0904.0522 [pdf, ps, other]

doi 10.1209/0295-5075/86/24001

Evolution of Rogue Waves in Interacting Wave Systems

Authors: A. Grönlund, B. Eliasson, M. Marklund

Abstract: Large amplitude water waves on deep water has long been known in the sea faring community, and the cause of great concern for, e.g., oil platform constructions. The concept of such freak waves is nowadays, thanks to satellite and radar measurements, well established within the scientific community. There are a number of important models and approaches for the theoretical description of such wave… ▽ More Large amplitude water waves on deep water has long been known in the sea faring community, and the cause of great concern for, e.g., oil platform constructions. The concept of such freak waves is nowadays, thanks to satellite and radar measurements, well established within the scientific community. There are a number of important models and approaches for the theoretical description of such waves. By analyzing the scaling behavior of freak wave formation in a model of two interacting waves, described by two coupled nonlinear Schroedinger equations, we show that there are two different dynamical scaling behaviors above and below a critical angle theta_c of the direction of the interacting waves below theta_c all wave systems evolve and display statistics similar to a wave system of non-interacting waves. The results equally apply to other systems described by the nonlinear Schroedinger equations, and should be of interest when designing optical wave guides. △ Less

Submitted 3 April, 2009; originally announced April 2009.

Comments: 5 pages, 2 figures, to appear in Europhysics Letters

arXiv:q-bio/0605043 [pdf, ps, other]

doi 10.1209/0295-5075/81/28003

Dynamic scaling regimes of collective decision making

Authors: Andreas Gronlund, Petter Holme, Petter Minnhagen

Abstract: We investigate a social system of agents faced with a binary choice. We assume there is a correct, or beneficial, outcome of this choice. Furthermore, we assume agents are influenced by others in making their decision, and that the agents can obtain information that may guide them towards making a correct decision. The dynamic model we propose is of nonequilibrium type, converging to a final dec… ▽ More We investigate a social system of agents faced with a binary choice. We assume there is a correct, or beneficial, outcome of this choice. Furthermore, we assume agents are influenced by others in making their decision, and that the agents can obtain information that may guide them towards making a correct decision. The dynamic model we propose is of nonequilibrium type, converging to a final decision. We run it on random graphs and scale-free networks. On random graphs, we find two distinct regions in terms of the "finalizing time" -- the time until all agents have finalized their decisions. On scale-free networks on the other hand, there does not seem to be any such distinct scaling regions. △ Less

Submitted 24 August, 2007; v1 submitted 25 May, 2006; originally announced May 2006.

Journal ref: Europhys. Lett. 81, (2008) 28003

arXiv:physics/0505050 [pdf, ps, other]

doi 10.1142/S0219525905000439

A network-based threshold model for the spreading of fads in society and markets

Authors: Andreas Gronlund, Petter Holme

Abstract: We investigate the behavior of a threshold model for the spreading of fads and similar phenomena in society. The model is giving the fad dynamics and is intended to be confined to an underlying network structure. We investigate the whole parameter space of the fad dynamics on three types of network models. The dynamics we discover is rich and highly dependent on the underlying network structure.… ▽ More We investigate the behavior of a threshold model for the spreading of fads and similar phenomena in society. The model is giving the fad dynamics and is intended to be confined to an underlying network structure. We investigate the whole parameter space of the fad dynamics on three types of network models. The dynamics we discover is rich and highly dependent on the underlying network structure. For some range of the parameter space, for all types of substrate networks, there are a great variety of sizes and life-lengths of the fads -- what one see in real-world social and economical systems. △ Less

Submitted 6 May, 2005; originally announced May 2005.

Journal ref: Advances in Complex Systems 8, 261-273 (2005)

arXiv:cond-mat/0505400 [pdf, ps, other]

doi 10.1103/PhysRevE.72.046117

Searchability of Networks

Authors: M. Rosvall, A. Gronlund, P. Minnhagen, K. Sneppen

Abstract: We investigate the searchability of complex systems in terms of their interconnectedness. Associating searchability with the number and size of branch points along the paths between the nodes, we find that scale-free networks are relatively difficult to search, and thus that the abundance of scale-free networks in nature and society may reflect an attempt to protect local areas in a highly inter… ▽ More We investigate the searchability of complex systems in terms of their interconnectedness. Associating searchability with the number and size of branch points along the paths between the nodes, we find that scale-free networks are relatively difficult to search, and thus that the abundance of scale-free networks in nature and society may reflect an attempt to protect local areas in a highly interconnected network from nonrelated communication. In fact, starting from a random node, real-world networks with higher order organization like modular or hierarchical structure are even more difficult to navigate than random scale-free networks. The searchability at the node level opens the possibility for a generalized hierarchy measure that captures both the hierarchy in the usual terms of trees as in military structures, and the intrinsic hierarchical nature of topological hierarchies for scale-free networks as in the Internet. △ Less

Submitted 7 December, 2005; v1 submitted 17 May, 2005; originally announced May 2005.

Comments: 9 pages, 10 figures

Journal ref: Phys. Rev. E 72, 046117 (2005)

arXiv:physics/0504181 [pdf, ps, other]

Modelling the dynamics of youth subcultures

Authors: Petter Holme, Andreas Gronlund

Abstract: What are the dynamics behind youth subcultures such as punk, hippie, or hip-hop cultures? How does the global dynamics of these subcultures relate to the individual's search for a personal identity? We propose a simple dynamical model to address these questions and find that only a few assumptions of the individual's behaviour are necessary to regenerate known features of youth culture. What are the dynamics behind youth subcultures such as punk, hippie, or hip-hop cultures? How does the global dynamics of these subcultures relate to the individual's search for a personal identity? We propose a simple dynamical model to address these questions and find that only a few assumptions of the individual's behaviour are necessary to regenerate known features of youth culture. △ Less

Submitted 25 April, 2005; originally announced April 2005.

Comments: To appear in JASSS

Journal ref: Journal of Artificial Societies and Social Simulation 8, (2005) 3

arXiv:cond-mat/0406268 [pdf, ps, other]

doi 10.1103/PhysRevE.70.061908

Networking genetic regulation and neural computation: Directed network topology and its effect on the dynamics

Authors: Andreas Grönlund

Abstract: Two different types of directed networks are investigated, transcriptional regulation networks and neural networks. The directed network structure are studied and also shown to reflect the different processes taking place on the networks. The distribution of influence, identified as the the number of downstream vertices, are used as a tool for investigating random vertex removal. In the transcri… ▽ More Two different types of directed networks are investigated, transcriptional regulation networks and neural networks. The directed network structure are studied and also shown to reflect the different processes taking place on the networks. The distribution of influence, identified as the the number of downstream vertices, are used as a tool for investigating random vertex removal. In the transcriptional regulation networks we observe that only a small number of vertices have a large influence. The small influences of most vertices limit the effect of a random removal to in most cases only a small fraction of vertices in the network. The neural network has a rather different topology with respect to the influence, which are large for most vertices. To further investigate the effect of vertex removal we simulate the biological processes taking place on the networks. Opposed to the presumpted large effect of random vertex removal in the neural network, the high density of edges in conjunction with the dynamics used makes the change in the state of the system to be highly localized around the removed vertex. △ Less

Submitted 5 October, 2004; v1 submitted 11 June, 2004; originally announced June 2004.

Comments: 7 figures, 1 table

arXiv:cond-mat/0401537 [pdf, ps, other]

doi 10.1088/0031-8949/71/6/018

Correlations in Networks associated to Preferential Growth

Authors: Andreas Gronlund, Kim Sneppen, Petter Minnhagen

Abstract: Combinations of random and preferential growth for both on-growing and stationary networks are studied and a hierarchical topology is observed. Thus for real world scale-free networks which do not exhibit hierarchical features preferential growth is probably not the main ingredient in the growth process. An example of such real world networks includes the protein-protein interaction network in y… ▽ More Combinations of random and preferential growth for both on-growing and stationary networks are studied and a hierarchical topology is observed. Thus for real world scale-free networks which do not exhibit hierarchical features preferential growth is probably not the main ingredient in the growth process. An example of such real world networks includes the protein-protein interaction network in yeast, which exhibits pronounced anti-hierarchical features. △ Less

Submitted 2 December, 2004; v1 submitted 27 January, 2004; originally announced January 2004.

Comments: 4 pages, 4 figures

arXiv:cond-mat/0312506 [pdf, ps, other]

doi 10.1103/PhysRevB.69.064515

Scaling determination of the nonlinear I-V characteristics for 2D superconducting networks

Authors: Petter Minnhagen, Beom Jun Kim, A. Gronlund

Abstract: It is shown from computer simulations that the current-voltage ($I$-$V$) characteristics for the two-dimensional XY model with resistively-shunted Josephson junction dynamics and Monte Carlo dynamics obeys a finite-size scaling form from which the nonlinear $I$-$V$ exponent $a$ can be determined to good precision. This determination supports the conclusion $a=z+1$, where $z$ is the dynamic criti… ▽ More It is shown from computer simulations that the current-voltage ($I$-$V$) characteristics for the two-dimensional XY model with resistively-shunted Josephson junction dynamics and Monte Carlo dynamics obeys a finite-size scaling form from which the nonlinear $I$-$V$ exponent $a$ can be determined to good precision. This determination supports the conclusion $a=z+1$, where $z$ is the dynamic critical exponent. The results are discussed in the light of the contrary conclusion reached by Tang and Chen [Phys. Rev. B {\bf 67}, 024508 (2003)] and the possibility of a breakdown of scaling suggested by Bormann [Phys. Rev. Lett. {\bf 78}, 4324 (1997)]. △ Less

Submitted 18 December, 2003; originally announced December 2003.

Comments: 6 pages, to appear in PRB

Journal ref: Phys. Rev. B 69, 064515 (2004)

arXiv:cond-mat/0312010 [pdf, ps, other]

doi 10.1103/PhysRevE.70.036108

The networked seceder model: Group formation in social and economic systems

Authors: Andreas Gronlund, Petter Holme

Abstract: The seceder model illustrates how the desire to be different than the average can lead to formation of groups in a population. We turn the original, agent based, seceder model into a model of network evolution. We find that the structural characteristics our model closely matches empirical social networks. Statistics for the dynamics of group formation are also given. Extensions of the model to… ▽ More The seceder model illustrates how the desire to be different than the average can lead to formation of groups in a population. We turn the original, agent based, seceder model into a model of network evolution. We find that the structural characteristics our model closely matches empirical social networks. Statistics for the dynamics of group formation are also given. Extensions of the model to networks of companies are also discussed. △ Less

Submitted 1 December, 2003; originally announced December 2003.

Journal ref: Phys. Rev. E 70, 036108 (2004)

Showing 1–29 of 29 results for author: Grønlund, A