Search | arXiv e-print repository

Universal Topological Regularities of Syntactic Structures: Decoupling Efficiency from Optimization

Authors: Fermín Moscoso del Prado Martín

Abstract: Human syntactic structures are usually represented as graphs. Much research has focused on the map** between such graphs and linguistic sequences, but less attention has been paid to the shapes of the graphs themselves: their topologies. This study investigates how the topologies of syntactic graphs reveal traces of the processes that led to their emergence. I report a new universal regularity i… ▽ More Human syntactic structures are usually represented as graphs. Much research has focused on the map** between such graphs and linguistic sequences, but less attention has been paid to the shapes of the graphs themselves: their topologies. This study investigates how the topologies of syntactic graphs reveal traces of the processes that led to their emergence. I report a new universal regularity in syntactic structures: Their topology is communicatively efficient above chance. The pattern holds, without exception, for all 124 languages studied, across linguistic families and modalities (spoken, written, and signed). This pattern can arise from a process optimizing for communicative efficiency or, alternatively, by construction, as a by-effect of a sublinear preferential attachment process reflecting language production mechanisms known from psycholinguistics. This dual explanation shows how communicative efficiency, per se, does not require optimization. Among the two options, efficiency without optimization offers the better explanation for the new pattern. △ Less

Submitted 31 January, 2023; originally announced February 2023.

Comments: 30 pages, 7 figures

arXiv:2105.06166 [pdf, ps, other]

The Dynamic k-Mismatch Problem

Authors: Raphaël Clifford, Paweł Gawrychowski, Tomasz Kociumaka, Daniel P. Martin, Przemysław Uznański

Abstract: The text-to-pattern Hamming distances problem asks to compute the Hamming distances between a given pattern of length $m$ and all length-$m$ substrings of a given text of length $n\ge m$. We focus on the $k$-mismatch version of the problem, where a distance needs to be returned only if it does not exceed a threshold $k$. We assume $n\le 2m$ (in general, one can partition the text into overlap**… ▽ More The text-to-pattern Hamming distances problem asks to compute the Hamming distances between a given pattern of length $m$ and all length-$m$ substrings of a given text of length $n\ge m$. We focus on the $k$-mismatch version of the problem, where a distance needs to be returned only if it does not exceed a threshold $k$. We assume $n\le 2m$ (in general, one can partition the text into overlap** blocks). In this work, we show data structures for the dynamic version of this problem supporting two operations: An update performs a single-letter substitution in the pattern or the text, and a query, given an index $i$, returns the Hamming distance between the pattern and the text substring starting at position $i$, or reports that it exceeds $k$. First, we show a data structure with $\tilde{O}(1)$ update and $\tilde{O}(k)$ query time. Then we show that $\tilde{O}(k)$ update and $\tilde{O}(1)$ query time is also possible. These two provide an optimal trade-off for the dynamic $k$-mismatch problem with $k \le \sqrt{n}$: we prove that, conditioned on the strong 3SUM conjecture, one cannot simultaneously achieve $k^{1-Ω(1)}$ time for all operations. For $k\ge \sqrt{n}$, we give another lower bound, conditioned on the Online Matrix-Vector conjecture, that excludes algorithms taking $n^{1/2-Ω(1)}$ time per operation. This is tight for constant-sized alphabets: Clifford et al. (STACS 2018) achieved $\tilde{O}(\sqrt{n})$ time per operation in that case, but with $\tilde{O}(n^{3/4})$ time per operation for large alphabets. We improve and extend this result with an algorithm that, given $1\le x\le k$, achieves update time $\tilde{O}(\frac{n}{k} +\sqrt{\frac{nk}{x}})$ and query time $\tilde{O}(x)$. In particular, for $k\ge \sqrt{n}$, an appropriate choice of $x$ yields $\tilde{O}(\sqrt[3]{nk})$ time per operation, which is $\tilde{O}(n^{2/3})$ when no threshold $k$ is provided. △ Less

Submitted 28 March, 2022; v1 submitted 13 May, 2021; originally announced May 2021.

arXiv:1906.08362 [pdf, other]

Trepan Reloaded: A Knowledge-driven Approach to Explaining Artificial Neural Networks

Authors: Roberto Confalonieri, Tillman Weyde, Tarek R. Besold, Fermín Moscoso del Prado Martín

Abstract: Explainability in Artificial Intelligence has been revived as a topic of active research by the need of conveying safety and trust to users in the `how' and `why' of automated decision-making. Whilst a plethora of approaches have been developed for post-hoc explainability, only a few focus on how to use domain knowledge, and how this influences the understandability of global explanations from the… ▽ More Explainability in Artificial Intelligence has been revived as a topic of active research by the need of conveying safety and trust to users in the `how' and `why' of automated decision-making. Whilst a plethora of approaches have been developed for post-hoc explainability, only a few focus on how to use domain knowledge, and how this influences the understandability of global explanations from the users' perspective. In this paper, we show how ontologies help the understandability of global post-hoc explanations, presented in the form of symbolic models. In particular, we build on Trepan, an algorithm that explains artificial neural networks by means of decision trees, and we extend it to include ontologies modeling domain knowledge in the process of generating explanations. We present the results of a user study that measures the understandability of decision trees using a syntactic complexity measure, and through time and accuracy of responses as well as reported user confidence and understandability. The user study considers domains where explanations are critical, namely, in finance and medicine. The results show that decision trees generated with our algorithm, taking into account domain knowledge, are more understandable than those generated by standard Trepan without the use of ontologies. △ Less

Submitted 21 November, 2019; v1 submitted 19 June, 2019; originally announced June 2019.

arXiv:1905.01254 [pdf, ps, other]

RLE edit distance in near optimal time

Authors: Raphaël Clifford, Paweł Gawrychowski, Tomasz Kociumaka, Daniel P. Martin, Przemysław Uznański

Abstract: We show that the edit distance between two run-length encoded strings of compressed lengths $m$ and $n$ respectively, can be computed in $\mathcal{O}(mn\log(mn))$ time. This improves the previous record by a factor of $\mathcal{O}(n/\log(mn))$. The running time of our algorithm is within subpolynomial factors of being optimal, subject to the standard SETH-hardness assumption. This effectively clos… ▽ More We show that the edit distance between two run-length encoded strings of compressed lengths $m$ and $n$ respectively, can be computed in $\mathcal{O}(mn\log(mn))$ time. This improves the previous record by a factor of $\mathcal{O}(n/\log(mn))$. The running time of our algorithm is within subpolynomial factors of being optimal, subject to the standard SETH-hardness assumption. This effectively closes a line of algorithmic research first started in 1993. △ Less

Submitted 3 May, 2019; originally announced May 2019.

arXiv:1709.05542 [pdf]

CIRCE: The Canarias InfraRed Camera Experiment for the Gran Telescopio Canarias

Authors: Stephen S. Eikenberry, Miguel Charcos, Michelle L. Edwards, Alan Garner, Nestor Lasso-Cabrera, Richard D. Stelter, Antonio Marin-Franch, S. Nicholas Raines, Kendall Ackley, John G. Bennett, Javier A. Cenarro, Brian Chinn, H. Veronica Donoso, Raymond Frommeyer, Kevin Hanna, Michael D. Herlevich, Jeff Julian, Paola Miller, Scott Mullin, Charles H. Murphey, Chris Packham, Frank Varosi, Claudia Vega, Craig Warner, A. N. Ramaprakash , et al. (29 additional authors not shown)

Abstract: The Canarias InfraRed Camera Experiment (CIRCE) is a near-infrared (1-2.5 micron) imager, polarimeter and low-resolution spectrograph operating as a visitor instrument for the Gran Telescopio Canarias 10.4-meter telescope. It was designed and built largely by graduate students and postdocs, with help from the UF astronomy engineering group, and is funded by the University of Florida and the U.S. N… ▽ More The Canarias InfraRed Camera Experiment (CIRCE) is a near-infrared (1-2.5 micron) imager, polarimeter and low-resolution spectrograph operating as a visitor instrument for the Gran Telescopio Canarias 10.4-meter telescope. It was designed and built largely by graduate students and postdocs, with help from the UF astronomy engineering group, and is funded by the University of Florida and the U.S. National Science Foundation. CIRCE is intended to help fill the gap in near-infrared capabilities prior to the arrival of EMIR to the GTC, and will also provide the following scientific capabilities to compliment EMIR after its arrival: high-resolution imaging, narrowband imaging, high-time-resolution photometry, imaging polarimetry, low resolution spectroscopy. In this paper, we review the design, fabrication, integration, lab testing, and on-sky performance results for CIRCE. These include a novel approach to the opto-mechanical design, fabrication, and alignment. △ Less

Submitted 16 September, 2017; originally announced September 2017.

Comments: 41 pages, 18 figures

arXiv:1709.00553 [pdf, ps, other]

Dynamic Shortest Path and Transitive Closure Algorithms: A Survey

Authors: Daniel P. Martin

Abstract: Algorithms which compute properties over graphs have always been of interest in computer science, with some of the fundamental algorithms, such as Dijkstra's algorithm, dating back to the 50s. Since the 70s there as been interest in computing over graphs which are constantly changing, in a way which is more efficient than simple recomputing after each time the graph changes. In this paper we provi… ▽ More Algorithms which compute properties over graphs have always been of interest in computer science, with some of the fundamental algorithms, such as Dijkstra's algorithm, dating back to the 50s. Since the 70s there as been interest in computing over graphs which are constantly changing, in a way which is more efficient than simple recomputing after each time the graph changes. In this paper we provide a survey of both the foundational, and the state of the art, algorithms which solve either shortest path or transitive closure problems in either fully or partially dynamic graphs. We balance this with the known conditional lowerbounds. △ Less

Submitted 29 October, 2017; v1 submitted 2 September, 2017; originally announced September 2017.

Comments: 17 pages, 3 figures

arXiv:1304.7359 [pdf, ps, other]

doi 10.1088/1742-5468/2013/07/L07001

Constant conditional entropy and related hypotheses

Authors: Ramon Ferrer-i-Cancho, Łukasz Dębowski, Fermín Moscoso del Prado Martín

Abstract: Constant entropy rate (conditional entropies must remain constant as the sequence length increases) and uniform information density (conditional probabilities must remain constant as the sequence length increases) are two information theoretic principles that are argued to underlie a wide range of linguistic phenomena. Here we revise the predictions of these principles to the light of Hilberg's la… ▽ More Constant entropy rate (conditional entropies must remain constant as the sequence length increases) and uniform information density (conditional probabilities must remain constant as the sequence length increases) are two information theoretic principles that are argued to underlie a wide range of linguistic phenomena. Here we revise the predictions of these principles to the light of Hilberg's law on the scaling of conditional entropy in language and related laws. We show that constant entropy rate (CER) and two interpretations for uniform information density (UID), full UID and strong UID, are inconsistent with these laws. Strong UID implies CER but the reverse is not true. Full UID, a particular case of UID, leads to costly uncorrelated sequences that are totally unrealistic. We conclude that CER and its particular cases are incomplete hypotheses about the scaling of conditional entropies. △ Less

Submitted 23 May, 2013; v1 submitted 27 April, 2013; originally announced April 2013.

Comments: introduction improved; typos corrected

Journal ref: Journal of Statistical Mechanics, L07001 (2013)

arXiv:1209.1751 [pdf, ps, other]

doi 10.1088/1742-5468/2011/12/L12002

Information content versus word length in random ty**

Authors: Ramon Ferrer-i-Cancho, Fermín Moscoso del Prado Martín

Abstract: Recently, it has been claimed that a linear relationship between a measure of information content and word length is expected from word length optimization and it has been shown that this linearity is supported by a strong correlation between information content and word length in many languages (Piantadosi et al. 2011, PNAS 108, 3825-3826). Here, we study in detail some connections between this m… ▽ More Recently, it has been claimed that a linear relationship between a measure of information content and word length is expected from word length optimization and it has been shown that this linearity is supported by a strong correlation between information content and word length in many languages (Piantadosi et al. 2011, PNAS 108, 3825-3826). Here, we study in detail some connections between this measure and standard information theory. The relationship between the measure and word length is studied for the popular random ty** process where a text is constructed by pressing keys at random from a keyboard containing letters and a space behaving as a word delimiter. Although this random process does not optimize word lengths according to information content, it exhibits a linear relationship between information content and word length. The exact slope and intercept are presented for three major variants of the random ty** process. A strong correlation between information content and word length can simply arise from the units making a word (e.g., letters) and not necessarily from the interplay between a word and its context as proposed by Piantadosi et al. In itself, the linear relation does not entail the results of any optimization process. △ Less

Submitted 8 September, 2012; originally announced September 2012.

Journal ref: Journal of Statistical Mechanics, L12002 (2011)

arXiv:0911.2381 [pdf, ps, other]

Analytical Determination of Fractal Structure in Stochastic Time Series

Authors: Fermín Moscoso del Prado Martín

Abstract: Current methods for determining whether a time series exhibits fractal structure (FS) rely on subjective assessments on estimators of the Hurst exponent (H). Here, I introduce the Bayesian Assessment of Scaling, an analytical framework for drawing objective and accurate inferences on the FS of time series. The technique exploits the scaling property of the diffusion associated to a time series.… ▽ More Current methods for determining whether a time series exhibits fractal structure (FS) rely on subjective assessments on estimators of the Hurst exponent (H). Here, I introduce the Bayesian Assessment of Scaling, an analytical framework for drawing objective and accurate inferences on the FS of time series. The technique exploits the scaling property of the diffusion associated to a time series. The resulting criterion is simple to compute and represents an accurate characterization of the evidence supporting different hypotheses on the scaling regime of a time series. Additionally, a closed-form Maximum Likelihood estimator of H is derived from the criterion, and this estimator outperforms the best available estimators. △ Less

Submitted 12 November, 2009; originally announced November 2009.

Comments: 9 pages, 4 figures

arXiv:0908.3432 [pdf, other]

The baseline for response latency distributions

Authors: Fermín Moscoso del Prado Martín

Abstract: Response latency -- the time taken to initiate or complete an action or task -- is one of the principal measures used to investigate the mechanisms subserving human and animal cognitive processes. The right tails of response latency distributions have received little attention in experimental psychology. This is because such very long latencies have traditionally been considered irrelevant for p… ▽ More Response latency -- the time taken to initiate or complete an action or task -- is one of the principal measures used to investigate the mechanisms subserving human and animal cognitive processes. The right tails of response latency distributions have received little attention in experimental psychology. This is because such very long latencies have traditionally been considered irrelevant for psychological processes, instead, they are expected to reflect `contingent' neural events unrelated to the experimental question. Most current theories predict the right tail of response latency distributions to decrease exponentially. In consequence, current standard practice recommends discarding very long response latencies as `outliers'. Here, I show that the right tails of response latency distributions always follow a power-law with a slope of exactly two. This entails that the very late responses cannot be considered outliers. Rather they provide crucial information that falsifies most current theories of cognitive processing with respect to their exponential tail predictions. This exponent constitutes a fundamental constant of the cognitive system that groups behavioral measures with a variety of physical phenomena. △ Less

Submitted 24 August, 2009; originally announced August 2009.

Comments: Submitted manuscript

arXiv:0908.3170 [pdf, other]

The thermodynamics of human reaction times

Authors: Fermín Moscoso del Prado Martín

Abstract: I present a new approach for the interpretation of reaction time (RT) data from behavioral experiments. From a physical perspective, the entropy of the RT distribution provides a model-free estimate of the amount of processing performed by the cognitive system. In this way, the focus is shifted from the conventional interpretation of individual RTs being either long or short, into their distribu… ▽ More I present a new approach for the interpretation of reaction time (RT) data from behavioral experiments. From a physical perspective, the entropy of the RT distribution provides a model-free estimate of the amount of processing performed by the cognitive system. In this way, the focus is shifted from the conventional interpretation of individual RTs being either long or short, into their distribution being more or less complex in terms of entropy. The new approach enables the estimation of the cognitive processing load without reference to the informational content of the stimuli themselves, thus providing a more appropriate estimate of the cognitive impact of different sources of information that are carried by experimental stimuli or tasks. The paper introduces the formulation of the theory, followed by an empirical validation using a database of human RTs in lexical tasks (visual lexical decision and word naming). The results show that this new interpretation of RTs is more powerful than the traditional one. The method provides theoretical estimates of the processing loads elicited by individual stimuli. These loads sharply distinguish the responses from different tasks. In addition, it provides upper-bound estimates for the speed at which the system processes information. Finally, I argue that the theoretical proposal, and the associated empirical evidence, provide strong arguments for an adaptive system that systematically adjusts its operational processing speed to the particular demands of each stimulus. This finding is in contradiction with Hick's law, which posits a relatively constant processing speed within an experimental context. △ Less

Submitted 21 August, 2009; originally announced August 2009.

Comments: Submitted manuscript

Showing 1–11 of 11 results for author: Martin, D P