-
Rauzy dimension and finite-state dimension
Authors:
Verónica Becher,
Olivier Carton,
Santiago Figueira
Abstract:
In a paper of 1976, Rauzy studied two complexity notions, $\underlineβ$ and $\overlineβ$, for infinite sequences over a finite alphabet. The function $\underlineβ$ is maximum exactly in the Borel normal sequences and $\overlineβ$ is minimum exactly in the sequences that, when added to any Borel normal sequence, the result is also Borel normal. Although the definition of $\underlineβ$ and…
▽ More
In a paper of 1976, Rauzy studied two complexity notions, $\underlineβ$ and $\overlineβ$, for infinite sequences over a finite alphabet. The function $\underlineβ$ is maximum exactly in the Borel normal sequences and $\overlineβ$ is minimum exactly in the sequences that, when added to any Borel normal sequence, the result is also Borel normal. Although the definition of $\underlineβ$ and $\overlineβ$ do not involve finite-state automata, we establish some connections between them and the lower $\underline{\rm dim}$ and upper $\overline{\rm dim}$ finite-state dimension (or other equivalent notions like finite-state compression ratio, aligned-entropy or cumulative log-loss of finite-state predictors). We show tight lower and upper bounds on $\underline{\rm dim}$ and $\overline{\rm dim}$ as functions of $\underlineβ$ and $\overlineβ$, respectively. In particular this implies that sequences with $\overline{\rm dim}$ zero are exactly the ones that that, when added to any Borel normal sequence, the result is also Borel normal. We also show that the finite-state dimensions $\underline{\rm dim}$ and $\overline{\rm dim}$ are essentially subadditive. We need two technical tools that are of independent interest. One is the family of local finite-state automata, which are automata whose memory consists of the last $k$ read symbols for some fixed integer $k$. We show that compressors based on local finite-state automata are as good as standard finite-state compressors. The other one is a notion of finite-state relational (non-deterministic) compressor, which can compress an input in several ways provided the input can always be recovered from any of its outputs. We show that such compressors cannot compress more than standard (deterministic) finite-state compressors.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
A Modal Logic with n-ary Relations Over Paths: Comonadic Semantics and Expressivity
Authors:
Santiago Figueira,
Gabriel Goren Roig
Abstract:
Game comonads give a categorical semantics for comparison games in Finite Model Theory, thus providing an abstract characterisation of logical equivalence for a wide range of logics, each one captured through a specific choice of comonad. However, data-aware logics such as CoreDataXPath present sophisticated notions of bisimulation which defy a straightforward comonadic encoding. In this work we b…
▽ More
Game comonads give a categorical semantics for comparison games in Finite Model Theory, thus providing an abstract characterisation of logical equivalence for a wide range of logics, each one captured through a specific choice of comonad. However, data-aware logics such as CoreDataXPath present sophisticated notions of bisimulation which defy a straightforward comonadic encoding. In this work we begin the comonadic treatment of data-aware logics by introducing a generalisation of Modal Logic that allows relation symbols of arbitrary arity as atoms of the syntax, which we call Path Predicate Modal Logic or PPML. We motivate this logic as arising from a shift in perspective on a already studied restricted version of CoreDataXPath, called DataGL, and prove that PPML recovers DataGL for a specific choice of signature. We argue that this shift in perspective allows the capturing and designing of new data-aware logics. On the other hand, PPML enjoys an intrinsic motivation in that it extends Modal Logic to predicate over more general models. Having defined the simulation and bisimulation games for PPML and having proven a Hennessy-Milner-type theorem, we define the PPML comonad and prove that it captures these games, following analogous results in the literature. Our treatment is novel in that we explicitly prove that our comonad satisfies the axioms of arboreal categories and arboreal covers. Using the comonadic machinery, we immediately obtain a tree-model property for PPML. Finally, we define a translation functor from relational structures into Kripke structures and use its properties to prove a series of polynomial-time reductions from PPML problems to their Basic Modal Logic counterparts. Our results explain precisely in what sense PPML lets us view general relational structures through the modal lens.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
PDL on Steroids: on Expressive Extensions of PDL with Intersection and Converse
Authors:
Diego Figueira,
Santiago Figueira,
Edwin Pin
Abstract:
We introduce CPDL+, a family of expressive logics rooted in Propositional Dynamic Logic (PDL). In terms of expressive power, CPDL+ strictly contains PDL extended with intersection and converse (a.k.a. ICPDL) as well as Conjunctive Queries (CQ), Conjunctive Regular Path Queries (CRPQ), or some known extensions thereof (Regular Queries and CQPDL). We investigate the expressive power, characterizatio…
▽ More
We introduce CPDL+, a family of expressive logics rooted in Propositional Dynamic Logic (PDL). In terms of expressive power, CPDL+ strictly contains PDL extended with intersection and converse (a.k.a. ICPDL) as well as Conjunctive Queries (CQ), Conjunctive Regular Path Queries (CRPQ), or some known extensions thereof (Regular Queries and CQPDL). We investigate the expressive power, characterization of bisimulation, satisfiability, and model checking for CPDL+.
We argue that natural subclasses of CPDL+ can be defined in terms of the tree-width of the underlying graphs of the formulas. We show that the class of CPDL+ formulas of tree-width 2 is equivalent to ICPDL, and that it also coincides with CPDL+ formulas of tree-width 1. However, beyond tree-width 2, incrementing the tree-width strictly increases the expressive power. We characterize the expressive power for every class of fixed tree-width formulas in terms of a bisimulation game with pebbles. Based on this characterization, we show that CPDL+ has a tree-like model property. We prove that the satisfiability problem is decidable in 2ExpTime on fixed tree-width formulas, coinciding with the complexity of ICPDL. We also exhibit classes for which satisfiability is reduced to ExpTime. Finally, we establish that the model checking problem for fixed tree-width formulas is in \ptime, contrary to the full class CPDL+.
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
Towards a more flexible Language of Thought: Bayesian grammar updates after each concept exposure
Authors:
Pablo Tano,
Sergio Romano,
Mariano Sigman,
Alejo Salles,
Santiago Figueira
Abstract:
Recent approaches to human concept learning have successfully combined the power of symbolic, infinitely productive rule systems and statistical learning to explain our ability to learn new concepts from just a few examples. The aim of most of these studies is to reveal the underlying language structuring these representations and providing a general substrate for thought. However, describing a mo…
▽ More
Recent approaches to human concept learning have successfully combined the power of symbolic, infinitely productive rule systems and statistical learning to explain our ability to learn new concepts from just a few examples. The aim of most of these studies is to reveal the underlying language structuring these representations and providing a general substrate for thought. However, describing a model of thought that is fixed once trained is against the extensive literature that shows how experience shapes concept learning. Here, we ask about the plasticity of these symbolic descriptive languages. We perform a concept learning experiment that demonstrates that humans can change very rapidly the repertoire of symbols they use to identify concepts, by compiling expressions which are frequently used into new symbols of the language. The pattern of concept learning times is accurately described by a Bayesian agent that rationally updates the probability of compiling a new expression according to how useful it has been to compress concepts so far. By portraying the Language of Thought as a flexible system of rules, we also highlight the difficulties to pin it down empirically.
△ Less
Submitted 26 September, 2019; v1 submitted 17 May, 2018;
originally announced May 2018.
-
Axiomatizations for downward XPath on Data Trees
Authors:
Sergio Abriola,
María Emilia Descotte,
Raul Fervari,
Santiago Figueira
Abstract:
We give sound and complete axiomatizations for XPath with data tests by "equality" or "inequality", and containing the single "child" axis. This data-aware logic predicts over data trees, which are tree-like structures whose every node contains a label from a finite alphabet and a data value from an infinite domain. The language allows us to compare data values of two nodes but cannot access the d…
▽ More
We give sound and complete axiomatizations for XPath with data tests by "equality" or "inequality", and containing the single "child" axis. This data-aware logic predicts over data trees, which are tree-like structures whose every node contains a label from a finite alphabet and a data value from an infinite domain. The language allows us to compare data values of two nodes but cannot access the data values themselves (i.e. there is no comparison by constants). Our axioms are in the style of equational logic, extending the axiomatization of data-oblivious XPath, by B. ten Cate, T. Litak and M. Marx. We axiomatize the full logic with tests by "equality" and "inequality", and also a simpler fragment with "equality" tests only. Our axiomatizations apply both to node expressions and path expressions. The proof of completeness relies on a novel normal form theorem for XPath with data tests.
△ Less
Submitted 13 March, 2017; v1 submitted 13 May, 2016;
originally announced May 2016.
-
Normality in non-integer bases and polynomial time randomness
Authors:
Javier Almarza,
Santiago Figueira
Abstract:
It is known that if $x\in[0,1]$ is polynomial time random (i.e. no polynomial time computable martingale succeeds on the binary fractional expansion of $x$) then $x$ is normal in any integer base greater than one. We show that if $x$ is polynomial time random and $β>1$ is Pisot, then $x$ is "normal in base $β$", in the sense that the sequence $(xβ^n)_{n\in\mathbb{N}}$ is uniformly distributed modu…
▽ More
It is known that if $x\in[0,1]$ is polynomial time random (i.e. no polynomial time computable martingale succeeds on the binary fractional expansion of $x$) then $x$ is normal in any integer base greater than one. We show that if $x$ is polynomial time random and $β>1$ is Pisot, then $x$ is "normal in base $β$", in the sense that the sequence $(xβ^n)_{n\in\mathbb{N}}$ is uniformly distributed modulo one. We work with the notion of "$P$-martingale", a generalization of martingales to non-uniform distributions, and show that a sequence over a finite alphabet is distributed according to an irreducible, invariant Markov measure~$P$ if an only if no $P$-martingale whose betting factors are computed by a deterministic finite automaton succeeds on it. This is a generalization of Schnorr and Stimm's characterization of normal sequences in integer bases. Our results use tools and techniques from symbolic dynamics, together with automata theory and algorithmic randomness.
△ Less
Submitted 30 October, 2014;
originally announced October 2014.
-
LT^2C^2: A language of thought with Turing-computable Kolmogorov complexity
Authors:
Sergio Romano,
Mariano Sigman,
Santiago Figueira
Abstract:
In this paper, we present a theoretical effort to connect the theory of program size to psychology by implementing a concrete language of thought with Turing-computable Kolmogorov complexity (LT^2C^2) satisfying the following requirements: 1) to be simple enough so that the complexity of any given finite binary sequence can be computed, 2) to be based on tangible operations of human reasoning (pri…
▽ More
In this paper, we present a theoretical effort to connect the theory of program size to psychology by implementing a concrete language of thought with Turing-computable Kolmogorov complexity (LT^2C^2) satisfying the following requirements: 1) to be simple enough so that the complexity of any given finite binary sequence can be computed, 2) to be based on tangible operations of human reasoning (printing, repeating,...), 3) to be sufficiently powerful to generate all possible sequences but not too powerful as to identify regularities which would be invisible to humans. We first formalize LT^2C^2, giving its syntax and semantics and defining an adequate notion of program size. Our setting leads to a Kolmogorov complexity function relative to LT^2C^2 which is computable in polynomial time, and it also induces a prediction algorithm in the spirit of Solomonoff's inductive inference theory. We then prove the efficacy of this language by investigating regularities in strings produced by participants attempting to generate random strings. Participants had a profound understanding of randomness and hence avoided typical misconceptions such as exaggerating the number of alternations. We reasoned that remaining regularities would express the algorithmic nature of human thoughts, revealed in the form of specific patterns. Kolmogorov complexity relative to LT^2C^2 passed three expected tests examined here: 1) human sequences were less complex than control PRNG sequences, 2) human sequences were not stationary, showing decreasing values of complexity resulting from fatigue, 3) each individual showed traces of algorithmic stability since fitting of partial sequences was more effective to predict subsequent sequences than average fits. This work extends on previous efforts to combine notions of Kolmogorov complexity theory and algorithmic information theory to psychology, by explicitly ...
△ Less
Submitted 4 March, 2013;
originally announced March 2013.
-
Ackermannian and Primitive-Recursive Bounds with Dickson's Lemma
Authors:
Diego Figueira,
Santiago Figueira,
Sylvain Schmitz,
Philippe Schnoebelen
Abstract:
Dickson's Lemma is a simple yet powerful tool widely used in termination proofs, especially when dealing with counters or related data structures. However, most computer scientists do not know how to derive complexity upper bounds from such termination proofs, and the existing literature is not very helpful in these matters.
We propose a new analysis of the length of bad sequences over (N^k,\leq…
▽ More
Dickson's Lemma is a simple yet powerful tool widely used in termination proofs, especially when dealing with counters or related data structures. However, most computer scientists do not know how to derive complexity upper bounds from such termination proofs, and the existing literature is not very helpful in these matters.
We propose a new analysis of the length of bad sequences over (N^k,\leq) and explain how one may derive complexity upper bounds from termination proofs. Our upper bounds improve earlier results and are essentially tight.
△ Less
Submitted 19 July, 2011; v1 submitted 18 July, 2010;
originally announced July 2010.
-
The Question of Expressiveness in the Generation of Referring Expressions
Authors:
Carlos Areces,
Santiago Figueira,
Daniel Gorín
Abstract:
We study the problem of generating referring expressions modulo different notions of expressive power. We define the notion of $\+L$-referring expression, for a formal language $\+L$ equipped with a semantics in terms of relational models. We show that the approach is independent of the particular algorithm used to generate the referring expression by providing examples using the frameworks of \ci…
▽ More
We study the problem of generating referring expressions modulo different notions of expressive power. We define the notion of $\+L$-referring expression, for a formal language $\+L$ equipped with a semantics in terms of relational models. We show that the approach is independent of the particular algorithm used to generate the referring expression by providing examples using the frameworks of \cite{AKS08} and \cite{Krahmer2003}. We provide some new complexity bounds, discuss the issue of the length of the generated descriptions, and propose ways in which the two approaches can be combined.
△ Less
Submitted 23 June, 2010;
originally announced June 2010.