Skip to main content

Showing 1–4 of 4 results for author: De Gregorio, J

.
  1. arXiv:2403.18430  [pdf, other

    cs.CL physics.data-an physics.soc-ph stat.AP

    Exploring language relations through syntactic distances and geographic proximity

    Authors: Juan De Gregorio, Raúl Toral, David Sánchez

    Abstract: Languages are grouped into families that share common linguistic traits. While this approach has been successful in understanding genetic relations between diverse languages, more analyses are needed to accurately quantify their relatedness, especially in less studied linguistic levels such as syntax. Here, we explore linguistic distances using series of parts of speech (POS) extracted from the Un… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 36 pages

  2. arXiv:2310.07547  [pdf, other

    cond-mat.stat-mech nlin.CD physics.data-an

    Entropy estimators for Markovian sequences: A comparative analysis

    Authors: Juan De Gregorio, David Sanchez, Raul Toral

    Abstract: Entropy estimation is a fundamental problem in information theory that has applications in various fields, including physics, biology, and computer science. Estimating the entropy of discrete sequences can be challenging due to limited data and the lack of unbiased estimators. Most existing entropy estimators are designed for sequences of independent events and their performance vary depending on… ▽ More

    Submitted 17 January, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: 19 pages, 9 figures

    Journal ref: Entropy 2024, 26(1), 79

  3. arXiv:2208.11175  [pdf, other

    cs.CL cond-mat.stat-mech physics.soc-ph

    Ordinal analysis of lexical patterns

    Authors: David Sanchez, Luciano Zunino, Juan De Gregorio, Raul Toral, Claudio Mirasso

    Abstract: Words are fundamental linguistic units that connect thoughts and things through meaning. However, words do not appear independently in a text sequence. The existence of syntactic rules induces correlations among neighboring words. Using an ordinal pattern approach, we present an analysis of lexical statistical connections for 11 major languages. We find that the diverse manners that languages util… ▽ More

    Submitted 14 March, 2023; v1 submitted 23 August, 2022; originally announced August 2022.

    Comments: 9 pages, 12 figures, 2 tables; v2: the section on universality has been removed because previous results were affected by spurious correlations. Published version

    Journal ref: Chaos 33, 033121 (2023)

  4. arXiv:2205.11931  [pdf, other

    cond-mat.stat-mech physics.data-an

    An improved estimator of Shannon entropy with applications to systems with memory

    Authors: Juan De Gregorio, David Sanchez, Raul Toral

    Abstract: We investigate the memory properties of discrete sequences built upon a finite number of states. We find that the block entropy can reliably determine the memory for systems modeled as Markov chains of arbitrary finite order. Further, we provide an entropy estimator that remarkably gives accurate results when correlations are present. To illustrate our findings, we calculate the memory of daily pr… ▽ More

    Submitted 18 November, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

    Journal ref: Chaos, Solitons and Fractals 165, 112797 (2022)