-
Blockchain Metrics and Indicators in Cryptocurrency Trading
Authors:
Juan C. King,
Roberto Dale,
José M. Amigó
Abstract:
The objective of this paper is the construction of new indicators that can be useful to operate in the cryptocurrency market. These indicators are based on public data obtained from the blockchain network, specifically from the nodes that make up Bitcoin mining. Therefore, our analysis is unique to that network. The results obtained with numerical simulations of algorithmic trading and prediction…
▽ More
The objective of this paper is the construction of new indicators that can be useful to operate in the cryptocurrency market. These indicators are based on public data obtained from the blockchain network, specifically from the nodes that make up Bitcoin mining. Therefore, our analysis is unique to that network. The results obtained with numerical simulations of algorithmic trading and prediction via statistical models and Machine Learning demonstrate the importance of variables such as the hash rate, the difficulty of mining or the cost per transaction when it comes to trade Bitcoin assets or predict the direction of price. Variables obtained from the blockchain network will be called here blockchain metrics. The corresponding indicators (inspired by the "Hash Ribbon") perform well in locating buy signals. From our results, we conclude that such blockchain indicators allow obtaining information with a statistical advantage in the highly volatile cryptocurrency market.
△ Less
Submitted 11 February, 2024;
originally announced March 2024.
-
(Mis)align: A Simple Dynamic Framework for Modeling Interpersonal Coordination
Authors:
Grace Qiyuan Miao,
Rick Dale,
Alexia Galati
Abstract:
As people coordinate in daily interactions, they engage in different patterns of behavior to achieve successful outcomes. This includes both synchrony - the temporal coordination of the same behaviors at the same time - and complementarity - the coordination of the same or different behaviors that may occur at different relative times. Using computational methods, we develop a simple framework to…
▽ More
As people coordinate in daily interactions, they engage in different patterns of behavior to achieve successful outcomes. This includes both synchrony - the temporal coordination of the same behaviors at the same time - and complementarity - the coordination of the same or different behaviors that may occur at different relative times. Using computational methods, we develop a simple framework to describe the interpersonal dynamics of behavioral synchrony and complementarity over time, and explore their task dependence. A key feature of this framework is the inclusion of a task context that mediates interactions, and consists of active, inactive, and inhibitory constraints on communication. Initial simulation results show that these task constraints can be a robust predictor of simulated agents' behaviors over time. We also show that the framework can reproduce some general patterns observed in human interaction data. We describe preliminary theoretical implications from these results, and relate them to broader proposals of synergistic self-organization in communication.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
A Statistical Model of Word Rank Evolution
Authors:
Alex John Quijano,
Rick Dale,
Suzanne Sindi
Abstract:
The availability of large linguistic data sets enables data-driven approaches to study linguistic change. The Google Books corpus unigram frequency data set is used to investigate the word rank dynamics in eight languages. We observed the rank changes of the unigrams from 1900 to 2008 and compared it to a Wright-Fisher inspired model that we developed for our analysis. The model simulates a neutra…
▽ More
The availability of large linguistic data sets enables data-driven approaches to study linguistic change. The Google Books corpus unigram frequency data set is used to investigate the word rank dynamics in eight languages. We observed the rank changes of the unigrams from 1900 to 2008 and compared it to a Wright-Fisher inspired model that we developed for our analysis. The model simulates a neutral evolutionary process with the restriction of having no disappearing and added words. This work explains the mathematical framework of the model - written as a Markov Chain with multinomial transition probabilities - to show how frequencies of words change in time. From our observations in the data and our model, word rank stability shows two types of characteristics: (1) the increase/decrease in ranks are monotonic, or (2) the rank stays the same. Based on our model, high-ranked words tend to be more stable while low-ranked words tend to be more volatile. Some words change in ranks in two ways: (a) by an accumulation of small increasing/decreasing rank changes in time and (b) by shocks of increase/decrease in ranks. Most words in all of the languages we have looked at are rank stable, but not as stable as a neutral model would predict. The stopwords and Swadesh words are observed to be rank stable across eight languages indicating linguistic conformity in established languages. These signatures suggest unigram frequencies in all languages have changed in a manner inconsistent with a purely neutral evolutionary process.
△ Less
Submitted 14 February, 2022; v1 submitted 21 July, 2021;
originally announced July 2021.
-
Complexity-based permutation entropies: from deterministic time series to white noise
Authors:
J. M. Amigó,
R. Dale,
P. Tempesta
Abstract:
This is a paper in the intersection of time series analysis and complexity theory that presents new results on permutation complexity in general and permutation entropy in particular. In this context, permutation complexity refers to the characterization of time series by means of ordinal patterns (permutations), entropic measures, decay rates of missing ordinal patterns, and more. Since the incep…
▽ More
This is a paper in the intersection of time series analysis and complexity theory that presents new results on permutation complexity in general and permutation entropy in particular. In this context, permutation complexity refers to the characterization of time series by means of ordinal patterns (permutations), entropic measures, decay rates of missing ordinal patterns, and more. Since the inception of this \textquotedblleft ordinal\textquotedblright\ methodology, its practical application to any type of scalar time series and real-valued processes have proven to be simple and useful. However, the theoretical aspects have remained limited to noiseless deterministic series and dynamical systems, the main obstacle being the super-exponential growth of visible permutations with length when randomness (also in form of observational noise) is present in the data. To overcome this difficulty, we take a new approach through complexity classes, which are precisely defined by the growth of visible permutations with length, regardless of the deterministic or noisy nature of the data. We consider three major classes: exponential, sub-factorial and factorial. The next step is to adapt the concept of Z-entropy to each of those classes, which we call permutation entropy because it coincides with the conventional permutation entropy on the exponential class. Z-entropies are a family of group entropies, each of them extensive on a given complexity class. The result is a unified approach to the ordinal analysis of deterministic and random processes, from dynamical systems to white noise, with new concepts and tools. Numerical simulations show that permutation entropy discriminates time series from all complexity classes.
△ Less
Submitted 5 November, 2021; v1 submitted 5 March, 2021;
originally announced March 2021.
-
Dynamic Natural Language Processing with Recurrence Quantification Analysis
Authors:
Rick Dale,
Nicholas D. Duran,
Moreno Coco
Abstract:
Writing and reading are dynamic processes. As an author composes a text, a sequence of words is produced. This sequence is one that, the author hopes, causes a revisitation of certain thoughts and ideas in others. These processes of composition and revisitation by readers are ordered in time. This means that text itself can be investigated under the lens of dynamical systems. A common technique fo…
▽ More
Writing and reading are dynamic processes. As an author composes a text, a sequence of words is produced. This sequence is one that, the author hopes, causes a revisitation of certain thoughts and ideas in others. These processes of composition and revisitation by readers are ordered in time. This means that text itself can be investigated under the lens of dynamical systems. A common technique for analyzing the behavior of dynamical systems, known as recurrence quantification analysis (RQA), can be used as a method for analyzing sequential structure of text. RQA treats text as a sequential measurement, much like a time series, and can thus be seen as a kind of dynamic natural language processing (NLP). The extension has several benefits. Because it is part of a suite of time series analysis tools, many measures can be extracted in one common framework. Secondly, the measures have a close relationship with some commonly used measures from natural language processing. Finally, using recurrence analysis offers an opportunity expand analysis of text by develo** theoretical descriptions derived from complex dynamic systems. We showcase an example analysis on 8,000 texts from the Gutenberg Project, compare it to well-known NLP approaches, and describe an R package (crqanlp) that can be used in conjunction with R library crqa.
△ Less
Submitted 19 March, 2018;
originally announced March 2018.
-
Timescales of Massive Human Entrainment
Authors:
Riccardo Fusaroli,
Marcus Perlman,
Alan Mislove,
Alexandra Paxton,
Teenie Matlock,
Rick Dale
Abstract:
The past two decades have seen an upsurge of interest in the collective behaviors of complex systems composed of many agents entrained to each other and to external events. In this paper, we extend concepts of entrainment to the dynamics of human collective attention. We conducted a detailed investigation of the unfolding of human entrainment - as expressed by the content and patterns of hundreds…
▽ More
The past two decades have seen an upsurge of interest in the collective behaviors of complex systems composed of many agents entrained to each other and to external events. In this paper, we extend concepts of entrainment to the dynamics of human collective attention. We conducted a detailed investigation of the unfolding of human entrainment - as expressed by the content and patterns of hundreds of thousands of messages on Twitter - during the 2012 US presidential debates. By time locking these data sources, we quantify the impact of the unfolding debate on human attention. We show that collective social behavior covaries second-by-second to the interactional dynamics of the debates: A candidate speaking induces rapid increases in mentions of his name on social media and decreases in mentions of the other candidate. Moreover, interruptions by an interlocutor increase the attention received. We also highlight a distinct time scale for the impact of salient moments in the debate: Mentions in social media start within 5-10 seconds after the moment; peak at approximately one minute; and slowly decay in a consistent fashion across well-known events during the debates. Finally, we show that public attention after an initial burst slowly decays through the course of the debates. Thus we demonstrate that large-scale human entrainment may hold across a number of distinct scales, in an exquisitely time-locked fashion. The methods and results pave the way for careful study of the dynamics and mechanisms of large-scale human entrainment.
△ Less
Submitted 11 January, 2015; v1 submitted 29 October, 2014;
originally announced October 2014.
-
Cross-Recurrence Quantification Analysis of Categorical and Continuous Time Series: an R package
Authors:
Moreno I. Coco,
Rick Dale
Abstract:
This paper describes the R package crqa to perform cross-recurrence quantification analysis of two time series of either a categorical or continuous nature. Streams of behavioral information, from eye movements to linguistic elements, unfold over time. When two people interact, such as in conversation, they often adapt to each other, leading these behavioral levels to exhibit recurrent states. In…
▽ More
This paper describes the R package crqa to perform cross-recurrence quantification analysis of two time series of either a categorical or continuous nature. Streams of behavioral information, from eye movements to linguistic elements, unfold over time. When two people interact, such as in conversation, they often adapt to each other, leading these behavioral levels to exhibit recurrent states. In dialogue, for example, interlocutors adapt to each other by exchanging interactive cues: smiles, nods, gestures, choice of words, and so on. In order for us to capture closely the goings-on of dynamic interaction, and uncover the extent of coupling between two individuals, we need to quantify how much recurrence is taking place at these levels. Methods available in crqa would allow researchers in cognitive science to pose such questions as how much are two people recurrent at some level of analysis, what is the characteristic lag time for one person to maximally match another, or whether one person is leading another. First, we set the theoretical ground to understand the difference between 'correlation' and 'co-visitation' when comparing two time series, using an aggregative or cross-recurrence approach. Then, we describe more formally the principles of cross-recurrence, and show with the current package how to carry out analyses applying them. We end the paper by comparing computational efficiency, and results' consistency, of crqa R package, with the benchmark MATLAB toolbox crptoolbox. We show perfect comparability between the two libraries on both levels.
△ Less
Submitted 3 October, 2013; v1 submitted 1 October, 2013;
originally announced October 2013.
-
B(eo)W(u)LF: Facilitating recurrence analysis on multi-level language
Authors:
A. Paxton,
R. Dale
Abstract:
Discourse analysis may seek to characterize not only the overall composition of a given text but also the dynamic patterns within the data. This technical report introduces a data format intended to facilitate multi-level investigations, which we call the by-word long-form or B(eo)W(u)LF. Inspired by the long-form data format required for mixed-effects modeling, B(eo)W(u)LF structures linguistic d…
▽ More
Discourse analysis may seek to characterize not only the overall composition of a given text but also the dynamic patterns within the data. This technical report introduces a data format intended to facilitate multi-level investigations, which we call the by-word long-form or B(eo)W(u)LF. Inspired by the long-form data format required for mixed-effects modeling, B(eo)W(u)LF structures linguistic data into an expanded matrix encoding any number of researchers-specified markers, making it ideal for recurrence-based analyses. While we do not necessarily claim to be the first to use methods along these lines, we have created a series of tools utilizing Python and MATLAB to enable such discourse analyses and demonstrate them using 319 lines of the Old English epic poem, Beowulf, translated into modern English.
△ Less
Submitted 12 August, 2013;
originally announced August 2013.
-
Random Sentences from a Generalized Phrase-Structure Grammar Interpreter
Authors:
Rick Dale
Abstract:
In numerous domains in cognitive science it is often useful to have a source for randomly generated corpora. These corpora may serve as a foundation for artificial stimuli in a learning experiment (e.g., Ellefson & Christiansen, 2000), or as input into computational models (e.g., Christiansen & Dale, 2001). The following compact and general C program interprets a phrase-structure grammar specifi…
▽ More
In numerous domains in cognitive science it is often useful to have a source for randomly generated corpora. These corpora may serve as a foundation for artificial stimuli in a learning experiment (e.g., Ellefson & Christiansen, 2000), or as input into computational models (e.g., Christiansen & Dale, 2001). The following compact and general C program interprets a phrase-structure grammar specified in a text file. It follows parameters set at a Unix or Unix-based command-line and generates a corpus of random sentences from that grammar.
△ Less
Submitted 14 February, 2007;
originally announced February 2007.
-
Selectional Restrictions in HPSG
Authors:
Ion Androutsopoulos,
Robert Dale
Abstract:
Selectional restrictions are semantic sortal constraints imposed on the participants of linguistic constructions to capture contextually-dependent constraints on interpretation. Despite their limitations, selectional restrictions have proven very useful in natural language applications, where they have been used frequently in word sense disambiguation, syntactic disambiguation, and anaphora reso…
▽ More
Selectional restrictions are semantic sortal constraints imposed on the participants of linguistic constructions to capture contextually-dependent constraints on interpretation. Despite their limitations, selectional restrictions have proven very useful in natural language applications, where they have been used frequently in word sense disambiguation, syntactic disambiguation, and anaphora resolution. Given their practical value, we explore two methods to incorporate selectional restrictions in the HPSG theory, assuming that the reader is familiar with HPSG. The first method employs HPSG's Background feature and a constraint-satisfaction component pipe-lined after the parser. The second method uses subsorts of referential indices, and blocks readings that violate selectional restrictions during parsing. While theoretically less satisfactory, we have found the second method particularly useful in the development of practical systems.
△ Less
Submitted 23 August, 2000;
originally announced August 2000.
-
The Role of the Gricean Maxims in the Generation of Referring Expressions
Authors:
Robert Dale,
Ehud Reiter
Abstract:
Grice's maxims of conversation [Grice 1975] are framed as directives to be followed by a speaker of the language. This paper argues that, when considered from the point of view of natural language generation, such a characterisation is rather misleading, and that the desired behaviour falls out quite naturally if we view language generation as a goal-oriented process. We argue this position with…
▽ More
Grice's maxims of conversation [Grice 1975] are framed as directives to be followed by a speaker of the language. This paper argues that, when considered from the point of view of natural language generation, such a characterisation is rather misleading, and that the desired behaviour falls out quite naturally if we view language generation as a goal-oriented process. We argue this position with particular regard to the generation of referring expressions.
△ Less
Submitted 18 April, 1996;
originally announced April 1996.
-
Generating One-Anaphoric Expressions: Where Does the Decision Lie?
Authors:
Robert Dale
Abstract:
Most natural language generation systems embody mechanisms for choosing whether to subsequently refer to an already-introduced entity by means of a pronoun or a definite noun phrase. Relatively few systems, however, consider referring to entites by means of one-anaphoric expressions such as \lingform{the small green one}. This paper looks at what is involved in generating referring expressions o…
▽ More
Most natural language generation systems embody mechanisms for choosing whether to subsequently refer to an already-introduced entity by means of a pronoun or a definite noun phrase. Relatively few systems, however, consider referring to entites by means of one-anaphoric expressions such as \lingform{the small green one}. This paper looks at what is involved in generating referring expressions of this type. Consideration of how to fit this capability into a standard algorithm for referring expression generation leads us to suggest a revision of some of the assumptions that underlie existing approaches. We demonstrate the usefulness of our approach to one-anaphora generation in the context of a simple database interface application, and make some observations about the impact of this approach on referring expression generation more generally.
△ Less
Submitted 9 May, 1995;
originally announced May 1995.
-
Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions
Authors:
Robert Dale,
Ehud Reiter
Abstract:
We examine the problem of generating definite noun phrases that are appropriate referring expressions; i.e, noun phrases that (1) successfully identify the intended referent to the hearer whilst (2) not conveying to her any false conversational implicatures (Grice, 1975). We review several possible computational interpretations of the conversational implicature maxims, with different computation…
▽ More
We examine the problem of generating definite noun phrases that are appropriate referring expressions; i.e, noun phrases that (1) successfully identify the intended referent to the hearer whilst (2) not conveying to her any false conversational implicatures (Grice, 1975). We review several possible computational interpretations of the conversational implicature maxims, with different computational costs, and argue that the simplest may be the best, because it seems to be closest to what human speakers do. We describe our recommended algorithm in detail, along with a specification of the resources a host system must provide in order to make use of the algorithm, and an implementation used in the natural language generation component of the IDAS system.
This paper will appear in the the April--June 1995 issue of Cognitive Science, and is made available on cmp-lg with the permission of Ablex, the publishers of that journal.
△ Less
Submitted 26 April, 1995;
originally announced April 1995.