-
Forecasting: theory and practice
Authors:
Fotios Petropoulos,
Daniele Apiletti,
Vassilios Assimakopoulos,
Mohamed Zied Babai,
Devon K. Barrow,
Souhaib Ben Taieb,
Christoph Bergmeir,
Ricardo J. Bessa,
Jakub Bijak,
John E. Boylan,
Jethro Browell,
Claudio Carnevale,
Jennifer L. Castle,
Pasquale Cirillo,
Michael P. Clements,
Clara Cordeiro,
Fernando Luiz Cyrino Oliveira,
Shari De Baets,
Alexander Dokumentov,
Joanne Ellison,
Piotr Fiszeder,
Philip Hans Franses,
David T. Frazier,
Michael Gilliland,
M. Sinan Gönül
, et al. (55 additional authors not shown)
Abstract:
Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systemati…
▽ More
Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts.
We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases.
△ Less
Submitted 5 January, 2022; v1 submitted 4 December, 2020;
originally announced December 2020.
-
Understanding interdependency through complex information sharing
Authors:
Fernando Rosas,
Vasilis Ntranos,
Christopher J. Ellison,
Sofie Pollin,
Marian Verhelst
Abstract:
The interactions between three or more random variables are often nontrivial, poorly understood, and yet, are paramount for future advances in fields such as network information theory, neuroscience, genetics and many others. In this work, we propose to analyze these interactions as different modes of information sharing. Towards this end, we introduce a novel axiomatic framework for decomposing t…
▽ More
The interactions between three or more random variables are often nontrivial, poorly understood, and yet, are paramount for future advances in fields such as network information theory, neuroscience, genetics and many others. In this work, we propose to analyze these interactions as different modes of information sharing. Towards this end, we introduce a novel axiomatic framework for decomposing the joint entropy, which characterizes the various ways in which random variables can share information. The key contribution of our framework is to distinguish between interdependencies where the information is shared redundantly, and synergistic interdependencies where the sharing structure exists in the whole but not between the parts. We show that our axioms determine unique formulas for all the terms of the proposed decomposition for a number of cases of interest. Moreover, we show how these results can be applied to several network information theory problems, providing a more intuitive understanding of their fundamental limits.
△ Less
Submitted 15 September, 2015;
originally announced September 2015.
-
Intersection Information based on Common Randomness
Authors:
Virgil Griffith,
Edwin K. P. Chong,
Ryan G. James,
Christopher J. Ellison,
James P. Crutchfield
Abstract:
The introduction of the partial information decomposition generated a flurry of proposals for defining an intersection information that quantifies how much of "the same information" two or more random variables specify about a target random variable. As of yet, none is wholly satisfactory. A palatable measure of intersection information would provide a principled way to quantify slippery concepts,…
▽ More
The introduction of the partial information decomposition generated a flurry of proposals for defining an intersection information that quantifies how much of "the same information" two or more random variables specify about a target random variable. As of yet, none is wholly satisfactory. A palatable measure of intersection information would provide a principled way to quantify slippery concepts, such as synergy. Here, we introduce an intersection information measure based on the Gács-Körner common random variable that is the first to satisfy the coveted target monotonicity property. Our measure is imperfect, too, and we suggest directions for improvement.
△ Less
Submitted 10 June, 2015; v1 submitted 6 October, 2013;
originally announced October 2013.
-
Exact Complexity: The Spectral Decomposition of Intrinsic Computation
Authors:
James P. Crutchfield,
Christopher J. Ellison,
Paul M. Riechers
Abstract:
We give exact formulae for a wide family of complexity measures that capture the organization of hidden nonlinear processes. The spectral decomposition of operator-valued functions leads to closed-form expressions involving the full eigenvalue spectrum of the mixed-state presentation of a process's epsilon-machine causal-state dynamic. Measures include correlation functions, power spectra, past-fu…
▽ More
We give exact formulae for a wide family of complexity measures that capture the organization of hidden nonlinear processes. The spectral decomposition of operator-valued functions leads to closed-form expressions involving the full eigenvalue spectrum of the mixed-state presentation of a process's epsilon-machine causal-state dynamic. Measures include correlation functions, power spectra, past-future mutual information, transient and synchronization informations, and many others. As a result, a direct and complete analysis of intrinsic computation is now available for the temporal organization of finitary hidden Markov models and nonlinear dynamical systems with generating partitions and for the spatial organization in one-dimensional systems, including spin systems, cellular automata, and complex materials via chaotic crystallography.
△ Less
Submitted 15 September, 2013;
originally announced September 2013.
-
How Hidden are Hidden Processes? A Primer on Crypticity and Entropy Convergence
Authors:
John R. Mahoney,
Christopher J. Ellison,
Ryan G. James,
James P. Crutchfield
Abstract:
We investigate a stationary process's crypticity---a measure of the difference between its hidden state information and its observed information---using the causal states of computational mechanics. Here, we motivate crypticity and cryptic order as physically meaningful quantities that monitor how hidden a hidden process is. This is done by recasting previous results on the convergence of block en…
▽ More
We investigate a stationary process's crypticity---a measure of the difference between its hidden state information and its observed information---using the causal states of computational mechanics. Here, we motivate crypticity and cryptic order as physically meaningful quantities that monitor how hidden a hidden process is. This is done by recasting previous results on the convergence of block entropy and block-state entropy in a geometric setting, one that is more intuitive and that leads to a number of new results. For example, we connect crypticity to how an observer synchronizes to a process. We show that the block-causal-state entropy is a convex function of block length. We give a complete analysis of spin chains. We present a classification scheme that surveys stationary processes in terms of their possible cryptic and Markov orders. We illustrate related entropy convergence behaviors using a new form of foliated information diagram. Finally, along the way, we provide a variety of interpretations of crypticity and cryptic order to establish their naturalness and pervasiveness. Hopefully, these will inspire new applications in spatially extended and network dynamical systems.
△ Less
Submitted 6 August, 2011;
originally announced August 2011.
-
Information Symmetries in Irreversible Processes
Authors:
Christopher J. Ellison,
John R. Mahoney,
Ryan G. James,
James P. Crutchfield,
Joerg Reichardt
Abstract:
We study dynamical reversibility in stationary stochastic processes from an information theoretic perspective. Extending earlier work on the reversibility of Markov chains, we focus on finitary processes with arbitrarily long conditional correlations. In particular, we examine stationary processes represented or generated by edge-emitting, finite-state hidden Markov models. Surprisingly, we find p…
▽ More
We study dynamical reversibility in stationary stochastic processes from an information theoretic perspective. Extending earlier work on the reversibility of Markov chains, we focus on finitary processes with arbitrarily long conditional correlations. In particular, we examine stationary processes represented or generated by edge-emitting, finite-state hidden Markov models. Surprisingly, we find pervasive temporal asymmetries in the statistics of such stationary processes with the consequence that the computational resources necessary to generate a process in the forward and reverse temporal directions are generally not the same. In fact, an exhaustive survey indicates that most stationary processes are irreversible. We study the ensuing relations between model topology in different representations, the process's statistical properties, and its reversibility in detail. A process's temporal asymmetry is efficiently captured using two canonical unifilar representations of the generating model, the forward-time and reverse-time epsilon-machines. We analyze example irreversible processes whose epsilon-machine presentations change size under time reversal, including one which has a finite number of recurrent causal states in one direction, but an infinite number in the opposite. From the forward-time and reverse-time epsilon-machines, we are able to construct a symmetrized, but nonunifilar, generator of a process---the bidirectional machine. Using the bidirectional machine, we show how to directly calculate a process's fundamental information properties, many of which are otherwise only poorly approximated via process samples. The tools we introduce and the insights we offer provide a better understanding of the many facets of reversibility and irreversibility in stochastic processes.
△ Less
Submitted 11 July, 2011;
originally announced July 2011.
-
Anatomy of a Bit: Information in a Time Series Observation
Authors:
Ryan G. James,
Christopher J. Ellison,
James P. Crutchfield
Abstract:
Appealing to several multivariate information measures---some familiar, some new here---we analyze the information embedded in discrete-valued stochastic time series. We dissect the uncertainty of a single observation to demonstrate how the measures' asymptotic behavior sheds structural and semantic light on the generating process's internal information dynamics. The measures scale with the length…
▽ More
Appealing to several multivariate information measures---some familiar, some new here---we analyze the information embedded in discrete-valued stochastic time series. We dissect the uncertainty of a single observation to demonstrate how the measures' asymptotic behavior sheds structural and semantic light on the generating process's internal information dynamics. The measures scale with the length of time window, which captures both intensive (rates of growth) and subextensive components. We provide interpretations for the components, develo** explicit relationships between them. We also identify the informational component shared between the past and the future that is not contained in a single observation. The existence of this component directly motivates the notion of a process's effective (internal) states and indicates why one must build models.
△ Less
Submitted 15 May, 2011;
originally announced May 2011.
-
The Past and the Future in the Present
Authors:
James P. Crutchfield,
Christopher J. Ellison
Abstract:
We show how the shared information between the past and future---the excess entropy---derives from the components of directional information stored in the present---the predictive and retrodictive causal states. A detailed proof allows us to highlight a number of the subtle problems in estimation and analysis that impede accurate calculation of the excess entropy.
We show how the shared information between the past and future---the excess entropy---derives from the components of directional information stored in the present---the predictive and retrodictive causal states. A detailed proof allows us to highlight a number of the subtle problems in estimation and analysis that impede accurate calculation of the excess entropy.
△ Less
Submitted 1 December, 2010;
originally announced December 2010.
-
Enumerating Finitary Processes
Authors:
B. D. Johnson,
J. P. Crutchfield,
C. J. Ellison,
C. S. McTague
Abstract:
We show how to efficiently enumerate a class of finite-memory stochastic processes using the causal representation of epsilon-machines. We characterize epsilon-machines in the language of automata theory and adapt a recent algorithm for generating accessible deterministic finite automata, pruning this over-large class down to that of epsilon-machines. As an application, we exactly enumerate topolo…
▽ More
We show how to efficiently enumerate a class of finite-memory stochastic processes using the causal representation of epsilon-machines. We characterize epsilon-machines in the language of automata theory and adapt a recent algorithm for generating accessible deterministic finite automata, pruning this over-large class down to that of epsilon-machines. As an application, we exactly enumerate topological epsilon-machines up to eight states and six-letter alphabets.
△ Less
Submitted 15 December, 2012; v1 submitted 29 October, 2010;
originally announced November 2010.
-
Many Roads to Synchrony: Natural Time Scales and Their Algorithms
Authors:
Ryan G. James,
John R. Mahoney,
Christopher J. Ellison,
James P. Crutchfield
Abstract:
We consider two important time scales---the Markov and cryptic orders---that monitor how an observer synchronizes to a finitary stochastic process. We show how to compute these orders exactly and that they are most efficiently calculated from the epsilon-machine, a process's minimal unifilar model. Surprisingly, though the Markov order is a basic concept from stochastic process theory, it is not a…
▽ More
We consider two important time scales---the Markov and cryptic orders---that monitor how an observer synchronizes to a finitary stochastic process. We show how to compute these orders exactly and that they are most efficiently calculated from the epsilon-machine, a process's minimal unifilar model. Surprisingly, though the Markov order is a basic concept from stochastic process theory, it is not a probabilistic property of a process. Rather, it is a topological property and, moreover, it is not computable from any finite-state model other than the epsilon-machine. Via an exhaustive survey, we close by demonstrating that infinite Markov and infinite cryptic orders are a dominant feature in the space of finite-memory processes. We draw out the roles played in statistical mechanical spin systems by these two complementary length scales.
△ Less
Submitted 20 December, 2013; v1 submitted 26 October, 2010;
originally announced October 2010.
-
Synchronization and Control in Intrinsic and Designed Computation: An Information-Theoretic Analysis of Competing Models of Stochastic Computation
Authors:
James P. Crutchfield,
Christopher J. Ellison,
Ryan G. James,
John R. Mahoney
Abstract:
We adapt tools from information theory to analyze how an observer comes to synchronize with the hidden states of a finitary, stationary stochastic process. We show that synchronization is determined by both the process's internal organization and by an observer's model of it. We analyze these components using the convergence of state-block and block-state entropies, comparing them to the previousl…
▽ More
We adapt tools from information theory to analyze how an observer comes to synchronize with the hidden states of a finitary, stationary stochastic process. We show that synchronization is determined by both the process's internal organization and by an observer's model of it. We analyze these components using the convergence of state-block and block-state entropies, comparing them to the previously known convergence properties of the Shannon block entropy. Along the way, we introduce a hierarchy of information quantifiers as derivatives and integrals of these entropies, which parallels a similar hierarchy introduced for block entropy. We also draw out the duality between synchronization properties and a process's controllability. The tools lead to a new classification of a process's alternative representations in terms of minimality, synchronizability, and unifilarity.
△ Less
Submitted 29 July, 2010;
originally announced July 2010.
-
Prediction, Retrodiction, and The Amount of Information Stored in the Present
Authors:
Christopher J. Ellison,
John R. Mahoney,
James P. Crutchfield
Abstract:
We introduce an ambidextrous view of stochastic dynamical systems, comparing their forward-time and reverse-time representations and then integrating them into a single time-symmetric representation. The perspective is useful theoretically, computationally, and conceptually. Mathematically, we prove that the excess entropy--a familiar measure of organization in complex systems--is the mutual inf…
▽ More
We introduce an ambidextrous view of stochastic dynamical systems, comparing their forward-time and reverse-time representations and then integrating them into a single time-symmetric representation. The perspective is useful theoretically, computationally, and conceptually. Mathematically, we prove that the excess entropy--a familiar measure of organization in complex systems--is the mutual information not only between the past and future, but also between the predictive and retrodictive causal states. Practically, we exploit the connection between prediction and retrodiction to directly calculate the excess entropy. Conceptually, these lead one to discover new system invariants for stochastic dynamical systems: crypticity (information accessibility) and causal irreversibility. Ultimately, we introduce a time-symmetric representation that unifies all these quantities, compressing the two directional representations into one. The resulting compression offers a new conception of the amount of information stored in the present.
△ Less
Submitted 21 May, 2009;
originally announced May 2009.
-
Optimal Causal Inference: Estimating Stored Information and Approximating Causal Architecture
Authors:
Susanne Still,
James P. Crutchfield,
Christopher J. Ellison
Abstract:
We introduce an approach to inferring the causal architecture of stochastic dynamical systems that extends rate distortion theory to use causal shielding---a natural principle of learning. We study two distinct cases of causal inference: optimal causal filtering and optimal causal estimation.
Filtering corresponds to the ideal case in which the probability distribution of measurement sequences i…
▽ More
We introduce an approach to inferring the causal architecture of stochastic dynamical systems that extends rate distortion theory to use causal shielding---a natural principle of learning. We study two distinct cases of causal inference: optimal causal filtering and optimal causal estimation.
Filtering corresponds to the ideal case in which the probability distribution of measurement sequences is known, giving a principled method to approximate a system's causal structure at a desired level of representation. We show that, in the limit in which a model complexity constraint is relaxed, filtering finds the exact causal architecture of a stochastic dynamical system, known as the causal-state partition. From this, one can estimate the amount of historical information the process stores. More generally, causal filtering finds a graded model-complexity hierarchy of approximations to the causal architecture. Abrupt changes in the hierarchy, as a function of approximation, capture distinct scales of structural organization.
For nonideal cases with finite data, we show how the correct number of underlying causal states can be found by optimal causal estimation. A previously derived model complexity control term allows us to correct for the effect of statistical fluctuations in probability estimates and thereby avoid over-fitting.
△ Less
Submitted 19 August, 2010; v1 submitted 11 August, 2007;
originally announced August 2007.