Search | arXiv e-print repository

Is it possible to know cosmological fine-tuning?

Authors: Daniel Andrés Díaz-Pachón, Ola Hössjer, Calvin Mathew

Abstract: Fine-tuning studies whether some physical parameters, or relevant ratios between them, are located within so-called life-permitting intervals of small probability outside of which carbon-based life would not be possible. Recent developments have found estimates of these probabilities that circumvent previous concerns of measurability and selection bias. However, the question remains if fine-tuning… ▽ More Fine-tuning studies whether some physical parameters, or relevant ratios between them, are located within so-called life-permitting intervals of small probability outside of which carbon-based life would not be possible. Recent developments have found estimates of these probabilities that circumvent previous concerns of measurability and selection bias. However, the question remains if fine-tuning can indeed be known. Using a mathematization of the epistemological concepts of learning and knowledge acquisition, we argue that most examples that have been touted as fine-tuned cannot be formally assessed as such. Nevertheless, fine-tuning can be known when the physical parameter is seen as a random variable and it is supported in the nonnegative real line, provided the size of the life-permitting interval is small in relation to the observed value of the parameter. △ Less

Submitted 1 March, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

Comments: Accepted version. Minor changes: a sentence removed at the end of Section 5.2, and more comprehensive keywords

MSC Class: 85A40 85A40 (Primary) 68T30; 94A15 (Secondary)

arXiv:2304.10752 [pdf, other]

Algorithmic Information Forecastability

Authors: Glauco Amigo, Daniel Andrés Díaz-Pachón, Robert J. Marks, Charles Baylis

Abstract: The outcome of all time series cannot be forecast, e.g. the flip** of a fair coin. Others, like the repeated {01} sequence {010101...} can be forecast exactly. Algorithmic information theory can provide a measure of forecastability that lies between these extremes. The degree of forecastability is a function of only the data. For prediction (or classification) of labeled data, we propose three c… ▽ More The outcome of all time series cannot be forecast, e.g. the flip** of a fair coin. Others, like the repeated {01} sequence {010101...} can be forecast exactly. Algorithmic information theory can provide a measure of forecastability that lies between these extremes. The degree of forecastability is a function of only the data. For prediction (or classification) of labeled data, we propose three categories for forecastability: oracle forecastability for predictions that are always exact, precise forecastability for errors up to a bound, and probabilistic forecastability for any other predictions. Examples are given in each case. △ Less

Submitted 1 December, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

arXiv:2208.13828 [pdf, other]

doi 10.3390/e24101323

Assessing, testing and estimating the amount of fine-tuning by means of active information

Authors: Daniel Andrés Díaz-Pachón, Ola Hössjer

Abstract: A general framework is introduced to estimate how much external information has been infused into a search algorithm, the so-called active information. This is rephrased as a test of fine-tuning, where tuning corresponds to the amount of pre-specified knowledge that the algorithm makes use of in order to reach a certain target. A function $f$ quantifies specificity for each possible outcome $x$ of… ▽ More A general framework is introduced to estimate how much external information has been infused into a search algorithm, the so-called active information. This is rephrased as a test of fine-tuning, where tuning corresponds to the amount of pre-specified knowledge that the algorithm makes use of in order to reach a certain target. A function $f$ quantifies specificity for each possible outcome $x$ of a search, so that the target of the algorithm is a set of highly specified states, whereas fine-tuning occurs if it is much more likely for the algorithm to reach the target than by chance. The distribution of a random outcome $X$ of the algorithm involves a parameter $θ$ that quantifies how much background information that has been infused. A simple choice of this parameter is to use $θf$ in order to exponentially tilt the distribution of the outcome of the search algorithm under the null distribution of no tuning, so that an exponential family of distributions is obtained. Such algorithms are obtained by iterating a Metropolis-Hastings type of Markov chain, and this makes it possible to compute the their active information under equilibrium and non-equilibrium of the Markov chain, with or without stop** when the targeted set of fine-tuned states has been reached. Other choices of tuning parameters $θ$ are discussed as well. Nonparametric and parametric estimators of active information and tests of fine-tuning are developed when repeated and independent outcomes of the algorithm are available. The theory is illustrated with examples from cosmology, student learning, reinforcement learning, a Moran type model of population genetics, and evolutionary programming. △ Less

Submitted 29 August, 2022; originally announced August 2022.

Comments: 28 pages, 3 figures

MSC Class: 94A15 94A17 60J10 69J70

Journal ref: Entropy 2022, 24(10), 1323

arXiv:2206.05120 [pdf, other]

Active information, missing data and prevalence estimation

Authors: Ola Hössjer, Daniel Andrés Díaz-Pachón, Chen Zhao, J. Sunil Rao

Abstract: The topic of this paper is prevalence estimation from the perspective of active information. Prevalence among tested individuals has an upward bias under the assumption that individuals' willingness to be tested for the disease increases with the strength of their symptoms. Active information due to testing bias quantifies the degree at which the willingness to be tested correlates with infection… ▽ More The topic of this paper is prevalence estimation from the perspective of active information. Prevalence among tested individuals has an upward bias under the assumption that individuals' willingness to be tested for the disease increases with the strength of their symptoms. Active information due to testing bias quantifies the degree at which the willingness to be tested correlates with infection status. Interpreting incomplete testing as a missing data problem, the missingness mechanism impacts the degree at which the bias of the original prevalence estimate can be removed. The reduction in prevalence, when testing bias is adjusted for, translates into an active information due to bias correction, with opposite sign to active information due to testing bias. Prevalence and active information estimates are asymptotically normal, a behavior also illustrated through simulations. △ Less

Submitted 10 June, 2022; originally announced June 2022.

Comments: 18 pages, 5 tables, 2 figures

MSC Class: 62D10; 94A17; 62B10; 62F12; 62P10; 92B15; 94A17; 94A16; 94A20

arXiv:2204.11780 [pdf, ps, other]

Sometimes size does not matter

Authors: Daniel Andrés Díaz-Pachón, Ola Hössjer, Robert J. Marks II

Abstract: Cosmological fine-tuning has traditionally been associated with the narrowness of the intervals in which the parameters of the physical models must be located to make life possible. A more thorough approach focuses on the probability of the interval, not on its size. Most attempts to measure the probability of the life-permitting interval for a given parameter rely on a Bayesian statistical approa… ▽ More Cosmological fine-tuning has traditionally been associated with the narrowness of the intervals in which the parameters of the physical models must be located to make life possible. A more thorough approach focuses on the probability of the interval, not on its size. Most attempts to measure the probability of the life-permitting interval for a given parameter rely on a Bayesian statistical approach for which the prior distribution of the parameter is uniform. However, the parameters in these models often take values in spaces of infinite size, so that a uniformity assumption is not possible. This is known as the normalization problem. This paper explains a framework to measure tuning that, among others, deals with normalization, assuming that the prior distribution belongs to a class of maximum entropy (maxent) distributions. By analyzing an upper bound of the tuning probability for this class of distributions the method solves the so-called weak anthropic principle, and offer a solution, at least in this context, to the well-known lack of invariance of maxent distributions. The implication of this approach is that, since all mathematical models need parameters, tuning is not only a question of natural science, but also a problem of mathematical modeling. Cosmological tuning is thus a particular instantiation of a more general scenario. Therefore, whenever a mathematical model is used to describe nature, not only in physics but in all of science, tuning is present. And the question of whether the tuning is fine or coarse for a given parameter -- if the interval in which the parameter is located has low or high probability, respectively -- depends crucially not only on the interval but also on the assumed class of prior distributions. Novel upper bounds for tuning probabilities are presented. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Comments: 31 pages, 1 table

MSC Class: 85A40; 62A01; 62B10; 62C10; 94A15

arXiv:2202.08928 [pdf, other]

"Back to the future" projections for COVID-19 surges

Authors: J. Sunil Rao, Tianhao Liu, Daniel Andrés Díaz-Pachón

Abstract: We argue that information from countries who had earlier COVID-19 surges can be used to inform another country's current model, then generating what we call back-to-the-future (BTF) projections. We show that these projections can be used to accurately predict future COVID-19 surges prior to an inflection point of the daily infection curve. We show, across 12 different countries from all populated… ▽ More We argue that information from countries who had earlier COVID-19 surges can be used to inform another country's current model, then generating what we call back-to-the-future (BTF) projections. We show that these projections can be used to accurately predict future COVID-19 surges prior to an inflection point of the daily infection curve. We show, across 12 different countries from all populated continents around the world, that our method can often predict future surges in scenarios where the traditional approaches would always predict no future surges. However, as expected, BTF projections cannot accurately predict a surge due to the emergence of a new variant. To generate BTF projections, we make use of a matching scheme for asynchronous time series combined with a response coaching SIR model. △ Less

Submitted 17 February, 2022; originally announced February 2022.

Comments: 21 pages, 7 figures

MSC Class: 92D25 (Primary) 92C60 92B15 62P10 62M10 (Secondary)

arXiv:2111.06909 [pdf, ps, other]

doi 10.5048/BIO-C.2020.4

Active information requirements for fixation on the Wright-Fisher model of population genetics

Authors: Daniel Andrés Díaz-Pachón, Robert J. Marks II

Abstract: In the context of population genetics, active information can be extended to measure the change of information of a given event (e.g., fixation of an allele) from a neutral model in which only genetic drift is taken into account to a non-neutral model that includes other sources of frequency variation (e.g., selection and mutation). In this paper we illustrate active information in population gene… ▽ More In the context of population genetics, active information can be extended to measure the change of information of a given event (e.g., fixation of an allele) from a neutral model in which only genetic drift is taken into account to a non-neutral model that includes other sources of frequency variation (e.g., selection and mutation). In this paper we illustrate active information in population genetics through the Wright-Fisher model. △ Less

Submitted 12 November, 2021; originally announced November 2021.

Comments: 15 pages

MSC Class: 92D25; 60J90

Journal ref: BIO-Complexity 2020(4):1-6

arXiv:2111.06865 [pdf, other]

doi 10.5048/BIO-C.2020.3

Generalized active information: extensions to unbounded domains

Authors: Daniel Andrés Díaz-Pachón, Robert J. Marks II

Abstract: In the last three decades, several measures of complexity have been proposed. Up to this point, most of such measures have only been developed for finite spaces. In these scenarios the baseline distribution is uniform. This makes sense because, among other things, the uniform distribution is the measure of maximum entropy over the relevant space. Active information traditionally assumes a finite i… ▽ More In the last three decades, several measures of complexity have been proposed. Up to this point, most of such measures have only been developed for finite spaces. In these scenarios the baseline distribution is uniform. This makes sense because, among other things, the uniform distribution is the measure of maximum entropy over the relevant space. Active information traditionally assumes a finite interval universe of discourse but can be extended to other cases where maximum entropy is defined. Illustrating this is the purpose of this paper. Disequilibrium from maximum entropy, measured as active information, can be evaluated from baselines with unbounded support. △ Less

Submitted 12 November, 2021; originally announced November 2021.

Comments: 15 pages, 1 figure

MSC Class: 94A17

Journal ref: BIO-Complexity 2020(3):1-6

arXiv:2104.05400 [pdf, other]

doi 10.1088/1475-7516/2021/07/020

Is Cosmological Tuning Fine or Coarse?

Authors: Daniel Andrés Díaz-Pachón, Ola Hössjer, Robert J. Marks II

Abstract: The fine-tuning of the universe for life, the idea that the constants of nature (or ratios between them) must belong to very small intervals in order for life to exist, has been debated by scientists for several decades. Several criticisms have emerged concerning probabilistic measurement of life-permitting intervals. Herein, a Bayesian statistical approach is used to assign an upper bound for the… ▽ More The fine-tuning of the universe for life, the idea that the constants of nature (or ratios between them) must belong to very small intervals in order for life to exist, has been debated by scientists for several decades. Several criticisms have emerged concerning probabilistic measurement of life-permitting intervals. Herein, a Bayesian statistical approach is used to assign an upper bound for the probability of tuning, which is invariant with respect to change of physical units, and under certain assumptions it is small whenever the life-permitting interval is small on a relative scale. The computation of the upper bound of the tuning probability is achieved by first assuming that the prior is chosen by the principle of maximum entropy (MaxEnt). The unknown parameters of this MaxEnt distribution are then handled in such a way that the weak anthropic principle is not violated. The MaxEnt assumption is "maximally noncommittal with regard to missing information." This approach is sufficiently general to be applied to constants of current cosmological models, or to other constants possibly under different models. Application of the MaxEnt model reveals, for example, that the ratio of the universal gravitational constant to the square of the Hubble constant is finely tuned in some cases, whereas the amplitude of primordial fluctuations is not. △ Less

Submitted 13 July, 2021; v1 submitted 5 March, 2021; originally announced April 2021.

Comments: 19 pages, 1 figure. Substantial reorganization and expansion to make it more clear

MSC Class: 85A40 (Primary) 94A17 92F05 60E05 85-10 (Secondary)

Journal ref: JCAP07(2021)020

arXiv:2101.04288 [pdf, other]

High Dimensional Mode Hunting Using Pettiest Components Analysis

Authors: Tianhao Liu, Daniel Andrés Díaz-Pachón, J. Sunil Rao, Jean-Eudes Dazard

Abstract: Principal components analysis has been used to reduce the dimensionality of datasets for a long time. In this paper, we will demonstrate that in mode detection the components of smallest variance, the pettiest components, are more important. We prove that for a multivariate normal or Laplace distribution, we obtain boxes of optimal volume by implementing "pettiest component analysis", in the sense… ▽ More Principal components analysis has been used to reduce the dimensionality of datasets for a long time. In this paper, we will demonstrate that in mode detection the components of smallest variance, the pettiest components, are more important. We prove that for a multivariate normal or Laplace distribution, we obtain boxes of optimal volume by implementing "pettiest component analysis", in the sense that their volume is minimal over all possible boxes with the same number of dimensions and fixed probability. This reduction in volume produces an information gain that is measured using active information. We illustrate our results with a simulation and a search for modal patterns of digitized images of hand-written numbers using the famous MNIST database; in both cases pettiest components work better than their competitors. In fact, we show that modes obtained with pettiest components generate better written digits for MNIST than principal components. △ Less

Submitted 29 July, 2022; v1 submitted 11 January, 2021; originally announced January 2021.

Comments: 13 pages, 4 tables, 12 figures

MSC Class: 62H25 15A18 68T09 68T10 94A08 94A15

arXiv:2011.05794 [pdf, ps, other]

doi 10.1002/asmb.2430

Mode hunting through active information

Authors: Daniel Andrés Díaz-Pachón, Juan Pablo Sáenz, J. Sunil Rao, Jean-Eudes Dazard

Abstract: We propose a new method to find modes based on active information. We develop an algorithm that, when applied to the whole space, will say whether there are any modes present \textit{and} where they are; this algorithm will reduce the dimensionality without resorting to Principal Components; and more importantly, population-wise, will not detect modes when they are not present. We propose a new method to find modes based on active information. We develop an algorithm that, when applied to the whole space, will say whether there are any modes present \textit{and} where they are; this algorithm will reduce the dimensionality without resorting to Principal Components; and more importantly, population-wise, will not detect modes when they are not present. △ Less

Submitted 9 November, 2020; originally announced November 2020.

Comments: 12 pages

MSC Class: 62R07

Journal ref: Applied Stochastic Models in Business and Industry (35)2, pp. 376-393, 2019

arXiv:2011.04834 [pdf, ps, other]

doi 10.1016/j.spl.2020.108742

Hypothesis testing with active information

Authors: Daniel Andrés Díaz-Pachón, Juan Pablo Sáenz, J. Sunil Rao

Abstract: We develop hypothesis testing for active information -the averaged quantity in the Kullback-Liebler divergence. To our knowledge, this is the first paper to derive exact probabilities of type-I errors for hypothesis testing in the area. We develop hypothesis testing for active information -the averaged quantity in the Kullback-Liebler divergence. To our knowledge, this is the first paper to derive exact probabilities of type-I errors for hypothesis testing in the area. △ Less

Submitted 12 November, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

Comments: Typo changed in one of the names in the Metadata, and a reference to an equation from the paper in the Supplement

MSC Class: 62B10

Journal ref: Statistics and Probability Letters 161, June 2020, 108742

arXiv:2007.07426 [pdf, ps, other]

doi 10.1016/j.jtbi.2020.110556

A simple correction for COVID-19 sampling bias

Authors: Daniel Andrés Díaz-Pachón, J Sunil Rao

Abstract: COVID-19 testing has become a standard approach for estimating prevalence which then assist in public health decision making to contain and mitigate the spread of the disease. The sampling designs used are often biased in that they do not reflect the true underlying populations. For instance, individuals with strong symptoms are more likely to be tested than those with no symptoms. This results in… ▽ More COVID-19 testing has become a standard approach for estimating prevalence which then assist in public health decision making to contain and mitigate the spread of the disease. The sampling designs used are often biased in that they do not reflect the true underlying populations. For instance, individuals with strong symptoms are more likely to be tested than those with no symptoms. This results in biased estimates of prevalence (too high). Typical post-sampling corrections are not always possible. Here we present a simple bias correction methodology derived and adapted from a correction for publication bias in meta analysis studies. The methodology is general enough to allow a wide variety of customization making it more useful in practice. Implementation is easily done using already collected information. Via a simulation and two real datasets, we show that the bias corrections can provide dramatic reductions in estimation error. △ Less

Submitted 11 January, 2021; v1 submitted 14 July, 2020; originally announced July 2020.

Comments: 14 pages. Title changed. The whole Section 7 with information from Lombardy, Italy, was added (another real dataset). Some typos were corrected. In spite of several lengthy additions, no substantial changes were done to the paper. The goal of the additions was more to clarify than to correct

MSC Class: 62D99

Journal ref: Journal of Theoretical Biology Journal of Theoretical Biology, Volume 512, 7 March 2021, 110556

arXiv:1507.07466 [pdf, ps, other]

$F$ tests for the strip-split plot design

Authors: Daniel Andrés Díaz-Pachón, Francisco J. P. Zimmermann, Luis Alberto López

Abstract: In this article we present the structure of the $F$ tests, the variance components and the approximate degrees of freedom for each of the eight possible mixed models of the strip-split plot design. We present an example to illustrate the model and compare it to more traditional settings like a three-way factorial design and a split-split plot model. In this article we present the structure of the $F$ tests, the variance components and the approximate degrees of freedom for each of the eight possible mixed models of the strip-split plot design. We present an example to illustrate the model and compare it to more traditional settings like a three-way factorial design and a split-split plot model. △ Less

Submitted 27 July, 2015; originally announced July 2015.

Comments: 22 pages

MSC Class: 62K99

Journal ref: REVISTA BRASILEIRA DE BIOMETRIA, [S.I.], v. 34, n. 2, p. 279-303, June 2016

arXiv:1409.8630 [pdf, other]

Unsupervised Bump Hunting Using Principal Components

Authors: Daniel A Díaz-Pachón, Jean-Eudes Dazard, J. Sunil Rao

Abstract: Principal Components Analysis is a widely used technique for dimension reduction and characterization of variability in multivariate populations. Our interest lies in studying when and why the rotation to principal components can be used effectively within a response-predictor set relationship in the context of mode hunting. Specifically focusing on the Patient Rule Induction Method (PRIM), we fir… ▽ More Principal Components Analysis is a widely used technique for dimension reduction and characterization of variability in multivariate populations. Our interest lies in studying when and why the rotation to principal components can be used effectively within a response-predictor set relationship in the context of mode hunting. Specifically focusing on the Patient Rule Induction Method (PRIM), we first develop a fast version of this algorithm (fastPRIM) under normality which facilitates the theoretical studies to follow. Using basic geometrical arguments, we then demonstrate how the PC rotation of the predictor space alone can in fact generate improved mode estimators. Simulation results are used to illustrate our findings. △ Less

Submitted 30 September, 2014; originally announced September 2014.

Comments: 24 pages, 9 figures

MSC Class: 65C60

arXiv:1404.4917 [pdf, other]

On the explanatory power of principal components

Authors: Daniel A. Diaz-Pachon, J. Sunil Rao, Jean-Eudes Dazard

Abstract: We show that if we have an orthogonal base ($u_1,\ldots,u_p$) in a $p$-dimensional vector space, and select $p+1$ vectors $v_1,\ldots, v_p$ and $w$ such that the vectors traverse the origin, then the probability of $w$ being to closer to all the vectors in the base than to $v_1,\ldots, v_p$ is at least 1/2 and converges as $p$ increases to infinity to a normal distribution on the interval [-1,1];… ▽ More We show that if we have an orthogonal base ($u_1,\ldots,u_p$) in a $p$-dimensional vector space, and select $p+1$ vectors $v_1,\ldots, v_p$ and $w$ such that the vectors traverse the origin, then the probability of $w$ being to closer to all the vectors in the base than to $v_1,\ldots, v_p$ is at least 1/2 and converges as $p$ increases to infinity to a normal distribution on the interval [-1,1]; i.e., $Φ(1)-Φ(-1)\approx0.6826$. This result has relevant consequences for Principal Components Analysis in the context of regression and other learning settings, if we take the orthogonal base as the direction of the principal components. △ Less

Submitted 19 April, 2014; originally announced April 2014.

Comments: 10 pages, 3 figures

MSC Class: 60D05; 62H25

arXiv:0909.5325 [pdf, ps, other]

doi 10.1080/17442508.2011.651215

Percolation for the stable marriage of Poisson and Lebesgue with random appetites

Authors: Daniel Andrés Díaz-Pachón

Abstract: Let $Ξ$ be a set of centers chosen according to a Poisson point process in $\mathbb R^d$. Consider the allocation of $\mathbb R^d$ to $Ξ$ which is stable in the sense of the Gale-Shapley marriage problem, with the additional feature that every center $ξ\inΞ$ has a random appetite $αV$, where $α$ is a nonnegative scale constant and $V$ is a nonnegative random variable. Generalizing previous results… ▽ More Let $Ξ$ be a set of centers chosen according to a Poisson point process in $\mathbb R^d$. Consider the allocation of $\mathbb R^d$ to $Ξ$ which is stable in the sense of the Gale-Shapley marriage problem, with the additional feature that every center $ξ\inΞ$ has a random appetite $αV$, where $α$ is a nonnegative scale constant and $V$ is a nonnegative random variable. Generalizing previous results by Freire, Popov and Vachkovskaia (\cite{FPV}), we show the absence of percolation when $α$ is small enough, depending on certain characteristics of the moment of $V$. △ Less

Submitted 12 November, 2021; v1 submitted 29 September, 2009; originally announced September 2009.

Comments: 13 pages. The arXiv version appeared twice, it is fixed now

MSC Class: 60D05

Journal ref: Stochastics, Vol. 85, Issue 2, pp. 252-261 (2013)

Showing 1–17 of 17 results for author: Díaz-Pachón, D A