Search | arXiv e-print repository

doi 10.48550/arXiv.2206.12179

How is model-related uncertainty quantified and reported in different disciplines?

Authors: Emily G. Simmonds, Kwaku Peprah Adjei, Christoffer Wold Andersen, Janne Cathrin Hetle Aspheim, Claudia Battistin, Nicola Bulso, Hannah Christensen, Benjamin Cretois, Ryan Cubero, Ivan A. Davidovich, Lisa Dickel, Benjamin Dunn, Etienne Dunn-Sigouin, Karin Dyrstad, Sigurd Einum, Donata Giglio, Haakon Gjerlow, Amelie Godefroidt, Ricardo Gonzalez-Gil, Soledad Gonzalo Cogno, Fabian Grosse, Paul Halloran, Mari F. Jensen, John James Kennedy, Peter Egge Langsaether , et al. (18 additional authors not shown)

Abstract: How do we know how much we know? Quantifying uncertainty associated with our modelling work is the only way we can answer how much we know about any phenomenon. With quantitative science now highly influential in the public sphere and the results from models translating into action, we must support our conclusions with sufficient rigour to produce useful, reproducible results. Incomplete considera… ▽ More How do we know how much we know? Quantifying uncertainty associated with our modelling work is the only way we can answer how much we know about any phenomenon. With quantitative science now highly influential in the public sphere and the results from models translating into action, we must support our conclusions with sufficient rigour to produce useful, reproducible results. Incomplete consideration of model-based uncertainties can lead to false conclusions with real world impacts. Despite these potentially damaging consequences, uncertainty consideration is incomplete both within and across scientific fields. We take a unique interdisciplinary approach and conduct a systematic audit of model-related uncertainty quantification from seven scientific fields, spanning the biological, physical, and social sciences. Our results show no single field is achieving complete consideration of model uncertainties, but together we can fill the gaps. We propose opportunities to improve the quantification of uncertainty through use of a source framework for uncertainty consideration, model type specific guidelines, improved presentation, and shared best practice. We also identify shared outstanding challenges (uncertainty in input data, balancing trade-offs, error propagation, and defining how much uncertainty is required). Finally, we make nine concrete recommendations for current practice (following good practice guidelines and an uncertainty checklist, presenting uncertainty numerically, and propagating model-related uncertainty into conclusions), future research priorities (uncertainty in input data, quantifying uncertainty in complex models, and the importance of missing uncertainty in different contexts), and general research standards across the sciences (transparency about study limitations and dedicated uncertainty sections of manuscripts). △ Less

Submitted 1 July, 2022; v1 submitted 24 June, 2022; originally announced June 2022.

Comments: 40 Pages (including supporting information), 3 Figures, 2 Boxes, 1 Table

arXiv:1906.05141 [pdf]

doi 10.1021/acs.nanolett.9b02521

Energy transfer and interference by collective electromagnetic coupling

Authors: Mayte Gómez-Castaño, Andrés Redondo Cubero, Lionel Buisson, Jose Luis Pau, Agustin Mihi, Serge Ravaine, Renaud A. L. Vallée, Abraham Nitzan, Maxim Sukharev

Abstract: The physics of collective optical response of molecular assemblies, pioneered by Dicke in 1954, has long been at the center of theoretical and experimental scrutiny. The influence of the environment on such phenomena is also of great interest due to various important applications in e.g. energy conversion devices. In this manuscript we demonstrate both experimentally and theoretically the spatial… ▽ More The physics of collective optical response of molecular assemblies, pioneered by Dicke in 1954, has long been at the center of theoretical and experimental scrutiny. The influence of the environment on such phenomena is also of great interest due to various important applications in e.g. energy conversion devices. In this manuscript we demonstrate both experimentally and theoretically the spatial modulations of the collective decay rates of molecules placed in proximity to a metal interface. We show in a very simple framework how the cooperative optical response can be analyzed in terms of intermolecular correlations causing interference between the response of different molecules and the polarization induced on a nearby metallic boundary and predict similar collective interference phenomena in excitation energy transfer between molecular aggregates. △ Less

Submitted 23 June, 2019; v1 submitted 12 June, 2019; originally announced June 2019.

arXiv:1809.00652 [pdf, other]

doi 10.3390/e20100755

Minimum Description Length codes are critical

Authors: Ryan John Cubero, Matteo Marsili, Yasser Roudi

Abstract: In the Minimum Description Length (MDL) principle, learning from the data is equivalent to an optimal coding problem. We show that the codes that achieve optimal compression in MDL are critical in a very precise sense. First, when they are taken as generative models of samples, they generate samples with broad empirical distributions and with a high value of the relevance, defined as the entropy o… ▽ More In the Minimum Description Length (MDL) principle, learning from the data is equivalent to an optimal coding problem. We show that the codes that achieve optimal compression in MDL are critical in a very precise sense. First, when they are taken as generative models of samples, they generate samples with broad empirical distributions and with a high value of the relevance, defined as the entropy of the empirical frequencies. These results are derived for different statistical models (Dirichlet model, independent and pairwise dependent spin models, and restricted Boltzmann machines). Second, MDL codes sit precisely at a second order phase transition point where the symmetry between the sampled outcomes is spontaneously broken. The order parameter controlling the phase transition is the coding cost of the samples. The phase transition is a manifestation of the optimality of MDL codes, and it arises because codes that achieve a higher compression do not exist. These results suggest a clear interpretation of the widespread occurrence of statistical criticality as a characterization of samples which are maximally informative on the underlying generative process. △ Less

Submitted 2 October, 2018; v1 submitted 3 September, 2018; originally announced September 2018.

Comments: 23 pages, 5 figures; Corrected the author name, revised Section 2.2 (Large Deviations of the Universal Codes Exhibit Phase Transitions), corrected Eq. (89)

Journal ref: Entropy 2018, 20(10)

arXiv:1808.00249 [pdf, other]

doi 10.1088/1742-5468/ab16c8

Statistical Criticality arises in Most Informative Representations

Authors: Ryan John Cubero, Junghyo Jo, Matteo Marsili, Yasser Roudi, Juyong Song

Abstract: We show that statistical criticality, i.e. the occurrence of power law frequency distributions, arises in samples that are maximally informative about the underlying generating process. In order to reach this conclusion, we first identify the frequency with which different outcomes occur in a sample, as the variable carrying useful information on the generative process. The entropy of the frequenc… ▽ More We show that statistical criticality, i.e. the occurrence of power law frequency distributions, arises in samples that are maximally informative about the underlying generating process. In order to reach this conclusion, we first identify the frequency with which different outcomes occur in a sample, as the variable carrying useful information on the generative process. The entropy of the frequency, that we call relevance, provides an upper bound to the number of informative bits. This differs from the entropy of the data, that we take as a measure of resolution. Samples that maximise relevance at a given resolution - that we call maximally informative samples - exhibit statistical criticality. In particular, Zipf's law arises at the optimal trade-off between resolution (i.e. compression) and relevance. As a byproduct, we derive a bound of the maximal number of parameters that can be estimated from a dataset, in the absence of prior knowledge on the generative model. Furthermore, we relate criticality to the statistical properties of the representation of the data generating process. We show that, as a consequence of the concentration property of the Asymptotic Equipartition Property, representations that are maximally informative about the data generating process are characterised by an exponential distribution of energy levels. This arises from a principle of minimal entropy, that is conjugate of the maximum entropy principle in statistical mechanics. This explains why statistical criticality requires no parameter fine tuning in maximally informative samples. △ Less

Submitted 8 July, 2019; v1 submitted 1 August, 2018; originally announced August 2018.

Comments: 21 pages, 3 figures; Corrected Appendix A; Updated journal reference

Journal ref: J. Stat. Mech. (2019) 063402

arXiv:1802.10354 [pdf, other]

Multiscale relevance and informative encoding in neuronal spike trains

Authors: Ryan John Cubero, Matteo Marsili, Yasser Roudi

Abstract: Neuronal responses to complex stimuli and tasks can encompass a wide range of time scales. Understanding these responses requires measures that characterize how the information on these response patterns are represented across multiple temporal resolutions. In this paper we propose a metric -- which we call multiscale relevance (MSR) -- to capture the dynamical variability of the activity of singl… ▽ More Neuronal responses to complex stimuli and tasks can encompass a wide range of time scales. Understanding these responses requires measures that characterize how the information on these response patterns are represented across multiple temporal resolutions. In this paper we propose a metric -- which we call multiscale relevance (MSR) -- to capture the dynamical variability of the activity of single neurons across different time scales. The MSR is a non-parametric, fully featureless indicator in that it uses only the time stamps of the firing activity without resorting to any a priori covariate or invoking any specific structure in the tuning curve for neural activity. When applied to neural data from the mEC and from the ADn and PoS regions of freely-behaving rodents, we found that neurons having low MSR tend to have low mutual information and low firing sparsity across the correlates that are believed to be encoded by the region of the brain where the recordings were made. In addition, neurons with high MSR contain significant information on spatial navigation and allow to decode spatial position or head direction as efficiently as those neurons whose firing activity has high mutual information with the covariate to be decoded and significantly better than the set of neurons with high local variations in their interspike intervals. Given these results, we propose that the MSR can be used as a measure to rank and select neurons for their information content without the need to appeal to any a priori covariate. △ Less

Submitted 20 December, 2019; v1 submitted 28 February, 2018; originally announced February 2018.

Comments: 38 pages, 16 figures

Showing 1–5 of 5 results for author: Cubero, R