-
Genealogical processes of non-neutral population models under rapid mutation
Authors:
Jere Koskela,
Paul A. Jenkins,
Adam M. Johansen,
Dario Spano
Abstract:
We show that genealogical trees arising from a broad class of non-neutral models of population evolution converge to the Kingman coalescent under a suitable rescaling of time. As well as non-neutral biological evolution, our results apply to genetic algorithms encompassing the prominent class of sequential Monte Carlo (SMC) methods. The time rescaling we need differs slightly from that used in cla…
▽ More
We show that genealogical trees arising from a broad class of non-neutral models of population evolution converge to the Kingman coalescent under a suitable rescaling of time. As well as non-neutral biological evolution, our results apply to genetic algorithms encompassing the prominent class of sequential Monte Carlo (SMC) methods. The time rescaling we need differs slightly from that used in classical results for convergence to the Kingman coalescent, which has implications for the performance of different resampling schemes in SMC algorithms. In addition, our work substantially simplifies earlier proofs of convergence to the Kingman coalescent, and corrects an error common to several earlier results.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Excursion theory for the Wright-Fisher diffusion
Authors:
Paul A. Jenkins,
Jere Koskela,
Jaromir Sant,
Dario Spano,
Ivana Valentic
Abstract:
In this work, we develop excursion theory for the Wright-Fisher diffusion with recurrent mutation. Our construction is intermediate between the classical excursion theory where all excursions begin and end at a single point and the more general approach considering excursions of processes from general sets. Since the Wright-Fisher diffusion has two boundary points, it is natural to construct excur…
▽ More
In this work, we develop excursion theory for the Wright-Fisher diffusion with recurrent mutation. Our construction is intermediate between the classical excursion theory where all excursions begin and end at a single point and the more general approach considering excursions of processes from general sets. Since the Wright-Fisher diffusion has two boundary points, it is natural to construct excursions which start from a specified boundary point, and end at one of two boundary points which determine the next starting point. In order to do this we study the killed Wright-Fisher diffusion, which is sent to a cemetery state whenever it hits either endpoint. We then construct a marked Poisson process of such killed paths which, when concatenated, produce a pathwise construction of the Wright-Fisher diffusion.
△ Less
Submitted 28 September, 2023;
originally announced September 2023.
-
Bernoulli factories and duality in Wright-Fisher and Allen-Cahn models of population genetics
Authors:
Jere Koskela,
Krzysztof Łatuszyński,
Dario Spanò
Abstract:
Mathematical models of genetic evolution often come in pairs, connected by a so-called duality relation. The most seminal example are the Wright-Fisher diffusion and the Kingman coalescent, where the former describes the stochastic evolution of neutral allele frequencies in a large population forwards in time, and the latter describes the genetic ancestry of randomly sampled individuals from the p…
▽ More
Mathematical models of genetic evolution often come in pairs, connected by a so-called duality relation. The most seminal example are the Wright-Fisher diffusion and the Kingman coalescent, where the former describes the stochastic evolution of neutral allele frequencies in a large population forwards in time, and the latter describes the genetic ancestry of randomly sampled individuals from the population backwards in time. As well as providing a richer description than either model in isolation, duality often yields equations satisfied by quantities of interest. We employ the so-called Bernoulli factory - a celebrated tool in simulation-based computing - to derive duality relations for broad classes of genetics models. As concrete examples, we present Wright-Fisher diffusions with general drift functions, and Allen-Cahn equations with general, nonlinear forcing terms. The drift and forcing functions can be interpreted as the action of frequency-dependent selection. To our knowledge, this work is the first time a connection has been drawn between Bernoulli factories and duality in models of population genetics.
△ Less
Submitted 1 February, 2024; v1 submitted 6 June, 2023;
originally announced June 2023.
-
EWF : simulating exact paths of the Wright--Fisher diffusion
Authors:
Jaromir Sant,
Paul A. Jenkins,
Jere Koskela,
Dario Spanò
Abstract:
The Wright--Fisher diffusion is important in population genetics in modelling the evolution of allele frequencies over time subject to the influence of biological phenomena such as selection, mutation, and genetic drift. Simulating paths of the process is challenging due to the form of the transition density. We present EWF, a robust and efficient sampler which returns exact draws for the diffusio…
▽ More
The Wright--Fisher diffusion is important in population genetics in modelling the evolution of allele frequencies over time subject to the influence of biological phenomena such as selection, mutation, and genetic drift. Simulating paths of the process is challenging due to the form of the transition density. We present EWF, a robust and efficient sampler which returns exact draws for the diffusion and diffusion bridge processes, accounting for general models of selection including those with frequency-dependence. Given a configuration of selection, mutation, and endpoints, EWF returns draws at the requested sampling times from the law of the corresponding Wright--Fisher process. Output was validated by comparison to approximations of the transition density via the Kolmogorov--Smirnov test and QQ plots. All software is available at https://github.com/JaroSant/EWF
△ Less
Submitted 13 January, 2023;
originally announced January 2023.
-
Characterizing Qubit Traffic of a Quantum Intranet aiming at Modular Quantum Computers
Authors:
Santiago Rodrigo,
Domenico Spanò,
Medina Bandic,
Sergi Abadal,
Hans van Someren,
Anabel Ovide,
Sebastian Feld,
Carmen G. Almudever,
Eduard Alarcón
Abstract:
Quantum many-core processors are envisioned as the ultimate solution for the scalability of quantum computers. Based upon Noisy Intermediate-Scale Quantum (NISQ) chips interconnected in a sort of quantum intranet, they enable large algorithms to be executed on current and close future technology. In order to optimize such architectures, it is crucial to develop tools that allow specific design spa…
▽ More
Quantum many-core processors are envisioned as the ultimate solution for the scalability of quantum computers. Based upon Noisy Intermediate-Scale Quantum (NISQ) chips interconnected in a sort of quantum intranet, they enable large algorithms to be executed on current and close future technology. In order to optimize such architectures, it is crucial to develop tools that allow specific design space explorations. To this aim, in this paper we present a technique to perform a spatio-temporal characterization of quantum circuits running in multi-chip quantum computers. Specifically, we focus on the analysis of the qubit traffic resulting from operations that involve qubits residing in different cores, and hence quantum communication across chips, while also giving importance to the amount of intra-core operations that occur in between those communications. Using specific multi-core performance metrics and a complete set of benchmarks, our analysis showcases the opportunities that the proposed approach may provide to guide the design of multi-core quantum computers and their interconnects.
△ Less
Submitted 31 August, 2022;
originally announced September 2022.
-
Dual process in the two-parameter Poisson-Dirichlet diffusion
Authors:
Robert C. Griffiths,
Matteo Ruggiero,
Dario Spano,
Youzhou Zhou
Abstract:
The two-parameter Poisson-Dirichlet diffusion takes values in the infinite ordered simplex and extends the celebrated infinitely-many-neutral-alleles model, with a two-parameter Poisson-Dirichlet stationary distribution. Here we identify a dual process for this diffusion and obtain its transition probabilities. The dual is shown to be given by Kingman's coalescent with mutation, conditional on a g…
▽ More
The two-parameter Poisson-Dirichlet diffusion takes values in the infinite ordered simplex and extends the celebrated infinitely-many-neutral-alleles model, with a two-parameter Poisson-Dirichlet stationary distribution. Here we identify a dual process for this diffusion and obtain its transition probabilities. The dual is shown to be given by Kingman's coalescent with mutation, conditional on a given configuration of leaves. Interestingly, the dual does not depend on the additional parameter of the stationary distribution. After discussing the sampling probabilities of a two-parameter Poisson-Dirichlet partition drawn conditionally on another partition, we use these together with the dual process to derive the transition density of the diffusion. Our derivation provides a new probabilistic proof of this result, leveraging on an extension of Pitman's Pólya urn scheme whereby the urn is split after a finite sequence and two urns are run independently onwards. The proof strategy exemplifies the power of duality and could be exported to other models where a dual is available.
△ Less
Submitted 10 January, 2024; v1 submitted 16 February, 2021;
originally announced February 2021.
-
Diffusion Limits at Small Times for Coalescent Processes with Mutation and Selection
Authors:
Philip A. Hanson,
Paul A. Jenkins,
Jere Koskela,
Dario Spanò
Abstract:
The Ancestral Selection Graph (ASG) is an important genealogical process which extends the well-known Kingman coalescent to incorporate natural selection. We show that the number of lineages of the ASG with and without mutation is asymptotic to $2/t$ as $t\to 0$, in agreement with the limiting behaviour of the Kingman coalescent. We couple these processes on the same probability space using a Pois…
▽ More
The Ancestral Selection Graph (ASG) is an important genealogical process which extends the well-known Kingman coalescent to incorporate natural selection. We show that the number of lineages of the ASG with and without mutation is asymptotic to $2/t$ as $t\to 0$, in agreement with the limiting behaviour of the Kingman coalescent. We couple these processes on the same probability space using a Poisson random measure construction that allows us to precisely compare their hitting times. These comparisons enable us to characterise the speed of coming down from infinity of the ASG as well as its fluctuations in a functional central limit theorem. This extends similar results for the Kingman coalescent.
△ Less
Submitted 22 December, 2020; v1 submitted 18 December, 2020;
originally announced December 2020.
-
Satellite Communications in the New Space Era: A Survey and Future Challenges
Authors:
O. Kodheli,
E. Lagunas,
N. Maturo,
S. K. Sharma,
B. Shankar,
J. F. Mendoza Montoya,
J. C. Merlano Duncan,
D. Spano,
S. Chatzinotas,
S. Kisseleff,
J. Querol,
L. Lei,
T. X. Vu,
G. Goussetis
Abstract:
Satellite communications have recently entered a period of renewed interest motivated by technological advances and nurtured through private investment and ventures. The present survey aims at capturing the state of the art in SatComs, while highlighting the most promising open research topics. Firstly, the main innovation drivers are motivated, such as new constellation types, on-board processing…
▽ More
Satellite communications have recently entered a period of renewed interest motivated by technological advances and nurtured through private investment and ventures. The present survey aims at capturing the state of the art in SatComs, while highlighting the most promising open research topics. Firstly, the main innovation drivers are motivated, such as new constellation types, on-board processing capabilities, nonterrestrial networks and space-based data collection/processing. Secondly, the most promising applications are described i.e. 5G integration, space communications, Earth observation, aeronautical and maritime tracking and communication. Subsequently, an in-depth literature review is provided across five axes: i) system aspects, ii) air interface, iii) medium access, iv) networking, v) testbeds & prototy**. Finally, a number of future challenges and the respective open research topics are described.
△ Less
Submitted 2 March, 2020; v1 submitted 20 February, 2020;
originally announced February 2020.
-
Convergence of Likelihood Ratios and Estimators for Selection in non-neutral Wright-Fisher Diffusions
Authors:
Jaromir Sant,
Paul A. Jenkins,
Jere Koskela,
Dario Spano
Abstract:
A number of discrete time, finite population size models in genetics describing the dynamics of allele frequencies are known to converge (subject to suitable scaling) to a diffusion process in the infinite population limit, termed the Wright-Fisher diffusion. In this article we show that the diffusion is ergodic uniformly in the selection and mutation parameters, and that the measures induced by t…
▽ More
A number of discrete time, finite population size models in genetics describing the dynamics of allele frequencies are known to converge (subject to suitable scaling) to a diffusion process in the infinite population limit, termed the Wright-Fisher diffusion. In this article we show that the diffusion is ergodic uniformly in the selection and mutation parameters, and that the measures induced by the solution to the stochastic differential equation are uniformly locally asymptotically normal. Subsequently these two results are used to analyse the statistical properties of the Maximum Likelihood and Bayesian estimators for the selection parameter, when both selection and mutation are acting on the population. In particular, it is shown that these estimators are uniformly over compact sets consistent, display uniform in the selection parameter asymptotic normality and convergence of moments over compact sets, and are asymptotically efficient for a suitable class of loss functions.
△ Less
Submitted 13 September, 2021; v1 submitted 10 January, 2020;
originally announced January 2020.
-
Interference Exploitation via Symbol-Level Precoding: Overview, State-of-the-Art and Future Directions
Authors:
Ang Li,
Danilo Spano,
Jevgenij Krivochiza,
Stavros Domouchtsidis,
Christos G. Tsinos,
Christos Masouros,
Symeon Chatzinotas,
Yonghui Li,
Branka Vucetic,
Björn Ottersten
Abstract:
Interference is traditionally viewed as a performance limiting factor in wireless communication systems, which is to be minimized or mitigated. Nevertheless, a recent line of work has shown that by manipulating the interfering signals such that they add up constructively at the receiver side, known interference can be made beneficial and further improve the system performance in a variety of wirel…
▽ More
Interference is traditionally viewed as a performance limiting factor in wireless communication systems, which is to be minimized or mitigated. Nevertheless, a recent line of work has shown that by manipulating the interfering signals such that they add up constructively at the receiver side, known interference can be made beneficial and further improve the system performance in a variety of wireless scenarios, achieved by symbol-level precoding (SLP). This paper aims to provide a tutorial on interference exploitation techniques from the perspective of precoding design in a multi-antenna wireless communication system, by beginning with the classification of constructive interference (CI) and destructive interference (DI). The definition for CI is presented and the corresponding mathematical characterization is formulated for popular modulation types, based on which optimization-based precoding techniques are discussed. In addition, the extension of CI precoding to other application scenarios as well as for hardware efficiency is also described. Proof-of-concept testbeds are demonstrated for the potential practical implementation of CI precoding, and finally a list of open problems and practical challenges are presented to inspire and motivate further research directions in this area.
△ Less
Submitted 11 July, 2019;
originally announced July 2019.
-
Precoded Cluster Hop** in Multi-Beam High Throughput Satellite Systems
Authors:
Mirza Golam Kibria,
Eva Lagunas,
Nicola Maturo,
Danilo Spano,
Symeon Chatzinotas
Abstract:
Beam-Hop** (BH) and precoding are two trending technologies for the satellite community. While BH enables flexibility to adapt the offered capacity to the heterogeneous demand, precoding aims at boosting the spectral efficiency. In this paper, we consider a high throughput satellite (HTS) system that employs BH in conjunction with precoding. In particular, we propose the concept of Cluster-Hoppi…
▽ More
Beam-Hop** (BH) and precoding are two trending technologies for the satellite community. While BH enables flexibility to adapt the offered capacity to the heterogeneous demand, precoding aims at boosting the spectral efficiency. In this paper, we consider a high throughput satellite (HTS) system that employs BH in conjunction with precoding. In particular, we propose the concept of Cluster-Hop** (CH) that seamlessly combines the BH and precoding paradigms and utilize their individual competencies. The cluster is defined as a set of adjacent beams that are simultaneously illuminated. In addition, we propose an efficient time-space illumination pattern design, where we determine the set of clusters that can be illuminated simultaneously at each hop** event along with the illumination duration. We model the CH time-space illumination pattern design as an integer programming problem which can be efficiently solved. Supporting results based on numerical simulations are provided which validate the effectiveness of the proposed CH concept and time-space illumination pattern design.
△ Less
Submitted 3 May, 2019;
originally announced May 2019.
-
Carrier Aggregation in Multi-Beam High Throughput Satellite Systems
Authors:
Mirza Golam Kibria,
Eva Lagunas,
Nicola Maturo,
Danilo Spano,
Hayder Al-Hraishawi,
Symeon Chatzinotas
Abstract:
Carrier Aggregation (CA) is an integral part of current terrestrial networks. Its ability to enhance the peak data rate, to efficiently utilize the limited available spectrum resources and to satisfy the demand for data-hungry applications has drawn large attention from different wireless network communities. Given the benefits of CA in the terrestrial wireless environment, it is of great interest…
▽ More
Carrier Aggregation (CA) is an integral part of current terrestrial networks. Its ability to enhance the peak data rate, to efficiently utilize the limited available spectrum resources and to satisfy the demand for data-hungry applications has drawn large attention from different wireless network communities. Given the benefits of CA in the terrestrial wireless environment, it is of great interest to analyze and evaluate the potential impact of CA in the satellite domain. In this paper, we study CA in multibeam high throughput satellite systems. We consider both inter-transponder and intra-transponder CA at the satellite payload level of the communication stack, and we address the problem of carrier-user assignment assuming that multiple users can be multiplexed in each carrier. The transmission parameters of different carriers are generated considering the transmission characteristics of carriers in different transponders. In particular, we propose a flexible carrier allocation approach for a CA-enabled multibeam satellite system targeting a proportionally fair user demand satisfaction. Simulation results and analysis shed some light on this rather unexplored scenario and demonstrate the feasibility of the CA in satellite communication systems.
△ Less
Submitted 3 May, 2019;
originally announced May 2019.
-
The effective strength of selection in random environment
Authors:
Adrián González Casanova,
Dario Spanò,
Maite Wilke-Berenguer
Abstract:
We analyse a family of two-types Wright-Fisher models with selection in a random environment and skewed offspring distribution. We provide a calculable criterion to quantify the impact of different shapes of selection on the fate of the weakest allele, and thus compare them. The main mathematical tool is duality, which we prove to hold, also in presence of random environment (quenched and in some…
▽ More
We analyse a family of two-types Wright-Fisher models with selection in a random environment and skewed offspring distribution. We provide a calculable criterion to quantify the impact of different shapes of selection on the fate of the weakest allele, and thus compare them. The main mathematical tool is duality, which we prove to hold, also in presence of random environment (quenched and in some cases annealed), between the population's allele frequencies and genealogy, both in the case of finite population size and in the scaling limit for large size. Duality also yields new insight on properties of branching-coalescing processes in random environment, such as their long term behaviour.
△ Less
Submitted 8 February, 2023; v1 submitted 28 March, 2019;
originally announced March 2019.
-
Asymptotic genealogies of interacting particle systems with an application to sequential Monte Carlo
Authors:
Jere Koskela,
Paul A. Jenkins,
Adam M. Johansen,
Dario Spano
Abstract:
We study weighted particle systems in which new generations are resampled from current particles with probabilities proportional to their weights. This covers a broad class of sequential Monte Carlo (SMC) methods, widely-used in applied statistics and cognate disciplines. We consider the genealogical tree embedded into such particle systems, and identify conditions, as well as an appropriate time-…
▽ More
We study weighted particle systems in which new generations are resampled from current particles with probabilities proportional to their weights. This covers a broad class of sequential Monte Carlo (SMC) methods, widely-used in applied statistics and cognate disciplines. We consider the genealogical tree embedded into such particle systems, and identify conditions, as well as an appropriate time-scaling, under which they converge to the Kingman n-coalescent in the infinite system size limit in the sense of finite-dimensional distributions. Thus, the tractable n-coalescent can be used to predict the shape and size of SMC genealogies, as we illustrate by characterising the limiting mean and variance of the tree height. SMC genealogies are known to be connected to algorithm performance, so that our results are likely to have applications in the design of new methods as well. Our conditions for convergence are strong, but we show by simulation that they do not appear to be necessary.
△ Less
Submitted 16 July, 2021; v1 submitted 5 April, 2018;
originally announced April 2018.
-
Symbol-level and Multicast Precoding for Multiuser Multiantenna Downlink: A Survey, Classification and Challenges
Authors:
Maha Alodeh,
Danilo Spano,
Ashkan Kalantari,
Christos Tsinos,
Dimitrios Christopoulos,
Symeon Chatzinotas,
Björn Ottersten
Abstract:
Precoding has been conventionally considered as an effective means of mitigating the interference and efficiently exploiting the available in the multiantenna downlink channel, where multiple users are simultaneously served with independent information over the same channel resources. The early works in this area were focused on transmitting an individual information stream to each user by constru…
▽ More
Precoding has been conventionally considered as an effective means of mitigating the interference and efficiently exploiting the available in the multiantenna downlink channel, where multiple users are simultaneously served with independent information over the same channel resources. The early works in this area were focused on transmitting an individual information stream to each user by constructing weighted linear combinations of symbol blocks (codewords). However, more recent works have moved beyond this traditional view by: i) transmitting distinct data streams to groups of users and ii) applying precoding on a symbol-per-symbol basis. In this context, the current survey presents a unified view and classification of precoding techniques with respect to two main axes: i) the switching rate of the precoding weights, leading to the classes of block- and symbol-level precoding, ii) the number of users that each stream is addressed to, hence unicast-/multicast-/broadcast- precoding. Furthermore, the classified techniques are compared through representative numerical results to demonstrate their relative performance and uncover fundamental insights. Finally, a list of open theoretical problems and practical challenges are presented to inspire further research in this area.
△ Less
Submitted 10 March, 2017;
originally announced March 2017.
-
Wright-Fisher diffusion bridges
Authors:
Robert Griffiths,
Paul A. Jenkins,
Dario Spanò
Abstract:
{\bf Abstract} The trajectory of the frequency of an allele which begins at $x$ at time $0$ and is known to have frequency $z$ at time $T$ can be modelled by the bridge process of the Wright-Fisher diffusion. Bridges when $x=z=0$ are particularly interesting because they model the trajectory of the frequency of an allele which appears at a time, then is lost by random drift or mutation after a tim…
▽ More
{\bf Abstract} The trajectory of the frequency of an allele which begins at $x$ at time $0$ and is known to have frequency $z$ at time $T$ can be modelled by the bridge process of the Wright-Fisher diffusion. Bridges when $x=z=0$ are particularly interesting because they model the trajectory of the frequency of an allele which appears at a time, then is lost by random drift or mutation after a time $T$. The coalescent genealogy back in time of a population in a neutral Wright-Fisher diffusion process is well understood. In this paper we obtain a new interpretation of the coalescent genealogy of the population in a bridge from a time $t\in (0,T)$. In a bridge with allele frequencies of 0 at times 0 and $T$ the coalescence structure is that the population coalesces in two directions from $t$ to $0$ and $t$ to $T$ such that there is just one lineage of the allele under consideration at times $0$ and $T$. The genealogy in Wright-Fisher diffusion bridges with selection is more complex than in the neutral model, but still with the property of the population branching and coalescing in two directions from time $t\in (0,T)$. The density of the frequency of an allele at time $t$ is expressed in a way that shows coalescence in the two directions. A new algorithm for exact simulation of a neutral Wright-Fisher bridge is derived. This follows from knowing the density of the frequency in a bridge and exact simulation from the Wright-Fisher diffusion. The genealogy of the neutral Wright-Fisher bridge is also modelled by branching Pólya urns, extending a representation in a Wright-Fisher diffusion. This is a new very interesting representation that relates Wright-Fisher bridges to classical urn models in a Bayesian setting.
△ Less
Submitted 21 August, 2017; v1 submitted 1 March, 2017;
originally announced March 2017.
-
Duality and Fixation in $Ξ$-Wright-Fisher processes with frequency-dependent selection
Authors:
Adrián González Casanova,
Dario Spanò
Abstract:
A two-types, discrete-time population model with finite, constant size is constructed, allowing for a general form of frequency-dependent selection and skewed offspring distribution. Selection is defined based on the idea that individuals first choose a (random) number of $\textit{potential}$ parents from the previous generation and then, from the selected pool, they inherit the type of the fittes…
▽ More
A two-types, discrete-time population model with finite, constant size is constructed, allowing for a general form of frequency-dependent selection and skewed offspring distribution. Selection is defined based on the idea that individuals first choose a (random) number of $\textit{potential}$ parents from the previous generation and then, from the selected pool, they inherit the type of the fittest parent. The probability distribution function of the number of potential parents per individual thus parametrises entirely the selection mechanism. Using sampling- and moment-duality, weak convergence is then proved both for the allele frequency process of the selectively weak type and for the population's ancestral process. The scaling limits are, respectively, a two-types $Ξ$-Fleming-Viot jump-diffusion process with frequency-dependent selection, and a branching-coalescing process with general branching and simultaneous multiple collisions. Duality also leads to a characterisation of the probability of extinction of the selectively weak allele, in terms of the ancestral process' ergodic properties.
△ Less
Submitted 12 April, 2017; v1 submitted 15 December, 2016;
originally announced December 2016.
-
Poisson Random Fields for Dynamic Feature Models
Authors:
Valerio Perrone,
Paul A. Jenkins,
Dario Spano,
Yee Whye Teh
Abstract:
We present the Wright-Fisher Indian buffet process (WF-IBP), a probabilistic model for time-dependent data assumed to have been generated by an unknown number of latent features. This model is suitable as a prior in Bayesian nonparametric feature allocation models in which the features underlying the observed data exhibit a dependency structure over time. More specifically, we establish a new fram…
▽ More
We present the Wright-Fisher Indian buffet process (WF-IBP), a probabilistic model for time-dependent data assumed to have been generated by an unknown number of latent features. This model is suitable as a prior in Bayesian nonparametric feature allocation models in which the features underlying the observed data exhibit a dependency structure over time. More specifically, we establish a new framework for generating dependent Indian buffet processes, where the Poisson random field model from population genetics is used as a way of constructing dependent beta processes. Inference in the model is complex, and we describe a sophisticated Markov Chain Monte Carlo algorithm for exact posterior simulation. We apply our construction to develop a nonparametric focused topic model for collections of time-stamped text documents and test it on the full corpus of NIPS papers published from 1987 to 2015.
△ Less
Submitted 22 November, 2016;
originally announced November 2016.
-
Conjugacy properties of time-evolving Dirichlet and gamma random measures
Authors:
Omiros Papaspiliopoulos,
Matteo Ruggiero,
Dario Spanò
Abstract:
We extend classic characterisations of posterior distributions under Dirichlet process and gamma random measures priors to a dynamic framework. We consider the problem of learning, from indirect observations, two families of time-dependent processes of interest in Bayesian nonparametrics: the first is a dependent Dirichlet process driven by a Fleming-Viot model, and the data are random samples fro…
▽ More
We extend classic characterisations of posterior distributions under Dirichlet process and gamma random measures priors to a dynamic framework. We consider the problem of learning, from indirect observations, two families of time-dependent processes of interest in Bayesian nonparametrics: the first is a dependent Dirichlet process driven by a Fleming-Viot model, and the data are random samples from the process state at discrete times; the second is a collection of dependent gamma random measures driven by a Dawson-Watanabe model, and the data are collected according to a Poisson point process with intensity given by the process state at discrete times. Both driving processes are diffusions taking values in the space of discrete measures whose support varies with time, and are stationary and reversible with respect to Dirichlet and gamma priors respectively. A common methodology is developed to obtain in closed form the time-marginal posteriors given past and present data. These are shown to belong to classes of finite mixtures of Dirichlet processes and gamma random measures for the two models respectively, yielding conjugacy of these classes to the type of data we consider. We provide explicit results on the parameters of the mixture components and on the mixing weights, which are time-varying and drive the mixtures towards the respective priors in absence of further data. Explicit algorithms are provided to recursively compute the parameters of the mixtures. Our results are based on the projective properties of the signals and on certain duality properties of their projections.
△ Less
Submitted 30 August, 2016; v1 submitted 11 July, 2016;
originally announced July 2016.
-
Inference and rare event simulation for stopped Markov processes via reverse-time sequential Monte Carlo
Authors:
Jere Koskela,
Dario Spano,
Paul A. Jenkins
Abstract:
We present a sequential Monte Carlo algorithm for Markov chain trajectories with proposals constructed in reverse time, which is advantageous when paths are conditioned to end in a rare set. The reverse time proposal distribution is constructed by approximating the ratio of Green's functions in Nagasawa's formula. Conditioning arguments can be used to interpret these ratios as low-dimensional cond…
▽ More
We present a sequential Monte Carlo algorithm for Markov chain trajectories with proposals constructed in reverse time, which is advantageous when paths are conditioned to end in a rare set. The reverse time proposal distribution is constructed by approximating the ratio of Green's functions in Nagasawa's formula. Conditioning arguments can be used to interpret these ratios as low-dimensional conditional sampling distributions of some coordinates of the process given the others. Hence the difficulty in designing SMC proposals in high dimension is greatly reduced. We illustrate our method on estimating an overflow probability in a queueing model, the probability that a diffusion follows a narrowing corridor, and the initial location of an infection in an epidemic model on a network.
△ Less
Submitted 2 January, 2017; v1 submitted 9 March, 2016;
originally announced March 2016.
-
Canonical correlations for dependent gamma processes
Authors:
Dario Spanò,
Antonio Lijoi
Abstract:
The present paper provides a characterisation of exchangeable pairs of random measures $(\widetildeμ_1,\widetildeμ_2)$ whose identical margins are fixed to coincide with the distribution of a gamma completely random measure, and whose dependence structure is given in terms of canonical correlations. It is first shown that canonical correlation sequences for the finite-dimensional distributions of…
▽ More
The present paper provides a characterisation of exchangeable pairs of random measures $(\widetildeμ_1,\widetildeμ_2)$ whose identical margins are fixed to coincide with the distribution of a gamma completely random measure, and whose dependence structure is given in terms of canonical correlations. It is first shown that canonical correlation sequences for the finite-dimensional distributions of $(\widetildeμ_1,\widetildeμ_2)$ are moments of means of a Dirichlet process having random base measure. Necessary and sufficient conditions are further given for canonically correlated gamma completely random measures to have independent joint increments. Finally, time-homogeneous Feller processes with gamma reversible measure and canonical autocorrelations are characterised as Dawson--Watanabe diffusions with independent homogeneous immigration, time-changed via an independent subordinator. It is thus shown that Dawson--Watanabe diffusions subordinated by pure drift are the only processes in this class whose time-finite-dimensional distributions have, jointly, independent increments.
△ Less
Submitted 22 January, 2016;
originally announced January 2016.
-
Wright-Fisher construction of the two-parameter Poisson-Dirichlet diffusion
Authors:
Cristina Costantini,
Pierpaolo De Blasi,
Stewart N. Ethier,
Matteo Ruggiero,
Dario Spano
Abstract:
The two-parameter Poisson--Dirichlet diffusion, introduced in 2009 by Petrov, extends the infinitely-many-neutral-alleles diffusion model, related to Kingman's one-parameter Poisson--Dirichlet distribution and to certain Fleming--Viot processes. The additional parameter has been shown to regulate the clustering structure of the population, but is yet to be fully understood in the way it governs th…
▽ More
The two-parameter Poisson--Dirichlet diffusion, introduced in 2009 by Petrov, extends the infinitely-many-neutral-alleles diffusion model, related to Kingman's one-parameter Poisson--Dirichlet distribution and to certain Fleming--Viot processes. The additional parameter has been shown to regulate the clustering structure of the population, but is yet to be fully understood in the way it governs the reproductive process. Here we shed some light on these dynamics by formulating a $K$-allele Wright--Fisher model for a population of size $N$, involving a uniform mutation pattern and a specific state-dependent migration mechanism. Suitably scaled, this process converges in distribution to a $K$-dimensional diffusion process as $N\to\infty$. Moreover, the descending order statistics of the $K$-dimensional diffusion converge in distribution to the two-parameter Poisson--Dirichlet diffusion as $K\to\infty$. The choice of the migration mechanism depends on a delicate balance between reinforcement and redistributive effects. The proof of convergence to the infinite-dimensional diffusion is nontrivial because the generators do not converge on a core. Our strategy for overcoming this complication is to prove \textit{a priori} that in the limit there is no "loss of mass", i.e., that, for each limit point of the sequence of finite-dimensional diffusions (after a reordering of components by size), allele frequencies sum to one.
△ Less
Submitted 12 December, 2016; v1 submitted 22 January, 2016;
originally announced January 2016.
-
Bayesian non-parametric inference for $Λ$-coalescents: consistency and a parametric method
Authors:
Jere Koskela,
Paul A. Jenkins,
Dario Spanò
Abstract:
We investigate Bayesian non-parametric inference of the $Λ$-measure of $Λ$-coalescent processes with recurrent mutation, parametrised by probability measures on the unit interval. We give verifiable criteria on the prior for posterior consistency when observations form a time series, and prove that any non-trivial prior is inconsistent when all observations are contemporaneous. We then show that t…
▽ More
We investigate Bayesian non-parametric inference of the $Λ$-measure of $Λ$-coalescent processes with recurrent mutation, parametrised by probability measures on the unit interval. We give verifiable criteria on the prior for posterior consistency when observations form a time series, and prove that any non-trivial prior is inconsistent when all observations are contemporaneous. We then show that the likelihood given a data set of size $n \in \mathbb{N}$ is constant across $Λ$-measures whose leading $n - 2$ moments agree, and focus on inferring truncated sequences of moments. We provide a large class of functionals which can be extremised using finite computation given a credible region of posterior truncated moment sequences, and a pseudo-marginal Metropolis-Hastings algorithm for sampling the posterior. Finally, we compare the efficiency of the exact and noisy pseudo-marginal algorithms with and without delayed acceptance acceleration using a simulation study.
△ Less
Submitted 23 January, 2017; v1 submitted 3 December, 2015;
originally announced December 2015.
-
Exact simulation of the Wright-Fisher diffusion
Authors:
Paul A. Jenkins,
Dario Spano
Abstract:
The Wright-Fisher family of diffusion processes is a widely used class of evolutionary models. However, simulation is difficult because there is no known closed-form formula for its transition function. In this article we demonstrate that it is in fact possible to simulate exactly from a broad class of Wright-Fisher diffusion processes and their bridges. For those diffusions corresponding to rever…
▽ More
The Wright-Fisher family of diffusion processes is a widely used class of evolutionary models. However, simulation is difficult because there is no known closed-form formula for its transition function. In this article we demonstrate that it is in fact possible to simulate exactly from a broad class of Wright-Fisher diffusion processes and their bridges. For those diffusions corresponding to reversible, neutral evolution, our key idea is to exploit an eigenfunction expansion of the transition function; this approach even applies to its infinite-dimensional analogue, the Fleming-Viot process. We then develop an exact rejection algorithm for processes with more general drift functions, including those modelling natural selection, using ideas from retrospective simulation. Our approach also yields methods for exact simulation of the moment dual of the Wright-Fisher diffusion, the ancestral process of an infinite-leaf Kingman coalescent tree. We believe our new perspective on diffusion simulation holds promise for other models admitting a transition eigenfunction expansion.
△ Less
Submitted 29 September, 2023; v1 submitted 23 June, 2015;
originally announced June 2015.
-
Consistency of Bayesian nonparametric inference for discretely observed jump diffusions
Authors:
Jere Koskela,
Dario Spano,
Paul A. Jenkins
Abstract:
We introduce verifiable criteria for weak posterior consistency of identifiable Bayesian nonparametric inference for jump diffusions with unit diffusion coefficient and uniformly Lipschitz drift and jump coefficients in arbitrary dimension. The criteria are expressed in terms of coefficients of the SDEs describing the process, and do not depend on intractable quantities such as transition densitie…
▽ More
We introduce verifiable criteria for weak posterior consistency of identifiable Bayesian nonparametric inference for jump diffusions with unit diffusion coefficient and uniformly Lipschitz drift and jump coefficients in arbitrary dimension. The criteria are expressed in terms of coefficients of the SDEs describing the process, and do not depend on intractable quantities such as transition densities. We also show that products of discrete net and Dirichlet mixture model priors satisfy our conditions, again under an identifiability assumption. This generalises known results by incorporating jumps into previous work on unit diffusions with uniformly Lipschitz drift coefficients.
△ Less
Submitted 14 September, 2018; v1 submitted 15 June, 2015;
originally announced June 2015.
-
Filtering hidden Markov measures
Authors:
Omiros Papaspiliopoulos,
Matteo Ruggiero,
Dario Spanò
Abstract:
We consider the problem of learning two families of time-evolving random measures from indirect observations. In the first model, the signal is a Fleming--Viot diffusion, which is reversible with respect to the law of a Dirichlet process, and the data is a sequence of random samples from the state at discrete times. In the second model, the signal is a Dawson--Watanabe diffusion, which is reversib…
▽ More
We consider the problem of learning two families of time-evolving random measures from indirect observations. In the first model, the signal is a Fleming--Viot diffusion, which is reversible with respect to the law of a Dirichlet process, and the data is a sequence of random samples from the state at discrete times. In the second model, the signal is a Dawson--Watanabe diffusion, which is reversible with respect to the law of a gamma random measure, and the data is a sequence of Poisson point configurations whose intensity is given by the state at discrete times. A common methodology is developed to obtain the filtering distributions in a computable form, which is based on the projective properties of the signals and duality properties of their projections. The filtering distributions take the form of mixtures of Dirichlet processes and gamma random measures for each of the two families respectively, and an explicit algorithm is provided to compute the parameters of the mixtures. Hence, our results extend classic characterisations of the posterior distribution under Dirichlet process and gamma random measures priors to a dynamic framework.
△ Less
Submitted 18 November, 2014;
originally announced November 2014.
-
Computational inference beyond Kingman's coalescent
Authors:
Jere Koskela,
Paul A. Jenkins,
Dario Spano
Abstract:
Full likelihood inference under Kingman's coalescent is a computationally challenging problem to which importance sampling (IS) and the product of approximate conditionals (PAC) method have been applied successfully. Both methods can be expressed in terms of families of intractable conditional sampling distributions (CSDs), and rely on principled approximations for accurate inference. Recently, mo…
▽ More
Full likelihood inference under Kingman's coalescent is a computationally challenging problem to which importance sampling (IS) and the product of approximate conditionals (PAC) method have been applied successfully. Both methods can be expressed in terms of families of intractable conditional sampling distributions (CSDs), and rely on principled approximations for accurate inference. Recently, more general $Λ$- and $Ξ$-coalescents have been observed to provide better modelling fits to some genetic data sets. We derive families of approximate CSDs for finite sites $Λ$- and $Ξ$-coalescents, and use them to obtain "approximately optimal" IS and PAC algorithms for $Λ$-coalescents, yielding substantial gains in efficiency over existing methods.
△ Less
Submitted 16 December, 2015; v1 submitted 22 November, 2013;
originally announced November 2013.
-
The ancestral process of long term seed bank models
Authors:
Jochen Blath,
Adrian González Casanova,
Noemi Kurt,
Dario Spanò
Abstract:
We present a new model for seed banks, where direct ancestors of individuals may have lived in the near as well as the very far past. The classical Wright-Fisher model, as well as a seed bank model with bounded age distribution considered by Kaj, Krone and Lascoux (2001) are special cases of our model. We discern three parameter regimes of the seed bank age distribution, which lead to substantiall…
▽ More
We present a new model for seed banks, where direct ancestors of individuals may have lived in the near as well as the very far past. The classical Wright-Fisher model, as well as a seed bank model with bounded age distribution considered by Kaj, Krone and Lascoux (2001) are special cases of our model. We discern three parameter regimes of the seed bank age distribution, which lead to substantially different behaviour in terms of genetic variability, in particular with respect to fixation of types and time to the most recent common ancestor. We prove that for age distributions with finite mean, the ancestral process converges to a time-changed Kingman coalescent, while in the case of infinite mean, ancestral lineages might not merge at all with positive probability. Further, we present a construction of the forward in time process in equilibrium. The mathematical methods are based on renewal theory, the urn process introduced by Kaj et al., as well as on a paper by Hammond and Sheffield (2011).
△ Less
Submitted 1 July, 2013; v1 submitted 23 March, 2012;
originally announced March 2012.
-
Orthogonal polynomial kernels and canonical correlations for Dirichlet measures
Authors:
Robert C. Griffiths,
Dario Spanò
Abstract:
We consider a multivariate version of the so-called Lancaster problem of characterizing canonical correlation coefficients of symmetric bivariate distributions with identical marginals and orthogonal polynomial expansions. The marginal distributions examined in this paper are the Dirichlet and the Dirichlet multinomial distribution, respectively, on the continuous and the N-discrete d-dimensional…
▽ More
We consider a multivariate version of the so-called Lancaster problem of characterizing canonical correlation coefficients of symmetric bivariate distributions with identical marginals and orthogonal polynomial expansions. The marginal distributions examined in this paper are the Dirichlet and the Dirichlet multinomial distribution, respectively, on the continuous and the N-discrete d-dimensional simplex. Their infinite-dimensional limit distributions, respectively, the Poisson-Dirichlet distribution and Ewens's sampling formula, are considered as well. We study, in particular, the possibility of map** canonical correlations on the d-dimensional continuous simplex (i) to canonical correlation sequences on the d+1-dimensional simplex and/or (ii) to canonical correlations on the discrete simplex, and vice versa. Driven by this motivation, the first half of the paper is devoted to providing a full characterization and probabilistic interpretation of n-orthogonal polynomial kernels (i.e., sums of products of orthogonal polynomials of the same degree n) with respect to the mentioned marginal distributions. We establish several identities and some integral representations which are multivariate extensions of important results known for the case d=2 since the 1970s. These results, along with a common interpretation of the mentioned kernels in terms of dependent Polya urns, are shown to be key features leading to several non-trivial solutions to Lancaster's problem, many of which can be extended naturally to the limit as $d\rightarrow\infty$.
△ Less
Submitted 20 March, 2013; v1 submitted 26 March, 2010;
originally announced March 2010.
-
Diffusion processes and coalescent trees
Authors:
Robert C. Griffiths,
Dario Spano`
Abstract:
We dedicate this paper to Sir John Kingman on his 70th Birthday. In modern mathematical population genetics the ancestral history of a population of genes back in time is described by John Kingman's coalescent tree. Classical and modern approaches model gene frequencies by diffusion processes. This paper, which is partly a review, discusses how coalescent processes are dual to diffusion processes…
▽ More
We dedicate this paper to Sir John Kingman on his 70th Birthday. In modern mathematical population genetics the ancestral history of a population of genes back in time is described by John Kingman's coalescent tree. Classical and modern approaches model gene frequencies by diffusion processes. This paper, which is partly a review, discusses how coalescent processes are dual to diffusion processes in an analytic and probabilistic sense. Bochner (1954) and Gasper (1972) were interested in characterizations of processes with Beta stationary distributions and Jacobi polynomial eigenfunctions. We discuss the connection with Wright--Fisher diffusions and the characterization of these processes. Subordinated Wright--Fisher diffusions are of this type. An Inverse Gaussian subordinator is interesting and important in subordinated Wright--Fisher diffusions and is related to the Jacobi Poisson Kernel in orthogonal polynomial theory. A related time-subordinated forest of non-mutant edges in the Kingman coalescent is novel.
△ Less
Submitted 24 March, 2010;
originally announced March 2010.
-
Multivariate Jacobi and Laguerre polynomials, infinite-dimensional extensions, and their probabilistic connections with multivariate Hahn and Meixner polynomials
Authors:
Robert C. Griffiths,
Dario Spanó
Abstract:
Multivariate versions of classical orthogonal polynomials such as Jacobi, Hahn, Laguerre and Meixner are reviewed and their connection explored by adopting a probabilistic approach. Hahn and Meixner polynomials are interpreted as posterior mixtures of Jacobi and Laguerre polynomials, respectively. By using known properties of gamma point processes and related transformations, a new infinite-dimens…
▽ More
Multivariate versions of classical orthogonal polynomials such as Jacobi, Hahn, Laguerre and Meixner are reviewed and their connection explored by adopting a probabilistic approach. Hahn and Meixner polynomials are interpreted as posterior mixtures of Jacobi and Laguerre polynomials, respectively. By using known properties of gamma point processes and related transformations, a new infinite-dimensional version of Jacobi polynomials is constructed with respect to the size-biased version of the Poisson--Dirichlet weight measure and to the law of the gamma point process from which it is derived.
△ Less
Submitted 18 July, 2011; v1 submitted 9 September, 2008;
originally announced September 2008.
-
Fragmenting random permutations
Authors:
Christina Goldschmidt,
James B. Martin,
Dario Spanò
Abstract:
Problem 1.5.7 from Pitman's Saint-Flour lecture notes: Does there exist for each n a fragmentation process (Π_{n,k}, 1 \leq k \leq n) taking values in the space of partitions of {1,2,...,n} such that Π_{n,k} is distributed like the partition generated by cycles of a uniform random permutation of {1,2,...,n} conditioned to have k cycles? We show that the answer is yes. We also give a partial exte…
▽ More
Problem 1.5.7 from Pitman's Saint-Flour lecture notes: Does there exist for each n a fragmentation process (Π_{n,k}, 1 \leq k \leq n) taking values in the space of partitions of {1,2,...,n} such that Π_{n,k} is distributed like the partition generated by cycles of a uniform random permutation of {1,2,...,n} conditioned to have k cycles? We show that the answer is yes. We also give a partial extension to general exchangeable Gibbs partitions.
△ Less
Submitted 4 December, 2007;
originally announced December 2007.
-
Record indices and age-ordered frequencies in Gibbs random partitions
Authors:
Robert C. Griffiths,
Dario Spanó
Abstract:
The distribution of age-ordered frequencies arising from an exchangeable Gibbs partition is studied in relation with the distribution of the positions at which new mutations appear in a sample.
The distribution of age-ordered frequencies arising from an exchangeable Gibbs partition is studied in relation with the distribution of the positions at which new mutations appear in a sample.
△ Less
Submitted 9 July, 2007; v1 submitted 30 January, 2007;
originally announced January 2007.