Skip to main content

Showing 1–18 of 18 results for author: Nicholls, G K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2311.00541  [pdf, other

    cs.CL stat.ME

    An Embedded Diachronic Sense Change Model with a Case Study from Ancient Greek

    Authors: Schyan Zafar, Geoff K. Nicholls

    Abstract: Word meanings change over time, and word senses evolve, emerge or die out in the process. For ancient languages, where the corpora are often small and sparse, modelling such changes accurately proves challenging, and quantifying uncertainty in sense-change estimates consequently becomes important. GASC (Genre-Aware Semantic Change) and DiSC (Diachronic Sense Change) are existing generative models… ▽ More

    Submitted 25 June, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

  2. arXiv:2308.09060  [pdf, other

    stat.CO

    TraitLab: a Matlab package for fitting and simulating binary tree-like data

    Authors: Luke J. Kelly, Geoff K. Nicholls, Robin J. Ryder, David Welch

    Abstract: TraitLab is a software package for simulating, fitting and analysing tree-like binary data under a stochastic Dollo model of evolution. The model also allows for rate heterogeneity through catastrophes, evolutionary events where many traits are simultaneously lost while new ones arise, and borrowing, whereby traits transfer laterally between species as well as through ancestral relationships. The… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: Manual describing the TraitLab software for phylogenetic inference

    MSC Class: 62-04

  3. arXiv:2306.15827  [pdf, other

    stat.ME

    Bayesian Inference for Vertex-Series-Parallel Partial Orders

    Authors: Chuxuan, Jiang, Geoff K. Nicholls, Jeong Eun Lee

    Abstract: Partial orders are a natural model for the social hierarchies that may constrain "queue-like" rank-order data. However, the computational cost of counting the linear extensions of a general partial order on a ground set with more than a few tens of elements is prohibitive. Vertex-series-parallel partial orders (VSPs) are a subclass of partial orders which admit rapid counting and represent the sor… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: 9 pages, 8 figures, to be published in UAI 2023

  4. arXiv:2306.15622  [pdf, other

    stat.ME stat.AP

    Biclustering random matrix partitions with an application to classification of forensic body fluids

    Authors: Chieh-Hsi Wu, Amy D. Roeder, Geoff K. Nicholls

    Abstract: Classification of unlabeled data is usually achieved by supervised learning from labeled samples. Although there exist many sophisticated supervised machine learning methods that can predict the missing labels with a high level of accuracy, they often lack the required transparency in situations where it is important to provide interpretable results and meaningful measures of confidence. Body flui… ▽ More

    Submitted 14 October, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

    Comments: 23 pages and 4 figures (paper); 22 pages and 6 figures (supplement); revision adds model comparisons

    MSC Class: 62F15 (Primary) 62P10 (Secondary)

  5. arXiv:2212.05524  [pdf, other

    stat.ME stat.AP

    Bayesian inference for partial orders from random linear extensions: power relations from 12th Century Royal Acta

    Authors: Geoff K. Nicholls, Jeong Eun Lee, Nicholas Karn, David Johnson, Rukuang Huang, Alexis Muir-Watt

    Abstract: We give a new class of models for time series data in which actors are listed in order of precedence. We model the lists as a realisation of a queue in which queue-position is constrained by an underlying social hierarchy. We model the hierarchy as a partial order so that the lists are random linear extensions. We account for noise via a random queue-jum** process. We give a marginally consisten… ▽ More

    Submitted 1 August, 2023; v1 submitted 11 December, 2022; originally announced December 2022.

    Comments: 57 pages, 37 figures and 2 tables including appendices

    MSC Class: 62M05 (Primary) 06A06; 62P25 (Secondary)

  6. arXiv:2204.00296  [pdf, other

    stat.ML cs.LG stat.CO

    Scalable Semi-Modular Inference with Variational Meta-Posteriors

    Authors: Chris U. Carmona, Geoff K. Nicholls

    Abstract: The Cut posterior and related Semi-Modular Inference are Generalised Bayes methods for Modular Bayesian evidence combination. Analysis is broken up over modular sub-models of the joint posterior distribution. Model-misspecification in multi-modular models can be hard to fix by model elaboration alone and the Cut posterior and SMI offer a way round this. Information entering the analysis from missp… ▽ More

    Submitted 1 April, 2022; originally announced April 2022.

    Comments: 41 pages including bibliography, 9 figures. Supplement 18 pages. Code reproducing results https://github.com/chriscarmona/modularbayes

    MSC Class: 62F15 (Primary) 62C10; 62-08 (Secondary)

  7. arXiv:2201.09706  [pdf, other

    stat.ME

    Valid belief updates for prequentially additive loss functions arising in Semi-Modular Inference

    Authors: Geoff K. Nicholls, Jeong Eun Lee, Chieh-Hsi Wu, Chris U. Carmona

    Abstract: Model-based Bayesian evidence combination leads to models with multiple parameteric modules. In this setting the effects of model misspecification in one of the modules may in some cases be ameliorated by cutting the flow of information from the misspecified module. Semi-Modular Inference (SMI) is a framework allowing partial cuts which modulate but do not completely cut the flow of information be… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

    Comments: 39 pages including supplement, 6 figures

    MSC Class: 62C10; 62C10 (Primary) 62F35; 65C05 (Secondary)

  8. arXiv:2012.13837  [pdf, other

    stat.ME

    Tree based credible set estimation

    Authors: Jeong Eun. Lee, Geoff K. Nicholls

    Abstract: Estimating a joint Highest Posterior Density credible set for a multivariate posterior density is challenging as dimension gets larger. Credible intervals for univariate marginals are usually presented for ease of computation and visualisation. There are often two layers of approximation, as we may need to compute a credible set for a target density which is itself only an approximation to the tru… ▽ More

    Submitted 27 May, 2021; v1 submitted 26 December, 2020; originally announced December 2020.

  9. arXiv:2006.11228  [pdf, other

    stat.CO

    Distortion estimates for approximate Bayesian inference

    Authors: Hanwen Xing, Geoff K. Nicholls, Jeong Eun Lee

    Abstract: Current literature on posterior approximation for Bayesian inference offers many alternative methods. Does our chosen approximation scheme work well on the observed data? The best existing generic diagnostic tools treating this kind of question by looking at performance averaged over data space, or otherwise lack diagnostic detail. However, if the approximation is bad for most data, but good at th… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

  10. arXiv:2003.06804  [pdf, other

    stat.ME math.ST stat.ML

    Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components

    Authors: Chris U. Carmona, Geoff K. Nicholls

    Abstract: Bayesian statistical inference loses predictive optimality when generative models are misspecified. Working within an existing coherent loss-based generalisation of Bayesian inference, we show existing Modular/Cut-model inference is coherent, and write down a new family of Semi-Modular Inference (SMI) schemes, indexed by an influence parameter, with Bayesian inference and Cut-models as special c… ▽ More

    Submitted 15 March, 2020; originally announced March 2020.

    Comments: for associated R package to reproduce results, see https://github.com/christianu7/aistats2020smi

  11. arXiv:2002.04704  [pdf, other

    stat.ML cs.LG stat.CO

    Large Scale Tensor Regression using Kernels and Variational Inference

    Authors: Robert Hu, Geoff K. Nicholls, Dino Sejdinovic

    Abstract: We outline an inherent weakness of tensor factorization models when latent factors are expressed as a function of side information and propose a novel method to mitigate this weakness. We coin our method \textit{Kernel Fried Tensor}(KFT) and present it as a large scale forecasting tool for high dimensional data. Our results show superior performance against \textit{LightGBM} and \textit{Field Awar… ▽ More

    Submitted 11 February, 2020; originally announced February 2020.

  12. arXiv:1810.06433  [pdf, other

    stat.CO stat.ME

    Calibration procedures for approximate Bayesian credible sets

    Authors: Jeong Eun Lee, Geoff K. Nicholls, Robin J. Ryder

    Abstract: We develop and apply two calibration procedures for checking the coverage of approximate Bayesian credible sets including intervals estimated using Monte Carlo methods. The user has an ideal prior and likelihood, but generates a credible set for an approximate posterior which is not proportional to the product of ideal likelihood and prior. We estimate the realised posterior coverage achieved by t… ▽ More

    Submitted 8 April, 2019; v1 submitted 15 October, 2018; originally announced October 2018.

    Comments: 28 pages, 6 Figures, 1 Table, 4 Algorithm boxes. Revision improves clarity of presentation and adds relevant citations

    Journal ref: Bayesian Anal. 14(4): 1245-1269 (December 2019)

  13. arXiv:1601.07931  [pdf, other

    stat.AP q-bio.PE

    Lateral transfer in Stochastic Dollo models

    Authors: Luke J. Kelly, Geoff K. Nicholls

    Abstract: Lateral transfer, a process whereby species exchange evolutionary traits through non-ancestral relationships, is a frequent source of model misspecification in phylogenetic inference. Lateral transfer obscures the phylogenetic signal in the data as the histories of affected traits are mosaics of the overall phylogeny. We control for the effect of lateral transfer in a Stochastic Dollo model and a… ▽ More

    Submitted 16 March, 2017; v1 submitted 28 January, 2016; originally announced January 2016.

    Comments: Improvements suggested by reviewers

    MSC Class: 62M05

  14. arXiv:1503.08066  [pdf, other

    stat.CO

    Scalable Bayesian Inference for the Inverse Temperature of a Hidden Potts Model

    Authors: Matthew T. Moores, Geoff K. Nicholls, Anthony N. Pettitt, Kerrie Mengersen

    Abstract: The inverse temperature parameter of the Potts model governs the strength of spatial cohesion and therefore has a major influence over the resulting model fit. A difficulty arises from the dependence of an intractable normalising constant on the value of this parameter and thus there is no closed-form solution for sampling from the posterior distribution directly. There are a variety of computatio… ▽ More

    Submitted 17 August, 2018; v1 submitted 27 March, 2015; originally announced March 2015.

  15. arXiv:1205.6857  [pdf, ps, other

    stat.CO

    Coupled MCMC with a randomized acceptance probability

    Authors: Geoff K. Nicholls, Colin Fox, Alexis Muir Watt

    Abstract: We consider Metropolis Hastings MCMC in cases where the log of the ratio of target distributions is replaced by an estimator. The estimator is based on m samples from an independent online Monte Carlo simulation. Under some conditions on the distribution of the estimator the process resembles Metropolis Hastings MCMC with a randomized transition kernel. When this is the case there is a correction… ▽ More

    Submitted 30 May, 2012; originally announced May 2012.

    Comments: 20 pages, 5 graphs in 3 figures

  16. arXiv:1006.5575  [pdf, ps, other

    stat.AP

    On building and fitting a spatio-temporal change-point model for settlement and growth at Bourewa, Fiji Islands

    Authors: Geoff K. Nicholls, Patrick D. Nunn

    Abstract: The Bourewa beach site on the Rove Peninsula of Viti Levu is the earliest known human settlement in the Fiji Islands. How did the settlement at Bourewa develop in space and time? We have radiocarbon dates on sixty specimens, found in association with evidence for human presence, taken from pits across the site. Owing to the lack of diagnostic stratigraphy, there is no direct archaeological evidenc… ▽ More

    Submitted 29 June, 2010; originally announced June 2010.

    Comments: 25 pages, 10 figures

  17. arXiv:0908.1735  [pdf, ps, other

    stat.AP

    Missing data in a stochastic Dollo model for cognate data, and its application to the dating of Proto-Indo-European

    Authors: Robin J. Ryder, Geoff K. Nicholls

    Abstract: Nicholls and Gray (2008) describe a phylogenetic model for trait data. They use their model to estimate branching times on Indo-European language trees from lexical data. Alekseyenko et al. (2008) extended the model and give applications in genetics. In this paper we extend the inference to handle data missing at random. When trait data are gathered, traits are thinned in a way that depends on b… ▽ More

    Submitted 12 August, 2009; originally announced August 2009.

  18. arXiv:0711.1874  [pdf, ps, other

    stat.ME stat.AP

    Dated ancestral trees from binary trait data and its application to the diversification of languages

    Authors: Geoff K. Nicholls, Russell D. Gray

    Abstract: Binary trait data record the presence or absence of distinguishing traits in individuals. We treat the problem of estimating ancestral trees with time depth from binary trait data. Simple analysis of such data is problematic. Each homology class of traits has a unique birth event on the tree, and the birth event of a trait visible at the leaves is biased towards the leaves. We propose a model-ba… ▽ More

    Submitted 12 November, 2007; originally announced November 2007.

    Comments: The definitive version of this manuscript is available in the Journal of the Royal Statistical Society at http://www3.interscience.wiley.com/

    Journal ref: Journal of the Royal Statistical Society. Series B: Statistical Methodology 70 (3), pp. 545-566 (2008)