Skip to main content

Showing 1–10 of 10 results for author: Hobolth, A

.
  1. arXiv:2405.07879  [pdf, other

    stat.AP cs.LG

    On the Relation Between Autoencoders and Non-negative Matrix Factorization, and Their Application for Mutational Signature Extraction

    Authors: Ida Egendal, Rasmus Froberg Brøndum, Marta Pelizzola, Asger Hobolth, Martin Bøgsted

    Abstract: The aim of this study is to provide a foundation to understand the relationship between non-negative matrix factorization (NMF) and non-negative autoencoders enabling proper interpretation and understanding of autoencoder-based alternatives to NMF. Since its introduction, NMF has been a popular tool for extracting interpretable, low-dimensional representations of high-dimensional data. However, re… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  2. arXiv:2207.02677  [pdf, other

    stat.AP

    A flexible model-based framework for robust estimation of mutational signatures

    Authors: Ragnhild Laursen, Lasse Maretty, Asger Hobolth

    Abstract: Somatic mutations in cancer can be viewed as a mixture distribution of several mutational signatures, which can be inferred using non-negative matrix factorization (NMF). Mutational signatures have previously been parametrized using either simple mono-nucleotide interaction models or general tri-nucleotide interaction models. We describe a flexible and novel framework for identifying biologically… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

  3. arXiv:2206.03257  [pdf, other

    stat.ME stat.AP

    Model selection and robust inference of mutational signatures using Negative Binomial non-negative matrix factorization

    Authors: Marta Pelizzola, Ragnhild Laursen, Asger Hobolth

    Abstract: The spectrum of mutations in a collection of cancer genomes can be described by a mixture of a few mutational signatures. The mutational signatures can be found using non-negative matrix factorization (NMF). To extract the mutational signatures we have to assume a distribution for the observed mutational counts and a number of mutational signatures. In most applications, the mutational counts are… ▽ More

    Submitted 1 November, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

  4. arXiv:2101.07526  [pdf, other

    stat.AP

    A sampling algorithm to compute the set of feasible solutions for non-negative matrix factorization with an arbitrary rank

    Authors: Ragnhild Laursen, Asger Hobolth

    Abstract: Non-negative Matrix Factorization (NMF) is a useful method to extract features from multivariate data, but an important and sometimes neglected concern is that NMF can result in non-unique solutions. Often, there exist a Set of Feasible Solutions (SFS), which makes it more difficult to interpret the factorization. This problem is especially ignored in cancer genomics, where NMF is used to infer in… ▽ More

    Submitted 19 January, 2021; originally announced January 2021.

    Comments: 18 pages, 8 figures, 1 algorithm

    MSC Class: 15A23; 62P10; 62-04

  5. arXiv:2101.04941  [pdf, other

    stat.ME

    Multivariate phase-type theory for the site frequency spectrum

    Authors: Asger Hobolth, Mogens Bladt, Lars Nørvang Andersen

    Abstract: Linear functions of the site frequency spectrum (SFS) play a major role for understanding and investigating genetic diversity. Estimators of the mutation rate (e.g. based on the total number of segregating sites or average of the pairwise differences) and tests for neutrality (e.g. Tajima's D) are perhaps the most well-known examples. The distribution of linear functions of the SFS is important fo… ▽ More

    Submitted 13 January, 2021; originally announced January 2021.

    MSC Class: 60J90 (Primary) 60J27; 60J28; 60J95; 92D15 (Secondary)

  6. arXiv:1806.01416  [pdf, other

    q-bio.PE q-bio.QM stat.CO

    Phase-type distributions in population genetics

    Authors: Asger Hobolth, Arno Siri-Jégousse, Mogens Bladt

    Abstract: Probability modelling for DNA sequence evolution is well established and provides a rich framework for understanding genetic variation between samples of individuals from one or more populations. We show that both classical and more recent models for coalescence (with or without recombination) can be described in terms of the so-called phase-type theory, where complicated and tedious calculations… ▽ More

    Submitted 4 June, 2018; originally announced June 2018.

  7. arXiv:1501.02847  [pdf, other

    q-bio.PE

    The SMC' is a highly accurate approximation to the ancestral recombination graph

    Authors: Peter R. Wilton, Shai Carmi, Asger Hobolth

    Abstract: Two sequentially Markov coalescent models (SMC and SMC') are available as tractable approximations to the ancestral recombination graph (ARG). We present a Markov process describing coalescence at two fixed points along a pair of sequences evolving under the SMC'. Using our Markov process, we derive a number of new quantities related to the pairwise SMC', thereby analytically quantifying for the f… ▽ More

    Submitted 4 March, 2015; v1 submitted 12 January, 2015; originally announced January 2015.

    Comments: Revised manuscript

  8. arXiv:1402.5790  [pdf

    q-bio.PE

    Strong selective sweeps associated with ampliconic regions in great ape X chromosomes

    Authors: Kiwoong Nam, Kasper Munch, Asger Hobolth, Julien Y. Dutheil, Krishna Veeramah, August Woerner, Michael F. Hammer, Great Ape Genome Diversity Project, Thomas Mailund, Mikkel H. Schierup

    Abstract: The unique inheritance pattern of X chromosomes makes them preferential targets of adaptive evolution. We here investigate natural selection on the X chromosome in all species of great apes. We find that diversity is more strongly reduced around genes on the X compared with autosomes, and that a higher proportion of substitutions results from positive selection. Strikingly, the X exhibits several… ▽ More

    Submitted 5 March, 2014; v1 submitted 24 February, 2014; originally announced February 2014.

    Comments: This the resubmitted version, with supplementary

  9. Simulation from endpoint-conditioned, continuous-time Markov chains on a finite state space, with applications to molecular evolution

    Authors: Asger Hobolth, Eric A. Stone

    Abstract: Analyses of serially-sampled data often begin with the assumption that the observations represent discrete samples from a latent continuous-time stochastic process. The continuous-time Markov chain (CTMC) is one such generative model whose popularity extends to a variety of disciplines ranging from computational finance to human genetics and genomics. A common theme among these diverse applicati… ▽ More

    Submitted 9 October, 2009; originally announced October 2009.

    Comments: Published in at http://dx.doi.org/10.1214/09-AOAS247 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS247

    Journal ref: Annals of Applied Statistics 2009, Vol. 3, No. 3, 1204-1231

  10. arXiv:q-bio/0511034  [pdf, ps, other

    q-bio.QM

    Maximum likelihood estimation of phylogenetic tree and substitution rates via generalized neighbor-joining and the EM algorithm

    Authors: Asger Hobolth, Ruriko Yoshida

    Abstract: A central task in the study of molecular sequence data from present-day species is the reconstruction of the ancestral relationships. The most established approach to tree reconstruction is the maximum likelihood (ML) method. In this method, evolution is described in terms of a discrete-state continuous-time Markov process on a phylogenetic tree. The substitution rate matrix, that determines the… ▽ More

    Submitted 19 November, 2005; originally announced November 2005.

    Comments: 12 pages. To appear in Algebaic Biology 2005