Skip to main content

Showing 1–13 of 13 results for author: Bacallado, S

.
  1. arXiv:2403.02979  [pdf, other

    stat.ME math.ST stat.AP

    Regularised Canonical Correlation Analysis: graphical lasso, biplots and beyond

    Authors: Lennie Wells, Kumar Thurimella, Sergio Bacallado

    Abstract: Recent developments in regularized Canonical Correlation Analysis (CCA) promise powerful methods for high-dimensional, multiview data analysis. However, justifying the structural assumptions behind many popular approaches remains a challenge, and features of realistic biological datasets pose practical difficulties that are seldom discussed. We propose a novel CCA estimator rooted in an assumption… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 83 pages, 27 figures

    MSC Class: 62H20 (Primary) 62H12; 62P10 (Secondary) ACM Class: G.3

  2. arXiv:2306.14809  [pdf, other

    cs.LG

    Tanimoto Random Features for Scalable Molecular Machine Learning

    Authors: Austin Tripp, Sergio Bacallado, Sukriti Singh, José Miguel Hernández-Lobato

    Abstract: The Tanimoto coefficient is commonly used to measure the similarity between molecules represented as discrete fingerprints, either as a distance metric or a positive definite kernel. While many kernel methods can be accelerated using random feature approximations, at present there is a lack of such approximations for the Tanimoto kernel. In this paper we propose two kinds of novel random features… ▽ More

    Submitted 13 November, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

    Comments: Camera-ready version presented at NeurIPS 2023. Updates include: notation changes, better description of features in section 4, updated experiments, link to code

  3. arXiv:2210.09211  [pdf, other

    stat.ML cs.LG

    Conditional Neural Processes for Molecules

    Authors: Miguel Garcia-Ortegon, Andreas Bender, Sergio Bacallado

    Abstract: Neural processes (NPs) are models for transfer learning with properties reminiscent of Gaussian Processes (GPs). They are adept at modelling data consisting of few observations of many related functions on the same input space and are trained by minimizing a variational objective, which is computationally much less expensive than the Bayesian updating required by GPs. So far, most studies of NPs h… ▽ More

    Submitted 23 February, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

  4. arXiv:2110.15486  [pdf, other

    stat.ML cs.LG q-bio.BM

    DOCKSTRING: easy molecular docking yields better benchmarks for ligand design

    Authors: Miguel García-Ortegón, Gregor N. C. Simm, Austin J. Tripp, José Miguel Hernández-Lobato, Andreas Bender, Sergio Bacallado

    Abstract: The field of machine learning for drug discovery is witnessing an explosion of novel methods. These methods are often benchmarked on simple physicochemical properties such as solubility or general druglikeness, which can be readily computed. However, these properties are poor representatives of objective functions in drug design, mainly because they do not depend on the candidate's interaction wit… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

  5. arXiv:2102.08984  [pdf, ps, other

    math.PR

    The *-Edge-Reinforced Random Walk

    Authors: Sergio Bacallado, Christophe Sabot, Pierre Tarrès

    Abstract: We define a linearly reinforced process called the *-Edge-Reinforced Random Walk (*-ERRW ) which can be seen as a Yaglom reversible, hence non-reversible, extension of the Edge-Reinforced Random Walk (ERRW) introduced by Coppersmith and Diaconis in 1986. This family of processes also generalizes the r-dependent ERRW introduced by Bacallado (2009). Under some assumptions on the initial weights, the… ▽ More

    Submitted 29 November, 2023; v1 submitted 17 February, 2021; originally announced February 2021.

    Comments: 20 pages

  6. arXiv:2004.07743  [pdf, other

    stat.AP stat.ME

    BETS: The dangers of selection bias in early analyses of the coronavirus disease (COVID-19) pandemic

    Authors: Qingyuan Zhao, Nianqiao Ju, Sergio Bacallado, Rajen D. Shah

    Abstract: The coronavirus disease 2019 (COVID-19) has quickly grown from a regional outbreak in Wuhan, China to a global pandemic. Early estimates of the epidemic growth and incubation period of COVID-19 may have been biased due to sample selection. Using detailed case reports from 14 locations in and outside mainland China, we obtained 378 Wuhan-exported cases who left Wuhan before an abrupt travel quarant… ▽ More

    Submitted 24 September, 2020; v1 submitted 16 April, 2020; originally announced April 2020.

    Comments: 33 pages, 8 figures, 5 tables; Accepted for publication in The Annals of Applied Statistics on 24th September, 2020

    MSC Class: 62P10; 62F15

  7. arXiv:1806.11370  [pdf, other

    stat.AP

    Bayesian Uncertainty Directed Trial Designs

    Authors: Steffen Ventz, Matteo Cellamare, Sergio Bacallado, Lorenzo Trippa

    Abstract: Most Bayesian response-adaptive designs unbalance randomization rates towards the most promising arms with the goal of increasing the number of positive treatment outcomes during the study, even though the primary aim of the trial is different. We discuss Bayesian uncertainty directed designs (BUD), a class of Bayesian designs in which the investigator specifies an information measure tailored to… ▽ More

    Submitted 29 June, 2018; originally announced June 2018.

  8. arXiv:1711.01241  [pdf, other

    stat.ME stat.AP

    Bayesian Mixed Effects Models for Zero-inflated Compositions in Microbiome Data Analysis

    Authors: Boyu Ren, Sergio Bacallado, Stefano Favaro, Tommi Vatanen, Curtis Huttenhower, Lorenzo Trippa

    Abstract: Detecting associations between microbial compositions and sample characteristics is one of the most important tasks in microbiome studies. Most of the existing methods apply univariate models to single microbial species separately, with adjustments for multiple hypothesis testing. We propose a Bayesian analysis for a generalized mixed effects linear model tailored to this application. The marginal… ▽ More

    Submitted 24 August, 2019; v1 submitted 3 November, 2017; originally announced November 2017.

  9. arXiv:1710.08045  [pdf, other

    cs.IR cs.LG stat.ML

    Sequential Matrix Completion

    Authors: Annie Marsden, Sergio Bacallado

    Abstract: We propose a novel algorithm for sequential matrix completion in a recommender system setting, where the $(i,j)$th entry of the matrix corresponds to a user $i$'s rating of product $j$. The objective of the algorithm is to provide a sequential policy for user-product pair recommendation which will yield the highest possible ratings after a finite time horizon. The algorithm uses a Gamma process fa… ▽ More

    Submitted 22 October, 2017; originally announced October 2017.

    Comments: 10 pages, 6 figures

  10. Bayesian Nonparametric Ordination for the Analysis of Microbial Communities

    Authors: Boyu Ren, Sergio Bacallado, Stefano Favaro, Susan Holmes, Lorenzo Trippa

    Abstract: Human microbiome studies use sequencing technologies to measure the abundance of bacterial species or Operational Taxonomic Units (OTUs) in samples of biological material. Typically the data are organized in contingency tables with OTU counts across heterogeneous biological samples. In the microbial ecology community, ordination methods are frequently used to investigate latent factors or clusters… ▽ More

    Submitted 20 January, 2017; v1 submitted 19 January, 2016; originally announced January 2016.

  11. Looking-backward probabilities for Gibbs-type exchangeable random partitions

    Authors: Sergio Bacallado, Stefano Favaro, Lorenzo Trippa

    Abstract: Gibbs-type random probability measures and the exchangeable random partitions they induce represent the subject of a rich and active literature. They provide a probabilistic framework for a wide range of theoretical and applied problems that are typically referred to as species sampling problems. In this paper, we consider the class of looking-backward species sampling problems introduced in Lijoi… ▽ More

    Submitted 3 April, 2015; originally announced April 2015.

    Comments: Published at http://dx.doi.org/10.3150/13-BEJ559 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

    Report number: IMS-BEJ-BEJ559

    Journal ref: Bernoulli 2015, Vol. 21, No. 1, 1-37

  12. Bayesian nonparametric analysis of reversible Markov chains

    Authors: Sergio Bacallado, Stefano Favaro, Lorenzo Trippa

    Abstract: We introduce a three-parameter random walk with reinforcement, called the $(θ,α,β)$ scheme, which generalizes the linearly edge reinforced random walk to uncountable spaces. The parameter $β$ smoothly tunes the $(θ,α,β)$ scheme between this edge reinforced random walk and the classical exchangeable two-parameter Hoppe urn scheme, while the parameters $α$ and $θ$ modulate how many states are typica… ▽ More

    Submitted 6 June, 2013; originally announced June 2013.

    Comments: Published in at http://dx.doi.org/10.1214/13-AOS1102 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS1102

    Journal ref: Annals of Statistics 2013, Vol. 41, No. 2, 870-896

  13. Bayesian analysis of variable-order, reversible Markov chains

    Authors: Sergio Bacallado

    Abstract: We define a conjugate prior for the reversible Markov chain of order $r$. The prior arises from a partially exchangeable reinforced random walk, in the same way that the Beta distribution arises from the exchangeable Polyá urn. An extension to variable-order Markov chains is also derived. We show the utility of this prior in testing the order and estimating the parameters of a reversible Markov mo… ▽ More

    Submitted 13 May, 2011; originally announced May 2011.

    Comments: Published in at http://dx.doi.org/10.1214/10-AOS857 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS857

    Journal ref: Annals of Statistics 2011, Vol. 39, No. 2, 838-864