Skip to main content

Showing 1–18 of 18 results for author: Martin, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2206.09076  [pdf, other

    stat.ML cs.LG stat.ME

    Fair Generalized Linear Models with a Convex Penalty

    Authors: Hyungrok Do, Preston Putzel, Axel Martin, Padhraic Smyth, Judy Zhong

    Abstract: Despite recent advances in algorithmic fairness, methodologies for achieving fairness with generalized linear models (GLMs) have yet to be explored in general, despite GLMs being widely used in practice. In this paper we introduce two fairness criteria for GLMs based on equalizing expected outcomes or log-likelihoods. We prove that for GLMs both criteria can be achieved via a convex penalty term b… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

    Comments: Accepted for publication in ICML 2022

  2. arXiv:2206.03619  [pdf, other

    stat.CO

    Bayesian additive regression trees for probabilistic programming

    Authors: Miriana Quiroga, Pablo G Garay, Juan M. Alonso, Juan Martin Loyola, Osvaldo A Martin

    Abstract: Bayesian additive regression trees (BART) is a non-parametric method to approximate functions. It is a black-box method based on the sum of many trees where priors are used to regularize inference, mainly by restricting trees' learning capacity so that no individual tree is able to explain the data, but rather the sum of trees. We discuss BART in the context of probabilistic programming languages… ▽ More

    Submitted 15 August, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: 22 pages, 17 figures

  3. arXiv:2112.01380  [pdf, other

    stat.ME

    Prior knowledge elicitation: The past, present, and future

    Authors: Petrus Mikkola, Osvaldo A. Martin, Suyog Chandramouli, Marcelo Hartmann, Oriol Abril Pla, Owen Thomas, Henri Pesonen, Jukka Corander, Aki Vehtari, Samuel Kaski, Paul-Christian Bürkner, Arto Klami

    Abstract: Specification of the prior distribution for a Bayesian model is a central part of the Bayesian workflow for data analysis, but it is often difficult even for statistical experts. In principle, prior elicitation transforms domain knowledge of various kinds into well-defined prior distributions, and offers a solution to the prior specification problem. In practice, however, we are still fairly far f… ▽ More

    Submitted 9 May, 2023; v1 submitted 1 December, 2021; originally announced December 2021.

    Comments: 69 pages, 1 figure

  4. arXiv:2107.04126  [pdf, other

    stat.ML cs.LG

    Many Objective Bayesian Optimization

    Authors: Lucia Asencio Martín, Eduardo C. Garrido-Merchán

    Abstract: Some real problems require the evaluation of expensive and noisy objective functions. Moreover, the analytical expression of these objective functions may be unknown. These functions are known as black-boxes, for example, estimating the generalization error of a machine learning algorithm and computing its prediction time in terms of its hyper-parameters. Multi-objective Bayesian optimization (MOB… ▽ More

    Submitted 8 July, 2021; originally announced July 2021.

    Comments: arXiv admin note: text overlap with arXiv:2101.08061

  5. arXiv:2012.10754  [pdf, other

    stat.CO

    Bambi: A simple interface for fitting Bayesian linear models in Python

    Authors: Tomás Capretto, Camen Piho, Ravin Kumar, Jacob Westfall, Tal Yarkoni, Osvaldo A. Martin

    Abstract: The popularity of Bayesian statistical methods has increased dramatically in recent years across many research areas and industrial applications. This is the result of a variety of methodological advances with faster and cheaper hardware as well as the development of new software tools. Here we introduce an open source Python package named Bambi (BAyesian Model Building Interface) that is built on… ▽ More

    Submitted 11 January, 2022; v1 submitted 19 December, 2020; originally announced December 2020.

    Comments: 25 pages 10 figures, to be published in the Journal of Statistical Software

  6. arXiv:2009.09358  [pdf, ps, other

    cs.LG stat.ML

    Out-Of-Bag Anomaly Detection

    Authors: Egor Klevak, Sangdi Lin, Andy Martin, Ondrej Linda, Eric Ringger

    Abstract: Data anomalies are ubiquitous in real world datasets, and can have an adverse impact on machine learning (ML) systems, such as automated home valuation. Detecting anomalies could make ML applications more responsible and trustworthy. However, the lack of labels for anomalies and the complex nature of real-world datasets make anomaly detection a challenging unsupervised learning problem. In this pa… ▽ More

    Submitted 20 September, 2020; originally announced September 2020.

    Comments: 13 pages, 4 figures, KDD 2020 TrueFact Workshop: Making a Credible Web for Tomorrow

  7. arXiv:2007.08620  [pdf, other

    cs.LG cs.AI stat.ML

    The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction

    Authors: Alice Martin, Charles Ollion, Florian Strub, Sylvain Le Corff, Olivier Pietquin

    Abstract: This paper introduces the Sequential Monte Carlo Transformer, an original approach that naturally captures the observations distribution in a transformer architecture. The keys, queries, values and attention vectors of the network are considered as the unobserved stochastic states of its hidden structure. This generative model is such that at each time step the received observation is a random fun… ▽ More

    Submitted 15 December, 2020; v1 submitted 15 July, 2020; originally announced July 2020.

  8. arXiv:2002.05438  [pdf, other

    stat.AP stat.ME

    Backward importance sampling for online estimation of state space models

    Authors: Alice Martin, Marie-Pierre Etienne, Pierre Gloaguen, Sylvain Le Corff, Jimmy Olsson

    Abstract: This paper proposes a new Sequential Monte Carlo algorithm to perform online estimation in the context of state space models when either the transition density of the latent state or the conditional likelihood of an observation given a state is intractable. In this setting, obtaining low variance estimators of expectations under the posterior distributions of the unobserved states given the observ… ▽ More

    Submitted 7 May, 2021; v1 submitted 13 February, 2020; originally announced February 2020.

  9. arXiv:2001.08049  [pdf, other

    stat.ML cs.LG

    On Last-Layer Algorithms for Classification: Decoupling Representation from Uncertainty Estimation

    Authors: Nicolas Brosse, Carlos Riquelme, Alice Martin, Sylvain Gelly, Éric Moulines

    Abstract: Uncertainty quantification for deep learning is a challenging open problem. Bayesian statistics offer a mathematically grounded framework to reason about uncertainties; however, approximate posteriors for modern neural networks still require prohibitive computational costs. We propose a family of algorithms which split the classification task into two stages: representation learning and uncertaint… ▽ More

    Submitted 22 January, 2020; originally announced January 2020.

  10. arXiv:1909.10678  [pdf, other

    stat.ME stat.CO

    Approximate Bayesian inference of directed acyclic graphs in biology with flexible priors on edge states

    Authors: Evan A Martin, Audrey Qiuyan Fu

    Abstract: Graphical models or networks describe the statistical dependence among multiple variables and are widely used in biology (e.g., gene regulatory networks). Under appropriate assumptions, directed edges may represent causal relationships. A key feature of a biological network is sparsity, defined by how likely an edge is present, of which we often have some knowledge. However, most existing Bayesian… ▽ More

    Submitted 27 November, 2023; v1 submitted 23 September, 2019; originally announced September 2019.

    Comments: Contains manuscript and supplementary materials

  11. arXiv:1810.12997  [pdf, other

    math.OC cs.AI cs.LG stat.ML

    An Online-Learning Approach to Inverse Optimization

    Authors: Andreas Bärmann, Alexander Martin, Sebastian Pokutta, Oskar Schneider

    Abstract: In this paper, we demonstrate how to learn the objective function of a decision-maker while only observing the problem input data and the decision-maker's corresponding decisions over multiple rounds. We present exact algorithms for this online version of inverse optimization which converge at a rate of $ \mathcal{O}(1/\sqrt{T}) $ in the number of observations~$T$ and compare their further propert… ▽ More

    Submitted 28 March, 2020; v1 submitted 30 October, 2018; originally announced October 2018.

    MSC Class: 68Q32; 68T05; 90C90; 90C11

  12. Grand Challenge: Real-time Destination and ETA Prediction for Maritime Traffic

    Authors: Oleh Bodunov, Florian Schmidt, André Martin, Andrey Brito, Christof Fetzer

    Abstract: In this paper, we present our approach for solving the DEBS Grand Challenge 2018. The challenge asks to provide a prediction for (i) a destination and the (ii) arrival time of ships in a streaming-fashion using Geo-spatial data in the maritime context. Novel aspects of our approach include the use of ensemble learning based on Random Forest, Gradient Boosting Decision Trees (GBDT), XGBoost Trees a… ▽ More

    Submitted 12 October, 2018; originally announced October 2018.

  13. arXiv:1806.01899  [pdf, other

    stat.ML cs.LG

    MRPC: An R package for accurate inference of causal graphs

    Authors: Md. Bahadur Badsha, Evan A Martin, Audrey Qiuyan Fu

    Abstract: We present MRPC, an R package that learns causal graphs with improved accuracy over existing packages, such as pcalg and bnlearn. Our algorithm builds on the powerful PC algorithm, the canonical algorithm in computer science for learning directed acyclic graphs. The improvement in accuracy results from online control of the false discovery rate (FDR) that reduces false positive edges, a more accur… ▽ More

    Submitted 5 June, 2018; originally announced June 2018.

  14. arXiv:1804.07580  [pdf

    cs.LG q-bio.QM stat.ML

    Robust And Scalable Learning Of Complex Dataset Topologies Via Elpigraph

    Authors: Luca Albergante, Evgeny M. Mirkes, Huidong Chen, Alexis Martin, Louis Faure, Emmanuel Barillot, Luca Pinello, Alexander N. Gorban, Andrei Zinovyev

    Abstract: Large datasets represented by multidimensional data point clouds often possess non-trivial distributions with branching trajectories and excluded regions, with the recent single-cell transcriptomic studies of develo** embryo being notable examples. Reducing the complexity and producing compact and interpretable representations of such data remains a challenging task. Most of the existing computa… ▽ More

    Submitted 20 June, 2018; v1 submitted 20 April, 2018; originally announced April 2018.

    Comments: 32 pages, 14 figures

    Journal ref: Entropy 22, no. 3: 296, 2020

  15. Dynamic time war** distance for message propagation classification in Twitter

    Authors: Siwar Jendoubi, Arnaud Martin, Ludovic Liétard, Boutheina Ben Yaghlane, Hend Ben Hadji

    Abstract: Social messages classification is a research domain that has attracted the attention of many researchers in these last years. Indeed, the social message is different from ordinary text because it has some special characteristics like its shortness. Then the development of new approaches for the processing of the social message is now essential to make its classification more efficient. In this pap… ▽ More

    Submitted 26 January, 2017; originally announced January 2017.

    Comments: 10 pages, 1 figure ECSQARU 2015, Proceedings of the 13th European Conferences on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, 2015

  16. Evidential-EM Algorithm Applied to Progressively Censored Observations

    Authors: Kuang Zhou, Arnaud Martin, Quan Pan

    Abstract: Evidential-EM (E2M) algorithm is an effective approach for computing maximum likelihood estimations under finite mixture models, especially when there is uncertain information about data. In this paper we present an extension of the E2M method in a particular case of incom-plete data, where the loss of information is due to both mixture models and censored observations. The prior uncertain informa… ▽ More

    Submitted 7 January, 2015; originally announced January 2015.

    Journal ref: 15th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Jul 2014, Montpellier, France. pp.180 - 189

  17. arXiv:1403.4700   

    physics.soc-ph stat.AP

    Kinetic modeling of opinion formation of peoples via multiple political parties

    Authors: Ryosuke Yano, Arnaud Martin

    Abstract: We investigate the opinion formation among the peoples and multiple political parties using the one dimensional relativistic Boltzmann-Vlasov equation for multi-components. A political party is constituted of politicians. The opinion formation depends on self-thinkings of peoples and politicians, and the constraint of the political party over opinions of politicians, when we restrict ourselves to… ▽ More

    Submitted 16 May, 2014; v1 submitted 19 March, 2014; originally announced March 2014.

    Comments: Further considerations are essential for binary exchange of opinions. Present version is insufficient

  18. Opinion formation with upper and lower bounds

    Authors: Ryosuke Yano, Arnaud Martin

    Abstract: We investigate the opinion formation with upper and lower bounds. We formulate the binary exchange of opinions between two individuals, and effects of the self-thinking and political party using the relativistic Boltzmann-Vlasov type equation with the randomly perturbed motion. The convergent form of the distribution function is determined by the balance between the cooling rate via the binary exc… ▽ More

    Submitted 23 November, 2015; v1 submitted 28 February, 2014; originally announced February 2014.

    Comments: We revised in April 2014