Skip to main content

Showing 1–8 of 8 results for author: Havasi, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16829  [pdf, other

    cs.CL cs.AI cs.LG

    Understanding and Mitigating Tokenization Bias in Language Models

    Authors: Buu Phan, Marton Havasi, Matthew Muckley, Karen Ullrich

    Abstract: State-of-the-art language models are autoregressive and operate on subword units known as tokens. Specifically, one must encode the conditioning string into a list of tokens before passing to the language models for next-token prediction. We show that, for encoding schemes such as maximum prefix matching, tokenization induces a sampling bias that cannot be mitigated with more training or data. To… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2402.12737  [pdf, other

    cs.LG

    Guarantee Regions for Local Explanations

    Authors: Marton Havasi, Sonali Parbhoo, Finale Doshi-Velez

    Abstract: Interpretability methods that utilise local surrogate models (e.g. LIME) are very good at describing the behaviour of the predictive model at a point of interest, but they are not guaranteed to extrapolate to the local region surrounding the point. However, overfitting to the local curvature of the predictive model and malicious tampering can significantly limit extrapolation. We propose an anchor… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  3. arXiv:2211.05667  [pdf, ps, other

    cs.LG

    What Makes a Good Explanation?: A Harmonized View of Properties of Explanations

    Authors: Zixi Chen, Varshini Subhash, Marton Havasi, Weiwei Pan, Finale Doshi-Velez

    Abstract: Interpretability provides a means for humans to verify aspects of machine learning (ML) models and empower human+ML teaming in situations where the task cannot be fully automated. Different contexts require explanations with different properties. For example, the kind of explanation required to determine if an early cardiac arrest warning system is ready to be integrated into a care setting is ver… ▽ More

    Submitted 2 December, 2022; v1 submitted 10 November, 2022; originally announced November 2022.

    Comments: Short version accepted at NeurIPS 2022 workshops on Progress and Challenges in Building Trustworthy Embodied AI and Trustworthy and Socially Responsible Machine Learning

  4. arXiv:2106.04015  [pdf, other

    cs.LG

    Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning

    Authors: Zachary Nado, Neil Band, Mark Collier, Josip Djolonga, Michael W. Dusenberry, Sebastian Farquhar, Qixuan Feng, Angelos Filos, Marton Havasi, Rodolphe Jenatton, Ghassen Jerfel, Jeremiah Liu, Zelda Mariet, Jeremy Nixon, Shreyas Padhy, Jie Ren, Tim G. J. Rudner, Faris Sbahi, Yeming Wen, Florian Wenzel, Kevin Murphy, D. Sculley, Balaji Lakshminarayanan, Jasper Snoek, Yarin Gal , et al. (1 additional authors not shown)

    Abstract: High-quality estimates of uncertainty and robustness are crucial for numerous real-world applications, especially for deep learning which underlies many deployed ML systems. The ability to compare techniques for improving these estimates is therefore very important for research and practice alike. Yet, competitive comparisons of methods are often lacking due to a range of reasons, including: compu… ▽ More

    Submitted 5 January, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

  5. arXiv:2010.06610  [pdf, other

    cs.LG cs.CV stat.ML

    Training independent subnetworks for robust prediction

    Authors: Marton Havasi, Rodolphe Jenatton, Stanislav Fort, Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew M. Dai, Dustin Tran

    Abstract: Recent approaches to efficiently ensemble neural networks have shown that strong robustness and uncertainty performance can be achieved with a negligible gain in parameters over the original network. However, these methods still require multiple forward passes for prediction, leading to a significant computational cost. In this work, we show a surprising result: the benefits of using multiple pred… ▽ More

    Submitted 4 August, 2021; v1 submitted 13 October, 2020; originally announced October 2020.

    Comments: Updated to the ICLR camera ready version, added reference to Soflaei et al. 2020

  6. arXiv:2010.01185  [pdf, other

    cs.IT eess.IV stat.ML

    Compressing Images by Encoding Their Latent Representations with Relative Entropy Coding

    Authors: Gergely Flamich, Marton Havasi, José Miguel Hernández-Lobato

    Abstract: Variational Autoencoders (VAEs) have seen widespread use in learned image compression. They are used to learn expressive latent representations on which downstream compression methods can operate with high efficiency. Recently proposed 'bits-back' methods can indirectly encode the latent representation of images with codelength close to the relative entropy between the latent posterior and the pri… ▽ More

    Submitted 19 April, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

    Comments: Accepted at the 34th Conference on Neural Information Processing Systems (NeurIPS 2020)

    MSC Class: 94A08 (Primary) 94A34 (Secondary) ACM Class: E.4; G.3; H.1.1

  7. arXiv:1810.00440  [pdf, other

    stat.ML cs.LG

    Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters

    Authors: Marton Havasi, Robert Peharz, José Miguel Hernández-Lobato

    Abstract: While deep neural networks are a highly successful model class, their large memory footprint puts considerable strain on energy consumption, communication bandwidth, and storage requirements. Consequently, model size reduction has become an utmost goal in deep learning. A typical approach is to train a set of deterministic weights, while applying certain techniques such as pruning and quantization… ▽ More

    Submitted 30 September, 2018; originally announced October 2018.

    Comments: Under review as a conference paper at ICLR 2019

  8. arXiv:1806.05490  [pdf, other

    stat.ML cs.LG

    Inference in Deep Gaussian Processes using Stochastic Gradient Hamiltonian Monte Carlo

    Authors: Marton Havasi, José Miguel Hernández-Lobato, Juan José Murillo-Fuentes

    Abstract: Deep Gaussian Processes (DGPs) are hierarchical generalizations of Gaussian Processes that combine well calibrated uncertainty estimates with the high flexibility of multilayer models. One of the biggest challenges with these models is that exact inference is intractable. The current state-of-the-art inference method, Variational Inference (VI), employs a Gaussian approximation to the posterior di… ▽ More

    Submitted 12 November, 2018; v1 submitted 14 June, 2018; originally announced June 2018.