Skip to main content

Showing 1–29 of 29 results for author: Cemgil, A T

.
  1. arXiv:2405.01563  [pdf, other

    cs.LG cs.AI cs.CL

    Mitigating LLM Hallucinations via Conformal Abstention

    Authors: Yasin Abbasi Yadkori, Ilja Kuzborskij, David Stutz, András György, Adam Fisch, Arnaud Doucet, Iuliya Beloshapka, Wei-Hung Weng, Yao-Yuan Yang, Csaba Szepesvári, Ali Taylan Cemgil, Nenad Tomasev

    Abstract: We develop a principled procedure for determining when a large language model (LLM) should abstain from responding (e.g., by saying "I don't know") in a general domain, instead of resorting to possibly "hallucinating" a non-sensical or incorrect answer. Building on earlier approaches that use self-consistency as a more reliable measure of model confidence, we propose using the LLM itself to self-e… ▽ More

    Submitted 4 April, 2024; originally announced May 2024.

  2. arXiv:2307.09302  [pdf, other

    cs.LG cs.CV stat.ME stat.ML

    Conformal prediction under ambiguous ground truth

    Authors: David Stutz, Abhijit Guha Roy, Tatiana Matejovicova, Patricia Strachan, Ali Taylan Cemgil, Arnaud Doucet

    Abstract: Conformal Prediction (CP) allows to perform rigorous uncertainty quantification by constructing a prediction set $C(X)$ satisfying $\mathbb{P}(Y \in C(X))\geq 1-α$ for a user-chosen $α\in [0,1]$ by relying on calibration data $(X_1,Y_1),...,(X_n,Y_n)$ from $\mathbb{P}=\mathbb{P}^{X} \otimes \mathbb{P}^{Y|X}$. It is typically implicitly assumed that $\mathbb{P}^{Y|X}$ is the "true" posterior label… ▽ More

    Submitted 24 October, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

  3. arXiv:2307.02191  [pdf, other

    cs.LG cs.CV stat.ME stat.ML

    Evaluating AI systems under uncertain ground truth: a case study in dermatology

    Authors: David Stutz, Ali Taylan Cemgil, Abhijit Guha Roy, Tatiana Matejovicova, Melih Barsbey, Patricia Strachan, Mike Schaekermann, Jan Freyberg, Rajeev Rikhye, Beverly Freeman, Javier Perez Matos, Umesh Telang, Dale R. Webster, Yuan Liu, Greg S. Corrado, Yossi Matias, Pushmeet Kohli, Yun Liu, Arnaud Doucet, Alan Karthikesalingam

    Abstract: For safety, AI systems in health undergo thorough evaluations before deployment, validating their predictions against a ground truth that is assumed certain. However, this is actually not the case and the ground truth may be uncertain. Unfortunately, this is largely ignored in standard evaluation of AI models but can have severe consequences such as overestimating the future performance. To avoid… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

  4. arXiv:2302.00049  [pdf, other

    cs.LG

    Transformers Meet Directed Graphs

    Authors: Simon Geisler, Yujia Li, Daniel Mankowitz, Ali Taylan Cemgil, Stephan Günnemann, Cosmin Paduraru

    Abstract: Transformers were originally proposed as a sequence-to-sequence model for text but have become vital for a wide range of modalities, including images, audio, video, and undirected graphs. However, transformers for directed graphs are a surprisingly underexplored topic, despite their applicability to ubiquitous domains, including source code and logic circuits. In this work, we propose two directio… ▽ More

    Submitted 31 August, 2023; v1 submitted 31 January, 2023; originally announced February 2023.

    Comments: 29 pages

  5. arXiv:2209.02270  [pdf, other

    cs.LG cs.CV cs.NI eess.SY

    An Indoor Localization Dataset and Data Collection Framework with High Precision Position Annotation

    Authors: F. Serhan Daniş, A. Teoman Naskali, A. Taylan Cemgil, Cem Ersoy

    Abstract: We introduce a novel technique and an associated high resolution dataset that aims to precisely evaluate wireless signal based indoor positioning algorithms. The technique implements an augmented reality (AR) based positioning system that is used to annotate the wireless signal parameter data samples with high precision position data. We track the position of a practical and low cost navigable set… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

    Comments: 30 pages

    Journal ref: F. Serhan Daniş, A. Teoman Naskali, A. Taylan Cemgil, Cem Ersoy, "An indoor localization dataset and data collection framework with high precision position annotation", Pervasive and Mobile Computing, Volume 81, 101554, 2022

  6. arXiv:2110.09192  [pdf, other

    cs.LG cs.CV stat.ME stat.ML

    Learning Optimal Conformal Classifiers

    Authors: David Stutz, Krishnamurthy, Dvijotham, Ali Taylan Cemgil, Arnaud Doucet

    Abstract: Modern deep learning based classifiers show very high accuracy on test data but this does not provide sufficient guarantees for safe deployment, especially in high-stake AI applications such as medical diagnosis. Usually, predictions are obtained without a reliable uncertainty estimate or a formal guarantee. Conformal prediction (CP) addresses these issues by using the classifier's predictions, e.… ▽ More

    Submitted 6 May, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

    Comments: ICLR 2022

  7. arXiv:2012.12862  [pdf, other

    cs.IR cs.LG stat.ML

    Towards Fair Personalization by Avoiding Feedback Loops

    Authors: Gökhan Çapan, Özge Bozal, İlker Gündoğdu, Ali Taylan Cemgil

    Abstract: Self-reinforcing feedback loops are both cause and effect of over and/or under-presentation of some content in interactive recommender systems. This leads to erroneous user preference estimates, namely, overestimation of over-presented content while violating the right to be presented of each alternative, contrary of which we define as a fair system. We consider two models that explicitly incorpor… ▽ More

    Submitted 20 December, 2020; originally announced December 2020.

    Comments: NeurIPS 2019 Workshop on Human-Centric Machine Learning

  8. arXiv:2012.03715  [pdf, other

    cs.LG stat.ML

    Autoencoding Variational Autoencoder

    Authors: A. Taylan Cemgil, Sumedh Ghaisas, Krishnamurthy Dvijotham, Sven Gowal, Pushmeet Kohli

    Abstract: Does a Variational AutoEncoder (VAE) consistently encode typical samples generated from its decoder? This paper shows that the perhaps surprising answer to this question is `No'; a (nominally trained) VAE does not necessarily amortize inference for typical samples that it is capable of generating. We study the implications of this behaviour on the learned representations and also the consequences… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

    Comments: Neurips 2020

  9. arXiv:2010.01550  [pdf, other

    cs.LG stat.AP stat.ML

    Intermittent Demand Forecasting with Renewal Processes

    Authors: Ali Caner Turkmen, Tim Januschowski, Yuyang Wang, Ali Taylan Cemgil

    Abstract: Intermittency is a common and challenging problem in demand forecasting. We introduce a new, unified framework for building intermittent demand forecasting models, which incorporates and allows to generalize existing methods in several directions. Our framework is based on extensions of well-established model-based methods to discrete-time renewal processes, which can parsimoniously account for pa… ▽ More

    Submitted 4 October, 2020; originally announced October 2020.

  10. arXiv:1909.03280  [pdf, ps, other

    physics.optics physics.comp-ph

    Gaussian Process and Design of Experiments for Surrogate Modeling of Optical Properties of Fractal Aggregates

    Authors: Ozan Burak Ericok, Atay Kaan Ozbek, Ali Taylan Cemgil, Hakan Erturk

    Abstract: A systematic approach based on the principles of supervised learning and design of experiments concepts is introduced to build a surrogate model for estimating the optical properties of fractal aggregates. The surrogate model is built on Gaussian process (GP) regression, and the input points for the GP regression are sampled with an adaptive sequential design algorithm. The covariance functions us… ▽ More

    Submitted 7 September, 2019; originally announced September 2019.

    Comments: 19 pages, 8 figures

  11. arXiv:1908.05640  [pdf, other

    stat.ML cs.LG

    A Bayesian Choice Model for Eliminating Feedback Loops

    Authors: Gökhan Çapan, Ilker Gündoğdu, Ali Caner Türkmen, Çağrı Sofuoğlu, Ali Taylan Cemgil

    Abstract: Self-reinforcing feedback loops in personalization systems are typically caused by users choosing from a limited set of alternatives presented systematically based on previous choices. We propose a Bayesian choice model built on Luce axioms that explicitly accounts for users' limited exposure to alternatives. Our model is fair---it does not impose negative bias towards unpresented alternatives, an… ▽ More

    Submitted 21 August, 2019; v1 submitted 15 August, 2019; originally announced August 2019.

  12. arXiv:1903.04478  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Bayesian Allocation Model: Inference by Sequential Monte Carlo for Nonnegative Tensor Factorizations and Topic Models using Polya Urns

    Authors: Ali Taylan Cemgil, Mehmet Burak Kurutmaz, Sinan Yildirim, Melih Barsbey, Umut Simsekli

    Abstract: We introduce a dynamic generative model, Bayesian allocation model (BAM), which establishes explicit connections between nonnegative tensor factorization (NTF), graphical models of discrete probability distributions and their Bayesian extensions, and the topic models such as the latent Dirichlet allocation. BAM is based on a Poisson process, whose events are marked by using a Bayesian network, whe… ▽ More

    Submitted 11 March, 2019; originally announced March 2019.

    Comments: 70 pages, 16 figures

  13. arXiv:1812.01502  [pdf, other

    stat.CO

    Parallelising Particle Filters with Butterfly Interactions

    Authors: Kari Heine, Nick Whiteley, A. Taylan Cemgil

    Abstract: Bootstrap particle filter (BPF) is the corner stone of many popular algorithms used for solving inference problems involving time series that are observed through noisy measurements in a non-linear and non-Gaussian context. The long term stability of BPF arises from particle interactions which in the context of modern parallel computing systems typically means that particle information needs to be… ▽ More

    Submitted 4 December, 2018; originally announced December 2018.

    Comments: 35 pages, 4 figures

  14. arXiv:1810.13104  [pdf, other

    cs.SD cs.LG eess.AS

    Audio Source Separation Using Variational Autoencoders and Weak Class Supervision

    Authors: Ertuğ Karamatlı, Ali Taylan Cemgil, Serap Kırbız

    Abstract: In this paper, we propose a source separation method that is trained by observing the mixtures and the class labels of the sources present in the mixture without any access to isolated sources. Since our method does not require source class labels for every time-frequency bin but only a single label for each source constituting the mixture signal, we call this scenario as weak class supervision. W… ▽ More

    Submitted 4 August, 2019; v1 submitted 31 October, 2018; originally announced October 2018.

    Comments: Accepted version

    Journal ref: IEEE Signal Processing Letters 26 (2019) 1349-1353

  15. arXiv:1806.02617  [pdf, other

    stat.ML cs.LG

    Asynchronous Stochastic Quasi-Newton MCMC for Non-Convex Optimization

    Authors: Umut Şimşekli, Çağatay Yıldız, Thanh Huy Nguyen, Gaël Richard, A. Taylan Cemgil

    Abstract: Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a strong potential in non-convex optimization, where local and global convergence guarantees can be shown under certain conditions. By building up on this recent theory, in this study, we develop an asynchronous-parallel stochastic L-BFGS algorithm for non-convex optimization. The proposed algorithm i… ▽ More

    Submitted 7 June, 2018; originally announced June 2018.

    Comments: Published in the International Conference on Machine Learning (ICML 2018)

  16. arXiv:1712.02629  [pdf, ps, other

    stat.ML cs.LG

    Differentially Private Variational Dropout

    Authors: Beyza Ermis, Ali Taylan Cemgil

    Abstract: Deep neural networks with their large number of parameters are highly flexible learning systems. The high flexibility in such networks brings with some serious problems such as overfitting, and regularization is used to address this problem. A currently popular and effective regularization technique for controlling the overfitting is dropout. Often, large data collections required for neural netwo… ▽ More

    Submitted 16 December, 2017; v1 submitted 30 November, 2017; originally announced December 2017.

    Comments: arXiv admin note: substantial text overlap with arXiv:1712.01665

  17. arXiv:1712.01665  [pdf, ps, other

    stat.ML cs.LG

    Differentially Private Dropout

    Authors: Beyza Ermis, Ali Taylan Cemgil

    Abstract: Large data collections required for the training of neural networks often contain sensitive information such as the medical histories of patients, and the privacy of the training data must be preserved. In this paper, we introduce a dropout technique that provides an elegant Bayesian interpretation to dropout, and show that the intrinsic noise added, with the primary goal of regularization, can be… ▽ More

    Submitted 30 November, 2017; originally announced December 2017.

    Comments: arXiv admin note: text overlap with arXiv:1611.00340 by other authors

  18. arXiv:1602.03442  [pdf, other

    stat.ML

    Stochastic Quasi-Newton Langevin Monte Carlo

    Authors: Umut Şimşekli, Roland Badeau, A. Taylan Cemgil, Gaël Richard

    Abstract: Recently, Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods have been proposed for scaling up Monte Carlo computations to large data problems. Whilst these approaches have proven useful in many applications, vanilla SG-MCMC might suffer from poor mixing rates when random variables exhibit strong couplings under the target densities or big scale differences. In this study, we propose a… ▽ More

    Submitted 12 December, 2016; v1 submitted 10 February, 2016; originally announced February 2016.

    Comments: Published in ICML 2016, International Conference on Machine Learning 2016, New York, NY, USA

  19. arXiv:1509.01698  [pdf, other

    stat.ML cs.LG

    HAMSI: A Parallel Incremental Optimization Algorithm Using Quadratic Approximations for Solving Partially Separable Problems

    Authors: Kamer Kaya, Figen Öztoprak, Ş. İlker Birbil, A. Taylan Cemgil, Umut Şimşekli, Nurdan Kuru, Hazal Koptagel, M. Kaan Öztürk

    Abstract: We propose HAMSI (Hessian Approximated Multiple Subsets Iteration), which is a provably convergent, second order incremental algorithm for solving large-scale partially separable optimization problems. The algorithm is based on a local quadratic approximation, and hence, allows incorporating curvature information to speed-up the convergence. HAMSI is inherently parallel and it scales nicely with t… ▽ More

    Submitted 4 August, 2017; v1 submitted 5 September, 2015; originally announced September 2015.

    Comments: The software is available at https://github.com/spartensor/hamsi-mf

  20. arXiv:1506.01418  [pdf, other

    stat.ML

    Parallel Stochastic Gradient Markov Chain Monte Carlo for Matrix Factorisation Models

    Authors: Umut Şimşekli, Hazal Koptagel, Hakan Güldaş, A. Taylan Cemgil, Figen Öztoprak, Ş. İlker Birbil

    Abstract: For large matrix factorisation problems, we develop a distributed Markov Chain Monte Carlo (MCMC) method based on stochastic gradient Langevin dynamics (SGLD) that we call Parallel SGLD (PSGLD). PSGLD has very favourable scaling properties with increasing data size and is comparable in terms of computational requirements to optimisation methods based on stochastic gradient descent. PSGLD achieves… ▽ More

    Submitted 28 September, 2015; v1 submitted 3 June, 2015; originally announced June 2015.

    Comments: 10 pages, 6 figures

  21. arXiv:1411.5876  [pdf, ps, other

    stat.ME

    Butterfly resampling: asymptotics for particle filters with constrained interactions

    Authors: Kari Heine, Nick Whiteley, A. Taylan Cemgil, Hakan Guldas

    Abstract: We generalize the elementary mechanism of sampling with replacement $N$ times from a weighted population of size $N$, by introducing auxiliary variables and constraints on conditional independence characterised by modular congruence relations. Motivated by considerations of parallelism, a convergence study reveals how sparsity of the mechanism's conditional independence graph is related to fluctua… ▽ More

    Submitted 21 November, 2014; originally announced November 2014.

    Comments: 29 pages, supplementary material (46 pages)

    MSC Class: 60F05; 60F99; 60G35

  22. arXiv:1410.6830  [pdf, ps, other

    cs.CL cs.LG

    Clustering Words by Projection Entropy

    Authors: Işık Barış Fidaner, Ali Taylan Cemgil

    Abstract: We apply entropy agglomeration (EA), a recently introduced algorithm, to cluster the words of a literary text. EA is a greedy agglomerative procedure that minimizes projection entropy (PE), a function that can quantify the segmentedness of an element set. To apply it, the text is reduced to a feature allocation, a combinatorial object to represent the word occurences in the text's paragraphs. The… ▽ More

    Submitted 24 October, 2014; originally announced October 2014.

    Comments: Accepted to NIPS 2014 Modern ML+NLP Workshop: http://www.cs.cmu.edu/~apparikh/nips2014ml-nlp/

  23. arXiv:1409.8276  [pdf, other

    cs.LG math.NA stat.ML

    A Bayesian Tensor Factorization Model via Variational Inference for Link Prediction

    Authors: Beyza Ermis, A. Taylan Cemgil

    Abstract: Probabilistic approaches for tensor factorization aim to extract meaningful structure from incomplete data by postulating low rank constraints. Recently, variational Bayesian (VB) inference techniques have successfully been applied to large scale models. This paper presents full Bayesian inference via VB on both single and coupled tensor factorization models. Our method can be run even for very la… ▽ More

    Submitted 29 September, 2014; originally announced September 2014.

    Comments: arXiv admin note: substantial text overlap with arXiv:1409.8083

  24. arXiv:1409.8083  [pdf, other

    stat.CO math.NA

    Variational Inference For Probabilistic Latent Tensor Factorization with KL Divergence

    Authors: Beyza Ermis, Y. Kenan Yılmaz, A. Taylan Cemgil, Evrim Acar

    Abstract: Probabilistic Latent Tensor Factorization (PLTF) is a recently proposed probabilistic framework for modelling multi-way data. Not only the common tensor factorization models but also any arbitrary tensor factorization structure can be realized by the PLTF framework. This paper presents full Bayesian inference via variational Bayes that facilitates more powerful modelling and allows more sophistica… ▽ More

    Submitted 29 September, 2014; originally announced September 2014.

  25. arXiv:1401.2490  [pdf, ps, other

    cs.LG stat.CO stat.ML

    An Online Expectation-Maximisation Algorithm for Nonnegative Matrix Factorisation Models

    Authors: Sinan Yildirim, A. Taylan Cemgil, Sumeetpal S. Singh

    Abstract: In this paper we formulate the nonnegative matrix factorisation (NMF) problem as a maximum likelihood estimation problem for hidden Markov models and propose online expectation-maximisation (EM) algorithms to estimate the NMF and the other unknown static parameters. We also propose a sequential Monte Carlo approximation of our online EM algorithm. We show the performance of the proposed method wit… ▽ More

    Submitted 10 January, 2014; originally announced January 2014.

    Comments: 6 pages, 3 figures

    Journal ref: 16th IFAC Symposium on System Identification, 2012, Volume 16, Part 1,

  26. arXiv:1310.0509  [pdf, ps, other

    cs.LG stat.ML

    Summary Statistics for Partitionings and Feature Allocations

    Authors: Işık Barış Fidaner, Ali Taylan Cemgil

    Abstract: Infinite mixture models are commonly used for clustering. One can sample from the posterior of mixture assignments by Monte Carlo methods or find its maximum a posteriori solution by optimization. However, in some problems the posterior is diffuse and it is hard to interpret the sampled partitionings. In this paper, we introduce novel statistics based on block sizes for representing sample sets of… ▽ More

    Submitted 25 November, 2013; v1 submitted 1 October, 2013; originally announced October 2013.

    Comments: Accepted to NIPS 2013: https://nips.cc/Conferences/2013/Program/event.php?ID=3763

  27. arXiv:1209.4280  [pdf, ps, other

    stat.ML cs.IT math.ST

    Alpha/Beta Divergences and Tweedie Models

    Authors: Y. Kenan Yilmaz, A. Taylan Cemgil

    Abstract: We describe the underlying probabilistic interpretation of alpha and beta divergences. We first show that beta divergences are inherently tied to Tweedie distributions, a particular type of exponential family, known as exponential dispersion models. Starting from the variance function of a Tweedie model, we outline how to get alpha and beta divergences as special cases of Csiszár's $f$ and Bregman… ▽ More

    Submitted 19 September, 2012; originally announced September 2012.

  28. arXiv:1208.6231  [pdf, other

    cs.LG

    Link Prediction via Generalized Coupled Tensor Factorisation

    Authors: Beyza Ermiş, Evrim Acar, A. Taylan Cemgil

    Abstract: This study deals with the missing link prediction problem: the problem of predicting the existence of missing connections between entities of interest. We address link prediction using coupled analysis of relational datasets represented as heterogeneous data, i.e., datasets in the form of matrices and higher-order tensors. We propose to use an approach based on probabilistic interpretation of tens… ▽ More

    Submitted 30 August, 2012; originally announced August 2012.

  29. Monte Carlo Methods for Tempo Tracking and Rhythm Quantization

    Authors: A. T. Cemgil, B. Kappen

    Abstract: We present a probabilistic generative model for timing deviations in expressive music performance. The structure of the proposed model is equivalent to a switching state space model. The switch variables correspond to discrete note locations as in a musical score. The continuous hidden variables denote the tempo. We formulate two well known music recognition problems, namely tempo tracking and aut… ▽ More

    Submitted 23 June, 2011; originally announced June 2011.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 18, pages 45-81, 2003