Skip to main content

Showing 1–21 of 21 results for author: Kotlowski, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14743  [pdf, other

    cs.LG stat.ML

    A General Online Algorithm for Optimizing Complex Performance Metrics

    Authors: Wojciech Kotłowski, Marek Wydmuch, Erik Schultheis, Rohit Babbar, Krzysztof Dembczyński

    Abstract: We consider sequential maximization of performance metrics that are general functions of a confusion matrix of a classifier (such as precision, F-measure, or G-mean). Such metrics are, in general, non-decomposable over individual instances, making their optimization very challenging. While they have been extensively studied under different frameworks in the batch setting, their analysis in the onl… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: This is the authors' version of the work accepted to ICML 2024

  2. arXiv:2403.02697  [pdf, other

    stat.ML cs.LG

    Noise misleads rotation invariant algorithms on sparse targets

    Authors: Manfred K. Warmuth, Wojciech Kotłowski, Matt Jones, Ehsan Amid

    Abstract: It is well known that the class of rotation invariant algorithms are suboptimal even for learning sparse linear problems when the number of examples is below the "dimension" of the problem. This class includes any gradient descent trained neural net with a fully-connected input layer (initialized with a rotationally symmetric distribution). The simplest sparse problem is learning a single feature… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  3. arXiv:2401.16594  [pdf, other

    cs.LG

    Consistent algorithms for multi-label classification with macro-at-$k$ metrics

    Authors: Erik Schultheis, Wojciech Kotłowski, Marek Wydmuch, Rohit Babbar, Strom Borman, Krzysztof Dembczyński

    Abstract: We consider the optimization of complex performance metrics in multi-label classification under the population utility framework. We mainly focus on metrics linearly decomposable into a sum of binary classification utilities applied separately to each label with an additional requirement of exactly $k$ labels predicted for each instance. These "macro-at-$k$" metrics possess desired properties for… ▽ More

    Submitted 29 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: This is the authors' version of the work accepted to ICLR 2024; the final version of the paper, errors and typos corrected, and minor modifications to improve clarity

  4. arXiv:2311.05081  [pdf, other

    cs.LG

    Generalized test utilities for long-tail performance in extreme multi-label classification

    Authors: Erik Schultheis, Marek Wydmuch, Wojciech Kotłowski, Rohit Babbar, Krzysztof Dembczyński

    Abstract: Extreme multi-label classification (XMLC) is the task of selecting a small subset of relevant labels from a very large set of possible labels. As such, it is characterized by long-tail labels, i.e., most labels have very few positive instances. With standard performance measures such as precision@k, a classifier can ignore tail labels and still report good performance. However, it is often argued… ▽ More

    Submitted 17 January, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: This is the authors' version of the work accepted to NeurIPS 2023; the final version of the paper, errors and typos corrected, and minor modifications to improve clarity

  5. arXiv:2308.15395  [pdf, other

    cs.LG q-bio.MN q-bio.QM

    The CausalBench challenge: A machine learning contest for gene network inference from single-cell perturbation data

    Authors: Mathieu Chevalley, Jacob Sackett-Sanders, Yusuf Roohani, Pascal Notin, Artemy Bakulin, Dariusz Brzezinski, Kaiwen Deng, Yuanfang Guan, Justin Hong, Michael Ibrahim, Wojciech Kotlowski, Marcin Kowiel, Panagiotis Misiakos, Achille Nazaret, Markus Püschel, Chris Wendler, Arash Mehrjou, Patrick Schwab

    Abstract: In drug discovery, map** interactions between genes within cellular systems is a crucial early step. This helps formulate hypotheses regarding molecular mechanisms that could potentially be targeted by future medicines. The CausalBench Challenge was an initiative to invite the machine learning community to advance the state of the art in constructing gene-gene interaction networks. These network… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  6. arXiv:2202.06438  [pdf, other

    cs.LG

    Learning from Randomly Initialized Neural Network Features

    Authors: Ehsan Amid, Rohan Anil, Wojciech Kotłowski, Manfred K. Warmuth

    Abstract: We present the surprising result that randomly initialized neural networks are good feature extractors in expectation. These random features correspond to finite-sample realizations of what we call Neural Network Prior Kernel (NNPK), which is inherently infinite-dimensional. We conduct ablations across multiple architectures of varying sizes as well as initializations and activation functions. Our… ▽ More

    Submitted 13 February, 2022; originally announced February 2022.

  7. arXiv:2107.01881  [pdf, ps, other

    cs.LG

    Robust Online Convex Optimization in the Presence of Outliers

    Authors: Tim van Erven, Sarah Sachs, Wouter M. Koolen, Wojciech Kotłowski

    Abstract: We consider online convex optimization when a number k of data points are outliers that may be corrupted. We model this by introducing the notion of robust regret, which measures the regret only on rounds that are not outliers. The aim for the learner is to achieve small robust regret, without knowing where the outliers are. If the outliers are chosen adversarially, we show that a simple filtering… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

    Journal ref: Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:4174-4194, 2021

  8. arXiv:2010.08625  [pdf, other

    cs.LG stat.ML

    A case where a spindly two-layer linear network whips any neural network with a fully connected input layer

    Authors: Manfred K. Warmuth, Wojciech Kotłowski, Ehsan Amid

    Abstract: It was conjectured that any neural network of any structure and arbitrary differentiable transfer functions at the nodes cannot learn the following problem sample efficiently when trained with gradient descent: The instances are the rows of a $d$-dimensional Hadamard matrix and the target is one of the features, i.e. very sparse. We essentially prove this conjecture: We show that after receiving a… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

  9. arXiv:1905.12781  [pdf, other

    cs.LG stat.ML

    Learning to Crawl

    Authors: Utkarsh Upadhyay, Robert Busa-Fekete, Wojciech Kotlowski, David Pal, Balazs Szorenyi

    Abstract: Web crawling is the problem of kee** a cache of webpages fresh, i.e., having the most recent copy available when a page is requested. This problem is usually coupled with the natural restriction that the bandwidth available to the web crawler is limited. The corresponding optimization problem was solved optimally by Azar et al. [2018] under the assumption that, for each webpage, both the elapsed… ▽ More

    Submitted 22 November, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: Published at AAAI 2020

  10. arXiv:1902.07528  [pdf, other

    cs.LG stat.ML

    Adaptive scale-invariant online algorithms for learning linear models

    Authors: Michał Kempka, Wojciech Kotłowski, Manfred K. Warmuth

    Abstract: We consider online learning with linear models, where the algorithm predicts on sequentially revealed instances (feature vectors), and is compared against the best linear function (comparator) in hindsight. Popular algorithms in this framework, such as Online Gradient Descent (OGD), have parameters (learning rates), which ideally should be tuned based on the scales of the features and the optimal… ▽ More

    Submitted 20 February, 2019; originally announced February 2019.

  11. arXiv:1902.03035  [pdf, ps, other

    cs.LG stat.ML

    Bandit Principal Component Analysis

    Authors: Wojciech Kotłowski, Gergely Neu

    Abstract: We consider a partial-feedback variant of the well-studied online PCA problem where a learner attempts to predict a sequence of $d$-dimensional vectors in terms of a quadratic loss, while only having limited feedback about the environment's choices. We focus on a natural notion of bandit feedback where the learner only observes the loss associated with its own prediction. Based on the classical ob… ▽ More

    Submitted 8 February, 2019; originally announced February 2019.

  12. arXiv:1802.07543  [pdf, ps, other

    stat.ML cs.LG

    The Many Faces of Exponential Weights in Online Learning

    Authors: Dirk van der Hoeven, Tim van Erven, Wojciech Kotłowski

    Abstract: A standard introduction to online learning might place Online Gradient Descent at its center and then proceed to develop generalizations and extensions like Online Mirror Descent and second-order methods. Here we explore the alternative approach of putting Exponential Weights (EW) first. We show that many standard methods and their regret bounds then follow as a special case by plugging in suitabl… ▽ More

    Submitted 5 June, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

    Journal ref: Proceedings of the 31st Conference On Learning Theory, PMLR 75:2067-2092, 2018

  13. arXiv:1708.07042  [pdf, ps, other

    cs.LG stat.ML

    Scale-invariant unconstrained online learning

    Authors: Wojciech Kotłowski

    Abstract: We consider a variant of online convex optimization in which both the instances (input vectors) and the comparator (weight vector) are unconstrained. We exploit a natural scale invariance symmetry in our unconstrained setting: the predictions of the optimal comparator are invariant under any linear transformation of the instances. Our goal is to design online algorithms which also enjoy this prope… ▽ More

    Submitted 23 August, 2017; originally announced August 2017.

    Comments: To appear in Proc. of the 28th International Conference on Algorithmic Learning Theory (ALT) 2017

  14. arXiv:1603.04190  [pdf, ps, other

    cs.LG stat.ML

    Online Isotonic Regression

    Authors: Wojciech Kotłowski, Wouter M. Koolen, Alan Malek

    Abstract: We consider the online version of the isotonic regression problem. Given a set of linearly ordered points (e.g., on the real line), the learner must predict labels sequentially at adversarially chosen positions and is evaluated by her total squared loss compared against the best isotonic (non-decreasing) function in hindsight. We survey several standard online learning algorithms and show that non… ▽ More

    Submitted 7 October, 2016; v1 submitted 14 March, 2016; originally announced March 2016.

    Comments: 25 pages

  15. arXiv:1506.04855  [pdf, ps, other

    cs.LG stat.ML

    PCA with Gaussian perturbations

    Authors: Wojciech Kotłowski, Manfred K. Warmuth

    Abstract: Most of machine learning deals with vector parameters. Ideally we would like to take higher order information into account and make use of matrix or even tensor parameters. However the resulting algorithms are usually inefficient. Here we address on-line learning with matrix parameters. It is often easy to obtain online algorithm with good generalization performance if you eigendecompose the curre… ▽ More

    Submitted 23 July, 2015; v1 submitted 16 June, 2015; originally announced June 2015.

  16. arXiv:1504.07272  [pdf, other

    cs.LG

    Surrogate regret bounds for generalized classification performance metrics

    Authors: Wojciech Kotłowski, Krzysztof Dembczyński

    Abstract: We consider optimization of generalized performance metrics for binary classification by means of surrogate losses. We focus on a class of metrics, which are linear-fractional functions of the false positive and false negative rates (examples of which include $F_β$-measure, Jaccard similarity coefficient, AM measure, and many others). Our analysis concerns the following two-step procedure. First,… ▽ More

    Submitted 7 October, 2016; v1 submitted 27 April, 2015; originally announced April 2015.

    Comments: 22 pages

  17. arXiv:1412.2106  [pdf, ps, other

    cs.LG

    Consistent optimization of AMS by logistic loss minimization

    Authors: Wojciech Kotłowski

    Abstract: In this paper, we theoretically justify an approach popular among participants of the Higgs Boson Machine Learning Challenge to optimize approximate median significance (AMS). The approach is based on the following two-stage procedure. First, a real-valued function is learned by minimizing a surrogate loss for binary classification, such as logistic loss, on the training sample. Then, a threshold… ▽ More

    Submitted 5 December, 2014; originally announced December 2014.

    Comments: 9 pages, HEPML workshop at NIPS 2014

  18. arXiv:1306.3895  [pdf, ps, other

    cs.LG

    On-line PCA with Optimal Regrets

    Authors: Jiazhong Nie, Wojciech Kotlowski, Manfred K. Warmuth

    Abstract: We carefully investigate the on-line version of PCA, where in each trial a learning algorithm plays a k-dimensional subspace, and suffers the compression loss on the next instance when projected into the chosen subspace. In this setting, we analyze two popular on-line algorithms, Gradient Descent (GD) and Exponentiated Gradient (EG). We show that both algorithms are essentially optimal in the wors… ▽ More

    Submitted 9 May, 2014; v1 submitted 17 June, 2013; originally announced June 2013.

  19. arXiv:1305.4324  [pdf, ps, other

    cs.LG stat.ML

    Horizon-Independent Optimal Prediction with Log-Loss in Exponential Families

    Authors: Peter Bartlett, Peter Grunwald, Peter Harremoes, Fares Hedayati, Wojciech Kotlowski

    Abstract: We study online learning under logarithmic loss with regular parametric models. Hedayati and Bartlett (2012b) showed that a Bayesian prediction strategy with Jeffreys prior and sequential normalized maximum likelihood (SNML) coincide and are optimal if and only if the latter is exchangeable, and if and only if the optimal strategy can be calculated without knowing the time horizon in advance. They… ▽ More

    Submitted 19 May, 2013; originally announced May 2013.

    Comments: 23 pages

  20. arXiv:1206.6401  [pdf

    cs.LG stat.ML

    Consistent Multilabel Ranking through Univariate Losses

    Authors: Krzysztof Dembczynski, Wojciech Kotlowski, Eyke Huellermeier

    Abstract: We consider the problem of rank loss minimization in the setting of multilabel classification, which is usually tackled by means of convex surrogate losses defined on pairs of labels. Very recently, this approach was put into question by a negative result showing that commonly used pairwise surrogate losses, such as exponential and logistic losses, are inconsistent. In this paper, we show a positi… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

  21. arXiv:1002.0757  [pdf, ps, other

    cs.IT cs.LG math.ST

    Prequential Plug-In Codes that Achieve Optimal Redundancy Rates even if the Model is Wrong

    Authors: Peter Grünwald, Wojciech Kotłowski

    Abstract: We analyse the prequential plug-in codes relative to one-parameter exponential families M. We show that if data are sampled i.i.d. from some distribution outside M, then the redundancy of any plug-in prequential code grows at rate larger than 1/2 ln(n) in the worst case. This means that plug-in codes, such as the Rissanen-Dawid ML code, may behave inferior to other important universal codes such… ▽ More

    Submitted 3 February, 2010; originally announced February 2010.