Skip to main content

Showing 1–17 of 17 results for author: Berent, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.15307  [pdf, other

    cs.CV cs.AI cs.LG

    Representing Online Handwriting for Recognition in Large Vision-Language Models

    Authors: Anastasiia Fadeeva, Philippe Schlattner, Andrii Maksai, Mark Collier, Efi Kokiopoulou, Jesse Berent, Claudiu Musat

    Abstract: The adoption of tablets with touchscreens and styluses is increasing, and a key feature is converting handwriting to text, enabling search, indexing, and AI assistance. Meanwhile, vision-language models (VLMs) are now the go-to solution for image understanding, thanks to both their state-of-the-art performance across a variety of tasks and the simplicity of a unified approach to training, fine-tun… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  2. arXiv:2402.05804  [pdf, other

    cs.CV cs.AI

    InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write

    Authors: Blagoj Mitrevski, Arina Rak, Julian Schnitzler, Chengkun Li, Andrii Maksai, Jesse Berent, Claudiu Musat

    Abstract: Digital note-taking is gaining popularity, offering a durable, editable, and easily indexable way of storing notes in the vectorized form, known as digital ink. However, a substantial gap remains between this way of note-taking and traditional pen-and-paper note-taking, a practice still favored by a vast majority. Our work, InkSight, aims to bridge the gap by empowering physical note-takers to eff… ▽ More

    Submitted 20 February, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  3. arXiv:2305.16999  [pdf, other

    cs.CV cs.AI cs.LG

    Three Towers: Flexible Contrastive Learning with Pretrained Image Models

    Authors: Jannik Kossen, Mark Collier, Basil Mustafa, Xiao Wang, Xiaohua Zhai, Lucas Beyer, Andreas Steiner, Jesse Berent, Rodolphe Jenatton, Efi Kokiopoulou

    Abstract: We introduce Three Towers (3T), a flexible method to improve the contrastive learning of vision-language models by incorporating pretrained image classifiers. While contrastive models are usually trained from scratch, LiT (Zhai et al., 2022) has recently shown performance gains from using pretrained classifier embeddings. However, LiT directly replaces the image tower with the frozen embeddings, e… ▽ More

    Submitted 30 October, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted for publication at NeurIPS 2023

  4. arXiv:2303.01806  [pdf, other

    cs.LG cs.CV

    When does Privileged Information Explain Away Label Noise?

    Authors: Guillermo Ortiz-Jimenez, Mark Collier, Anant Nawalgaria, Alexander D'Amour, Jesse Berent, Rodolphe Jenatton, Effrosyni Kokiopoulou

    Abstract: Leveraging privileged information (PI), or features available during training but not at test time, has recently been shown to be an effective method for addressing label noise. However, the reasons for its effectiveness are not well understood. In this study, we investigate the role played by different properties of the PI in explaining away label noise. Through experiments on multiple datasets w… ▽ More

    Submitted 1 June, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: Accepted ICML 2023, Honolulu

  5. arXiv:2301.12860  [pdf, other

    cs.LG stat.ML

    Massively Scaling Heteroscedastic Classifiers

    Authors: Mark Collier, Rodolphe Jenatton, Basil Mustafa, Neil Houlsby, Jesse Berent, Effrosyni Kokiopoulou

    Abstract: Heteroscedastic classifiers, which learn a multivariate Gaussian distribution over prediction logits, have been shown to perform well on image classification problems with hundreds to thousands of classes. However, compared to standard classifiers, they introduce extra parameters that scale linearly with the number of classes. This makes them infeasible to apply to larger-scale problems. In additi… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: Accepted to ICLR 2023

  6. arXiv:2202.13794  [pdf, other

    cs.AI

    Inkorrect: Online Handwriting Spelling Correction

    Authors: Andrii Maksai, Henry Rowley, Jesse Berent, Claudiu Musat

    Abstract: We introduce Inkorrect, a data- and label-efficient approach for online handwriting (Digital Ink) spelling correction - DISC. Unlike previous work, the proposed method does not require multiple samples from the same writer, or access to character level segmentation. We show that existing automatic evaluation metrics do not fully capture and are not correlated with the human perception of the quali… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

  7. arXiv:2202.09244  [pdf, other

    cs.LG

    Transfer and Marginalize: Explaining Away Label Noise with Privileged Information

    Authors: Mark Collier, Rodolphe Jenatton, Efi Kokiopoulou, Jesse Berent

    Abstract: Supervised learning datasets often have privileged information, in the form of features which are available at training time but are not available at test time e.g. the ID of the annotator that provided the label. We argue that privileged information is useful for explaining away label noise, thereby reducing the harmful impact of noisy labels. We develop a simple and efficient method for supervis… ▽ More

    Submitted 15 June, 2022; v1 submitted 18 February, 2022; originally announced February 2022.

    Comments: Accepted at ICML 2022, Baltimore

  8. arXiv:2110.02609  [pdf, other

    stat.ML cs.LG

    Deep Classifiers with Label Noise Modeling and Distance Awareness

    Authors: Vincent Fortuin, Mark Collier, Florian Wenzel, James Allingham, Jeremiah Liu, Dustin Tran, Balaji Lakshminarayanan, Jesse Berent, Rodolphe Jenatton, Effrosyni Kokiopoulou

    Abstract: Uncertainty estimation in deep learning has recently emerged as a crucial area of interest to advance reliability and robustness in safety-critical applications. While there have been many proposed methods that either focus on distance-aware model uncertainties for out-of-distribution detection or on input-dependent label uncertainties for in-distribution calibration, both of these types of uncert… ▽ More

    Submitted 8 August, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: Published in TMLR

  9. arXiv:2105.10305  [pdf, other

    cs.LG cs.CV stat.ML

    Correlated Input-Dependent Label Noise in Large-Scale Image Classification

    Authors: Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent

    Abstract: Large scale image classification datasets often contain noisy labels. We take a principled probabilistic approach to modelling input-dependent, also known as heteroscedastic, label noise in these datasets. We place a multivariate Normal distributed latent variable on the final hidden layer of a neural network classifier. The covariance matrix of this latent variable, models the aleatoric uncertain… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: Accepted as Oral at CVPR 2021

  10. arXiv:2009.04381  [pdf, other

    cs.LG stat.ML

    Routing Networks with Co-training for Continual Learning

    Authors: Mark Collier, Efi Kokiopoulou, Andrea Gesmundo, Jesse Berent

    Abstract: The core challenge with continual learning is catastrophic forgetting, the phenomenon that when neural networks are trained on a sequence of tasks they rapidly forget previously learned tasks. It has been observed that catastrophic forgetting is most severe when tasks are dissimilar to each other. We propose the use of sparse routing networks for continual learning. For each input, these network a… ▽ More

    Submitted 9 September, 2020; originally announced September 2020.

    Comments: Presented at ICML Workshop on Continual Learning 2020

  11. arXiv:2003.06778  [pdf, other

    cs.LG stat.ML

    A Simple Probabilistic Method for Deep Classification under Input-Dependent Label Noise

    Authors: Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent

    Abstract: Datasets with noisy labels are a common occurrence in practical applications of classification methods. We propose a simple probabilistic method for training deep classifiers under input-dependent (heteroscedastic) label noise. We assume an underlying heteroscedastic generative process for noisy labels. To make gradient based training feasible we use a temperature parameterized softmax as a smooth… ▽ More

    Submitted 12 November, 2020; v1 submitted 15 March, 2020; originally announced March 2020.

  12. arXiv:1911.11481  [pdf, other

    cs.LG stat.ML

    Ranking architectures using meta-learning

    Authors: Alina Dubatovka, Efi Kokiopoulou, Luciano Sbaiz, Andrea Gesmundo, Gabor Bartok, Jesse Berent

    Abstract: Neural architecture search has recently attracted lots of research efforts as it promises to automate the manual design of neural networks. However, it requires a large amount of computing resources and in order to alleviate this, a performance prediction network has been recently proposed that enables efficient architecture search by forecasting the performance of candidate architectures, instead… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Comments: NeurIPS 2019 Meta-Learning workshop

  13. arXiv:1910.04915  [pdf, other

    cs.LG stat.ML

    Flexible Multi-task Networks by Learning Parameter Allocation

    Authors: Krzysztof Maziarz, Efi Kokiopoulou, Andrea Gesmundo, Luciano Sbaiz, Gabor Bartok, Jesse Berent

    Abstract: This paper proposes a novel learning method for multi-task applications. Multi-task neural networks can learn to transfer knowledge across different tasks by using parameter sharing. However, sharing parameters between unrelated tasks can hurt performance. To address this issue, we propose a framework to learn fine-grained patterns of parameter sharing. Assuming that the network is composed of sev… ▽ More

    Submitted 18 July, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

  14. arXiv:1907.10164  [pdf, other

    cs.CV

    Cap2Det: Learning to Amplify Weak Caption Supervision for Object Detection

    Authors: Keren Ye, Mingda Zhang, Adriana Kovashka, Wei Li, Danfeng Qin, Jesse Berent

    Abstract: Learning to localize and name object instances is a fundamental problem in vision, but state-of-the-art approaches rely on expensive bounding box supervision. While weakly supervised detection (WSOD) methods relax the need for boxes to that of image-level annotations, even cheaper supervision is naturally available in the form of unstructured textual descriptions that users may freely provide when… ▽ More

    Submitted 16 August, 2019; v1 submitted 23 July, 2019; originally announced July 2019.

    Comments: To appear in ICCV 2019

  15. arXiv:1902.05781  [pdf, other

    cs.LG stat.ML

    Fast Task-Aware Architecture Inference

    Authors: Efi Kokiopoulou, Anja Hauth, Luciano Sbaiz, Andrea Gesmundo, Gabor Bartok, Jesse Berent

    Abstract: Neural architecture search has been shown to hold great promise towards the automation of deep learning. However in spite of its potential, neural architecture search remains quite costly. To this point, we propose a novel gradient-based framework for efficient architecture search by sharing information across several tasks. We start by training many model architectures on several related (trainin… ▽ More

    Submitted 15 February, 2019; originally announced February 2019.

  16. arXiv:1811.10080  [pdf, other

    cs.CV

    Learning to discover and localize visual objects with open vocabulary

    Authors: Keren Ye, Mingda Zhang, Wei Li, Danfeng Qin, Adriana Kovashka, Jesse Berent

    Abstract: To alleviate the cost of obtaining accurate bounding boxes for training today's state-of-the-art object detection models, recent weakly supervised detection work has proposed techniques to learn from image-level labels. However, requiring discrete image-level labels is both restrictive and suboptimal. Real-world "supervision" usually consists of more unstructured text, such as captions. In this wo… ▽ More

    Submitted 25 November, 2018; originally announced November 2018.

  17. arXiv:1705.05640  [pdf, other

    cs.CV

    WebVision Challenge: Visual Learning and Understanding With Web Data

    Authors: Wen Li, Limin Wang, Wei Li, Eirikur Agustsson, Jesse Berent, Abhinav Gupta, Rahul Sukthankar, Luc Van Gool

    Abstract: We present the 2017 WebVision Challenge, a public image recognition challenge designed for deep learning based on web images without instance-level human annotation. Following the spirit of previous vision challenges, such as ILSVRC, Places2 and PASCAL VOC, which have played critical roles in the development of computer vision by contributing to the community with large scale annotated data for mo… ▽ More

    Submitted 16 May, 2017; originally announced May 2017.

    Comments: project page: http://www.vision.ee.ethz.ch/webvision/