Skip to main content

Showing 1–17 of 17 results for author: Weissenborn, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2205.06230  [pdf, other

    cs.CV

    Simple Open-Vocabulary Object Detection with Vision Transformers

    Authors: Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, Neil Houlsby

    Abstract: Combining simple architectures with large-scale pre-training has led to massive improvements in image classification. For object detection, pre-training and scaling approaches are less well established, especially in the long-tailed and open-vocabulary setting, where training data is relatively scarce. In this paper, we propose a strong recipe for transferring image-text models to open-vocabulary… ▽ More

    Submitted 20 July, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

    Comments: ECCV 2022 camera-ready version

  2. arXiv:2104.03059  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Differentiable Patch Selection for Image Recognition

    Authors: Jean-Baptiste Cordonnier, Aravindh Mahendran, Alexey Dosovitskiy, Dirk Weissenborn, Jakob Uszkoreit, Thomas Unterthiner

    Abstract: Neural Networks require large amounts of memory and compute to process high resolution images, even when only a small part of the image is actually informative for the task at hand. We propose a method based on a differentiable Top-K operator to select the most relevant parts of the input to efficiently process high resolution images. Our method may be interfaced with any downstream neural network… ▽ More

    Submitted 7 April, 2021; originally announced April 2021.

    Comments: Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021. Code available at https://github.com/google-research/google-research/tree/master/ptopk_patch_selection/

  3. arXiv:2102.04432  [pdf, other

    cs.CV cs.AI cs.LG

    Colorization Transformer

    Authors: Manoj Kumar, Dirk Weissenborn, Nal Kalchbrenner

    Abstract: We present the Colorization Transformer, a novel approach for diverse high fidelity image colorization based on self-attention. Given a grayscale image, the colorization proceeds in three steps. We first use a conditional autoregressive transformer to produce a low resolution coarse coloring of the grayscale image. Our architecture adopts conditional transformer layers to effectively condition gra… ▽ More

    Submitted 7 March, 2021; v1 submitted 8 February, 2021; originally announced February 2021.

    Comments: ICLR 2021 Camera Ready. See https://openreview.net/forum?id=5NA1PinlGFu for more details

  4. arXiv:2010.11929  [pdf, other

    cs.CV cs.AI cs.LG

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Authors: Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby

    Abstract: While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to replace certain components of convolutional networks while kee** their overall structure in place. We show that this reliance on CNNs is not nece… ▽ More

    Submitted 3 June, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: Fine-tuning code and pre-trained models are available at https://github.com/google-research/vision_transformer. ICLR camera-ready version with 2 small modifications: 1) Added a discussion of CLS vs GAP classifier in the appendix, 2) Fixed an error in exaFLOPs computation in Figure 5 and Table 6 (relative performance of models is basically not affected)

  5. arXiv:2006.15055  [pdf, other

    cs.LG cs.CV stat.ML

    Object-Centric Learning with Slot Attention

    Authors: Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf

    Abstract: Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features. Yet, most deep learning approaches learn distributed representations that do not capture the compositional properties of natural scenes. In this paper, we present the Slot Attention module, an architectural component that interfaces with pe… ▽ More

    Submitted 14 October, 2020; v1 submitted 26 June, 2020; originally announced June 2020.

    Comments: NeurIPS 2020. Code available at https://github.com/google-research/google-research/tree/master/slot_attention

  6. arXiv:1912.12180  [pdf, other

    cs.CV

    Axial Attention in Multidimensional Transformers

    Authors: Jonathan Ho, Nal Kalchbrenner, Dirk Weissenborn, Tim Salimans

    Abstract: We propose Axial Transformers, a self-attention-based autoregressive model for images and other data organized as high dimensional tensors. Existing autoregressive models either suffer from excessively large computational resource requirements for high dimensional data, or make compromises in terms of distribution expressiveness or ease of implementation in order to decrease resource requirements.… ▽ More

    Submitted 20 December, 2019; originally announced December 2019.

    Comments: 10 pages

  7. arXiv:1906.02634  [pdf, other

    cs.CV cs.AI cs.LG

    Scaling Autoregressive Video Models

    Authors: Dirk Weissenborn, Oscar Täckström, Jakob Uszkoreit

    Abstract: Due to the statistical complexity of video, the high degree of inherent stochasticity, and the sheer amount of data, generating natural video remains a challenging task. State-of-the-art video generation models often attempt to address these issues by combining sometimes complex, usually video-specific neural network architectures, latent variable models, adversarial training and a range of other… ▽ More

    Submitted 10 February, 2020; v1 submitted 6 June, 2019; originally announced June 2019.

    Comments: International Conference on Learning Representations (ICLR) 2020

  8. arXiv:1806.08727  [pdf, ps, other

    cs.CL cs.LG stat.ML

    Jack the Reader - A Machine Reading Framework

    Authors: Dirk Weissenborn, Pasquale Minervini, Tim Dettmers, Isabelle Augenstein, Johannes Welbl, Tim Rocktäschel, Matko Bošnjak, Jeff Mitchell, Thomas Demeester, Pontus Stenetorp, Sebastian Riedel

    Abstract: Many Machine Reading and Natural Language Understanding tasks require reading supporting text in order to answer questions. For example, in Question Answering, the supporting text can be newswire or Wikipedia articles; in Natural Language Inference, premises can be seen as the supporting text and hypotheses as questions. Providing a set of useful primitives operating in a single framework of relat… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

    Comments: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2018), System Demonstrations

  9. arXiv:1805.01646  [pdf, ps, other

    cs.CL

    Cross-lingual Candidate Search for Biomedical Concept Normalization

    Authors: Roland Roller, Madeleine Kittner, Dirk Weissenborn, Ulf Leser

    Abstract: Biomedical concept normalization links concept mentions in texts to a semantically equivalent concept in a biomedical knowledge base. This task is challenging as concepts can have different expressions in natural languages, e.g. paraphrases, which are not necessarily all present in the knowledge base. Concept normalization of non-English biomedical text is even more challenging as non-English reso… ▽ More

    Submitted 4 May, 2018; originally announced May 2018.

  10. arXiv:1706.08568  [pdf, other

    cs.CL cs.AI cs.NE

    Neural Question Answering at BioASQ 5B

    Authors: Georg Wiese, Dirk Weissenborn, Mariana Neves

    Abstract: This paper describes our submission to the 2017 BioASQ challenge. We participated in Task B, Phase B which is concerned with biomedical question answering (QA). We focus on factoid and list question, using an extractive QA model, that is, we restrict our system to output substrings of the provided text snippets. At the core of our system, we use FastQA, a state-of-the-art neural QA system. We exte… ▽ More

    Submitted 26 June, 2017; originally announced June 2017.

  11. arXiv:1706.03610  [pdf, other

    cs.CL cs.AI cs.NE

    Neural Domain Adaptation for Biomedical Question Answering

    Authors: Georg Wiese, Dirk Weissenborn, Mariana Neves

    Abstract: Factoid question answering (QA) has recently benefited from the development of deep learning (DL) systems. Neural network models outperform traditional approaches in domains where large datasets exist, such as SQuAD (ca. 100,000 questions) for Wikipedia articles. However, these systems have not yet been applied to QA in more specific domains, such as biomedicine, because datasets are generally too… ▽ More

    Submitted 15 June, 2017; v1 submitted 12 June, 2017; originally announced June 2017.

  12. arXiv:1706.02596  [pdf, other

    cs.CL cs.AI cs.NE

    Dynamic Integration of Background Knowledge in Neural NLU Systems

    Authors: Dirk Weissenborn, Tomáš Kočiský, Chris Dyer

    Abstract: Common-sense and background knowledge is required to understand natural language, but in most neural natural language understanding (NLU) systems, this knowledge must be acquired from training corpora during learning, and then it is static at test time. We introduce a new architecture for the dynamic integration of explicit background knowledge in NLU models. A general-purpose reading module reads… ▽ More

    Submitted 21 August, 2018; v1 submitted 8 June, 2017; originally announced June 2017.

  13. arXiv:1703.04816  [pdf, other

    cs.CL cs.AI cs.NE

    Making Neural QA as Simple as Possible but not Simpler

    Authors: Dirk Weissenborn, Georg Wiese, Laura Seiffe

    Abstract: Recent development of large-scale question answering (QA) datasets triggered a substantial amount of research into end-to-end neural architectures for QA. Increasingly complex systems have been conceived without comparison to simpler neural baseline systems that would justify their complexity. In this work, we propose a simple heuristic that guides the development of neural baseline systems for th… ▽ More

    Submitted 8 June, 2017; v1 submitted 14 March, 2017; originally announced March 2017.

  14. arXiv:1609.00626  [pdf, other

    cs.CL stat.AP

    SynsetRank: Degree-adjusted Random Walk for Relation Identification

    Authors: Shinichi Nakajima, Sebastian Krause, Dirk Weissenborn, Sven Schmeier, Nico Goernitz, Feiyu Xu

    Abstract: In relation extraction, a key process is to obtain good detectors that find relevant sentences describing the target relation. To minimize the necessity of labeled data for refining detectors, previous work successfully made use of BabelNet, a semantic graph structure expressing relationships between synsets, as side information or prior knowledge. The goal of this paper is to enhance the use of g… ▽ More

    Submitted 15 September, 2016; v1 submitted 2 September, 2016; originally announced September 2016.

  15. arXiv:1607.03316  [pdf, other

    cs.CL cs.NE

    Separating Answers from Queries for Neural Reading Comprehension

    Authors: Dirk Weissenborn

    Abstract: We present a novel neural architecture for answering queries, designed to optimally leverage explicit support in the form of query-answer memories. Our model is able to refine and update a given query while separately accumulating evidence for predicting the answer. Its architecture reflects this separation with dedicated embedding matrices and loosely connected information pathways (modules) for… ▽ More

    Submitted 27 September, 2016; v1 submitted 12 July, 2016; originally announced July 2016.

  16. arXiv:1606.03864  [pdf, other

    cs.NE cs.AI cs.CL cs.LG

    Neural Associative Memory for Dual-Sequence Modeling

    Authors: Dirk Weissenborn

    Abstract: Many important NLP problems can be posed as dual-sequence or sequence-to-sequence modeling tasks. Recent advances in building end-to-end neural architectures have been highly successful in solving such tasks. In this work we propose a new architecture for dual-sequence modeling that is based on associative memory. We derive AM-RNNs, a recurrent associative memory (AM) which augments generic recurr… ▽ More

    Submitted 14 June, 2016; v1 submitted 13 June, 2016; originally announced June 2016.

    Comments: To appear in RepL4NLP at ACL 2016

  17. arXiv:1606.03002  [pdf, other

    cs.NE cs.AI cs.CL

    MuFuRU: The Multi-Function Recurrent Unit

    Authors: Dirk Weissenborn, Tim Rocktäschel

    Abstract: Recurrent neural networks such as the GRU and LSTM found wide adoption in natural language processing and achieve state-of-the-art results for many tasks. These models are characterized by a memory state that can be written to and read from by applying gated composition operations to the current input and the previous state. However, they only cover a small subset of potentially useful composition… ▽ More

    Submitted 9 June, 2016; originally announced June 2016.