Skip to main content

Showing 1–11 of 11 results for author: Alaniz, S

.
  1. arXiv:2312.03759  [pdf, ps, other

    cs.CL cs.AI cs.CY cs.DL

    How should the advent of large language models affect the practice of science?

    Authors: Marcel Binz, Stephan Alaniz, Adina Roskies, Balazs Aczel, Carl T. Bergstrom, Colin Allen, Daniel Schad, Dirk Wulff, Jevin D. West, Qiong Zhang, Richard M. Shiffrin, Samuel J. Gershman, Ven Popov, Emily M. Bender, Marco Marelli, Matthew M. Botvinick, Zeynep Akata, Eric Schulz

    Abstract: Large language models (LLMs) are being increasingly incorporated into scientific workflows. However, we have yet to fully grasp the implications of this integration. How should the advent of large language models affect the practice of science? For this opinion piece, we have invited four diverse groups of scientists to reflect on this query, sharing their perspectives and engaging in debate. Schu… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  2. arXiv:2309.03173  [pdf, other

    cs.CV

    PDiscoNet: Semantically consistent part discovery for fine-grained recognition

    Authors: Robert van der Klis, Stephan Alaniz, Massimiliano Mancini, Cassio F. Dantas, Dino Ienco, Zeynep Akata, Diego Marcos

    Abstract: Fine-grained classification often requires recognizing specific object parts, such as beak shape and wing patterns for birds. Encouraging a fine-grained classification model to first detect such parts and then using them to infer the class could help us gauge whether the model is indeed looking at the right details better than with interpretability methods that provide a single attribution map. We… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

    Comments: 9 pages, 8 figures, ICCV

  3. arXiv:2309.02102  [pdf, other

    cs.CV cs.AI cs.LG

    Iterative Superquadric Recomposition of 3D Objects from Multiple Views

    Authors: Stephan Alaniz, Massimiliano Mancini, Zeynep Akata

    Abstract: Humans are good at recomposing novel objects, i.e. they can identify commonalities between unknown objects from general structure to finer detail, an ability difficult to replicate by machines. We propose a framework, ISCO, to recompose an object using 3D superquadrics as semantic parts directly from 2D views without training a model that uses 3D supervision. To achieve this, we optimize the super… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: Accepted at ICCV 2023

  4. arXiv:2309.01617  [pdf, other

    cs.CV cs.AI cs.LG

    DeViL: Decoding Vision features into Language

    Authors: Meghal Dani, Isabel Rio-Torto, Stephan Alaniz, Zeynep Akata

    Abstract: Post-hoc explanation methods have often been criticised for abstracting away the decision-making process of deep neural networks. In this work, we would like to provide natural language descriptions for what different layers of a vision backbone have learned. Our DeViL method decodes vision features into language, not only highlighting the attribution locations but also generating textual descript… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

    Comments: Accepted at GCPR 2023 (Oral)

  5. arXiv:2305.14930  [pdf, other

    cs.AI cs.CL cs.LG

    In-Context Impersonation Reveals Large Language Models' Strengths and Biases

    Authors: Leonard Salewski, Stephan Alaniz, Isabel Rio-Torto, Eric Schulz, Zeynep Akata

    Abstract: In everyday conversations, humans can take on different roles and adapt their vocabulary to their chosen roles. We explore whether LLMs can take on, that is impersonate, different roles when they generate text in-context. We ask LLMs to assume different personas before solving vision and language tasks. We do this by prefixing the prompt with a persona that is associated either with a social ident… ▽ More

    Submitted 26 November, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Published in NeurIPS 2023 (Spotlight)

  6. arXiv:2209.02536  [pdf, other

    cs.CV cs.AI

    Semantic Image Synthesis with Semantically Coupled VQ-Model

    Authors: Stephan Alaniz, Thomas Hummel, Zeynep Akata

    Abstract: Semantic image synthesis enables control over unconditional image generation by allowing guidance on what is being generated. We conditionally synthesize the latent space from a vector quantized model (VQ-model) pre-trained to autoencode images. Instead of training an autoregressive Transformer on separately learned conditioning latents and image latents, we find that jointly learning the conditio… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

    Comments: ICLR 2022 DGM4HSD

  7. arXiv:2207.13543  [pdf, other

    cs.CV cs.AI

    Abstracting Sketches through Simple Primitives

    Authors: Stephan Alaniz, Massimiliano Mancini, Anjan Dutta, Diego Marcos, Zeynep Akata

    Abstract: Humans show high-level of abstraction capabilities in games that require quickly communicating object information. They decompose the message content into multiple parts and communicate them in an interpretable protocol. Toward equip** machines with such capabilities, we propose the Primitive-based Sketch Abstraction task where the goal is to represent sketches using a fixed set of drawing primi… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

    Comments: European Conference on Computer Vision (ECCV) 2022

  8. arXiv:2206.06404  [pdf, other

    cs.CV cs.AI cs.LG

    Compositional Mixture Representations for Vision and Text

    Authors: Stephan Alaniz, Marco Federici, Zeynep Akata

    Abstract: Learning a common representation space between vision and language allows deep networks to relate objects in the image to the corresponding semantic meaning. We present a model that learns a shared Gaussian mixture representation imposing the compositionality of the text onto the visual domain without having explicit location supervision. By combining the spatial transformer with a representation… ▽ More

    Submitted 13 June, 2022; originally announced June 2022.

    Comments: Workshop on Learning with Limited Labelled Data for Image and Video Understanding (L3D-IVU), CVPR 2022

  9. arXiv:1910.04872  [pdf, other

    cs.AI

    Modeling Conceptual Understanding in Image Reference Games

    Authors: Rodolfo Corona, Stephan Alaniz, Zeynep Akata

    Abstract: An agent who interacts with a wide population of other agents needs to be aware that there may be variations in their understanding of the world. Furthermore, the machinery which they use to perceive may be inherently different, as is the case between humans and machines. In this work, we present both an image reference game between a speaker and a population of listeners where reasoning about the… ▽ More

    Submitted 19 November, 2019; v1 submitted 10 October, 2019; originally announced October 2019.

    Comments: Published in NeurIPS 2019

  10. arXiv:1902.01780  [pdf, other

    cs.LG cs.AI

    Learning Decision Trees Recurrently Through Communication

    Authors: Stephan Alaniz, Diego Marcos, Bernt Schiele, Zeynep Akata

    Abstract: Integrated interpretability without sacrificing the prediction accuracy of decision making algorithms has the potential of greatly improving their value to the user. Instead of assigning a label to an image directly, we propose to learn iterative binary sub-decisions, inducing sparsity and transparency in the decision making process. The key aspect of our model is its ability to build a decision t… ▽ More

    Submitted 11 April, 2021; v1 submitted 5 February, 2019; originally announced February 2019.

    Comments: Accepted in IEEE CVPR 2021

  11. arXiv:1803.08456  [pdf, other

    cs.AI cs.LG stat.ML

    Deep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft

    Authors: Stephan Alaniz

    Abstract: Deep reinforcement learning has been successfully applied to several visual-input tasks using model-free methods. In this paper, we propose a model-based approach that combines learning a DNN-based transition model with Monte Carlo tree search to solve a block-placing task in Minecraft. Our learned transition model predicts the next frame and the rewards one step ahead given the last four frames o… ▽ More

    Submitted 22 March, 2018; originally announced March 2018.

    Comments: The 3rd Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM) 2017