Skip to main content

Showing 1–21 of 21 results for author: Mitrović, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16674  [pdf, other

    cs.CL

    Computational Approaches to the Detection of Lesser-Known Rhetorical Figures: A Systematic Survey and Research Challenges

    Authors: Ramona Kühn, Jelena Mitrović, Michael Granitzer

    Abstract: Rhetorical figures play a major role in our everyday communication as they make text more interesting, more memorable, or more persuasive. Therefore, it is important to computationally detect rhetorical figures to fully understand the meaning of a text. We provide a comprehensive overview of computational approaches to lesser-known rhetorical figures. We explore the linguistic and computational pe… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Submitted to ACM Computing Surveys. 35 pages

  2. arXiv:2403.07750  [pdf, other

    cs.CV cs.AI

    Synth$^2$: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings

    Authors: Sahand Sharifzadeh, Christos Kaplanis, Shreya Pathak, Dharshan Kumaran, Anastasija Ilic, Jovana Mitrovic, Charles Blundell, Andrea Banino

    Abstract: The creation of high-quality human-labeled image-caption datasets presents a significant bottleneck in the development of Visual-Language Models (VLMs). In this work, we investigate an approach that leverages the strengths of Large Language Models (LLMs) and image generation models to create synthetic image-text pairs for efficient and effective VLM training. Our method employs a pretrained text-t… ▽ More

    Submitted 7 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: 9 pages, 6 figures

  3. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  4. arXiv:2401.09865  [pdf, other

    cs.CV cs.AI cs.LG

    Improving fine-grained understanding in image-text pre-training

    Authors: Ioana Bica, Anastasija Ilić, Matthias Bauer, Goker Erdogan, Matko Bošnjak, Christos Kaplanis, Alexey A. Gritsenko, Matthias Minderer, Charles Blundell, Razvan Pascanu, Jovana Mitrović

    Abstract: We introduce SPARse Fine-grained Contrastive Alignment (SPARC), a simple method for pretraining more fine-grained multimodal representations from image-text pairs. Given that multiple image patches often correspond to single words, we propose to learn a grou** of image patches for every token in the caption. To achieve this, we use a sparse similarity metric between image patches and language to… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: 26 pages

  5. Technical Report: Impact of Position Bias on Language Models in Token Classification

    Authors: Mehdi Ben Amor, Michael Granitzer, Jelena Mitrović

    Abstract: Language Models (LMs) have shown state-of-the-art performance in Natural Language Processing (NLP) tasks. Downstream tasks such as Named Entity Recognition (NER) or Part-of-Speech (POS) tagging are known to suffer from data imbalance issues, particularly regarding the ratio of positive to negative examples and class disparities. This paper investigates an often-overlooked issue of encoder models,… ▽ More

    Submitted 11 April, 2024; v1 submitted 26 April, 2023; originally announced April 2023.

    Comments: Updated content of the preprint

  6. A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents

    Authors: Norman Meuschke, Apurva Jagdale, Timo Spinde, Jelena Mitrović, Bela Gipp

    Abstract: Extracting information from academic PDF documents is crucial for numerous indexing, retrieval, and analysis use cases. Choosing the best tool to extract specific content elements is difficult because many, technically diverse tools are available, but recent performance benchmarks are rare. Moreover, such benchmarks typically cover only a few content elements like header metadata or bibliographic… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: iConference 2023

  7. German BERT Model for Legal Named Entity Recognition

    Authors: Harshil Darji, Jelena Mitrović, Michael Granitzer

    Abstract: The use of BERT, one of the most popular language models, has led to improvements in many Natural Language Processing (NLP) tasks. One such task is Named Entity Recognition (NER) i.e. automatic identification of named entities such as location, person, organization, etc. from a given text. It is also an important base step for many NLP tasks such as information extraction and argumentation mining.… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

    Comments: Presented at ICAART 2023

    Journal ref: Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART (2023) 723-728

  8. arXiv:2302.10258  [pdf, other

    cs.LG cs.AI stat.ME

    Neural Algorithmic Reasoning with Causal Regularisation

    Authors: Beatrice Bevilacqua, Kyriacos Nikiforou, Borja Ibarz, Ioana Bica, Michela Paganini, Charles Blundell, Jovana Mitrovic, Petar Veličković

    Abstract: Recent work on neural algorithmic reasoning has investigated the reasoning capabilities of neural networks, effectively demonstrating they can learn to execute classical algorithms on unseen data coming from the train distribution. However, the performance of existing neural reasoners significantly degrades on out-of-distribution (OOD) test data, where inputs have larger sizes. In this work, we ma… ▽ More

    Submitted 3 July, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: ICML 2023, Camera Ready; 17 pages, 7 figures

  9. arXiv:2301.05158  [pdf, other

    cs.CV cs.AI cs.LG

    SemPPL: Predicting pseudo-labels for better contrastive representations

    Authors: Matko Bošnjak, Pierre H. Richemond, Nenad Tomasev, Florian Strub, Jacob C. Walker, Felix Hill, Lars Holger Buesing, Razvan Pascanu, Charles Blundell, Jovana Mitrovic

    Abstract: Learning from large amounts of unsupervised data and a small amount of supervision is an important open problem in computer vision. We propose a new semi-supervised learning method, Semantic Positives via Pseudo-Labels (SemPPL), that combines labelled and unlabelled data to learn informative representations. Our method extends self-supervised contrastive learning -- where representations are shape… ▽ More

    Submitted 10 January, 2024; v1 submitted 12 January, 2023; originally announced January 2023.

    Comments: Published as a conference paper at ICLR 2023. For checkpoints and source code see https://github.com/google-deepmind/semppl

  10. Exploiting Transformer-based Multitask Learning for the Detection of Media Bias in News Articles

    Authors: Timo Spinde, Jan-David Krieger, Terry Ruas, Jelena Mitrović, Franz Götz-Hahn, Akiko Aizawa, Bela Gipp

    Abstract: Media has a substantial impact on the public perception of events. A one-sided or polarizing perspective on any topic is usually described as media bias. One of the ways how bias in news articles can be introduced is by altering word choice. Biased word choices are not always obvious, nor do they exhibit high context-dependency. Hence, detecting bias is often difficult. We propose a Transformer-ba… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Journal ref: Proceedings of the iConference 2022

  11. arXiv:2201.05119  [pdf, other

    cs.CV cs.LG stat.ML

    Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?

    Authors: Nenad Tomasev, Ioana Bica, Brian McWilliams, Lars Buesing, Razvan Pascanu, Charles Blundell, Jovana Mitrovic

    Abstract: Despite recent progress made by self-supervised methods in representation learning with residual networks, they still underperform supervised learning on the ImageNet classification benchmark, limiting their applicability in performance-critical settings. Building on prior theoretical insights from ReLIC [Mitrovic et al., 2021], we include additional inductive biases into self-supervised learning.… ▽ More

    Submitted 3 November, 2022; v1 submitted 13 January, 2022; originally announced January 2022.

  12. arXiv:2107.05431  [pdf, other

    cs.LG cs.AI

    CoBERL: Contrastive BERT for Reinforcement Learning

    Authors: Andrea Banino, Adrià Puidomenech Badia, Jacob Walker, Tim Scholtes, Jovana Mitrovic, Charles Blundell

    Abstract: Many reinforcement learning (RL) agents require a large amount of experience to solve tasks. We propose Contrastive BERT for RL (CoBERL), an agent that combines a new contrastive loss and a hybrid LSTM-transformer architecture to tackle the challenge of improving data efficiency. CoBERL enables efficient, robust learning from pixels across a wide range of domains. We use bidirectional masked predi… ▽ More

    Submitted 22 February, 2022; v1 submitted 12 July, 2021; originally announced July 2021.

    Comments: 9 pages, 2 figures, 6 tables

  13. arXiv:2105.13327  [pdf, other

    cs.LG

    Encoders and Ensembles for Task-Free Continual Learning

    Authors: Murray Shanahan, Christos Kaplanis, Jovana Mitrović

    Abstract: We present an architecture that is effective for continual learning in an especially demanding setting, where task boundaries do not exist or are unknown, and where classes have to be learned online (with each example presented only once). To obtain good performance under these constraints, while mitigating catastrophic forgetting, we exploit recent advances in contrastive, self-supervised learnin… ▽ More

    Submitted 7 October, 2021; v1 submitted 27 May, 2021; originally announced May 2021.

  14. arXiv:2010.12472  [pdf, other

    cs.CL

    HateBERT: Retraining BERT for Abusive Language Detection in English

    Authors: Tommaso Caselli, Valerio Basile, Jelena Mitrović, Michael Granitzer

    Abstract: In this paper, we introduce HateBERT, a re-trained BERT model for abusive language detection in English. The model was trained on RAL-E, a large-scale dataset of Reddit comments in English from communities banned for being offensive, abusive, or hateful that we have collected and made available to the public. We present the results of a detailed comparison between a general pre-trained language mo… ▽ More

    Submitted 4 February, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

  15. arXiv:2010.07922  [pdf, other

    cs.LG cs.CV stat.ML

    Representation Learning via Invariant Causal Mechanisms

    Authors: Jovana Mitrovic, Brian McWilliams, Jacob Walker, Lars Buesing, Charles Blundell

    Abstract: Self-supervised learning has emerged as a strategy to reduce the reliance on costly supervised signal by pretraining representations only using unlabeled data. These methods combine heuristic proxy classification tasks with data augmentations and have achieved significant success, but our theoretical understanding of this success remains limited. In this paper we analyze self-supervised representa… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

  16. arXiv:2008.09301  [pdf, other

    stat.ML cs.LG

    Amortized learning of neural causal representations

    Authors: Nan Rosemary Ke, Jane. X. Wang, Jovana Mitrovic, Martin Szummer, Danilo J. Rezende

    Abstract: Causal models can compactly and efficiently encode the data-generating process under all interventions and hence may generalize better under changes in distribution. These models are often represented as Bayesian networks and learning them scales poorly with the number of variables. Moreover, these approaches cannot leverage previously learned knowledge to help with learning new causal models. In… ▽ More

    Submitted 21 August, 2020; originally announced August 2020.

    Comments: ICLR 2020 causal learning for decision making workshop

  17. arXiv:2002.02836  [pdf, other

    cs.LG cs.AI stat.ML

    Causally Correct Partial Models for Reinforcement Learning

    Authors: Danilo J. Rezende, Ivo Danihelka, George Papamakarios, Nan Rosemary Ke, Ray Jiang, Theophane Weber, Karol Gregor, Hamza Merzic, Fabio Viola, Jane Wang, Jovana Mitrovic, Frederic Besse, Ioannis Antonoglou, Lars Buesing

    Abstract: In reinforcement learning, we can learn a model of future observations and rewards, and use it to plan the agent's next actions. However, jointly modeling future observations can be computationally expensive or even intractable if the observations are high-dimensional (e.g. images). For this reason, previous works have considered partial models, which model only part of the observation. In this pa… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

  18. arXiv:1901.08162  [pdf, other

    cs.LG cs.AI stat.ML

    Causal Reasoning from Meta-reinforcement Learning

    Authors: Ishita Dasgupta, Jane Wang, Silvia Chiappa, Jovana Mitrovic, Pedro Ortega, David Raposo, Edward Hughes, Peter Battaglia, Matthew Botvinick, Zeb Kurth-Nelson

    Abstract: Discovering and exploiting the causal structure in the environment is a crucial challenge for intelligent agents. Here we explore whether causal reasoning can emerge via meta-reinforcement learning. We train a recurrent network with model-free reinforcement learning to solve a range of problems that each contain causal structure. We find that the trained agent can perform causal reasoning in novel… ▽ More

    Submitted 23 January, 2019; originally announced January 2019.

  19. arXiv:1804.04622  [pdf, other

    cs.LG stat.ME stat.ML

    Causal Inference via Kernel Deviance Measures

    Authors: Jovana Mitrovic, Dino Sejdinovic, Yee Whye Teh

    Abstract: Discovering the causal structure among a set of variables is a fundamental problem in many areas of science. In this paper, we propose Kernel Conditional Deviance for Causal Inference (KCDC) a fully nonparametric causal discovery method based on purely observational data. From a novel interpretation of the notion of asymmetry between cause and effect, we derive a corresponding asymmetry measure us… ▽ More

    Submitted 12 April, 2018; originally announced April 2018.

  20. arXiv:1802.01071  [pdf, other

    stat.ML cs.LG

    Hierarchical Adversarially Learned Inference

    Authors: Mohamed Ishmael Belghazi, Sai Rajeswar, Olivier Mastropietro, Negar Rostamzadeh, Jovana Mitrovic, Aaron Courville

    Abstract: We propose a novel hierarchical generative model with a simple Markovian structure and a corresponding inference model. Both the generative and inference model are trained using the adversarial learning paradigm. We demonstrate that the hierarchical structure supports the learning of progressively more abstract representations as well as providing semantically meaningful reconstructions with diffe… ▽ More

    Submitted 3 February, 2018; originally announced February 2018.

    Comments: 18 pages, 7 figures

  21. arXiv:1602.04805  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    DR-ABC: Approximate Bayesian Computation with Kernel-Based Distribution Regression

    Authors: Jovana Mitrovic, Dino Sejdinovic, Yee Whye Teh

    Abstract: Performing exact posterior inference in complex generative models is often difficult or impossible due to an expensive to evaluate or intractable likelihood function. Approximate Bayesian computation (ABC) is an inference framework that constructs an approximation to the true likelihood based on the similarity between the observed and simulated data as measured by a predefined set of summary stati… ▽ More

    Submitted 15 February, 2016; originally announced February 2016.