Skip to main content

Showing 1–25 of 25 results for author: Marrese-Taylor, E

.
  1. arXiv:2405.17139  [pdf, other

    cs.CV cs.AI cs.LG

    Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling

    Authors: Cristian Rodriguez-Opazo, Ehsan Abbasnejad, Damien Teney, Edison Marrese-Taylor, Hamed Damirchi, Anton van den Hengel

    Abstract: Contrastive Language-Image Pretraining (CLIP) stands out as a prominent method for image representation learning. Various architectures, from vision transformers (ViTs) to convolutional networks (ResNets) have been trained with CLIP to serve as general solutions to diverse vision tasks. This paper explores the differences across various CLIP-trained vision backbones. Despite using the same data an… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2312.14400

  2. arXiv:2403.05881  [pdf, other

    cs.CL

    KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking Techniques

    Authors: Rui Yang, Haoran Liu, Edison Marrese-Taylor, Qingcheng Zeng, Yu He Ke, Wanxin Li, Lechao Cheng, Qingyu Chen, James Caverlee, Yutaka Matsuo, Irene Li

    Abstract: Large Language Models (LLMs) have significantly advanced healthcare innovation on generation capabilities. However, their application in real clinical settings is challenging due to potential deviations from medical facts and inherent biases. In this work, we develop an augmented LLM framework, KG-Rank, which leverages a medical knowledge graph (KG) with ranking and re-ranking techniques, aiming t… ▽ More

    Submitted 18 March, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  3. arXiv:2312.14400  [pdf, other

    cs.CV

    Unveiling Backbone Effects in CLIP: Exploring Representational Synergies and Variances

    Authors: Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Ehsan Abbasnejad, Hamed Damirchi, Ignacio M. Jara, Felipe Bravo-Marquez, Anton van den Hengel

    Abstract: Contrastive Language-Image Pretraining (CLIP) stands out as a prominent method for image representation learning. Various neural architectures, spanning Transformer-based models like Vision Transformers (ViTs) to Convolutional Networks (ConvNets) like ResNets, are trained with CLIP and serve as universal backbones across diverse vision tasks. Despite utilizing the same data and training objectives… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  4. arXiv:2311.13105  [pdf, other

    cs.CL

    Perceptual Structure in the Absence of Grounding for LLMs: The Impact of Abstractedness and Subjectivity in Color Language

    Authors: Pablo Loyola, Edison Marrese-Taylor, Andres Hoyos-Idobro

    Abstract: The need for grounding in language understanding is an active research topic. Previous work has suggested that color perception and color language appear as a suitable test bed to empirically study the problem, given its cognitive significance and showing that there is considerable alignment between a defined color space and the feature space defined by a language model. To further study this issu… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023 Findings

  5. arXiv:2310.02778  [pdf, other

    cs.CL cs.AI

    Integrating UMLS Knowledge into Large Language Models for Medical Question Answering

    Authors: Rui Yang, Edison Marrese-Taylor, Yuhe Ke, Lechao Cheng, Qingyu Chen, Irene Li

    Abstract: Large language models (LLMs) have demonstrated powerful text generation capabilities, bringing unprecedented innovation to the healthcare field. While LLMs hold immense promise for applications in healthcare, applying them to real clinical scenarios presents significant challenges, as these models may generate content that deviates from established medical facts and even exhibit potential biases.… ▽ More

    Submitted 13 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: 12 pages, 3 figures

  6. arXiv:2310.01138  [pdf, other

    cs.CL

    Target-Aware Contextual Political Bias Detection in News

    Authors: Iffat Maab, Edison Marrese-Taylor, Yutaka Matsuo

    Abstract: Media bias detection requires comprehensive integration of information derived from multiple news sources. Sentence-level political bias detection in news is no exception, and has proven to be a challenging task that requires an understanding of bias in consideration of the context. Inspired by the fact that humans exhibit varying degrees of writing styles, resulting in a diverse range of statemen… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Comments: 11 pages, 3 figures, conference paper accepted in IJCNLP-AACL 2023 but will get published after Nov 4th Bali conference

  7. arXiv:2209.13359  [pdf, other

    cs.CV cs.CL

    Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding

    Authors: Erica K. Shimomoto, Edison Marrese-Taylor, Hiroya Takamura, Ichiro Kobayashi, Hideki Nakayama, Yusuke Miyao

    Abstract: This paper explores the task of Temporal Video Grounding (TVG) where, given an untrimmed video and a natural language sentence query, the goal is to recognize and determine temporal boundaries of action instances in the video described by the query. Recent works tackled this task by improving query inputs with large pre-trained language models (PLM) at the cost of more expensive training. However,… ▽ More

    Submitted 25 May, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: Accepted for Findings of ACL2023

  8. arXiv:2112.10066  [pdf, other

    cs.CV

    LocFormer: Enabling Transformers to Perform Temporal Moment Localization on Long Untrimmed Videos With a Feature Sampling Approach

    Authors: Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Basura Fernando, Hiroya Takamura, Qi Wu

    Abstract: We propose LocFormer, a Transformer-based model for video grounding which operates at a constant memory footprint regardless of the video length, i.e. number of frames. LocFormer is designed for tasks where it is necessary to process the entire long video and at its core lie two main contributions. First, our model incorporates a new sampling technique that splits the input feature sequence into a… ▽ More

    Submitted 19 December, 2021; originally announced December 2021.

  9. arXiv:2101.00234  [pdf, other

    cs.CL cs.LG

    Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers

    Authors: Machel Reid, Edison Marrese-Taylor, Yutaka Matsuo

    Abstract: Transformers have shown improved performance when compared to previous architectures for sequence processing such as RNNs. Despite their sizeable performance gains, as recently suggested, the model is computationally expensive to train and with a high parameter budget. In light of this, we explore parameter-sharing methods in Transformers with a specific focus on generative models. We perform an a… ▽ More

    Submitted 8 September, 2021; v1 submitted 1 January, 2021; originally announced January 2021.

    Comments: EMNLP 2021 Findings

  10. arXiv:2010.06260  [pdf, other

    cs.CV

    DORi: Discovering Object Relationship for Moment Localization of a Natural-Language Query in Video

    Authors: Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Basura Fernando, Hongdong Li, Stephen Gould

    Abstract: This paper studies the task of temporal moment localization in a long untrimmed video using natural language query. Given a query sentence, the goal is to determine the start and end of the relevant segment within the video. Our key innovation is to learn a video feature embedding through a language-conditioned message-passing algorithm suitable for temporal moment localization which captures the… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

  11. arXiv:2010.03124  [pdf, other

    cs.CL cs.LG

    VCDM: Leveraging Variational Bi-encoding and Deep Contextualized Word Representations for Improved Definition Modeling

    Authors: Machel Reid, Edison Marrese-Taylor, Yutaka Matsuo

    Abstract: In this paper, we tackle the task of definition modeling, where the goal is to learn to generate definitions of words and phrases. Existing approaches for this task are discriminative, combining distributional and lexical semantics in an implicit rather than direct way. To tackle this issue we propose a generative model for the task, introducing a continuous latent variable to explicitly model the… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020, 10 Pages

  12. arXiv:2005.13362  [pdf, other

    cs.CL cs.CV

    A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews

    Authors: Edison Marrese-Taylor, Cristian Rodriguez-Opazo, Jorge A. Balazs, Stephen Gould, Yutaka Matsuo

    Abstract: Despite the recent advances in opinion mining for written reviews, few works have tackled the problem on other sources of reviews. In light of this issue, we propose a multi-modal approach for mining fine-grained opinions from video reviews that is able to determine the aspects of the item under review that are being discussed and the sentiment orientation towards them. Our approach works at the s… ▽ More

    Submitted 27 May, 2020; v1 submitted 27 May, 2020; originally announced May 2020.

    Comments: Second Grand Challenge and Workshop on Multimodal Language ACL 2020

  13. arXiv:2004.09143  [pdf, other

    cs.CL

    Variational Inference for Learning Representations of Natural Language Edits

    Authors: Edison Marrese-Taylor, Machel Reid, Yutaka Matsuo

    Abstract: Document editing has become a pervasive component of the production of information, with version control systems enabling edits to be efficiently stored and applied. In light of this, the task of learning distributed representations of edits has been recently proposed. With this in mind, we propose a novel approach that employs variational inference to learn a continuous latent space of vector rep… ▽ More

    Submitted 3 January, 2021; v1 submitted 20 April, 2020; originally announced April 2020.

    Comments: Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

  14. arXiv:2003.04419  [pdf, ps, other

    cs.CL cs.LG

    Combining Pretrained High-Resource Embeddings and Subword Representations for Low-Resource Languages

    Authors: Machel Reid, Edison Marrese-Taylor, Yutaka Matsuo

    Abstract: The contrast between the need for large amounts of data for current Natural Language Processing (NLP) techniques, and the lack thereof, is accentuated in the case of African languages, most of which are considered low-resource. To help circumvent this issue, we explore techniques exploiting the qualities of morphologically rich languages (MRLs), while leveraging pretrained word vectors in well-res… ▽ More

    Submitted 21 April, 2020; v1 submitted 9 March, 2020; originally announced March 2020.

    Comments: Accepted to the "AfricaNLP - Unlocking Local Languages" workshop at ICLR 2020

  15. arXiv:1909.08880  [pdf, other

    cs.CL

    An Edit-centric Approach for Wikipedia Article Quality Assessment

    Authors: Edison Marrese-Taylor, Pablo Loyola, Yutaka Matsuo

    Abstract: We propose an edit-centric approach to assess Wikipedia article quality as a complementary alternative to current full document-based techniques. Our model consists of a main classifier equipped with an auxiliary generative module which, for a given edit, jointly provides an estimation of its quality and generates a description in natural language. We performed an empirical study to assess the fea… ▽ More

    Submitted 19 September, 2019; originally announced September 2019.

    Comments: Accepted at the W-NUT Workshop, EMNLP 2019

  16. arXiv:1908.07236  [pdf, other

    cs.CV

    Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention

    Authors: Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Fatemeh Sadat Saleh, Hongdong Li, Stephen Gould

    Abstract: This paper studies the problem of temporal moment localization in a long untrimmed video using natural language as the query. Given an untrimmed video and a sentence as the query, the goal is to determine the starting, and the ending, of the relevant visual moment in the video, that corresponds to the query sentence. While previous works have tackled this task by a propose-and-rank approach, we in… ▽ More

    Submitted 12 March, 2020; v1 submitted 20 August, 2019; originally announced August 2019.

    Comments: Winter Conference on Applications of Computer Vision 2020

  17. arXiv:1809.09795  [pdf, other

    cs.CL

    Deep contextualized word representations for detecting sarcasm and irony

    Authors: Suzana Ilić, Edison Marrese-Taylor, Jorge A. Balazs, Yutaka Matsuo

    Abstract: Predicting context-dependent and non-literal utterances like sarcastic and ironic expressions still remains a challenging task in NLP, as it goes beyond linguistic patterns, encompassing common sense and shared knowledge as crucial components. To capture complex morpho-syntactic features that can usually serve as indicators for irony or sarcasm across dynamic contexts, we propose a model that uses… ▽ More

    Submitted 25 September, 2018; originally announced September 2018.

    Comments: To appear in WASSA 2018

  18. arXiv:1808.08672  [pdf, other

    cs.CL

    IIIDYT at IEST 2018: Implicit Emotion Classification With Deep Contextualized Word Representations

    Authors: Jorge A. Balazs, Edison Marrese-Taylor, Yutaka Matsuo

    Abstract: In this paper we describe our system designed for the WASSA 2018 Implicit Emotion Shared Task (IEST), which obtained 2$^{\text{nd}}$ place out of 26 teams with a test macro F1 score of $0.710$. The system is composed of a single pre-trained ELMo layer for encoding words, a Bidirectional Long-Short Memory Network BiLSTM for enriching word representations with context, a max-pooling operation for cr… ▽ More

    Submitted 1 September, 2018; v1 submitted 26 August, 2018; originally announced August 2018.

    Comments: Accepted as a system description paper for the Implicit Emotion Shared Task of WASSA 2018 (EMNLP)

  19. arXiv:1806.04524  [pdf, other

    cs.CL

    Learning to Automatically Generate Fill-In-The-Blank Quizzes

    Authors: Edison Marrese-Taylor, Ai Nakajima, Yutaka Matsuo, Ono Yuichi

    Abstract: In this paper we formalize the problem automatic fill-in-the-blank question generation using two standard NLP machine learning schemes, proposing concrete deep learning models for each. We present an empirical study based on data obtained from a language learning platform showing that both of our proposed settings offer promising results.

    Submitted 12 June, 2018; originally announced June 2018.

    Comments: 5 pages

    Journal ref: 5th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA), collocated with ACL 2018

  20. arXiv:1804.08094  [pdf, other

    cs.CL

    IIIDYT at SemEval-2018 Task 3: Irony detection in English tweets

    Authors: Edison Marrese-Taylor, Suzana Ilic, Jorge A. Balazs, Yutaka Matsuo, Helmut Prendinger

    Abstract: In this paper we introduce our system for the task of Irony detection in English tweets, a part of SemEval 2018. We propose representation learning approach that relies on a multi-layered bidirectional LSTM, without using external features that provide additional semantic information. Although our model is able to outperform the baseline in the validation set, our results show limited generalizati… ▽ More

    Submitted 22 April, 2018; originally announced April 2018.

    Comments: 4 pages

  21. arXiv:1708.05521  [pdf, other

    cs.CL

    EmoAtt at EmoInt-2017: Inner attention sentence embedding for Emotion Intensity

    Authors: Edison Marrese-Taylor, Yutaka Matsuo

    Abstract: In this paper we describe a deep learning system that has been designed and built for the WASSA 2017 Emotion Intensity Shared Task. We introduce a representation learning approach based on inner attention on top of an RNN. Results show that our model offers good capabilities and is able to successfully identify emotion-bearing words to predict intensity without leveraging on lexicons, obtaining th… ▽ More

    Submitted 18 August, 2017; originally announced August 2017.

    Comments: WASSA 2017 Shared Task on Emotion Intensity

  22. arXiv:1708.02420  [pdf, other

    cs.CL

    Mining fine-grained opinions on closed captions of YouTube videos with an attention-RNN

    Authors: Edison Marrese-Taylor, Jorge A. Balazs, Yutaka Matsuo

    Abstract: Video reviews are the natural evolution of written product reviews. In this paper we target this phenomenon and introduce the first dataset created from closed captions of YouTube product review videos as well as a new attention-RNN model for aspect extraction and joint aspect extraction and sentiment classification. Our model provides state-of-the-art performance on aspect extraction without requ… ▽ More

    Submitted 8 August, 2017; originally announced August 2017.

    Comments: 8th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA)

  23. arXiv:1707.03103  [pdf, other

    cs.CL

    Refining Raw Sentence Representations for Textual Entailment Recognition via Attention

    Authors: Jorge A. Balazs, Edison Marrese-Taylor, Pablo Loyola, Yutaka Matsuo

    Abstract: In this paper we present the model used by the team Rivercorners for the 2017 RepEval shared task. First, our model separately encodes a pair of sentences into variable-length representations by using a bidirectional LSTM. Later, it creates fixed-length raw representations by means of simple aggregation functions, which are then refined using an attention mechanism. Finally it combines the refined… ▽ More

    Submitted 12 July, 2017; v1 submitted 10 July, 2017; originally announced July 2017.

  24. arXiv:1704.04856  [pdf, other

    cs.CL

    A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes

    Authors: Pablo Loyola, Edison Marrese-Taylor, Yutaka Matsuo

    Abstract: We propose a model to automatically describe changes introduced in the source code of a program using natural language. Our method receives as input a set of code commits, which contains both the modifications and message introduced by an user. These two modalities are used to train an encoder-decoder architecture. We evaluated our approach on twelve real world open source projects from four diffe… ▽ More

    Submitted 16 April, 2017; originally announced April 2017.

    Comments: Accepted at ACL 2017

  25. arXiv:1701.01565  [pdf, ps, other

    cs.CL

    Replication issues in syntax-based aspect extraction for opinion mining

    Authors: Edison Marrese-Taylor, Yutaka Matsuo

    Abstract: Reproducing experiments is an important instrument to validate previous work and build upon existing approaches. It has been tackled numerous times in different areas of science. In this paper, we introduce an empirical replicability study of three well-known algorithms for syntactic centric aspect-based opinion mining. We show that reproducing results continues to be a difficult endeavor, mainly… ▽ More

    Submitted 6 January, 2017; originally announced January 2017.

    Comments: Accepted in the EACL 2017 SRW