Skip to main content

Showing 1–7 of 7 results for author: Mogadala, A

.
  1. arXiv:2010.15251  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Fusion Models for Improved Visual Captioning

    Authors: Marimuthu Kalimuthu, Aditya Mogadala, Marius Mosbach, Dietrich Klakow

    Abstract: Visual captioning aims to generate textual descriptions given images or videos. Traditionally, image captioning models are trained on human annotated datasets such as Flickr30k and MS-COCO, which are limited in size and diversity. This limitation hinders the generalization capabilities of these models while also rendering them liable to making mistakes. Language models can, however, be trained on… ▽ More

    Submitted 4 December, 2020; v1 submitted 28 October, 2020; originally announced October 2020.

    Comments: Accepted at "Multi-Modal Deep Learning: Challenges and Applications" (MMDLCA), International Conference on Pattern Recognition (ICPR)-2020, Milano, Italia

    Journal ref: Springer LNCS, volume 12666, 2021

  2. arXiv:2007.11690  [pdf, other

    cs.CV cs.CL cs.LG

    Integrating Image Captioning with Rule-based Entity Masking

    Authors: Aditya Mogadala, Xiaoyu Shen, Dietrich Klakow

    Abstract: Given an image, generating its natural language description (i.e., caption) is a well studied problem. Approaches proposed to address this problem usually rely on image features that are difficult to interpret. Particularly, these image features are subdivided into global and local features, where global features are extracted from the global representation of the image, while local features are e… ▽ More

    Submitted 22 July, 2020; originally announced July 2020.

  3. arXiv:2007.06077  [pdf, other

    cs.CV cs.CL cs.LG

    Sparse Graph to Sequence Learning for Vision Conditioned Long Textual Sequence Generation

    Authors: Aditya Mogadala, Marius Mosbach, Dietrich Klakow

    Abstract: Generating longer textual sequences when conditioned on the visual information is an interesting problem to explore. The challenge here proliferate over the standard vision conditioned sentence-level generation (e.g., image or video captioning) as it requires to produce a brief and coherent story describing the visual content. In this paper, we mask this Vision-to-Sequence as Graph-to-Sequence lea… ▽ More

    Submitted 12 July, 2020; originally announced July 2020.

    Comments: International Conference on Machine Learning (ICML) 2020 Workshop (https://logicalreasoninggnn.github.io/)

  4. arXiv:1912.07478  [pdf, other

    cs.CV cs.CL cs.LG

    Image Manipulation with Natural Language using Two-sidedAttentive Conditional Generative Adversarial Network

    Authors: Dawei Zhu, Aditya Mogadala, Dietrich Klakow

    Abstract: Altering the content of an image with photo editing tools is a tedious task for an inexperienced user. Especially, when modifying the visual attributes of a specific object in an image without affecting other constituents such as background etc. To simplify the process of image manipulation and to provide more control to users, it is better to utilize a simpler interface like natural language. The… ▽ More

    Submitted 16 December, 2019; originally announced December 2019.

    Comments: Submitted to Journal

  5. arXiv:1907.09358  [pdf, other

    cs.CV cs.CL cs.LG

    Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods

    Authors: Aditya Mogadala, Marimuthu Kalimuthu, Dietrich Klakow

    Abstract: Interest in Artificial Intelligence (AI) and its applications has seen unprecedented growth in the last few years. This success can be partly attributed to the advancements made in the sub-fields of AI such as machine learning, computer vision, and natural language processing. Much of the growth in these fields has been made possible with deep learning, a sub-area of machine learning that uses art… ▽ More

    Submitted 31 December, 2021; v1 submitted 22 July, 2019; originally announced July 2019.

    Comments: Published at the Journal of Artificial Intelligence Research (JAIR); 135 pages

    Journal ref: Journal of Artificial Intelligence Research, Vol. 71, 2021

  6. arXiv:1710.09137  [pdf, other

    cs.CL

    Linking Tweets with Monolingual and Cross-Lingual News using Transformed Word Embeddings

    Authors: Aditya Mogadala, Dominik Jung, Achim Rettinger

    Abstract: Social media platforms have grown into an important medium to spread information about an event published by the traditional media, such as news articles. Grou** such diverse sources of information that discuss the same topic in varied perspectives provide new insights. But the gap in word usage between informal social media content such as tweets and diligently written content (e.g. news articl… ▽ More

    Submitted 25 October, 2017; originally announced October 2017.

    Comments: Presented at CICLing 2017 (18th International Conference on Intelligent Text Processing and Computational Linguistics). To appear in International Journal of Computational Linguistics and Applications (IJLCA)

  7. arXiv:1710.06303  [pdf, other

    cs.CV cs.CL

    Describing Natural Images Containing Novel Objects with Knowledge Guided Assitance

    Authors: Aditya Mogadala, Umanga Bista, Lexing Xie, Achim Rettinger

    Abstract: Images in the wild encapsulate rich knowledge about varied abstract concepts and cannot be sufficiently described with models built only using image-caption pairs containing selected objects. We propose to handle such a task with the guidance of a knowledge base that incorporate many abstract concepts. Our method is a two-step process where we first build a multi-entity-label image recognition mod… ▽ More

    Submitted 17 October, 2017; originally announced October 2017.

    Comments: 10 pages, 5 figures