Skip to main content

Showing 1–7 of 7 results for author: Sampat, S K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2212.03866  [pdf, other

    cs.CV

    Learning Action-Effect Dynamics for Hypothetical Vision-Language Reasoning Task

    Authors: Shailaja Keyur Sampat, Pratyay Banerjee, Yezhou Yang, Chitta Baral

    Abstract: 'Actions' play a vital role in how humans interact with the world. Thus, autonomous agents that would assist us in everyday tasks also require the capability to perform 'Reasoning about Actions & Change' (RAC). This has been an important research direction in Artificial Intelligence (AI) in general, but the study of RAC with visual and linguistic inputs is relatively recent. The CLEVR_HYP (Sampat… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

    Comments: 11 pages, 9 figures; Accepted at Findings of EMNLP 2022. arXiv admin note: substantial text overlap with arXiv:2212.03433

  2. arXiv:2212.03433  [pdf, other

    cs.CV

    Learning Action-Effect Dynamics from Pairs of Scene-graphs

    Authors: Shailaja Keyur Sampat, Pratyay Banerjee, Yezhou Yang, Chitta Baral

    Abstract: 'Actions' play a vital role in how humans interact with the world. Thus, autonomous agents that would assist us in everyday tasks also require the capability to perform 'Reasoning about Actions & Change' (RAC). Recently, there has been growing interest in the study of RAC with visual and linguistic inputs. Graphs are often used to represent semantic structure of the visual content (i.e. objects, t… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Comments: 5 pages, 6 figures; Accepted at 3rd Workshop on Graphs and more Complex structures for Learning and Reasoning (GCLR) workshop, AAAI 2023

  3. arXiv:2207.07568  [pdf, other

    cs.CL

    Reasoning about Actions over Visual and Linguistic Modalities: A Survey

    Authors: Shailaja Keyur Sampat, Maitreya Patel, Subhasish Das, Yezhou Yang, Chitta Baral

    Abstract: 'Actions' play a vital role in how humans interact with the world and enable them to achieve desired goals. As a result, most common sense (CS) knowledge for humans revolves around actions. While 'Reasoning about Actions & Change' (RAC) has been widely studied in the Knowledge Representation community, it has recently piqued the interest of NLP and computer vision researchers. This paper surveys e… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: 7 pages, 3 figures; This survey will be periodically updated with the latest works in this area

  4. arXiv:2204.07705  [pdf, other

    cs.CL cs.AI

    Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

    Authors: Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Maitreya Patel, Kuntal Kumar Pal, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza , et al. (15 additional authors not shown)

    Abstract: How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? To address this question, we first introduce Super-NaturalInstructions, a benchmark of 1,616 diverse NLP tasks and their expert-written instructions. Our collection covers 76 distinct task types, including but not limited to classification, extraction, infilling, sequence tagging, text rewriting,… ▽ More

    Submitted 24 October, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

    Comments: Accepted to EMNLP 2022, 25 pages

  5. arXiv:2104.05981  [pdf, other

    cs.CV

    CLEVR_HYP: A Challenge Dataset and Baselines for Visual Question Answering with Hypothetical Actions over Images

    Authors: Shailaja Keyur Sampat, Akshay Kumar, Yezhou Yang, Chitta Baral

    Abstract: Most existing research on visual question answering (VQA) is limited to information explicitly present in an image or a video. In this paper, we take visual understanding to a higher level where systems are challenged to answer questions that involve mentally simulating the hypothetical consequences of performing specific actions in a given scenario. Towards that end, we formulate a vision-languag… ▽ More

    Submitted 13 April, 2021; originally announced April 2021.

    Comments: 16 pages, 11 figures, Accepted as a Long Paper at NAACL-HLT 2021

  6. arXiv:2103.15022  [pdf, other

    cs.CL

    'Just because you are right, doesn't mean I am wrong': Overcoming a Bottleneck in the Development and Evaluation of Open-Ended Visual Question Answering (VQA) Tasks

    Authors: Man Luo, Shailaja Keyur Sampat, Riley Tallman, Yankai Zeng, Manuha Vancha, Akarshan Sajja, Chitta Baral

    Abstract: GQA~\citep{hudson2019gqa} is a dataset for real-world visual reasoning and compositional question answering. We found that many answers predicted by the best vision-language models on the GQA dataset do not match the ground-truth answer but still are semantically meaningful and correct in the given context. In fact, this is the case with most existing visual question answering (VQA) datasets where… ▽ More

    Submitted 31 May, 2022; v1 submitted 27 March, 2021; originally announced March 2021.

    Comments: accepted to EACL 2021

  7. arXiv:2005.00330  [pdf, other

    cs.CV cs.AI cs.CL

    Visuo-Linguistic Question Answering (VLQA) Challenge

    Authors: Shailaja Keyur Sampat, Yezhou Yang, Chitta Baral

    Abstract: Understanding images and text together is an important aspect of cognition and building advanced Artificial Intelligence (AI) systems. As a community, we have achieved good benchmarks over language and vision domains separately, however joint reasoning is still a challenge for state-of-the-art computer vision and natural language processing (NLP) systems. We propose a novel task to derive joint in… ▽ More

    Submitted 18 November, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

    Comments: Findings of EMNLP 2020 (22 pages, 13 figures)