Skip to main content

Showing 1–19 of 19 results for author: Akula, R

.
  1. arXiv:2305.18373  [pdf, other

    cs.CV cs.CL

    KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models

    Authors: Zhiwei Jia, Pradyumna Narayana, Arjun R. Akula, Garima Pruthi, Hao Su, Sugato Basu, Varun Jampani

    Abstract: Image ad understanding is a crucial task with wide real-world applications. Although highly challenging with the involvement of diverse atypical scenes, real-world entities, and reasoning over scene-texts, how to interpret image ads is relatively under-explored, especially in the era of foundational vision-language models (VLMs) featuring impressive generalizability and adaptability. In this paper… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  2. arXiv:2212.09898  [pdf, other

    cs.CV

    MetaCLUE: Towards Comprehensive Visual Metaphors Research

    Authors: Arjun R. Akula, Brendan Driscoll, Pradyumna Narayana, Soravit Changpinyo, Zhiwei Jia, Suyash Damle, Garima Pruthi, Sugato Basu, Leonidas Guibas, William T. Freeman, Yuanzhen Li, Varun Jampani

    Abstract: Creativity is an indispensable part of human cognition and also an inherent part of how we make sense of the world. Metaphorical abstraction is fundamental in communicating creative ideas through nuanced relationships between abstract concepts such as feelings. While computer vision benchmarks and approaches predominantly focus on understanding and generating literal interpretations of images, met… ▽ More

    Submitted 2 June, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: Accepted in CVPR 2023. Project page: https://metaclue.github.io/ , Video summary: https://youtu.be/V3TmeNETL-o

  3. arXiv:2201.11194  [pdf, other

    cs.HC cs.LG

    Attention cannot be an Explanation

    Authors: Arjun R Akula, Song-Chun Zhu

    Abstract: Attention based explanations (viz. saliency maps), by providing interpretability to black box models such as deep neural networks, are assumed to improve human trust and reliance in the underlying models. Recently, it has been shown that attention weights are frequently uncorrelated with gradient-based measures of feature importance. Motivated by this, we ask a follow-up question: "Assuming that w… ▽ More

    Submitted 26 January, 2022; originally announced January 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2109.01401, arXiv:1909.06907

  4. arXiv:2201.09639  [pdf, other

    cs.CV

    Question Generation for Evaluating Cross-Dataset Shifts in Multi-modal Grounding

    Authors: Arjun R. Akula

    Abstract: Visual question answering (VQA) is the multi-modal task of answering natural language questions about an input image. Through cross-dataset adaptation methods, it is possible to transfer knowledge from a source dataset with larger train samples to a target dataset where training set is limited. Suppose a VQA model trained on one dataset train set fails in adapting to another, it is hard to identif… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

  5. arXiv:2201.06207  [pdf, other

    cs.CV

    Discourse Analysis for Evaluating Coherence in Video Paragraph Captions

    Authors: Arjun R Akula, Song-Chun Zhu

    Abstract: Video paragraph captioning is the task of automatically generating a coherent paragraph description of the actions in a video. Previous linguistic studies have demonstrated that coherence of a natural language text is reflected by its discourse structure and relations. However, existing video captioning methods evaluate the coherence of generated paragraphs by comparing them merely against human p… ▽ More

    Submitted 16 January, 2022; originally announced January 2022.

  6. arXiv:2109.01401  [pdf, other

    cs.AI cs.CV cs.LG

    CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition Models

    Authors: Arjun R. Akula, Keze Wang, Changsong Liu, Sari Saba-Sadiya, Hong**g Lu, Sinisa Todorovic, Joyce Chai, Song-Chun Zhu

    Abstract: We propose CX-ToM, short for counterfactual explanations with theory-of mind, a new explainable AI (XAI) framework for explaining decisions made by a deep convolutional neural network (CNN). In contrast to the current methods in XAI that generate explanations as a single shot response, we pose explanation as an iterative communication process, i.e. dialog, between the machine and human user. More… ▽ More

    Submitted 2 December, 2021; v1 submitted 3 September, 2021; originally announced September 2021.

    Comments: Accepted by iScience Cell Press Journal 2021. arXiv admin note: text overlap with arXiv:1909.06907

  7. arXiv:2107.14046  [pdf, other

    cs.CY

    Audit and Assurance of AI Algorithms: A framework to ensure ethical algorithmic practices in Artificial Intelligence

    Authors: Ramya Akula, Ivan Garibay

    Abstract: Algorithms are becoming more widely used in business, and businesses are becoming increasingly concerned that their algorithms will cause significant reputational or financial damage. We should emphasize that any of these damages stem from situations in which the United States lacks strict legislative prohibitions or specified protocols for measuring damages. As a result, governments are enacting… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

    Journal ref: International Conference on Human-Computer Interaction 2021

  8. arXiv:2107.14044  [pdf, other

    cs.CY cs.AI

    Ethical AI for Social Good

    Authors: Ramya Akula, Ivan Garibay

    Abstract: The concept of AI for Social Good(AI4SG) is gaining momentum in both information societies and the AI community. Through all the advancement of AI-based solutions, it can solve societal issues effectively. To date, however, there is only a rudimentary grasp of what constitutes AI socially beneficial in principle, what constitutes AI4SG in reality, and what are the policies and regulations needed t… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

    Journal ref: International Conference on Human-Computer Interaction, 2021

  9. arXiv:2101.05875  [pdf, other

    cs.CL cs.AI cs.SI

    Interpretable Multi-Head Self-Attention model for Sarcasm Detection in social media

    Authors: Ramya Akula, Ivan Garibay

    Abstract: Sarcasm is a linguistic expression often used to communicate the opposite of what is said, usually something that is very unpleasant with an intention to insult or ridicule. Inherent ambiguity in sarcastic expressions, make sarcasm detection very difficult. In this work, we focus on detecting sarcasm in textual conversations from various social networking platforms and online media. To this end, w… ▽ More

    Submitted 14 January, 2021; originally announced January 2021.

  10. arXiv:2005.01655  [pdf, other

    cs.CL cs.CV

    Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions

    Authors: Arjun R Akula, Spandana Gella, Yaser Al-Onaizan, Song-Chun Zhu, Siva Reddy

    Abstract: Visual referring expression recognition is a challenging task that requires natural language understanding in the context of an image. We critically examine RefCOCOg, a standard benchmark for this task, using a human study and show that 83.7% of test instances do not require reasoning on linguistic structure, i.e., words are enough to identify the target object, the word order doesn't matter. To m… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  11. arXiv:1911.11642  [pdf, other

    cs.PF cs.AR

    System Performance with varying L1 Instruction and Data Cache Sizes: An Empirical Analysis

    Authors: Ramya Akula, Kartik Jain, Deep Jigar Kotecha

    Abstract: In this project, we investigate the fluctuations in performance caused by changing the Instruction (I-cache) size and the Data (D-cache) size in the L1 cache. We employ the Gem5 framework to simulate a system with varying specifications on a single host machine. We utilize the FreqMine benchmark available under the PARSEC suite as the workload program to benchmark our simulated system. The Out-ord… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Comments: 5 Figures and 3 Tables

  12. arXiv:1910.12589  [pdf, other

    cs.LG stat.ML

    Forecasting the Success of Television Series using Machine Learning

    Authors: Ramya Akula, Zachary Wieselthier, Laura Martin, Ivan Garibay

    Abstract: Television is an ever-evolving multi billion dollar industry. The success of a television show in an increasingly technological society is a vast multi-variable formula. The art of success is not just something that happens, but is studied, replicated, and applied. Hollywood can be unpredictable regarding success, as many movies and sitcoms that are hyped up and promise to be a hit end up being bo… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

    Comments: 9 Pages, 10 Figures and 2 Tables

  13. arXiv:1910.09356  [pdf, other

    cs.LG stat.ML

    Supervised Machine Learning based Ensemble Model for Accurate Prediction of Type 2 Diabetes

    Authors: Ramya Akula, Ni Nguyen, Ivan Garibay

    Abstract: According to the American Diabetes Association(ADA), 30.3 million people in the United States have diabetes, but only 7.2 million may be undiagnosed and unaware of their condition. Type 2 diabetes is usually diagnosed for most patients later on in life whereas the less common Type 1 diabetes is diagnosed early on in life. People can live healthy and happy lives while living with diabetes, but earl… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

    Comments: 9 Pages, # Tables and 8 Figures

  14. arXiv:1910.07999  [pdf

    cs.SI cs.AI cs.LG cs.MA

    DeepFork: Supervised Prediction of Information Diffusion in GitHub

    Authors: Ramya Akula, Niloofar Yousefi, Ivan Garibay

    Abstract: Information spreads on complex social networks extremely fast, in other words, a piece of information can go viral within no time. Often it is hard to barricade this diffusion prior to the significant occurrence of chaos, be it a social media or an online coding platform. GitHub is one such trending online focal point for any business to reach their potential contributors and customers, simultaneo… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

    Comments: 12 Pages, 7 Figures, 2 Tables

  15. arXiv:1909.06907  [pdf, other

    cs.AI cs.CV cs.HC cs.LG

    X-ToM: Explaining with Theory-of-Mind for Gaining Justified Human Trust

    Authors: Arjun R. Akula, Changsong Liu, Sari Saba-Sadiya, Hong**g Lu, Sinisa Todorovic, Joyce Y. Chai, Song-Chun Zhu

    Abstract: We present a new explainable AI (XAI) framework aimed at increasing justified human trust and reliance in the AI machine through explanations. We pose explanation as an iterative communication process, i.e. dialog, between the machine and human user. More concretely, the machine generates sequence of explanations in a dialog which takes into account three important aspects at each dialog turn: (a)… ▽ More

    Submitted 15 September, 2019; originally announced September 2019.

    Comments: A short version of this was presented at CVPR 2019 Workshop on Explainable AI

  16. arXiv:1903.05720  [pdf, other

    cs.AI

    Natural Language Interaction with Explainable AI Models

    Authors: Arjun R Akula, Sinisa Todorovic, Joyce Y Chai, Song-Chun Zhu

    Abstract: This paper presents an explainable AI (XAI) system that provides explanations for its predictions. The system consists of two key components -- namely, the prediction And-Or graph (AOG) model for recognizing and localizing concepts of interest in input data, and the XAI model for providing explanations to the user about the AOG's predictions. In this work, we focus on the XAI model specified to in… ▽ More

    Submitted 7 July, 2019; v1 submitted 13 March, 2019; originally announced March 2019.

    Journal ref: CVPR 2019 Workshop on Explainable AI

  17. arXiv:1903.02252  [pdf, other

    cs.CV

    Discourse Parsing in Videos: A Multi-modal Appraoch

    Authors: Arjun R. Akula, Song-Chun Zhu

    Abstract: Text-level discourse parsing aims to unmask how two sentences in the text are related to each other. We propose the task of Visual Discourse Parsing, which requires understanding discourse relations among scenes in a video. Here we use the term scene to refer to a subset of video frames that can better summarize the video. In order to collect a dataset for learning discourse cues from videos, one… ▽ More

    Submitted 22 January, 2022; v1 submitted 6 March, 2019; originally announced March 2019.

    Comments: Accepted in CVPR 2019 Workshop on Language and Vision (Oral Presentation)

    Journal ref: CVPR 2019 Workshop on Language and Vision (Oral Presentation)

  18. arXiv:1902.04003  [pdf, other

    cs.CE math.NA

    Stabilized MorteX method for mesh tying along embedded interfaces

    Authors: Basava Raju Akula, Julien Vignollet, Vladislav A. Yastrebov

    Abstract: We present a unified framework to tie overlap** meshes in solid mechanics applications. This framework is a combination of the X-FEM method and the mortar method, which uses Lagrange multipliers to fulfill the tying constraints. As known, mixed formulations are prone to mesh locking which manifests itself by the emergence of spurious oscillations in the vicinity of the tying interface. To overco… ▽ More

    Submitted 3 February, 2019; originally announced February 2019.

    Comments: 32 pages, 36 figures, 64 references

  19. arXiv:1902.04000  [pdf, other

    cs.CE math.NA

    MorteX method for contact along real and embedded surfaces: coupling X-FEM with the Mortar method

    Authors: Basava Raju Akula, Julien Vignollet, Vladislav A. Yastrebov

    Abstract: A method to treat frictional contact problems along embedded surfaces in the finite element framework is developed. Arbitrarily shaped embedded surfaces, cutting through finite element meshes, are handled by the X-FEM. The frictional contact problem is solved using the monolithic augmented Lagrangian method within the mortar framework which was adapted for handling embedded surfaces. We report tha… ▽ More

    Submitted 3 February, 2019; originally announced February 2019.

    Comments: 30 pages, 28 figures, 58 references