Skip to main content

Showing 1–13 of 13 results for author: Shetty, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2110.01428  [pdf, other

    cs.CV

    Seeking Similarities over Differences: Similarity-based Domain Alignment for Adaptive Object Detection

    Authors: Farzaneh Rezaeianaran, Rakshith Shetty, Rahaf Aljundi, Daniel Olmeda Reino, Shanshan Zhang, Bernt Schiele

    Abstract: In order to robustly deploy object detectors across a wide range of scenarios, they should be adaptable to shifts in the input distribution without the need to constantly annotate new data. This has motivated research in Unsupervised Domain Adaptation (UDA) algorithms for detection. UDA methods learn to adapt from labeled source domains to unlabeled target domains, by inducing alignment between de… ▽ More

    Submitted 4 October, 2021; originally announced October 2021.

    Comments: Accepted in ICCV 2021

  2. arXiv:1912.07538  [pdf, other

    cs.CV cs.CL cs.LG

    Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing

    Authors: Vedika Agarwal, Rakshith Shetty, Mario Fritz

    Abstract: Despite significant success in Visual Question Answering (VQA), VQA models have been shown to be notoriously brittle to linguistic variations in the questions. Due to deficiencies in models and datasets, today's models often rely on correlations rather than predictions that are causal w.r.t. data. In this paper, we propose a novel way to analyze and measure the robustness of the state of the art m… ▽ More

    Submitted 29 May, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

    Comments: 16 pages

  3. arXiv:1812.06707  [pdf, other

    cs.CV cs.AI stat.ML

    Not Using the Car to See the Sidewalk: Quantifying and Controlling the Effects of Context in Classification and Segmentation

    Authors: Rakshith Shetty, Bernt Schiele, Mario Fritz

    Abstract: Importance of visual context in scene understanding tasks is well recognized in the computer vision community. However, to what extent the computer vision models for image classification and semantic segmentation are dependent on the context to make their predictions is unclear. A model overly relying on context will fail when encountering objects in context distributions different from training d… ▽ More

    Submitted 17 December, 2018; originally announced December 2018.

    Comments: 14 pages (12 figures)

  4. arXiv:1809.03707  [pdf, other

    cs.CV cs.CL cs.LG

    Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions

    Authors: M. Wagner, H. Basevi, R. Shetty, W. Li, M. Malinowski, M. Fritz, A. Leonardis

    Abstract: In-depth scene descriptions and question answering tasks have greatly increased the scope of today's definition of scene understanding. While such tasks are in principle open ended, current formulations primarily focus on describing only the current state of the scenes under consideration. In contrast, in this paper, we focus on the future states of the scenes which are also conditioned on actions… ▽ More

    Submitted 21 November, 2018; v1 submitted 11 September, 2018; originally announced September 2018.

    Comments: Paper: 18 pages, 5 figures, 5 tables. Supplementary material: 3 pages, 1 figure, 1 table. To be published in VLEASE ECCV 2018 workshop

    MSC Class: 68

  5. arXiv:1806.01911  [pdf, other

    cs.CV cs.AI stat.ML

    Adversarial Scene Editing: Automatic Object Removal from Weak Supervision

    Authors: Rakshith Shetty, Mario Fritz, Bernt Schiele

    Abstract: While great progress has been made recently in automatic image manipulation, it has been limited to object centric images like faces or structured scene datasets. In this work, we take a step towards general scene-level image editing by develo** an automatic interaction-free object removal model. Our model learns to find and remove objects from general scene images using image-level labels and u… ▽ More

    Submitted 5 June, 2018; originally announced June 2018.

  6. arXiv:1711.01921  [pdf, other

    cs.CR cs.CL cs.CY cs.SI stat.ML

    $A^{4}NT$: Author Attribute Anonymity by Adversarial Training of Neural Machine Translation

    Authors: Rakshith Shetty, Bernt Schiele, Mario Fritz

    Abstract: Text-based analysis methods allow to reveal privacy relevant author attributes such as gender, age and identify of the text's author. Such methods can compromise the privacy of an anonymous author even when the author tries to remove privacy sensitive content. In this paper, we propose an automatic method, called Adversarial Author Attribute Anonymity Neural Translation ($A^4NT$), to combat such t… ▽ More

    Submitted 19 February, 2018; v1 submitted 6 November, 2017; originally announced November 2017.

    Comments: 16 pages, 10 figures and 8 tables

  7. arXiv:1704.07434  [pdf, other

    cs.CV cs.AI

    Paying Attention to Descriptions Generated by Image Captioning Models

    Authors: Hamed R. Tavakoli, Rakshith Shetty, Ali Borji, Jorma Laaksonen

    Abstract: To bridge the gap between humans and machines in image understanding and describing, we need further insight into how people describe a perceived scene. In this paper, we study the agreement between bottom-up saliency-based visual attention and object referrals in scene description constructs. We investigate the properties of human-written descriptions and machine-generated ones. We then propose a… ▽ More

    Submitted 4 August, 2017; v1 submitted 24 April, 2017; originally announced April 2017.

    Comments: To appear in ICCV 2017

  8. arXiv:1703.10476  [pdf, other

    cs.CV cs.AI cs.CL

    Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training

    Authors: Rakshith Shetty, Marcus Rohrbach, Lisa Anne Hendricks, Mario Fritz, Bernt Schiele

    Abstract: While strong progress has been made in image captioning over the last years, machine and human captions are still quite distinct. A closer look reveals that this is due to the deficiencies in the generated word distribution, vocabulary size, and strong bias in the generators towards frequent captions. Furthermore, humans -- rightfully so -- generate multiple, diverse captions, due to the inherent… ▽ More

    Submitted 6 November, 2017; v1 submitted 30 March, 2017; originally announced March 2017.

    Comments: 16 pages, Published in ICCV 2017

  9. Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation

    Authors: Rakshith Shetty, Jorma Laaksonen

    Abstract: We present our submission to the Microsoft Video to Language Challenge of generating short captions describing videos in the challenge dataset. Our model is based on the encoder--decoder pipeline, popular in image and video captioning systems. We propose to utilize two different kinds of video features, one to capture the video content in terms of objects and attributes, and the other to capture t… ▽ More

    Submitted 17 August, 2016; originally announced August 2016.

  10. arXiv:1512.02949  [pdf, other

    cs.CV

    Video captioning with recurrent networks based on frame- and video-level features and visual content classification

    Authors: Rakshith Shetty, Jorma Laaksonen

    Abstract: In this paper, we describe the system for generating textual descriptions of short video clips using recurrent neural networks (RNN), which we used while participating in the Large Scale Movie Description Challenge 2015 in ICCV 2015. Our work builds on static image captioning systems with RNN based language models and extends this framework to videos utilizing both static image features and video-… ▽ More

    Submitted 9 December, 2015; originally announced December 2015.

  11. arXiv:1510.07830  [pdf

    cs.NI

    Automation of Smartphone Traffic Generation in a Virtualized Environment

    Authors: Tanya Jha, Rashmi Shetty

    Abstract: Scalable and comprehensive analysis of rapidly evolving mobile device application traffic is extremely important but a challenging problem for the Deep Packet Inspection (DPI) engines to perform effective policy management. We present a test framework in which a test driver can automate/orchestrate traffic generation by invoking appropriate method (intent) of real mobile applications (as opposed t… ▽ More

    Submitted 27 October, 2015; originally announced October 2015.

  12. arXiv:1510.05577  [pdf

    cs.LG

    Application of Machine Learning Techniques in Human Activity Recognition

    Authors: Jitenkumar Babubhai Rana, Rashmi Shetty, Tanya Jha

    Abstract: Human activity detection has seen a tremendous growth in the last decade playing a major role in the field of pervasive computing. This emerging popularity can be attributed to its myriad of real-life applications primarily dealing with human-centric problems like healthcare and elder care. Many research attempts with data mining and machine learning techniques have been undergoing to accurately d… ▽ More

    Submitted 19 October, 2015; originally announced October 2015.

  13. arXiv:1208.1880  [pdf

    cs.CV cs.MM cs.SD

    Stereo Acoustic Perception based on Real Time Video Acquisition for Navigational Assistance

    Authors: Supreeth K. Rao, Arpitha Prasad B., Anushree R. Shetty, Chinmai, R. Bhakthavathsalam, Rajeshwari Hegde

    Abstract: A smart navigation system (an Electronic Travel Aid) based on an object detection mechanism has been designed to detect the presence of obstacles that immediately impede the path, by means of real time video processing. The algorithm can be used for any general purpose navigational aid. This paper is discussed, kee** in mind the navigation of the visually impaired, and is not limited to the same… ▽ More

    Submitted 9 August, 2012; originally announced August 2012.

    Comments: 12 pages, 8 figures, 1 table, SIPM-2012, pp. 97-108, 2012; http://airccj.org/CSCP/vol2/csit2311.pdf