Skip to main content

Showing 1–37 of 37 results for author: Karanam, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02389  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.MM

    SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation

    Authors: Sayan Nag, Koustava Goswami, Srikrishna Karanam

    Abstract: Referring Expression Segmentation (RES) aims to provide a segmentation mask of the target object in an image referred to by the text (i.e., referring expression). Existing methods require large-scale mask annotations. Moreover, such approaches do not generalize well to unseen/zero-shot scenarios. To address the aforementioned issues, we propose a weakly-supervised bootstrap** architecture for RE… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted at ECCV 2024

  2. arXiv:2406.18893  [pdf, other

    cs.CV

    AlignIT: Enhancing Prompt Alignment in Customization of Text-to-Image Models

    Authors: Aishwarya Agarwal, Srikrishna Karanam, Balaji Vasan Srinivasan

    Abstract: We consider the problem of customizing text-to-image diffusion models with user-supplied reference images. Given new prompts, the existing methods can capture the key concept from the reference images but fail to align the generated image with the prompt. In this work, we seek to address this key issue by proposing new methods that can easily be used in conjunction with existing customization meth… ▽ More

    Submitted 27 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 10 pages, 9 figures

  3. arXiv:2406.10197  [pdf, other

    cs.CV cs.AI cs.LG

    Crafting Parts for Expressive Object Composition

    Authors: Harsh Rangwani, Aishwarya Agarwal, Kuldeep Kulkarni, R. Venkatesh Babu, Srikrishna Karanam

    Abstract: Text-to-image generation from large generative models like Stable Diffusion, DALLE-2, etc., have become a common base for various tasks due to their superior quality and extensive knowledge bases. As image composition and generation are creative processes the artists need control over various parts of the images being generated. We find that just adding details about parts in the base text prompt… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Project Page Will Be Here: https://rangwani-harsh.github.io/PartCraft

  4. arXiv:2405.01040  [pdf, other

    cs.CV cs.CL eess.IV

    Few Shot Class Incremental Learning using Vision-Language models

    Authors: Anurag Kumar, Chinmay Bharti, Saikat Dutta, Srikrishna Karanam, Biplab Banerjee

    Abstract: Recent advancements in deep learning have demonstrated remarkable performance comparable to human capabilities across various supervised computer vision tasks. However, the prevalent assumption of having an extensive pool of training data encompassing all classes prior to model training often diverges from real-world scenarios, where limited data availability for novel classes is the norm. The cha… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: under review at Pattern Recognition Letters

  5. arXiv:2403.11026  [pdf, other

    cs.CV

    EfficientMorph: Parameter-Efficient Transformer-Based Architecture for 3D Image Registration

    Authors: Abu Zahid Bin Aziz, Mokshagna Sai Teja Karanam, Tushar Kataria, Shireen Y. Elhabian

    Abstract: Transformers have emerged as the state-of-the-art architecture in medical image registration, outperforming convolutional neural networks (CNNs) by addressing their limited receptive fields and overcoming gradient instability in deeper models. Despite their success, transformer-based models require substantial resources for training, including data, memory, and computational power, which may restr… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  6. arXiv:2312.04429  [pdf, other

    cs.CV

    Approximate Caching for Efficiently Serving Diffusion Models

    Authors: Shubham Agarwal, Subrata Mitra, Sarthak Chakraborty, Srikrishna Karanam, Koyel Mukherjee, Shiv Saini

    Abstract: Text-to-image generation using diffusion models has seen explosive popularity owing to their ability in producing high quality images adhering to text prompts. However, production-grade diffusion model serving is a resource intensive task that not only require high-end GPUs which are expensive but also incurs considerable latency. In this paper, we introduce a technique called approximate-caching… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted at NSDI'24

  7. arXiv:2311.11919  [pdf, other

    cs.CV

    An Image is Worth Multiple Words: Multi-attribute Inversion for Constrained Text-to-Image Synthesis

    Authors: Aishwarya Agarwal, Srikrishna Karanam, Tripti Shukla, Balaji Vasan Srinivasan

    Abstract: We consider the problem of constraining diffusion model outputs with a user-supplied reference image. Our key objective is to extract multiple attributes (e.g., color, object, layout, style) from this single reference image, and then generate new samples with them. One line of existing work proposes to invert the reference images into a single textual conditioning vector, enabling generation of ne… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  8. arXiv:2309.00613  [pdf, other

    cs.CV cs.AI cs.LG

    Iterative Multi-granular Image Editing using Diffusion Models

    Authors: K J Joseph, Prateksha Udhayanan, Tripti Shukla, Aishwarya Agarwal, Srikrishna Karanam, Koustava Goswami, Balaji Vasan Srinivasan

    Abstract: Recent advances in text-guided image synthesis has dramatically changed how creative professionals generate artistic and aesthetically pleasing visual assets. To fully support such creative endeavors, the process should possess the ability to: 1) iteratively edit the generations and 2) control the spatial reach of desired changes (global, local or anything in between). We formalize this pragmatic… ▽ More

    Submitted 28 October, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: Accepted to IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024

  9. arXiv:2308.16649  [pdf, other

    cs.CV

    Learning with Multi-modal Gradient Attention for Explainable Composed Image Retrieval

    Authors: Prateksha Udhayanan, Srikrishna Karanam, Balaji Vasan Srinivasan

    Abstract: We consider the problem of composed image retrieval that takes an input query consisting of an image and a modification text indicating the desired changes to be made on the image and retrieves images that match these changes. Current state-of-the-art techniques that address this problem use global features for the retrieval, resulting in incorrect localization of the regions of interest to be mod… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

  10. arXiv:2307.03273  [pdf, other

    cs.CV eess.IV

    ADASSM: Adversarial Data Augmentation in Statistical Shape Models From Images

    Authors: Mokshagna Sai Teja Karanam, Tushar Kataria, Krithika Iyer, Shireen Elhabian

    Abstract: Statistical shape models (SSM) have been well-established as an excellent tool for identifying variations in the morphology of anatomy across the underlying population. Shape models use consistent shape representation across all the samples in a given cohort, which helps to compare shapes and identify the variations that can detect pathologies and help in formulating treatment plans. In medical im… ▽ More

    Submitted 21 August, 2023; v1 submitted 6 July, 2023; originally announced July 2023.

  11. arXiv:2307.00910  [pdf, other

    cs.CV cs.AI

    CoPL: Contextual Prompt Learning for Vision-Language Understanding

    Authors: Koustava Goswami, Srikrishna Karanam, Prateksha Udhayanan, K J Joseph, Balaji Vasan Srinivasan

    Abstract: Recent advances in multimodal learning has resulted in powerful vision-language models, whose representations are generalizable across a variety of downstream tasks. Recently, their generalization ability has been further extended by incorporating trainable prompts, borrowed from the natural language processing literature. While such prompt learning techniques have shown impressive results, we ide… ▽ More

    Submitted 12 December, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: Accepted at AAAI 2024

  12. arXiv:2306.14603  [pdf, other

    cs.CV

    Learning with Difference Attention for Visually Grounded Self-supervised Representations

    Authors: Aishwarya Agarwal, Srikrishna Karanam, Balaji Vasan Srinivasan

    Abstract: Recent works in self-supervised learning have shown impressive results on single-object images, but they struggle to perform well on complex multi-object images as evidenced by their poor visual grounding. To demonstrate this concretely, we propose visual difference attention (VDA) to compute visual attention maps in an unsupervised fashion by comparing an image with its salient-regions-masked-out… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: 15 pages, 14 figures

  13. arXiv:2306.14544  [pdf, other

    cs.CV

    A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis

    Authors: Aishwarya Agarwal, Srikrishna Karanam, K J Joseph, Apoorv Saxena, Koustava Goswami, Balaji Vasan Srinivasan

    Abstract: While recent developments in text-to-image generative models have led to a suite of high-performing methods capable of producing creative imagery from free-form text, there are several limitations. By analyzing the cross-attention representations of these models, we notice two key issues. First, for text prompts that contain multiple concepts, there is a significant amount of pixel-space overlap (… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: 15 pages, 16 figures

  14. arXiv:2306.02270  [pdf, other

    cs.CR

    Crypto-ransomware Detection through Quantitative API-based Behavioral Profiling

    Authors: Wenjia Song, Sanjula Karanam, Ya Xiao, **gyuan Qi, Nathan Dautenhahn, Na Meng, Elena Ferrari, Danfeng, Yao

    Abstract: Crypto-ransomware has caused an unprecedented scope of impact in recent years with an evolving level of sophistication. We are in urgent need to pinpoint the security gap and improve the effectiveness of defenses by identifying new detection approaches. In this paper, we quantitatively characterized the runtime behaviors of 54 ransomware samples from 35 distinct families, with a focus on the core… ▽ More

    Submitted 9 October, 2023; v1 submitted 4 June, 2023; originally announced June 2023.

  15. arXiv:2302.14757  [pdf, other

    cs.MM cs.IR cs.SD eess.AS

    Audio Retrieval for Multimodal Design Documents: A New Dataset and Algorithms

    Authors: Prachi Singh, Srikrishna Karanam, Sumit Shekhar

    Abstract: We consider and propose a new problem of retrieving audio files relevant to multimodal design document inputs comprising both textual elements and visual imagery, e.g., birthday/greeting cards. In addition to enhancing user experience, integrating audio that matches the theme/style of these inputs also helps improve the accessibility of these documents (e.g., visually impaired people can listen to… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: 5 pages including references

  16. arXiv:2209.04599  [pdf, other

    cs.CR cs.CV cs.LG

    Preserving Privacy in Federated Learning with Ensemble Cross-Domain Knowledge Distillation

    Authors: Xuan Gong, Abhishek Sharma, Srikrishna Karanam, Ziyan Wu, Terrence Chen, David Doermann, Arun Innanje

    Abstract: Federated Learning (FL) is a machine learning paradigm where local nodes collaboratively train a central model while the training data remains decentralized. Existing FL methods typically share model parameters or employ co-distillation to address the issue of unbalanced data distribution. However, they suffer from communication bottlenecks. More importantly, they risk privacy leakage. In this wor… ▽ More

    Submitted 10 September, 2022; originally announced September 2022.

    Comments: Accepted by AAAI2022

  17. arXiv:2209.04596  [pdf, other

    cs.CV cs.AI

    Self-supervised Human Mesh Recovery with Cross-Representation Alignment

    Authors: Xuan Gong, Meng Zheng, Benjamin Planche, Srikrishna Karanam, Terrence Chen, David Doermann, Ziyan Wu

    Abstract: Fully supervised human mesh recovery methods are data-hungry and have poor generalizability due to the limited availability and diversity of 3D-annotated benchmark datasets. Recent progress in self-supervised human mesh recovery has been made using synthetic-data-driven training paradigms where the model is trained from synthetic paired 2D representation (e.g., 2D keypoints and segmentation masks)… ▽ More

    Submitted 10 September, 2022; originally announced September 2022.

    Comments: Accepted ECCV2022

  18. arXiv:2207.05282  [pdf, other

    cs.CV

    PseudoClick: Interactive Image Segmentation with Click Imitation

    Authors: Qin Liu, Meng Zheng, Benjamin Planche, Srikrishna Karanam, Terrence Chen, Marc Niethammer, Ziyan Wu

    Abstract: The goal of click-based interactive image segmentation is to obtain precise object segmentation masks with limited user interaction, i.e., by a minimal number of user clicks. Existing methods require users to provide all the clicks: by first inspecting the segmentation mask and then providing points on mislabeled regions, iteratively. We ask the question: can our model directly predict where to cl… ▽ More

    Submitted 26 July, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

    Comments: 18 pages, 6 figures, 7 tables. ECCV 2022

  19. Learning Hierarchical Attention for Weakly-supervised Chest X-Ray Abnormality Localization and Diagnosis

    Authors: Xi Ouyang, Srikrishna Karanam, Ziyan Wu, Terrence Chen, Jiayu Huo, Xiang Sean Zhou, Qian Wang, Jie-Zhi Cheng

    Abstract: We consider the problem of abnormality localization for clinical applications. While deep learning has driven much recent progress in medical imaging, many clinical challenges are not fully addressed, limiting its broader usage. While recent methods report high diagnostic accuracies, physicians have concerns trusting these algorithm results for diagnostic decision-making purposes because of a gene… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

    Journal ref: IEEE Transactions on Medical Imaging 2021

  20. arXiv:2107.12847  [pdf, other

    cs.CV cs.LG cs.RO

    Learning Local Recurrent Models for Human Mesh Recovery

    Authors: Runze Li, Srikrishna Karanam, Ren Li, Terrence Chen, Bir Bhanu, Ziyan Wu

    Abstract: We consider the problem of estimating frame-level full human body meshes given a video of a person with natural motion dynamics. While much progress in this field has been in single image-based mesh estimation, there has been a recent uptick in efforts to infer mesh dynamics from video given its role in alleviating issues such as depth ambiguity and occlusions. However, a key limitation of existin… ▽ More

    Submitted 27 July, 2021; originally announced July 2021.

    Comments: 10 pages, 6 figures, 2 tables

  21. arXiv:2107.11878  [pdf, other

    cs.CV

    Spatio-Temporal Representation Factorization for Video-based Person Re-Identification

    Authors: Abhishek Aich, Meng Zheng, Srikrishna Karanam, Terrence Chen, Amit K. Roy-Chowdhury, Ziyan Wu

    Abstract: Despite much recent progress in video-based person re-identification (re-ID), the current state-of-the-art still suffers from common real-world challenges such as appearance similarity among various people, occlusions, and frame misalignment. To alleviate these problems, we propose Spatio-Temporal Representation Factorization (STRF), a flexible new computational unit that can be used in conjunctio… ▽ More

    Submitted 14 August, 2021; v1 submitted 25 July, 2021; originally announced July 2021.

    Comments: Accepted at IEEE ICCV 2021, Includes Supplementary Material

  22. arXiv:2107.06239  [pdf, other

    cs.CV cs.GR cs.LG cs.RO stat.ML

    Everybody Is Unique: Towards Unbiased Human Mesh Recovery

    Authors: Ren Li, Meng Zheng, Srikrishna Karanam, Terrence Chen, Ziyan Wu

    Abstract: We consider the problem of obese human mesh recovery, i.e., fitting a parametric human mesh to images of obese people. Despite obese person mesh fitting being an important problem with numerous applications (e.g., healthcare), much recent progress in mesh recovery has been restricted to images of non-obese people. In this work, we identify this crucial gap in the current literature by presenting a… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

    Comments: 10 pages, 5 figures, 4 tables

  23. arXiv:2105.00290  [pdf, other

    cs.CV cs.AI cs.LG

    A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

    Authors: Yunhao Ge, Yao Xiao, Zhi Xu, Meng Zheng, Srikrishna Karanam, Terrence Chen, Laurent Itti, Ziyan Wu

    Abstract: Despite substantial progress in applying neural networks (NN) to a wide variety of areas, they still largely suffer from a lack of transparency and interpretability. While recent developments in explainable artificial intelligence attempt to bridge this gap (e.g., by visualizing the correlation between input pixels and final outputs), these approaches are limited to explaining low-level relationsh… ▽ More

    Submitted 1 May, 2021; originally announced May 2021.

    Comments: CVPR 2021

  24. arXiv:2008.06035  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Towards Visually Explaining Similarity Models

    Authors: Meng Zheng, Srikrishna Karanam, Terrence Chen, Richard J. Radke, Ziyan Wu

    Abstract: We consider the problem of visually explaining similarity models, i.e., explaining why a model predicts two images to be similar in addition to producing a scalar score. While much recent work in visual model interpretability has focused on gradient-based attention, these methods rely on a classification module to generate visual explanations. Consequently, they cannot readily explain other kinds… ▽ More

    Submitted 13 October, 2020; v1 submitted 13 August, 2020; originally announced August 2020.

    Comments: 13 pages, 10 figures, 4 tables. arXiv admin note: substantial text overlap with arXiv:1911.07381

  25. arXiv:2003.04232  [pdf, other

    cs.CV cs.LG cs.RO

    Hierarchical Kinematic Human Mesh Recovery

    Authors: Georgios Georgakis, Ren Li, Srikrishna Karanam, Terrence Chen, Jana Kosecka, Ziyan Wu

    Abstract: We consider the problem of estimating a parametric model of 3D human mesh from a single image. While there has been substantial recent progress in this area with direct regression of model parameters, these methods only implicitly exploit the human body kinematic structure, leading to sub-optimal use of the model prior. In this work, we address this gap by proposing a new technique for regression… ▽ More

    Submitted 14 July, 2020; v1 submitted 9 March, 2020; originally announced March 2020.

    Comments: 17 pages, 8 figures, 5 tables, ECCV 2020

  26. arXiv:1911.07389  [pdf, other

    cs.CV cs.LG

    Towards Visually Explaining Variational Autoencoders

    Authors: Wenqian Liu, Runze Li, Meng Zheng, Srikrishna Karanam, Ziyan Wu, Bir Bhanu, Richard J. Radke, Octavia Camps

    Abstract: Recent advances in Convolutional Neural Network (CNN) model interpretability have led to impressive progress in visualizing and understanding model predictions. In particular, gradient-based visual attention methods have driven much recent effort in using visual attention maps as a means for visual explanations. A key problem, however, is these methods are designed for classification and categoriz… ▽ More

    Submitted 14 April, 2020; v1 submitted 17 November, 2019; originally announced November 2019.

    Comments: 10 pages, 9 figures, 2 tables, CVPR 2020

  27. arXiv:1911.07383  [pdf, other

    cs.CV cs.LG cs.RO

    Towards Robust RGB-D Human Mesh Recovery

    Authors: Ren Li, Changjiang Cai, Georgios Georgakis, Srikrishna Karanam, Terrence Chen, Ziyan Wu

    Abstract: We consider the problem of human pose estimation. While much recent work has focused on the RGB domain, these techniques are inherently under-constrained since there can be many 3D configurations that explain the same 2D projection. To this end, we propose a new method that uses RGB-D data to estimate a parametric human mesh model. Our key innovations include (a) the design of a new dynamic data f… ▽ More

    Submitted 17 November, 2019; originally announced November 2019.

    Comments: 10 pages, 4 figures, 4 tables

  28. arXiv:1911.07381  [pdf, other

    cs.CV cs.LG

    Visual Similarity Attention

    Authors: Meng Zheng, Srikrishna Karanam, Terrence Chen, Richard J. Radke, Ziyan Wu

    Abstract: While there has been substantial progress in learning suitable distance metrics, these techniques in general lack transparency and decision reasoning, i.e., explaining why the input set of images is similar or dissimilar. In this work, we solve this key problem by proposing the first method to generate generic visual similarity explanations with gradient-based attention. We demonstrate that our te… ▽ More

    Submitted 3 May, 2022; v1 submitted 17 November, 2019; originally announced November 2019.

    Comments: 10 pages, 7 figures, 4 tables

  29. arXiv:1811.12297  [pdf, other

    cs.CV cs.LG

    Incremental Scene Synthesis

    Authors: Benjamin Planche, Xuejian Rong, Ziyan Wu, Srikrishna Karanam, Harald Kosch, YingLi Tian, Jan Ernst, Andreas Hutter

    Abstract: We present a method to incrementally generate complete 2D or 3D scenes with the following properties: (a) it is globally consistent at each step according to a learned scene prior, (b) real observations of a scene can be incorporated while observing global consistency, (c) unobserved regions can be hallucinated locally in consistence with previous observations, hallucinations and global priors, an… ▽ More

    Submitted 13 November, 2019; v1 submitted 29 November, 2018; originally announced November 2018.

    Journal ref: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019)

  30. arXiv:1811.07487  [pdf, other

    cs.CV cs.LG

    Re-Identification with Consistent Attentive Siamese Networks

    Authors: Meng Zheng, Srikrishna Karanam, Ziyan Wu, Richard J. Radke

    Abstract: We propose a new deep architecture for person re-identification (re-id). While re-id has seen much recent progress, spatial localization and view-invariant representation learning for robust cross-view matching remain key, unsolved problems. We address these questions by means of a new attention-driven Siamese learning architecture, called the Consistent Attentive Siamese Network. Our key innovati… ▽ More

    Submitted 11 April, 2019; v1 submitted 18 November, 2018; originally announced November 2018.

    Comments: 10 pages, 8 figures, 3 tables, to appear in CVPR 2019

  31. arXiv:1811.07484  [pdf, other

    cs.CV cs.LG

    Sharpen Focus: Learning with Attention Separability and Consistency

    Authors: Lezi Wang, Ziyan Wu, Srikrishna Karanam, Kuan-Chuan Peng, Rajat Vikram Singh, Bo Liu, Dimitris N. Metaxas

    Abstract: Recent developments in gradient-based attention modeling have seen attention maps emerge as a powerful tool for interpreting convolutional neural networks. Despite good localization for an individual class of interest, these techniques produce attention maps with substantially overlap** responses among different classes, leading to the problem of visual confusion and the need for discriminative… ▽ More

    Submitted 7 August, 2019; v1 submitted 18 November, 2018; originally announced November 2018.

    Comments: This paper is accepted to ICCV 2019. The supplementary material (appendix) can be found after the main paper

  32. arXiv:1811.07249  [pdf, other

    cs.CV cs.LG cs.RO

    Learning Local RGB-to-CAD Correspondences for Object Pose Estimation

    Authors: Georgios Georgakis, Srikrishna Karanam, Ziyan Wu, Jana Kosecka

    Abstract: We consider the problem of 3D object pose estimation. While much recent work has focused on the RGB domain, the reliance on accurately annotated images limits their generalizability and scalability. On the other hand, the easily available CAD models of objects are rich sources of data, providing a large number of synthetically rendered images. In this paper, we solve this key problem of existing m… ▽ More

    Submitted 31 July, 2019; v1 submitted 17 November, 2018; originally announced November 2018.

    Comments: 10 pages, 6 figures, 4 tables, ICCV 2019

  33. arXiv:1808.05499  [pdf, other

    cs.CV

    Measuring the Temporal Behavior of Real-World Person Re-Identification

    Authors: Meng Zheng, Srikrishna Karanam, Richard J. Radke

    Abstract: Designing real-world person re-identification (re-id) systems requires attention to operational aspects not typically considered in academic research. Typically, the probe image or image sequence is matched to a gallery set with a fixed candidate list. On the other hand, in real-world applications of re-id, we would search for a person of interest in a gallery set that is continuously populated by… ▽ More

    Submitted 16 August, 2018; originally announced August 2018.

    Comments: 14 pages, 14 figures

  34. arXiv:1802.07869  [pdf, other

    cs.CV

    End-to-end learning of keypoint detector and descriptor for pose invariant 3D matching

    Authors: Georgios Georgakis, Srikrishna Karanam, Ziyan Wu, Jan Ernst, Jana Kosecka

    Abstract: Finding correspondences between images or 3D scans is at the heart of many computer vision and image retrieval applications and is often enabled by matching local keypoint descriptors. Various learning approaches have been applied in the past to different stages of the matching pipeline, considering detector, descriptor, or metric learning objectives. These objectives were typically addressed sepa… ▽ More

    Submitted 9 May, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

    Comments: 9 pages, 9 figures, 3 tables, CVPR 2018

  35. arXiv:1711.06148  [pdf, other

    cs.CV cs.LG

    Learning Compositional Visual Concepts with Mutual Consistency

    Authors: Yunye Gong, Srikrishna Karanam, Ziyan Wu, Kuan-Chuan Peng, Jan Ernst, Peter C. Doerschuk

    Abstract: Compositionality of semantic concepts in image synthesis and analysis is appealing as it can help in decomposing known and generatively recomposing unknown data. For instance, we may learn concepts of changing illumination, geometry or albedo of a scene, and try to recombine them to generate physically meaningful, but unseen data for training and testing. In practice however we often do not have s… ▽ More

    Submitted 28 March, 2018; v1 submitted 16 November, 2017; originally announced November 2017.

    Comments: 10 pages, 8 figures, 4 tables, CVPR 2018

  36. arXiv:1706.00553  [pdf, other

    cs.CV

    Rank Persistence: Assessing the Temporal Performance of Real-World Person Re-Identification

    Authors: Srikrishna Karanam, Eric Lam, Richard J. Radke

    Abstract: Designing useful person re-identification systems for real-world applications requires attention to operational aspects not typically considered in academic research. Here, we focus on the temporal aspect of re-identification; that is, instead of finding a match to a probe person of interest in a fixed candidate gallery, we consider the more realistic scenario in which the gallery is continuously… ▽ More

    Submitted 4 June, 2017; v1 submitted 2 June, 2017; originally announced June 2017.

    Comments: 8 pages, 7 figures

  37. arXiv:1605.09653  [pdf, other

    cs.CV

    A Systematic Evaluation and Benchmark for Person Re-Identification: Features, Metrics, and Datasets

    Authors: Srikrishna Karanam, Mengran Gou, Ziyan Wu, Angels Rates-Borras, Octavia Camps, Richard J. Radke

    Abstract: Person re-identification (re-id) is a critical problem in video analytics applications such as security and surveillance. The public release of several datasets and code for vision algorithms has facilitated rapid progress in this area over the last few years. However, directly comparing re-id algorithms reported in the literature has become difficult since a wide variety of features, experimental… ▽ More

    Submitted 14 February, 2018; v1 submitted 31 May, 2016; originally announced May 2016.

    Comments: Preliminary work on person Re-Id benchmark. S. Karanam and M. Gou contributed equally. 14 pages, 6 figures, 4 tables. For supplementary material, see http://robustsystems.coe.neu.edu/sites/robustsystems.coe.neu.edu/files/systems/supmat/ReID_benchmark_supp.zip