Skip to main content

Showing 1–50 of 50 results for author: Ahuja, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.06512  [pdf, other

    cs.CV cs.AI

    Merlin: A Vision Language Foundation Model for 3D Computed Tomography

    Authors: Louis Blankemeier, Joseph Paul Cohen, Ashwin Kumar, Dave Van Veen, Syed Jamal Safdar Gardezi, Magdalini Paschali, Zhihong Chen, Jean-Benoit Delbrouck, Eduardo Reis, Cesar Truyts, Christian Bluethgen, Malte Engmann Kjeldskov Jensen, Sophie Ostmeier, Maya Varma, Jeya Maria Jose Valanarasu, Zhongnan Fang, Zepeng Huo, Zaid Nabulsi, Diego Ardila, Wei-Hung Weng, Edson Amaro Junior, Neera Ahuja, Jason Fries, Nigam H. Shah, Andrew Johnston , et al. (6 additional authors not shown)

    Abstract: Over 85 million computed tomography (CT) scans are performed annually in the US, of which approximately one quarter focus on the abdomen. Given the current radiologist shortage, there is a large impetus to use artificial intelligence to alleviate the burden of interpreting these complex imaging studies. Prior state-of-the-art approaches for automated medical image interpretation leverage vision la… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 18 pages, 7 figures

  2. arXiv:2406.01042  [pdf, other

    cs.CV

    Self-Calibrating 4D Novel View Synthesis from Monocular Videos Using Gaussian Splatting

    Authors: Fang Li, Hao Zhang, Narendra Ahuja

    Abstract: Gaussian Splatting (GS) has significantly elevated scene reconstruction efficiency and novel view synthesis (NVS) accuracy compared to Neural Radiance Fields (NeRF), particularly for dynamic scenes. However, current 4D NVS methods, whether based on GS or NeRF, primarily rely on camera parameters provided by COLMAP and even utilize sparse point clouds generated by COLMAP for initialization, which l… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: GitHub Page: https://github.com/fangli333/SC-4DGS

  3. arXiv:2405.18560  [pdf, other

    cs.CV cs.AI cs.IR cs.LG eess.IV

    Potential Field Based Deep Metric Learning

    Authors: Shubhang Bhatnagar, Narendra Ahuja

    Abstract: Deep metric learning (DML) involves training a network to learn a semantically meaningful representation space. Many current approaches mine n-tuples of examples and model interactions within each tuplets. We present a novel, compositional DML model, inspired by electrostatic fields in physics that, instead of in tuples, represents the influence of each example (embedding) by a continuous potentia… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  4. arXiv:2405.14017  [pdf, other

    cs.CV

    MagicPose4D: Crafting Articulated Models with Appearance and Motion Control

    Authors: Hao Zhang, Di Chang, Fang Li, Mohammad Soleymani, Narendra Ahuja

    Abstract: With the success of 2D and 3D visual generative models, there is growing interest in generating 4D content. Existing methods primarily rely on text prompts to produce 4D content, but they often fall short of accurately defining complex or rare motions. To address this limitation, we propose MagicPose4D, a novel framework for refined control over both appearance and motion in 4D generation. Unlike… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: Project Page: https://boese0601.github.io/magicpose4d

  5. arXiv:2405.12607  [pdf, other

    cs.CV

    S3O: A Dual-Phase Approach for Reconstructing Dynamic Shape and Skeleton of Articulated Objects from Single Monocular Video

    Authors: Hao Zhang, Fang Li, Samyak Rawlekar, Narendra Ahuja

    Abstract: Reconstructing dynamic articulated objects from a singular monocular video is challenging, requiring joint estimation of shape, motion, and camera parameters from limited views. Current methods typically demand extensive computational resources and training time, and require additional human annotations such as predefined parametric models, camera poses, and key points, limiting their generalizabi… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  6. arXiv:2404.16193  [pdf, other

    cs.CV cs.AI cs.LG cs.MM eess.IV

    Improving Multi-label Recognition using Class Co-Occurrence Probabilities

    Authors: Samyak Rawlekar, Shubhang Bhatnagar, Vishnuvardhan Pogunulu Srinivasulu, Narendra Ahuja

    Abstract: Multi-label Recognition (MLR) involves the identification of multiple objects within an image. To address the additional complexity of this problem, recent works have leveraged information from vision-language models (VLMs) trained on large text-images datasets for the task. These methods learn an independent classifier for each object (class), overlooking correlations in their occurrences. Such c… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  7. arXiv:2403.14977  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    Piecewise-Linear Manifolds for Deep Metric Learning

    Authors: Shubhang Bhatnagar, Narendra Ahuja

    Abstract: Unsupervised deep metric learning (UDML) focuses on learning a semantic representation space using only unlabeled data. This challenging problem requires accurately estimating the similarity between data points, which is used to supervise a deep network. For this purpose, we propose to model the high-dimensional data manifold using a piecewise-linear approximation, with each low-dimensional linear… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: Accepted at CPAL 2024 (Oral)

  8. arXiv:2401.08809  [pdf, other

    cs.CV

    Learning Implicit Representation for Reconstructing Articulated Objects

    Authors: Hao Zhang, Fang Li, Samyak Rawlekar, Narendra Ahuja

    Abstract: 3D Reconstruction of moving articulated objects without additional information about object structure is a challenging problem. Current methods overcome such challenges by employing category-specific skeletal models. Consequently, they do not generalize well to articulated objects in the wild. We treat an articulated object as an unknown, semi-rigid skeletal structure surrounded by nonrigid materi… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted by ICLR 2024. Code: https://github.com/haoz19/LIMR

  9. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  10. arXiv:2312.05538  [pdf, other

    cs.CV

    CSL: Class-Agnostic Structure-Constrained Learning for Segmentation Including the Unseen

    Authors: Hao Zhang, Fang Li, Lu Qi, Ming-Hsuan Yang, Narendra Ahuja

    Abstract: Addressing Out-Of-Distribution (OOD) Segmentation and Zero-Shot Semantic Segmentation (ZS3) is challenging, necessitating segmenting unseen classes. Existing strategies adapt the class-agnostic Mask2Former (CA-M2F) tailored to specific tasks. However, these methods cater to singular tasks, demand training from scratch, and we demonstrate certain deficiencies in CA-M2F, which affect performance. We… ▽ More

    Submitted 8 February, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024

  11. arXiv:2310.16383  [pdf, other

    cs.CV

    Open-NeRF: Towards Open Vocabulary NeRF Decomposition

    Authors: Hao Zhang, Fang Li, Narendra Ahuja

    Abstract: In this paper, we address the challenge of decomposing Neural Radiance Fields (NeRF) into objects from an open vocabulary, a critical task for object manipulation in 3D reconstruction and view synthesis. Current techniques for NeRF decomposition involve a trade-off between the flexibility of processing open-vocabulary queries and the accuracy of 3D segmentation. We present, Open-vocabulary Embedde… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: Accepted by WACV 2024

  12. Adapted Large Language Models Can Outperform Medical Experts in Clinical Text Summarization

    Authors: Dave Van Veen, Cara Van Uden, Louis Blankemeier, Jean-Benoit Delbrouck, Asad Aali, Christian Bluethgen, Anuj Pareek, Malgorzata Polacin, Eduardo Pontes Reis, Anna Seehofnerova, Nidhi Rohatgi, Poonam Hosamani, William Collins, Neera Ahuja, Curtis P. Langlotz, Jason Hom, Sergios Gatidis, John Pauly, Akshay S. Chaudhari

    Abstract: Analyzing vast textual data and summarizing key information from electronic health records imposes a substantial burden on how clinicians allocate their time. Although large language models (LLMs) have shown promise in natural language processing (NLP), their effectiveness on a diverse range of clinical summarization tasks remains unproven. In this study, we apply adaptation methods to eight LLMs,… ▽ More

    Submitted 11 April, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: 27 pages, 19 figures

    Journal ref: Nature Medicine, 2024

  13. arXiv:2308.04643  [pdf, other

    cs.CV cs.HC cs.RO eess.IV

    Long-Distance Gesture Recognition using Dynamic Neural Networks

    Authors: Shubhang Bhatnagar, Sharath Gopal, Narendra Ahuja, Liu Ren

    Abstract: Gestures form an important medium of communication between humans and machines. An overwhelming majority of existing gesture recognition methods are tailored to a scenario where humans and machines are located very close to each other. This short-distance assumption does not hold true for several types of interactions, for example gesture-based interactions with a floor cleaning robot or with a dr… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023)

    Journal ref: 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 2023, pp. 1307-1312

  14. arXiv:2305.17295  [pdf, other

    eess.IV cs.IT

    Rate-Distortion Theory in Coding for Machines and its Application

    Authors: Alon Harell, Yalda Foroutan, Nilesh Ahuja, Parual Datta, Bhavya Kanzariya, V. Srinivasa Somayaulu, Omesh Tickoo, Anderson de Andrade, Ivan V. Bajic

    Abstract: Recent years have seen a tremendous growth in both the capability and popularity of automatic machine analysis of images and video. As a result, a growing need for efficient compression methods optimized for machine vision, rather than human vision, has emerged. To meet this growing demand, several methods have been developed for image and video coding for machines. Unfortunately, while there is a… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

  15. arXiv:2302.10718  [pdf, other

    cs.CV

    Effects of Architectures on Continual Semantic Segmentation

    Authors: Tobias Kalb, Niket Ahuja, **gxing Zhou, Jürgen Beyerer

    Abstract: Research in the field of Continual Semantic Segmentation is mainly investigating novel learning algorithms to overcome catastrophic forgetting of neural networks. Most recent publications have focused on improving learning algorithms without distinguishing effects caused by the choice of neural architecture.Therefore, we study how the choice of neural network architecture affects catastrophic forg… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: Currently under Review

  16. arXiv:2211.12650  [pdf, other

    cs.CV cs.LG

    FRE: A Fast Method For Anomaly Detection And Segmentation

    Authors: Ibrahima Ndiour, Nilesh Ahuja, Utku Genc, Omesh Tickoo

    Abstract: This paper presents a fast and principled approach for solving the visual anomaly detection and segmentation problem. In this setup, we have access to only anomaly-free training data and want to detect and identify anomalies of an arbitrary nature on test data. We propose the application of linear statistical dimensionality reduction techniques on the intermediate features produced by a pretrained… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2203.10422

  17. arXiv:2210.16472  [pdf, other

    cs.SD cs.CV cs.LG eess.AS

    Learning Audio-Visual Dynamics Using Scene Graphs for Audio Source Separation

    Authors: Moitreya Chatterjee, Narendra Ahuja, Anoop Cherian

    Abstract: There exists an unequivocal distinction between the sound produced by a static source and that produced by a moving one, especially when the source moves towards or away from the microphone. In this paper, we propose to use this connection between audio and visual dynamics for solving two challenging tasks simultaneously, namely: (i) separating audio sources from a mixture using visual cues, and (… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

    Comments: Accepted at NeurIPS 2022

  18. arXiv:2210.06540  [pdf

    cs.CR

    Blockchain for Unmanned Underwater Drones: Research Issues, Challenges, Trends and Future Directions

    Authors: Neelu Jyoti Ahuja, Adarsh Kumar, Monika Thapliyal, Sarthika Dutt, Tanesh Kumar, Diego Augusto De Jesus Pacheco, Charalambos Konstantinou, Kim-Kwang Raymond Choo

    Abstract: Underwater drones have found a place in oceanography, oceanic research, bathymetric surveys, military, surveillance, monitoring, undersea exploration, mining, commercial diving, photography and several other activities. Drones housed with several sensors and complex propulsion systems help oceanographic scientists and undersea explorers to map the seabed, study waves, view dead zones, analyze fish… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

  19. arXiv:2208.11596  [pdf, other

    cs.LG cs.IT stat.ML

    A Low-Complexity Approach to Rate-Distortion Optimized Variable Bit-Rate Compression for Split DNN Computing

    Authors: Parual Datta, Nilesh Ahuja, V. Srinivasa Somayazulu, Omesh Tickoo

    Abstract: Split computing has emerged as a recent paradigm for implementation of DNN-based AI workloads, wherein a DNN model is split into two parts, one of which is executed on a mobile/client device and the other on an edge-server (or cloud). Data compression is applied to the intermediate tensor from the DNN that needs to be transmitted, addressing the challenge of optimizing the rate-accuracy-complexity… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

    Comments: ICPR 2022

  20. arXiv:2206.07988  [pdf, other

    cs.AI

    PreCogIIITH at HinglishEval : Leveraging Code-Mixing Metrics & Language Model Embeddings To Estimate Code-Mix Quality

    Authors: Prashant Kodali, Tanmay Sachan, Akshay Goindani, Anmol Goel, Naman Ahuja, Manish Shrivastava, Ponnurangam Kumaraguru

    Abstract: Code-Mixing is a phenomenon of mixing two or more languages in a speech event and is prevalent in multilingual societies. Given the low-resource nature of Code-Mixing, machine generation of code-mixed text is a prevalent approach for data augmentation. However, evaluating the quality of such machine generated code-mixed text is an open problem. In our submission to HinglishEval, a shared-task coll… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

  21. arXiv:2203.10422  [pdf, other

    cs.LG

    Subspace Modeling for Fast Out-Of-Distribution and Anomaly Detection

    Authors: Ibrahima J. Ndiour, Nilesh A. Ahuja, Omesh Tickoo

    Abstract: This paper presents a fast, principled approach for detecting anomalous and out-of-distribution (OOD) samples in deep neural networks (DNN). We propose the application of linear statistical dimensionality reduction techniques on the semantic features produced by a DNN, in order to capture the low-dimensional subspace truly spanned by said features. We show that the "feature reconstruction error" (… ▽ More

    Submitted 19 March, 2022; originally announced March 2022.

    Comments: arXiv admin note: text overlap with arXiv:2012.04250

  22. arXiv:2202.08341  [pdf, other

    cs.CV cs.LG

    Anomalib: A Deep Learning Library for Anomaly Detection

    Authors: Samet Akcay, Dick Ameln, Ashwin Vaidya, Barath Lakshmanan, Nilesh Ahuja, Utku Genc

    Abstract: This paper introduces anomalib, a novel library for unsupervised anomaly detection and localization. With reproducibility and modularity in mind, this open-source library provides algorithms from the literature and a set of tools to design custom anomaly detection algorithms via a plug-and-play approach. Anomalib comprises state-of-the-art anomaly detection algorithms that achieve top performance… ▽ More

    Submitted 16 February, 2022; originally announced February 2022.

  23. arXiv:2201.06741  [pdf

    cs.CL

    HashSet -- A Dataset For Hashtag Segmentation

    Authors: Prashant Kodali, Akshala Bhatnagar, Naman Ahuja, Manish Shrivastava, Ponnurangam Kumaraguru

    Abstract: Hashtag segmentation is the task of breaking a hashtag into its constituent tokens. Hashtags often encode the essence of user-generated posts, along with information like topic and sentiment, which are useful in downstream tasks. Hashtags prioritize brevity and are written in unique ways -- transliterating and mixing languages, spelling variations, creative named entities. Benchmark datasets used… ▽ More

    Submitted 17 January, 2022; originally announced January 2022.

  24. arXiv:2111.10971  [pdf, other

    cs.CV

    Tracking Grow-Finish Pigs Across Large Pens Using Multiple Cameras

    Authors: Aniket Shirke, Aziz Saifuddin, Achleshwar Luthra, Jiangong Li, Tawni Williams, Xiaodan Hu, Aneesh Kotnana, Okan Kocabalkanli, Narendra Ahuja, Angela Green-Miller, Isabella Condotta, Ryan N. Dilger, Matthew Caesar

    Abstract: Increasing demand for meat products combined with farm labor shortages has resulted in a need to develop new real-time solutions to monitor animals effectively. Significant progress has been made in continuously locating individual pigs using tracking-by-detection methods. However, these methods fail for oblong pens because a single fixed camera does not cover the entire floor at adequate resoluti… ▽ More

    Submitted 21 November, 2021; originally announced November 2021.

    Comments: 6 pages, 4 figures, Accepted at the CVPR 2021 CV4Animals workshop

  25. arXiv:2110.03446  [pdf, other

    cs.CV cs.AI cs.LG

    A Hierarchical Variational Neural Uncertainty Model for Stochastic Video Prediction

    Authors: Moitreya Chatterjee, Narendra Ahuja, Anoop Cherian

    Abstract: Predicting the future frames of a video is a challenging task, in part due to the underlying stochastic real-world phenomena. Prior approaches to solve this task typically estimate a latent prior characterizing this stochasticity, however do not account for the predictive uncertainty of the (deep learning) model. Such approaches often derive the training signal from the mean-squared error (MSE) be… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

    Comments: Accepted at ICCV 2021 (Oral)

  26. arXiv:2109.11955  [pdf, other

    cs.CV cs.AI cs.LG cs.SD eess.AS

    Visual Scene Graphs for Audio Source Separation

    Authors: Moitreya Chatterjee, Jonathan Le Roux, Narendra Ahuja, Anoop Cherian

    Abstract: State-of-the-art approaches for visually-guided audio source separation typically assume sources that have characteristic sounds, such as musical instruments. These approaches often ignore the visual context of these sound sources or avoid modeling object interactions that may be useful to better characterize the sources, especially when the same object class may produce varied sounds from distinc… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

    Comments: Accepted at ICCV 2021

  27. arXiv:2109.09166  [pdf, other

    cs.CV

    Unsupervised 3D Pose Estimation for Hierarchical Dance Video Recognition

    Authors: Xiaodan Hu, Narendra Ahuja

    Abstract: Dance experts often view dance as a hierarchy of information, spanning low-level (raw images, image sequences), mid-levels (human poses and bodypart movements), and high-level (dance genre). We propose a Hierarchical Dance Video Recognition framework (HDVR). HDVR estimates 2D pose sequences, tracks dancers, and then simultaneously estimates corresponding 3D poses and 3D-to-2D imaging parameters, w… ▽ More

    Submitted 19 September, 2021; originally announced September 2021.

    Comments: To appear in ICCV2021

  28. arXiv:2109.06873  [pdf, other

    cs.LG cs.AI

    Robust Contrastive Active Learning with Feature-guided Query Strategies

    Authors: Ranganath Krishnan, Nilesh Ahuja, Alok Sinha, Mahesh Subedar, Omesh Tickoo, Ravi Iyer

    Abstract: We introduce supervised contrastive active learning (SCAL) and propose efficient query strategies in active learning based on the feature similarity (featuresim) and principal component analysis based feature-reconstruction error (fre) to select informative data samples with diverse feature representations. We demonstrate our proposed method achieves state-of-the-art accuracy, model calibration an… ▽ More

    Submitted 14 August, 2022; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: 20 pages with appendix. arXiv admin note: text overlap with arXiv:2109.06321

  29. arXiv:2109.06321  [pdf, other

    cs.LG

    Mitigating Sampling Bias and Improving Robustness in Active Learning

    Authors: Ranganath Krishnan, Alok Sinha, Nilesh Ahuja, Mahesh Subedar, Omesh Tickoo, Ravi Iyer

    Abstract: This paper presents simple and efficient methods to mitigate sampling bias in active learning while achieving state-of-the-art accuracy and model robustness. We introduce supervised contrastive active learning by leveraging the contrastive loss for active learning under a supervised setting. We propose an unbiased query strategy that selects informative data samples of diverse feature representati… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: Human in the Loop Learning workshop at International Conference on Machine Learning (ICML 2021)

  30. arXiv:2105.03270  [pdf, other

    cs.LG cs.CV

    Energy-Based Anomaly Detection and Localization

    Authors: Ergin Utku Genc, Nilesh Ahuja, Ibrahima J Ndiour, Omesh Tickoo

    Abstract: This brief sketches initial progress towards a unified energy-based solution for the semi-supervised visual anomaly detection and localization problem. In this setup, we have access to only anomaly-free training data and want to detect and identify anomalies of an arbitrary nature on test data. We employ the density estimates from the energy-based model (EBM) as normalcy scores that can be used to… ▽ More

    Submitted 7 May, 2021; originally announced May 2021.

    Comments: 9 pages, 3 figures, as submitted to EBM ICLR 2021 workshop

  31. arXiv:2012.04250  [pdf, other

    cs.LG

    Out-Of-Distribution Detection With Subspace Techniques And Probabilistic Modeling Of Features

    Authors: Ibrahima Ndiour, Nilesh Ahuja, Omesh Tickoo

    Abstract: This paper presents a principled approach for detecting out-of-distribution (OOD) samples in deep neural networks (DNN). Modeling probability distributions on deep features has recently emerged as an effective, yet computationally cheap method to detect OOD samples in DNN. However, the features produced by a DNN at any given layer do not fully occupy the corresponding high-dimensional feature spac… ▽ More

    Submitted 8 December, 2020; originally announced December 2020.

  32. arXiv:2007.12130  [pdf, other

    cs.CV cs.LG cs.SD eess.AS

    Sound2Sight: Generating Visual Dynamics from Sound and Context

    Authors: Anoop Cherian, Moitreya Chatterjee, Narendra Ahuja

    Abstract: Learning associations across modalities is critical for robust multimodal reasoning, especially when a modality may be missing during inference. In this paper, we study this problem in the context of audio-conditioned visual synthesis -- a task that is important, for example, in occlusion reasoning. Specifically, our goal is to generate future video frames and their motion dynamics conditioned on… ▽ More

    Submitted 23 July, 2020; originally announced July 2020.

    Comments: Accepted at ECCV 2020

  33. arXiv:1912.08434  [pdf, other

    stat.ML cs.LG stat.CO

    Tree pyramidal adaptive importance sampling

    Authors: Javier Felip, Nilesh Ahuja, Omesh Tickoo

    Abstract: This paper introduces Tree-Pyramidal Adaptive Importance Sampling (TP-AIS), a novel iterated sampling method that outperforms state-of-the-art approaches like deterministic mixture population Monte Carlo (DM-PMC), mixture population Monte Carlo (M-PMC) and layered adaptive importance sampling (LAIS). TP-AIS iteratively builds a proposal distribution parameterized by a tree pyramid, where each tree… ▽ More

    Submitted 23 March, 2020; v1 submitted 18 December, 2019; originally announced December 2019.

    Comments: 20 pages + 13 pages of additional result plots and evaluation details

  34. arXiv:1912.01206  [pdf, ps, other

    cs.LG stat.ML

    Deep Probabilistic Models to Detect Data Poisoning Attacks

    Authors: Mahesh Subedar, Nilesh Ahuja, Ranganath Krishnan, Ibrahima J. Ndiour, Omesh Tickoo

    Abstract: Data poisoning attacks compromise the integrity of machine-learning models by introducing malicious training samples to influence the results during test time. In this work, we investigate backdoor data poisoning attack on deep neural networks (DNNs) by inserting a backdoor pattern in the training images. The resulting attack will misclassify poisoned test samples while maintaining high accuracies… ▽ More

    Submitted 3 December, 2019; originally announced December 2019.

    Comments: To appear in Bayesian Deep Learning Workshop at NeurIPS 2019

  35. arXiv:1909.11786  [pdf, other

    stat.ML cs.LG

    Probabilistic Modeling of Deep Features for Out-of-Distribution and Adversarial Detection

    Authors: Nilesh A. Ahuja, Ibrahima Ndiour, Trushant Kalyanpur, Omesh Tickoo

    Abstract: We present a principled approach for detecting out-of-distribution (OOD) and adversarial samples in deep neural networks. Our approach consists in modeling the outputs of the various layers (deep features) with parametric probability distributions once training is completed. At inference, the likelihoods of the deep features w.r.t the previously learnt distributions are calculated and used to deri… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

  36. arXiv:1905.13307  [pdf, other

    cs.CV cs.LG stat.ML

    Real-time Approximate Bayesian Computation for Scene Understanding

    Authors: Javier Felip, Nilesh Ahuja, David Gómez-Gutiérrez, Omesh Tickoo, Vikash Mansinghka

    Abstract: Consider scene understanding problems such as predicting where a person is probably reaching, or inferring the pose of 3D objects from depth images, or inferring the probable street crossings of pedestrians at a busy intersection. This paper shows how to solve these problems using Approximate Bayesian Computation. The underlying generative models are built from realistic simulation software, wrapp… ▽ More

    Submitted 22 May, 2019; originally announced May 2019.

  37. arXiv:1807.09810  [pdf, other

    cs.CV cs.LG

    Coreset-Based Neural Network Compression

    Authors: Abhimanyu Dubey, Moitreya Chatterjee, Narendra Ahuja

    Abstract: We propose a novel Convolutional Neural Network (CNN) compression algorithm based on coreset representations of filters. We exploit the redundancies extant in the space of CNN weights and neuronal activations (across samples) in order to obtain compression. Our method requires no retraining, is easy to implement, and obtains state-of-the-art compression performance across a wide variety of CNN arc… ▽ More

    Submitted 25 July, 2018; originally announced July 2018.

    Comments: Camera-Ready version for ECCV 2018

  38. arXiv:1804.00650  [pdf, other

    cs.CV

    DeepMVS: Learning Multi-view Stereopsis

    Authors: Po-Han Huang, Kevin Matzen, Johannes Kopf, Narendra Ahuja, Jia-Bin Huang

    Abstract: We present DeepMVS, a deep convolutional neural network (ConvNet) for multi-view stereo reconstruction. Taking an arbitrary number of posed images as input, we first produce a set of plane-sweep volumes and use the proposed DeepMVS network to predict high-quality disparity maps. The key contributions that enable these results are (1) supervised pretraining on a photorealistic synthetic dataset, (2… ▽ More

    Submitted 2 April, 2018; originally announced April 2018.

    Comments: CVPR 2018. Project page: https://phuang17.github.io/DeepMVS/ Code: https://github.com/phuang17/DeepMVS

  39. arXiv:1710.04200  [pdf, other

    cs.CV

    Joint Image Filtering with Deep Convolutional Networks

    Authors: Yijun Li, Jia-Bin Huang, Narendra Ahuja, Ming-Hsuan Yang

    Abstract: Joint image filters leverage the guidance image as a prior and transfer the structural details from the guidance image to the target image for suppressing noise or enhancing spatial resolution. Existing methods either rely on various explicit filter constructions or hand-designed objective functions, thereby making it difficult to understand, improve, and accelerate these filters in a coherent fra… ▽ More

    Submitted 2 January, 2019; v1 submitted 11 October, 2017; originally announced October 2017.

    Comments: Accepted by TPAMI

  40. arXiv:1710.02139  [pdf, other

    cs.CV

    Tracking Persons-of-Interest via Unsupervised Representation Adaptation

    Authors: Shun Zhang, Jia-Bin Huang, Jongwoo Lim, Yihong Gong, **jun Wang, Narendra Ahuja, Ming-Hsuan Yang

    Abstract: Multi-face tracking in unconstrained videos is a challenging problem as faces of one person often appear drastically different in multiple shots due to significant variations in scale, pose, expression, illumination, and make-up. Existing multi-target tracking methods often use low-level features which are not sufficiently discriminative for identifying faces with such large appearance variations.… ▽ More

    Submitted 5 October, 2017; originally announced October 2017.

    Comments: Project page: http://vllab1.ucmerced.edu/~szhang/FaceTracking/

  41. arXiv:1710.01992  [pdf, other

    cs.CV

    Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks

    Authors: Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, Ming-Hsuan Yang

    Abstract: Convolutional neural networks have recently demonstrated high-quality reconstruction for single image super-resolution. However, existing methods often require a large number of network parameters and entail heavy computational loads at runtime for generating high-accuracy super-resolution results. In this paper, we propose the deep Laplacian Pyramid Super-Resolution Network for fast and accurate… ▽ More

    Submitted 9 August, 2018; v1 submitted 4 October, 2017; originally announced October 2017.

    Comments: The code and datasets are available at http://vllab.ucmerced.edu/wlai24/LapSRN/

  42. arXiv:1704.03915  [pdf, other

    cs.CV

    Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution

    Authors: Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, Ming-Hsuan Yang

    Abstract: Convolutional neural networks have recently demonstrated high-quality reconstruction for single-image super-resolution. In this paper, we propose the Laplacian Pyramid Super-Resolution Network (LapSRN) to progressively reconstruct the sub-band residuals of high-resolution images. At each pyramid level, our model takes coarse-resolution feature maps as input, predicts the high-frequency residuals,… ▽ More

    Submitted 9 October, 2017; v1 submitted 12 April, 2017; originally announced April 2017.

    Comments: This work is accepted in CVPR 2017. The code and datasets are available on http://vllab.ucmerced.edu/wlai24/LapSRN/

  43. arXiv:1605.06325  [pdf, other

    cs.CV

    Superpixel Hierarchy

    Authors: Xing Wei, Qingxiong Yang, Yihong Gong, Ming-Hsuan Yang, Narendra Ahuja

    Abstract: Superpixel segmentation is becoming ubiquitous in computer vision. In practice, an object can either be represented by a number of segments in finer levels of detail or included in a surrounding region at coarser levels of detail, and thus a superpixel segmentation hierarchy is useful for applications that require different levels of image segmentation detail depending on the particular image obje… ▽ More

    Submitted 20 May, 2016; originally announced May 2016.

  44. arXiv:1504.07846  [pdf, other

    math.OC cs.DS cs.NE

    Incorporating Road Networks into Territory Design

    Authors: Nitin Ahuja, Matthias Bender, Peter Sanders, Christian Schulz, Andreas Wagner

    Abstract: Given a set of basic areas, the territory design problem asks to create a predefined number of territories, each containing at least one basic area, such that an objective function is optimized. Desired properties of territories often include a reasonable balance, compact form, contiguity and small average journey times which are usually encoded in the objective function or formulated as constrain… ▽ More

    Submitted 5 May, 2015; v1 submitted 29 April, 2015; originally announced April 2015.

  45. arXiv:1409.1412  [pdf

    cs.NI

    MEEP: Multihop Energy Efficient Protocol For Heterogeneous Wireless Sensor Network

    Authors: Surender Kumar, Manish Prateek, N. J. Ahuja, Bharat Bhushan

    Abstract: Energy conservation of sensor nodes for increasing the network life is the most crucial design goal while develo** efficient routing protocol for wireless sensor networks. Recent technological advances help in the development of wide variety of sensor nodes. Heterogeneity takes the advantage of different types of sensor nodes and improves the energy efficiency and network life. Generally sensors… ▽ More

    Submitted 2 September, 2014; originally announced September 2014.

    Comments: 9 Pages, 10 Figures, http://www.ijcst.org/Volume5/Issue3/2014. arXiv admin note: substantial text overlap with arXiv:1408.3110

  46. MEECDA: Multihop Energy Efficient Clustering and Data Aggregation Protocol for HWSN

    Authors: Surender Kumar, Manish Prateek, N. J. Ahuja, Bharat Bhushan

    Abstract: Wireless sensor network consists of large number of inexpensive tiny sensors which are connected with low power wireless communications. Most of the routing and data dissemination protocols of WSN assume a homogeneous network architecture, in which all sensors have the same capabilities in terms of battery power, communication, sensing, storage, and processing. However the continued advances in mi… ▽ More

    Submitted 13 August, 2014; originally announced August 2014.

    Comments: 8 pages, 11 figures. available at http://ijcaonline.org/2014. arXiv admin note: substantial text overlap with arXiv:1408.2914

  47. DE-LEACH: Distance and Energy Aware LEACH

    Authors: Surender Kumar, Manish Prateek, N. J. Ahuja, Bharat Bhushan

    Abstract: Wireless sensor network consists of large number of tiny sensor nodes which are usually deployed in a harsh environment. Self configuration and infrastructure less are the two fundamental properties of sensor networks. Sensor nodes are highly energy constrained devices because they are battery operated devices and due to harsh environment deployment it is impossible to change or recharge their bat… ▽ More

    Submitted 13 August, 2014; originally announced August 2014.

    Comments: 7 pages, 5 figures. available online at http://ijcaonline.org/2014

  48. arXiv:1109.2389  [pdf, other

    cs.CV cs.LG

    A Probabilistic Framework for Discriminative Dictionary Learning

    Authors: Bernard Ghanem, Narendra Ahuja

    Abstract: In this paper, we address the problem of discriminative dictionary learning (DDL), where sparse linear representation and classification are combined in a probabilistic framework. As such, a single discriminative dictionary and linear binary classifiers are learned jointly. By encoding sparse representation and discriminative classification models in a MAP setting, we propose a general optimizatio… ▽ More

    Submitted 12 September, 2011; originally announced September 2011.

    Comments: 10 pages, 4 figures, conference, dictionary learning, sparse coding

  49. arXiv:1109.2388  [pdf, other

    cs.LG cs.CV

    MIS-Boost: Multiple Instance Selection Boosting

    Authors: Emre Akbas, Bernard Ghanem, Narendra Ahuja

    Abstract: In this paper, we present a new multiple instance learning (MIL) method, called MIS-Boost, which learns discriminative instance prototypes by explicit instance selection in a boosting framework. Unlike previous instance selection based MIL methods, we do not restrict the prototypes to a discrete set of training instances but allow them to take arbitrary values in the instance feature space. We als… ▽ More

    Submitted 12 September, 2011; originally announced September 2011.

  50. arXiv:1102.1292  [pdf, other

    cs.CV

    Modeling Dynamic Swarms

    Authors: Bernard Ghanem, Narendra Ahuja

    Abstract: This paper proposes the problem of modeling video sequences of dynamic swarms (DS). We define DS as a large layout of stochastically repetitive spatial configurations of dynamic objects (swarm elements) whose motions exhibit local spatiotemporal interdependency and stationarity, i.e., the motions are similar in any small spatiotemporal neighborhood. Examples of DS abound in nature, e.g., herds of… ▽ More

    Submitted 7 February, 2011; originally announced February 2011.

    Comments: 11 pages, 17 figures, conference paper, computer vision