Skip to main content

Showing 1–48 of 48 results for author: Ravi, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17005  [pdf, other

    cs.CV

    PVUW 2024 Challenge on Complex Video Understanding: Methods and Results

    Authors: Henghui Ding, Chang Liu, Yunchao Wei, Nikhila Ravi, Shuting He, Song Bai, Philip Torr, Deshui Miao, Xin Li, Zhenyu He, Yaowei Wang, Ming-Hsuan Yang, Zhensong Xu, Jiangtao Yao, Cheng**g Wu, Ting Liu, Luoqi Liu, Xinyu Liu, **g Zhang, Kexin Zhang, Yuting Yang, Licheng Jiao, Shuyuan Yang, Mingqi Gao, **gnan Luo , et al. (12 additional authors not shown)

    Abstract: Pixel-level Video Understanding in the Wild Challenge (PVUW) focus on complex video understanding. In this CVPR 2024 workshop, we add two new tracks, Complex Video Object Segmentation Track based on MOSE dataset and Motion Expression guided Video Segmentation track based on MeViS dataset. In the two new tracks, we provide additional videos and annotations that feature challenging elements, such as… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: MOSE Challenge: https://henghuiding.github.io/MOSE/ChallengeCVPR2024, MeViS Challenge: https://henghuiding.github.io/MeViS/ChallengeCVPR2024

  2. arXiv:2404.08668  [pdf, other

    cs.IR cs.AI

    A Comprehensive Survey on AI-based Methods for Patents

    Authors: Homaira Huda Shomee, Zhu Wang, Sathya N. Ravi, Sourav Medya

    Abstract: Recent advancements in Artificial Intelligence (AI) and machine learning have demonstrated transformative capabilities across diverse domains. This progress extends to the field of patent analysis and innovation, where AI-based tools present opportunities to streamline and enhance important tasks in the patent cycle such as classification, retrieval, and valuation prediction. This not only acceler… ▽ More

    Submitted 18 June, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  3. arXiv:2403.02324  [pdf, other

    eess.SP cs.CR

    Differentially Private Communication of Measurement Anomalies in the Smart Grid

    Authors: Nikhil Ravi, Anna Scaglione, Sean Peisert, Parth Pradhan

    Abstract: In this paper, we present a framework based on differential privacy (DP) for querying electric power measurements to detect system anomalies or bad data. Our DP approach conceals consumption and system matrix data, while simultaneously enabling an untrusted third party to test hypotheses of anomalies, such as the presence of bad data, by releasing a randomized sufficient statistic for hypothesis-t… ▽ More

    Submitted 22 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: 13 pages, 5 figures

  4. arXiv:2401.03251  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    TeLeS: Temporal Lexeme Similarity Score to Estimate Confidence in End-to-End ASR

    Authors: Nagarathna Ravi, Thishyan Raj T, Vipul Arora

    Abstract: Confidence estimation of predictions from an End-to-End (E2E) Automatic Speech Recognition (ASR) model benefits ASR's downstream and upstream tasks. Class-probability-based confidence scores do not accurately represent the quality of overconfident ASR predictions. An ancillary Confidence Estimation Model (CEM) calibrates the predictions. State-of-the-art (SOTA) solutions use binary target scores f… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: Submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processing

  5. arXiv:2310.04515  [pdf, other

    cs.LG cs.AI

    Utilizing Free Clients in Federated Learning for Focused Model Enhancement

    Authors: Aditya Narayan Ravi, Ilan Shomorony

    Abstract: Federated Learning (FL) is a distributed machine learning approach to learn models on decentralized heterogeneous data, without the need for clients to share their data. Many existing FL approaches assume that all clients have equal importance and construct a global objective based on all clients. We consider a version of FL we call Prioritized FL, where the goal is to learn a weighted mean object… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: 26 pages, 6 figures

  6. arXiv:2310.03890  [pdf, other

    cs.LG cs.AI cs.CV

    Accelerated Neural Network Training with Rooted Logistic Objectives

    Authors: Zhu Wang, Praveen Raj Veluswami, Harsh Mishra, Sathya N. Ravi

    Abstract: Many neural networks deployed in the real world scenarios are trained using cross entropy based loss functions. From the optimization perspective, it is known that the behavior of first order methods such as gradient descent crucially depend on the separability of datasets. In fact, even in the most simplest case of binary classification, the rate of convergence depends on two factors: (1) conditi… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  7. arXiv:2309.00035  [pdf, other

    cs.CV cs.AI

    FACET: Fairness in Computer Vision Evaluation Benchmark

    Authors: Laura Gustafson, Chloe Rolland, Nikhila Ravi, Quentin Duval, Aaron Adcock, Cheng-Yang Fu, Melissa Hall, Candace Ross

    Abstract: Computer vision models have known performance disparities across attributes such as gender and skin tone. This means during tasks such as classification and detection, model performance differs for certain classes based on the demographics of the people in the image. These disparities have been shown to exist, but until now there has not been a unified approach to measure these differences for com… ▽ More

    Submitted 31 August, 2023; originally announced September 2023.

  8. arXiv:2306.05578  [pdf, other

    eess.SP cs.CR

    Differential Privacy for Class-based Data: A Practical Gaussian Mechanism

    Authors: Raksha Ramakrishna, Anna Scaglione, Tong Wu, Nikhil Ravi, Sean Peisert

    Abstract: In this paper, we present a notion of differential privacy (DP) for data that comes from different classes. Here, the class-membership is private information that needs to be protected. The proposed method is an output perturbation mechanism that adds noise to the release of query response such that the analyst is unable to infer the underlying class-label. The proposed DP method is capable of not… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: Under review in IEEE Transactions on Information Forensics & Security

  9. arXiv:2304.02643  [pdf, other

    cs.CV cs.AI cs.LG

    Segment Anything

    Authors: Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, Ross Girshick

    Abstract: We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the largest segmentation dataset to date (by far), with over 1 billion masks on 11M licensed and privacy respecting images. The model is designed and trained to be promptable, so it can transfer zero-shot to new image distributions and… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: Project web-page: https://segment-anything.com

  10. arXiv:2302.05865  [pdf, other

    cs.LG cs.DC

    Flag Aggregator: Scalable Distributed Training under Failures and Augmented Losses using Convex Optimization

    Authors: Hamidreza Almasi, Harsh Mishra, Balajee Vamanan, Sathya N. Ravi

    Abstract: Modern ML applications increasingly rely on complex deep learning models and large datasets. There has been an exponential growth in the amount of computation needed to train the largest models. Therefore, to scale computation and data, these models are inevitably trained in a distributed manner in clusters of nodes, and their updates are aggregated before being applied to the model. However, a di… ▽ More

    Submitted 24 September, 2023; v1 submitted 12 February, 2023; originally announced February 2023.

  11. arXiv:2302.05608  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis

    Authors: Zhu Wang, Sourav Medya, Sathya N. Ravi

    Abstract: Often, deep network models are purely inductive during training and while performing inference on unseen data. Thus, when such models are used for predictions, it is well known that they often fail to capture the semantic information and implicit dependencies that exist among objects (or concepts) on a population level. Moreover, it is still unclear how domain or prior modal knowledge can be speci… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

  12. arXiv:2302.02336  [pdf, other

    cs.LG cs.CV

    Using Intermediate Forward Iterates for Intermediate Generator Optimization

    Authors: Harsh Mishra, Jurijs Nazarovs, Manmohan Dogra, Sathya N. Ravi

    Abstract: Score-based models have recently been introduced as a richer framework to model distributions in high dimensions and are generally more suitable for generative tasks. In score-based models, a generative task is formulated using a parametric model (such as a neural network) to directly learn the gradient of such high dimensional distributions, instead of the density functions themselves, as is done… ▽ More

    Submitted 5 February, 2023; originally announced February 2023.

  13. arXiv:2211.01338  [pdf, other

    eess.AS cs.CL cs.MM cs.SD eess.IV

    Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages

    Authors: Anusha Prakash, Arun Kumar, Ashish Seth, Bhagyashree Mukherjee, Ishika Gupta, Jom Kuriakose, Jordan Fernandes, K V Vikram, Mano Ranjith Kumar M, Metilda Sagaya Mary, Mohammad Wajahat, Mohana N, Mudit Batra, Navina K, Nihal John George, Nithya Ravi, Pruthwik Mishra, Sudhanshu Srivastava, Vasista Sai Lodagala, Vandan Mujadia, Kada Sai Venkata Vineeth, Vrunda Sukhadia, Dipti Sharma, Hema Murthy, Pushpak Bhattacharya , et al. (2 additional authors not shown)

    Abstract: Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  14. arXiv:2207.10660  [pdf, other

    cs.CV

    Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild

    Authors: Garrick Brazil, Abhinav Kumar, Julian Straub, Nikhila Ravi, Justin Johnson, Georgia Gkioxari

    Abstract: Recognizing scenes and objects in 3D from a single image is a longstanding goal of computer vision with applications in robotics and AR/VR. For 2D recognition, large datasets and scalable solutions have led to unprecedented advances. In 3D, existing benchmarks are small in size and approaches specialize in few object categories and specific domains, e.g. urban driving scenes. Motivated by the succ… ▽ More

    Submitted 23 March, 2023; v1 submitted 21 July, 2022; originally announced July 2022.

    Comments: CVPR 2023, Project website: https://omni3d.garrickbrazil.com/

  15. arXiv:2207.00611  [pdf, other

    cs.AI cond-mat.mtrl-sci cs.LG

    FAIR principles for AI models with a practical application for accelerated high energy diffraction microscopy

    Authors: Nikil Ravi, Pranshu Chaturvedi, E. A. Huerta, Zhengchun Liu, Ryan Chard, Aristana Scourtas, K. J. Schmidt, Kyle Chard, Ben Blaiszik, Ian Foster

    Abstract: A concise and measurable set of FAIR (Findable, Accessible, Interoperable and Reusable) principles for scientific data is transforming the state-of-practice for data management and stewardship, supporting and enabling discovery and innovation. Learning from this initiative, and acknowledging the impact of artificial intelligence (AI) in the practice of science and engineering, we introduce a set o… ▽ More

    Submitted 21 December, 2022; v1 submitted 1 July, 2022; originally announced July 2022.

    Comments: 11 pages, 3 figures; Accepted to Scientific Data; for press release see https://www.anl.gov/article/argonne-scientists-promote-fair-standards-for-managing-artificial-intelligence-models and https://www.ncsa.illinois.edu/ncsa-student-researchers-lead-authors-on-award-winning-paper; Received 2022 HPCwire Readers' Choice Award on Best Use of High Performance Data Analytics & Artificial Intelligence

    MSC Class: 68T01; 68T05 ACM Class: I.2; J.2

    Journal ref: Scientific Data 9, 657 (2022)

  16. arXiv:2206.07028  [pdf, other

    cs.CV

    Learning 3D Object Shape and Layout without 3D Supervision

    Authors: Georgia Gkioxari, Nikhila Ravi, Justin Johnson

    Abstract: A 3D scene consists of a set of objects, each with a shape and a layout giving their position in space. Understanding 3D scenes from 2D images is an important goal, with applications in robotics and graphics. While there have been recent advances in predicting 3D shape and layout from a single image, most approaches rely on 3D ground truth for training which is expensive to collect at scale. We ov… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

    Comments: CVPR 2022, project page: https://gkioxari.github.io/usl/

  17. arXiv:2204.07655  [pdf, other

    cs.CV cs.LG

    Deep Unlearning via Randomized Conditionally Independent Hessians

    Authors: Ronak Mehta, Sourav Pal, Vikas Singh, Sathya N. Ravi

    Abstract: Recent legislation has led to interest in machine unlearning, i.e., removing specific training samples from a predictive model as if they never existed in the training dataset. Unlearning may also be required due to corrupted/adversarial data or simply a user's updated privacy requirement. For models which require no training (k-NN), simply deleting the closest original sample can be effective. Bu… ▽ More

    Submitted 13 July, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

    Comments: CVPR 2022. Supplement appended to end of main paper (total 15 pages). Ronak Mehta and Sourav Pal equal contribution

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 10422-10431

  18. arXiv:2203.15234  [pdf, other

    cs.LG cs.AI cs.CV

    Equivariance Allows Handling Multiple Nuisance Variables When Analyzing Pooled Neuroimaging Datasets

    Authors: Vishnu Suresh Lokhande, Rudrasis Chakraborty, Sathya N. Ravi, Vikas Singh

    Abstract: Pooling multiple neuroimaging datasets across institutions often enables improvements in statistical power when evaluating associations (e.g., between risk factors and disease outcomes) that may otherwise be too weak to detect. When there is only a {\em single} source of variability (e.g., different scanners), domain adaptation and matching the distributions of representations may suffice in many… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: Accepted at 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  19. arXiv:2202.09463  [pdf, other

    cs.LG cs.AI stat.AP

    Mixed Effects Neural ODE: A Variational Approximation for Analyzing the Dynamics of Panel Data

    Authors: Jurijs Nazarovs, Rudrasis Chakraborty, Songwong Tasneeyapant, Sathya N. Ravi, Vikas Singh

    Abstract: Panel data involving longitudinal measurements of the same set of participants taken over multiple time points is common in studies to understand childhood development and disease modeling. Deep hybrid models that marry the predictive power of neural networks with physical simulators such as differential equations, are starting to drive advances in such applications. The task of modeling not just… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

    Journal ref: Proceedings of Machine Learning Research; PMLR 161:107-117, 2021

  20. arXiv:2201.08377  [pdf, other

    cs.CV cs.AI cs.IR cs.LG

    Omnivore: A Single Model for Many Visual Modalities

    Authors: Rohit Girdhar, Mannat Singh, Nikhila Ravi, Laurens van der Maaten, Armand Joulin, Ishan Misra

    Abstract: Prior work has studied different visual modalities in isolation and developed separate architectures for recognition of images, videos, and 3D data. Instead, in this paper, we propose a single model which excels at classifying images, videos, and single-view 3D data using exactly the same model parameters. Our 'Omnivore' model leverages the flexibility of transformer-based architectures and is tra… ▽ More

    Submitted 30 March, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

    Comments: Accepted at CVPR 2022 (Oral Presentation)

  21. arXiv:2112.01520  [pdf, other

    cs.CV

    Recognizing Scenes from Novel Viewpoints

    Authors: Shengyi Qian, Alexander Kirillov, Nikhila Ravi, Devendra Singh Chaplot, Justin Johnson, David F. Fouhey, Georgia Gkioxari

    Abstract: Humans can perceive scenes in 3D from a handful of 2D views. For AI agents, the ability to recognize a scene from any viewpoint given only a few images enables them to efficiently interact with the scene and its objects. In this work, we attempt to endow machines with this ability. We propose a model which takes as input a few RGB images of a new scene and recognizes the scene from novel viewpoint… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

  22. arXiv:2111.11661  [pdf, other

    cs.CR cs.DS eess.SY

    Optimum Noise Mechanism for Differentially Private Queries in Discrete Finite Sets

    Authors: Sachin Kadam, Anna Scaglione, Nikhil Ravi, Sean Peisert, Brent Lunghino, Aram Shumavon

    Abstract: The Differential Privacy (DP) literature often centers on meeting privacy constraints by introducing noise to the query, typically using a pre-specified parametric distribution model with one or two degrees of freedom. However, this emphasis tends to neglect the crucial considerations of response accuracy and utility, especially in the context of categorical or discrete numerical database queries,… ▽ More

    Submitted 8 April, 2024; v1 submitted 23 November, 2021; originally announced November 2021.

    Comments: Accepted for publication in the journal Cybersecurity (https://cybersecurity.springeropen.com/)

  23. arXiv:2111.09887  [pdf, other

    cs.CV cs.LG

    PyTorchVideo: A Deep Learning Library for Video Understanding

    Authors: Haoqi Fan, Tullie Murrell, Heng Wang, Kalyan Vasudev Alwala, Yanghao Li, Yilei Li, Bo Xiong, Nikhila Ravi, Meng Li, Haichuan Yang, Jitendra Malik, Ross Girshick, Matt Feiszli, Aaron Adcock, Wan-Yen Lo, Christoph Feichtenhofer

    Abstract: We introduce PyTorchVideo, an open-source deep-learning library that provides a rich set of modular, efficient, and reproducible components for a variety of video understanding tasks, including classification, detection, self-supervised learning, and low-level processing. The library covers a full stack of video understanding tools including multimodal data loading, transformations, and models tha… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

    Comments: Technical report

  24. arXiv:2111.09714  [pdf, other

    cs.LG cs.CL

    You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

    Authors: Zhanpeng Zeng, Yunyang Xiong, Sathya N. Ravi, Shailesh Acharya, Glenn Fung, Vikas Singh

    Abstract: Transformer-based models are widely used in natural language processing (NLP). Central to the transformer model is the self-attention mechanism, which captures the interactions of token pairs in the input sequences and depends quadratically on the sequence length. Training such models on longer sequences is expensive. In this paper, we show that a Bernoulli sampling attention mechanism based on Lo… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

    Comments: Proceedings of the 38th ICML (2021)

  25. arXiv:2111.07850  [pdf, other

    cs.CR eess.SP

    Colored Noise Mechanism for Differentially Private Clustering

    Authors: Nikhil Ravi, Anna Scaglione, Sean Peisert

    Abstract: The goal of this paper is to propose and analyze a differentially private randomized mechanism for the $K$-means query. The goal is to ensure that the information received about the cluster-centroids is differentially private. The method consists in adding Gaussian noise with an optimum covariance. The main result of the paper is the analytical solution for the optimum covariance as a function of… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

    Comments: 5 pages, 3 figures, preprint

  26. arXiv:2110.02868  [pdf, other

    cs.IT stat.AP

    Coded Shotgun Sequencing

    Authors: Aditya Narayan Ravi, Alireza Vahid, Ilan Shomorony

    Abstract: Most DNA sequencing technologies are based on the shotgun paradigm: many short reads are obtained from random unknown locations in the DNA sequence. A fundamental question, studied in arXiv:1203.6233, is what read length and coverage depth (i.e., the total number of reads) are needed to guarantee reliable sequence reconstruction. Motivated by DNA-based storage, we study the coded version of this p… ▽ More

    Submitted 7 February, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: 35 pages, 4 figures, 8 appendices

  27. arXiv:2109.10472  [pdf, other

    physics.med-ph cs.CV eess.IV physics.bio-ph

    Rotor Localization and Phase Map** of Cardiac Excitation Waves using Deep Neural Networks

    Authors: Jan Lebert, Namita Ravi, Flavio Fenton, Jan Christoph

    Abstract: The analysis of electrical impulse phenomena in cardiac muscle tissue is important for the diagnosis of heart rhythm disorders and other cardiac pathophysiology. Cardiac map** techniques acquire local temporal measurements and combine them to visualize the spread of electrophysiological wave phenomena across the heart surface. However, low spatial resolution, sparse measurement locations, noise… ▽ More

    Submitted 8 November, 2021; v1 submitted 21 September, 2021; originally announced September 2021.

    Journal ref: Front. Physiol. 12 (2021) 782176

  28. arXiv:2108.08891  [pdf, other

    cs.CV cs.LG

    Neural TMDlayer: Modeling Instantaneous flow of features via SDE Generators

    Authors: Zihang Meng, Vikas Singh, Sathya N. Ravi

    Abstract: We study how stochastic differential equation (SDE) based ideas can inspire new modifications to existing algorithms for a set of problems in computer vision. Loosely speaking, our formulation is related to both explicit and implicit strategies for data augmentation and group equivariance, but is derived from new results in the SDE literature on estimating infinitesimal generators of a class of st… ▽ More

    Submitted 19 August, 2021; originally announced August 2021.

  29. arXiv:2102.08343  [pdf, other

    cs.LG cs.AI cs.CV

    Learning Invariant Representations using Inverse Contrastive Loss

    Authors: Aditya Kumar Akash, Vishnu Suresh Lokhande, Sathya N. Ravi, Vikas Singh

    Abstract: Learning invariant representations is a critical first step in a number of machine learning tasks. A common approach corresponds to the so-called information bottleneck principle in which an application dependent function of mutual information is carefully chosen and optimized. Unfortunately, in practice, these functions are not suitable for optimization purposes since these losses are agnostic of… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    Comments: Accepted to AAAI-21

  30. arXiv:2012.09854  [pdf, other

    cs.CV cs.AI cs.GR cs.LG stat.ML

    Worldsheet: Wrap** the World in a 3D Sheet for View Synthesis from a Single Image

    Authors: Ronghang Hu, Nikhila Ravi, Alexander C. Berg, Deepak Pathak

    Abstract: We present Worldsheet, a method for novel view synthesis using just a single RGB image as input. The main insight is that simply shrink-wrap** a planar mesh sheet onto the input image, consistent with the learned intermediate depth, captures underlying geometry sufficient to generate photorealistic unseen views with large viewpoint changes. To operationalize this, we propose a novel differentiab… ▽ More

    Submitted 18 August, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

    Comments: ICCV 2021; 17 pages

  31. arXiv:2009.08765  [pdf, ps, other

    cs.IT

    On the Capacity Enlargement of Gaussian Broadcast Channels with Passive Noisy Feedback

    Authors: Aditya Narayan Ravi, Sibi Raj B. Pillai, Vinod Prabhakaran, Michèle Wigger

    Abstract: It is well known that the capacity region of an average transmit power constrained Gaussian Broadcast Channel (GBC) with independent noise realizations at the receivers is enlarged by the presence of causal noiseless feedback. Capacity region enlargement is also known to be possible by using only passive noisy feedback, when the GBC has identical noise variances at the receivers. The last fact rem… ▽ More

    Submitted 18 September, 2020; originally announced September 2020.

    Comments: 23 single column pages, 4 Figures

  32. arXiv:2007.08501  [pdf, other

    cs.CV cs.GR cs.LG

    Accelerating 3D Deep Learning with PyTorch3D

    Authors: Nikhila Ravi, Jeremy Reizenstein, David Novotny, Taylor Gordon, Wan-Yen Lo, Justin Johnson, Georgia Gkioxari

    Abstract: Deep learning has significantly improved 2D image recognition. Extending into 3D may advance many new applications including autonomous vehicles, virtual and augmented reality, authoring 3D content, and even improving 2D recognition. However despite growing interest, 3D deep learning remains relatively underexplored. We believe that some of this disparity is due to the engineering challenges invol… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

    Comments: tech report

  33. arXiv:2004.14539  [pdf, other

    cs.LG cs.CV stat.ML

    Physarum Powered Differentiable Linear Programming Layers and Applications

    Authors: Zihang Meng, Sathya N. Ravi, Vikas Singh

    Abstract: Consider a learning algorithm, which involves an internal call to an optimization routine such as a generalized eigenvalue problem, a cone programming problem or even sorting. Integrating such a method as a layer(s) within a trainable deep neural network (DNN) in an efficient and numerically stable way is not straightforward -- for instance, only recently, strategies have emerged for eigendecompos… ▽ More

    Submitted 10 May, 2021; v1 submitted 29 April, 2020; originally announced April 2020.

  34. arXiv:2004.01355  [pdf, other

    cs.CV cs.LG

    FairALM: Augmented Lagrangian Method for Training Fair Models with Little Regret

    Authors: Vishnu Suresh Lokhande, Aditya Kumar Akash, Sathya N. Ravi, Vikas Singh

    Abstract: Algorithmic decision making based on computer vision and machine learning technologies continue to permeate our lives. But issues related to biases of these models and the extent to which they treat certain segments of the population unfairly, have led to concern in the general public. It is now accepted that because of biases in the datasets we present to the models, a fairness-oblivious training… ▽ More

    Submitted 23 June, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

  35. arXiv:2003.03808  [pdf, other

    cs.CV cs.LG eess.IV

    PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models

    Authors: Sachit Menon, Alexandru Damian, Shijia Hu, Nikhil Ravi, Cynthia Rudin

    Abstract: The primary aim of single-image super-resolution is to construct high-resolution (HR) images from corresponding low-resolution (LR) inputs. In previous approaches, which have generally been supervised, the training objective typically measures a pixel-wise average distance between the super-resolved (SR) and HR images. Optimizing such metrics often leads to blurring, especially in high variance (d… ▽ More

    Submitted 20 July, 2020; v1 submitted 8 March, 2020; originally announced March 2020.

    Comments: Sachit Menon and Alexandru Damian contributed equally. Computer Vision and Pattern Recognition (CVPR) 2020

  36. arXiv:1911.06239  [pdf, other

    stat.ML cs.LG stat.AP

    Unreliable Multi-Armed Bandits: A Novel Approach to Recommendation Systems

    Authors: Aditya Narayan Ravi, Pranav Poduval, Dr. Sharayu Moharir

    Abstract: We use a novel modification of Multi-Armed Bandits to create a new model for recommendation systems. We model the recommendation system as a bandit seeking to maximize reward by pulling on arms with unknown rewards. The catch however is that this bandit can only access these arms through an unreliable intermediate that has some level of autonomy while choosing its arms. For example, in a streaming… ▽ More

    Submitted 14 November, 2019; originally announced November 2019.

    Comments: 4 pages, 4 figures, Aditya Narayan Ravi and Pranav Poduval have equal contribution

  37. arXiv:1909.12398  [pdf, other

    cs.CV cs.LG

    Optimizing Nondecomposable Data Dependent Regularizers via Lagrangian Reparameterization offers Significant Performance and Efficiency Gains

    Authors: Sathya N. Ravi, Abhay Venkatesh, Glenn Moo Fung, Vikas Singh

    Abstract: Data dependent regularization is known to benefit a wide variety of problems in machine learning. Often, these regularizers cannot be easily decomposed into a sum over a finite number of terms, e.g., a sum over individual example-wise terms. The $F_β$ measure, Area under the ROC curve (AUCROC) and Precision at a fixed recall (P@R) are some prominent examples that are used in many applications. We… ▽ More

    Submitted 26 September, 2019; originally announced September 2019.

  38. arXiv:1909.05479  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Generating Accurate Pseudo-labels in Semi-Supervised Learning and Avoiding Overconfident Predictions via Hermite Polynomial Activations

    Authors: Vishnu Suresh Lokhande, Songwong Tasneeyapant, Abhay Venkatesh, Sathya N. Ravi, Vikas Singh

    Abstract: Rectified Linear Units (ReLUs) are among the most widely used activation function in a broad variety of tasks in vision. Recent theoretical results suggest that despite their excellent practical performance, in various cases, a substitution with basis expansions (e.g., polynomials) can yield significant benefits from both the optimization and generalization perspective. Unfortunately, the existing… ▽ More

    Submitted 31 March, 2020; v1 submitted 12 September, 2019; originally announced September 2019.

    Comments: Accepted at 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  39. arXiv:1909.02533  [pdf, other

    cs.CV cs.GR

    C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure From Motion

    Authors: David Novotny, Nikhila Ravi, Benjamin Graham, Natalia Neverova, Andrea Vedaldi

    Abstract: We propose C3DPO, a method for extracting 3D models of deformable objects from 2D keypoint annotations in unconstrained images. We do so by learning a deep network that reconstructs a 3D object from a single view at a time, accounting for partial occlusions, and explicitly factoring the effects of viewpoint changes and object deformations. In order to achieve this factorization, we introduce a nov… ▽ More

    Submitted 15 October, 2019; v1 submitted 5 September, 2019; originally announced September 2019.

    Comments: Added a link to the source code into the abstract

    Journal ref: IEEE/CVF International Conference on Computer Vision 2019

  40. arXiv:1805.03383  [pdf, other

    cs.CV

    New Techniques for Preserving Global Structure and Denoising with Low Information Loss in Single-Image Super-Resolution

    Authors: Yijie Bei, Alex Damian, Shijia Hu, Sachit Menon, Nikhil Ravi, Cynthia Rudin

    Abstract: This work identifies and addresses two important technical challenges in single-image super-resolution: (1) how to upsample an image without magnifying noise and (2) how to preserve large scale structure when upsampling. We summarize the techniques we developed for our second place entry in Track 1 (Bicubic Downsampling), seventh place entry in Track 2 (Realistic Adverse Conditions), and seventh p… ▽ More

    Submitted 15 June, 2018; v1 submitted 9 May, 2018; originally announced May 2018.

    Comments: 8 pages, CVPR workshop 2018

  41. arXiv:1803.08137  [pdf, other

    cs.CV cs.AI math.NA stat.ML

    Robust Blind Deconvolution via Mirror Descent

    Authors: Sathya N. Ravi, Ronak Mehta, Vikas Singh

    Abstract: We revisit the Blind Deconvolution problem with a focus on understanding its robustness and convergence properties. Provable robustness to noise and other perturbations is receiving recent interest in vision, from obtaining immunity to adversarial attacks to assessing and describing failure modes of algorithms in mission critical applications. Further, many blind deconvolution methods based on dee… ▽ More

    Submitted 21 March, 2018; originally announced March 2018.

  42. arXiv:1803.06453  [pdf, other

    cs.LG cs.CV stat.ML

    Constrained Deep Learning using Conditional Gradient and Applications in Computer Vision

    Authors: Sathya N. Ravi, Tuan Dinh, Vishnu Lokhande, Vikas Singh

    Abstract: A number of results have recently demonstrated the benefits of incorporating various constraints when training deep architectures in vision and machine learning. The advantages range from guarantees for statistical generalization to better accuracy to compression. But support for general constraints within widely used libraries remains scarce and their broader deployment within many applications t… ▽ More

    Submitted 16 March, 2018; originally announced March 2018.

  43. arXiv:1702.08670  [pdf, other

    cs.LG math.OC stat.ML

    On architectural choices in deep learning: From network structure to gradient convergence and parameter estimation

    Authors: Vamsi K Ithapu, Sathya N Ravi, Vikas Singh

    Abstract: We study mechanisms to characterize how the asymptotic convergence of backpropagation in deep architectures, in general, is related to the network structure, and how it may be influenced by other design choices including activation type, denoising and dropout rate. We seek to analyze whether network architecture and input data statistics may guide the choices of learning parameters and vice versa.… ▽ More

    Submitted 28 February, 2017; originally announced February 2017.

    Comments: 87 Pages; 14 figures; Under review

  44. arXiv:1511.05297  [pdf, other

    cs.LG stat.ML

    On the interplay of network structure and gradient convergence in deep learning

    Authors: Vamsi K Ithapu, Sathya N Ravi, Vikas Singh

    Abstract: The regularization and output consistency behavior of dropout and layer-wise pretraining for learning deep networks have been fairly well studied. However, our understanding of how the asymptotic convergence of backpropagation in deep architectures is related to the structural properties of the network and other design choices (like denoising and dropout rate) is less clear at this time. An intere… ▽ More

    Submitted 22 February, 2017; v1 submitted 17 November, 2015; originally announced November 2015.

    Comments: 54th Allerton Conference on Communication, Control and Computing 2016; pgs 488-495

  45. arXiv:1506.03412   

    cs.LG cs.CV cs.NE math.OC stat.ML

    Convergence rates for pretraining and dropout: Guiding learning parameters using network structure

    Authors: Vamsi K. Ithapu, Sathya Ravi, Vikas Singh

    Abstract: Unsupervised pretraining and dropout have been well studied, especially with respect to regularization and output consistency. However, our understanding about the explicit convergence rates of the parameter estimates, and their dependence on the learning (like denoising and dropout rate) and structural (like depth and layer lengths) aspects of the network is less mature. An interesting question i… ▽ More

    Submitted 22 February, 2017; v1 submitted 10 June, 2015; originally announced June 2015.

    Comments: This manuscript is now superseded by arXiv:1511.05297 and the corresponding accepted paper in 54th Allerton Conference on Communication, Control and Computing (2017)

  46. A survey on data and transaction management in mobile databases

    Authors: D. Roselin Selvarani, T. N. Ravi

    Abstract: The popularity of the Mobile Database is increasing day by day as people need information even on the move in the fast changing world. This database technology permits employees using mobile devices to connect to their corporate networks, hoard the needed data, work in the disconnected mode and reconnect to the network to synchronize with the corporate database. In this scenario, the data is being… ▽ More

    Submitted 23 November, 2012; originally announced November 2012.

    Comments: 20 Pages; International Journal of Database Management Systems (IJDMS) Vol.4, No.5, October 2012. arXiv admin note: text overlap with arXiv:0908.0076, arXiv:1005.1747, arXiv:1108.6195 by other authors

  47. arXiv:1206.6855  [pdf

    cs.GT

    An Efficient Optimal-Equilibrium Algorithm for Two-player Game Trees

    Authors: Michael L. Littman, Nishkam Ravi, Arjun Talwar, Martin Zinkevich

    Abstract: Two-player complete-information game trees are perhaps the simplest possible setting for studying general-sum games and the computational problem of finding equilibria. These games admit a simple bottom-up algorithm for finding subgame perfect Nash equilibria efficiently. However, such an algorithm can fail to identify optimal equilibria, such as those that maximize social welfare. The reason is t… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006)

    Report number: UAI-P-2006-PG-298-305

  48. arXiv:1111.7258  [pdf

    cs.AR

    A New Design for Array Multiplier with Trade off in Power and Area

    Authors: Nirlakalla Ravi, A. Satish, T. Jayachandra Prasad, T. Subba Rao

    Abstract: In this paper a low power and low area array multiplier with carry save adder is proposed. The proposed adder eliminates the final addition stage of the multiplier than the conventional parallel array multiplier. The conventional and proposed multiplier both are synthesized with 16-T full adder. Among Transmission Gate, Transmission Function Adder, 14-T, 16-T full adder shows energy efficiency. In… ▽ More

    Submitted 30 November, 2011; originally announced November 2011.

    Comments: 5 pages, 6 figures; IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 2, May 2011