Skip to main content

Showing 1–33 of 33 results for author: Naik, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.15002  [pdf, other

    cs.CL cs.CV

    CommVQA: Situating Visual Question Answering in Communicative Contexts

    Authors: Nandita Shankar Naik, Christopher Potts, Elisa Kreiss

    Abstract: Current visual question answering (VQA) models tend to be trained and evaluated on image-question pairs in isolation. However, the questions people ask are dependent on their informational needs and prior knowledge about the image content. To evaluate how situating images within naturalistic contexts shapes visual questions, we introduce CommVQA, a VQA dataset consisting of images, image descripti… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  2. arXiv:2401.13974  [pdf, other

    cs.CV cs.AI cs.GR

    BootPIG: Bootstrap** Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models

    Authors: Senthil Purushwalkam, Akash Gokul, Shafiq Joty, Nikhil Naik

    Abstract: Recent text-to-image generation models have demonstrated incredible success in generating images that faithfully follow input prompts. However, the requirement of using words to describe a desired concept provides limited control over the appearance of the generated concepts. In this work, we address this shortcoming by proposing an approach to enable personalization capabilities in existing text-… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  3. arXiv:2311.12908  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Diffusion Model Alignment Using Direct Preference Optimization

    Authors: Bram Wallace, Meihua Dang, Rafael Rafailov, Linqi Zhou, Aaron Lou, Senthil Purushwalkam, Stefano Ermon, Caiming Xiong, Shafiq Joty, Nikhil Naik

    Abstract: Large language models (LLMs) are fine-tuned using human comparison data with Reinforcement Learning from Human Feedback (RLHF) methods to make them better aligned with users' preferences. In contrast to LLMs, human preference learning has not been widely explored in text-to-image diffusion models; the best existing approach is to fine-tune a pretrained model using carefully curated high quality im… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

  4. arXiv:2311.07191  [pdf, other

    cs.AI cs.LG stat.AP

    Applying Large Language Models for Causal Structure Learning in Non Small Cell Lung Cancer

    Authors: Narmada Naik, Ayush Khandelwal, Mohit Joshi, Madhusudan Atre, Hollis Wright, Kavya Kannan, Scott Hill, Giridhar Mamidipudi, Ganapati Srinivasa, Carlo Bifulco, Brian Piening, Kevin Matlock

    Abstract: Causal discovery is becoming a key part in medical AI research. These methods can enhance healthcare by identifying causal links between biomarkers, demographics, treatments and outcomes. They can aid medical professionals in choosing more impactful treatments and strategies. In parallel, Large Language Models (LLMs) have shown great potential in identifying patterns and generating insights from t… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  5. arXiv:2311.05230  [pdf, other

    cs.CV

    ConRad: Image Constrained Radiance Fields for 3D Generation from a Single Image

    Authors: Senthil Purushwalkam, Nikhil Naik

    Abstract: We present a novel method for reconstructing 3D objects from a single RGB image. Our method leverages the latest image generation models to infer the hidden 3D structure while remaining faithful to the input image. While existing methods obtain impressive results in generating 3D models from text prompts, they do not provide an easy approach for conditioning on input RGB data. Naïve extensions of… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: Advances in Neural Information Processing Systems (NeurIPS 2023)

  6. arXiv:2308.07309  [pdf, other

    cs.CR cs.DC cs.ET

    Reinforcing Security and Usability of Crypto-Wallet with Post-Quantum Cryptography and Zero-Knowledge Proof

    Authors: Yathin Kethepalli, Rony Joseph, Sai Raja Vajrala, Jashwanth Vemula, Nenavath Srinivas Naik

    Abstract: Crypto-wallets or digital asset wallets are a crucial aspect of managing cryptocurrencies and other digital assets such as NFTs. However, these wallets are not immune to security threats, particularly from the growing risk of quantum computing. The use of traditional public-key cryptography systems in digital asset wallets makes them vulnerable to attacks from quantum computers, which may increase… ▽ More

    Submitted 29 August, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

  7. arXiv:2307.15745  [pdf, other

    cs.CL cs.CV

    Context-VQA: Towards Context-Aware and Purposeful Visual Question Answering

    Authors: Nandita Naik, Christopher Potts, Elisa Kreiss

    Abstract: Visual question answering (VQA) has the potential to make the Internet more accessible in an interactive way, allowing people who cannot see images to ask questions about them. However, multiple studies have shown that people who are blind or have low-vision prefer image explanations that incorporate the context in which an image appears, yet current VQA datasets focus on images in isolation. We a… ▽ More

    Submitted 30 August, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: Proceedings of ICCV 2023 Workshop on Closing the Loop Between Vision and Language

  8. arXiv:2303.13703  [pdf, other

    cs.CV cs.AI cs.LG

    End-to-End Diffusion Latent Optimization Improves Classifier Guidance

    Authors: Bram Wallace, Akash Gokul, Stefano Ermon, Nikhil Naik

    Abstract: Classifier guidance -- using the gradients of an image classifier to steer the generations of a diffusion model -- has the potential to dramatically expand the creative control over image generation and editing. However, currently classifier guidance requires either training new noise-aware models to obtain accurate gradients or using a one-step denoising approximation of the final generation, whi… ▽ More

    Submitted 31 May, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

  9. arXiv:2211.12446  [pdf, other

    cs.CV cs.AI cs.LG

    EDICT: Exact Diffusion Inversion via Coupled Transformations

    Authors: Bram Wallace, Akash Gokul, Nikhil Naik

    Abstract: Finding an initial noise vector that produces an input image when fed into the diffusion process (known as inversion) is an important problem in denoising diffusion models (DDMs), with applications for real image editing. The state-of-the-art approach for real image editing with inversion uses denoising diffusion implicit models (DDIMs) to deterministically noise the image to the intermediate stat… ▽ More

    Submitted 22 December, 2022; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: 24 pages, 22 figures. Code now available

  10. arXiv:2206.13517  [pdf, other

    cs.LG q-bio.QM

    ProGen2: Exploring the Boundaries of Protein Language Models

    Authors: Erik Nijkamp, Jeffrey Ruffolo, Eli N. Weinstein, Nikhil Naik, Ali Madani

    Abstract: Attention-based models trained on protein sequences have demonstrated incredible success at classification and generation tasks relevant for artificial intelligence-driven protein design. However, we lack a sufficient understanding of how very large-scale models and data play a role in effective protein model development. We introduce a suite of protein language models, named ProGen2, that are sca… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

  11. arXiv:2204.11122  [pdf, other

    cs.CV cs.LG

    Can domain adaptation make object recognition work for everyone?

    Authors: Viraj Prabhu, Ramprasaath R. Selvaraju, Judy Hoffman, Nikhil Naik

    Abstract: Despite the rapid progress in deep visual recognition, modern computer vision datasets significantly overrepresent the developed world and models trained on such datasets underperform on images from unseen geographies. We investigate the effectiveness of unsupervised domain adaptation (UDA) of such models across geographies at closing this performance gap. To do so, we first curate two shifts from… ▽ More

    Submitted 23 April, 2022; originally announced April 2022.

    Comments: Published at the L3D-IVU workshop at CVPR 2022

  12. arXiv:2112.07133  [pdf, other

    cs.CV

    CLIP-Lite: Information Efficient Visual Representation Learning with Language Supervision

    Authors: Aman Shrivastava, Ramprasaath R. Selvaraju, Nikhil Naik, Vicente Ordonez

    Abstract: We propose CLIP-Lite, an information efficient method for visual representation learning by feature alignment with textual annotations. Compared to the previously proposed CLIP model, CLIP-Lite requires only one negative image-text sample pair for every positive image-text sample during the optimization of its contrastive learning objective. We accomplish this by taking advantage of an information… ▽ More

    Submitted 11 May, 2023; v1 submitted 13 December, 2021; originally announced December 2021.

  13. arXiv:2112.00804  [pdf, other

    cs.CV

    PreViTS: Contrastive Pretraining with Video Tracking Supervision

    Authors: Brian Chen, Ramprasaath R. Selvaraju, Shih-Fu Chang, Juan Carlos Niebles, Nikhil Naik

    Abstract: Videos are a rich source for self-supervised learning (SSL) of visual representations due to the presence of natural temporal transformations of objects. However, current methods typically randomly sample video clips for learning, which results in an imperfect supervisory signal. In this work, we propose PreViTS, an SSL framework that utilizes an unsupervised tracking signal for selecting clips co… ▽ More

    Submitted 27 September, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

    Comments: To be presented at WACV 2023

  14. arXiv:2110.04282  [pdf, other

    cs.CV cs.AI

    Field Extraction from Forms with Unlabeled Data

    Authors: Mingfei Gao, Zeyuan Chen, Nikhil Naik, Kazuma Hashimoto, Caiming Xiong, Ran Xu

    Abstract: We propose a novel framework to conduct field extraction from forms with unlabeled data. To bootstrap the training process, we develop a rule-based method for mining noisy pseudo-labels from unlabeled forms. Using the supervisory signal from the pseudo-labels, we extract a discriminative token representation from a transformer-based model by modeling the interaction between text in the form. To pr… ▽ More

    Submitted 11 April, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

    Comments: Spa-NLP@ACL2022

  15. arXiv:2107.02968  [pdf, other

    cs.LG cs.CL q-bio.QM

    Deep Extrapolation for Attribute-Enhanced Generation

    Authors: Alvin Chan, Ali Madani, Ben Krause, Nikhil Naik

    Abstract: Attribute extrapolation in sample generation is challenging for deep neural networks operating beyond the training distribution. We formulate a new task for extrapolation in sequence generation, focusing on natural language and proteins, and propose GENhance, a generative framework that enhances attributes through a learned latent space. Trained on movie reviews and a computed protein stability da… ▽ More

    Submitted 25 October, 2021; v1 submitted 6 July, 2021; originally announced July 2021.

    Comments: NeurIPS 2021

  16. arXiv:2101.04013  [pdf

    cs.LG

    Contrastive Learning Improves Critical Event Prediction in COVID-19 Patients

    Authors: Tingyi Wanyan, Hossein Honarvar, Suraj K. Jaladanki, Chengxi Zang, Nidhi Naik, Sulaiman Somani, Jessica K. De Freitas, Ishan Paranjpe, Akhil Vaid, Riccardo Miotto, Girish N. Nadkarni, Marinka Zitnik, ArifulAzad, Fei Wang, Ying Ding, Benjamin S. Glicksberg

    Abstract: Machine Learning (ML) models typically require large-scale, balanced training data to be robust, generalizable, and effective in the context of healthcare. This has been a major issue for develo** ML models for the coronavirus-disease 2019 (COVID-19) pandemic where data is highly imbalanced, particularly within electronic health records (EHR) research. Conventional approaches in ML use cross-ent… ▽ More

    Submitted 11 January, 2021; originally announced January 2021.

  17. arXiv:2012.04630  [pdf, other

    cs.CV cs.AI cs.LG

    CASTing Your Model: Learning to Localize Improves Self-Supervised Representations

    Authors: Ramprasaath R. Selvaraju, Karan Desai, Justin Johnson, Nikhil Naik

    Abstract: Recent advances in self-supervised learning (SSL) have largely closed the gap with supervised ImageNet pretraining. Despite their success these methods have been primarily applied to unlabeled ImageNet images, and show marginal gains when trained on larger sets of uncurated images. We hypothesize that current SSL methods perform best on iconic images, and struggle on complex scene images with many… ▽ More

    Submitted 8 December, 2020; originally announced December 2020.

  18. arXiv:2010.14785  [pdf, other

    cs.LG cs.AI

    Designing Interpretable Approximations to Deep Reinforcement Learning

    Authors: Nathan Dahlin, Krishna Chaitanya Kalagarla, Nikhil Naik, Rahul Jain, Pierluigi Nuzzo

    Abstract: In an ever expanding set of research and application areas, deep neural networks (DNNs) set the bar for algorithm performance. However, depending upon additional constraints such as processing power and execution time limits, or requirements such as verifiable safety guarantees, it may not be feasible to actually use such high-performing DNNs in practice. Many techniques have been developed in rec… ▽ More

    Submitted 19 June, 2021; v1 submitted 28 October, 2020; originally announced October 2020.

  19. arXiv:2006.14234  [pdf

    cs.CR

    Consortium Blockchain for Security and Privacy-Preserving in E-government Systems

    Authors: Noe Elisa, Longzhi Yang, Honglei Li, Fei Chao, Nitin Naik

    Abstract: Since its inception as a solution for secure cryptocurrencies sharing in 2008, the blockchain technology has now become one of the core technologies for secure data sharing and storage over trustless and decentralised peer-to-peer systems. E-government is amongst the systems that stores sensitive information about citizens, businesses and other affiliates, and therefore becomes the target of cyber… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

    Comments: 9 pages

  20. arXiv:2004.13332  [pdf, other

    econ.GN cs.LG stat.ML

    The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies

    Authors: Stephan Zheng, Alexander Trott, Sunil Srinivasa, Nikhil Naik, Melvin Gruesbeck, David C. Parkes, Richard Socher

    Abstract: Tackling real-world socio-economic challenges requires designing and testing economic policies. However, this is hard in practice, due to a lack of appropriate (micro-level) economic data and limited opportunity to experiment. In this work, we train social planners that discover tax policies in dynamic economies that can effectively trade-off economic equality and productivity. We propose a two-le… ▽ More

    Submitted 28 April, 2020; originally announced April 2020.

    Comments: 46 pages, 21 figures

  21. arXiv:2004.03497  [pdf, other

    q-bio.BM cs.LG stat.ML

    ProGen: Language Modeling for Protein Generation

    Authors: Ali Madani, Bryan McCann, Nikhil Naik, Nitish Shirish Keskar, Namrata Anand, Raphael R. Eguchi, Po-Ssu Huang, Richard Socher

    Abstract: Generative modeling for protein engineering is key to solving fundamental problems in synthetic biology, medicine, and material science. We pose protein engineering as an unsupervised sequence generation problem in order to leverage the exponentially growing set of proteins that lack costly, structural annotations. We train a 1.2B-parameter language model, ProGen, on ~280M protein sequences condit… ▽ More

    Submitted 7 March, 2020; originally announced April 2020.

  22. arXiv:2003.13525  [pdf, other

    cs.CV cs.LG

    Improving out-of-distribution generalization via multi-task self-supervised pretraining

    Authors: Isabela Albuquerque, Nikhil Naik, Junnan Li, Nitish Keskar, Richard Socher

    Abstract: Self-supervised feature representations have been shown to be useful for supervised classification, few-shot learning, and adversarial robustness. We show that features obtained using self-supervised learning are comparable to, or better than, supervised learning for domain generalization in computer vision. We introduce a new self-supervised pretext task of predicting responses to Gabor filter ba… ▽ More

    Submitted 30 March, 2020; originally announced March 2020.

  23. arXiv:1809.05934  [pdf, other

    cs.CV cs.LG

    Maximum-Entropy Fine-Grained Classification

    Authors: Abhimanyu Dubey, Otkrist Gupta, Ramesh Raskar, Nikhil Naik

    Abstract: Fine-Grained Visual Classification (FGVC) is an important computer vision problem that involves small diversity within the different classes, and often requires expert annotators to collect data. Utilizing this notion of small visual diversity, we revisit Maximum-Entropy learning in the context of fine-grained classification, and provide a training routine that maximizes the entropy of the output… ▽ More

    Submitted 20 September, 2018; v1 submitted 16 September, 2018; originally announced September 2018.

    Comments: Camera-ready, accepted to NIPS 2018, v2 has minor typo updates and small changes in text

  24. arXiv:1705.10823  [pdf, other

    cs.LG cs.CV cs.NE

    Accelerating Neural Architecture Search using Performance Prediction

    Authors: Bowen Baker, Otkrist Gupta, Ramesh Raskar, Nikhil Naik

    Abstract: Methods for neural network hyperparameter optimization and meta-modeling are computationally expensive due to the need to train a large number of model configurations. In this paper, we show that standard frequentist regression models can predict the final performance of partially trained model configurations using features based on network architectures, hyperparameters, and time-series validatio… ▽ More

    Submitted 8 November, 2017; v1 submitted 30 May, 2017; originally announced May 2017.

    Comments: Submitted to International Conference on Learning Representations, (2018)

  25. arXiv:1705.08016  [pdf, other

    cs.CV

    Pairwise Confusion for Fine-Grained Visual Classification

    Authors: Abhimanyu Dubey, Otkrist Gupta, Pei Guo, Ramesh Raskar, Ryan Farrell, Nikhil Naik

    Abstract: Fine-Grained Visual Classification (FGVC) datasets contain small sample sizes, along with significant intra-class variation and inter-class similarity. While prior work has addressed intra-class variation using localization and segmentation techniques, inter-class similarity may also affect feature learning and reduce classification performance. In this work, we address this problem using a novel… ▽ More

    Submitted 25 July, 2018; v1 submitted 22 May, 2017; originally announced May 2017.

    Comments: Camera-Ready version for ECCV 2018

  26. arXiv:1704.08526  [pdf

    cs.AR

    An Efficient Reconfigurable FIR Digital Filter Using Modified Distribute Arithmetic Technique

    Authors: Naveen S Naik, Kiran A Gupta

    Abstract: This paper provides modified Distributed Arithmetic based technique to compute sum of products saving appreciable number of Multiply And accumulation blocks and this consecutively reduces circuit size. In this technique multiplexer based structure is used to reuse the blocks so as to reduce the required memory locations. In this technique a Carry Look Ahead based adder tree is used to have better… ▽ More

    Submitted 27 April, 2017; originally announced April 2017.

    Comments: 5 pages,4 figures, journal 2015

  27. arXiv:1611.02167  [pdf, other

    cs.LG

    Designing Neural Network Architectures using Reinforcement Learning

    Authors: Bowen Baker, Otkrist Gupta, Nikhil Naik, Ramesh Raskar

    Abstract: At present, designing convolutional neural network (CNN) architectures requires both human expertise and labor. New architectures are handcrafted by careful experimentation or modified from a handful of existing networks. We introduce MetaQNN, a meta-modeling algorithm based on reinforcement learning to automatically generate high-performing CNN architectures for a given learning task. The learnin… ▽ More

    Submitted 22 March, 2017; v1 submitted 7 November, 2016; originally announced November 2016.

  28. arXiv:1608.01769  [pdf, other

    cs.CV

    Deep Learning the City : Quantifying Urban Perception At A Global Scale

    Authors: Abhimanyu Dubey, Nikhil Naik, Devi Parikh, Ramesh Raskar, César A. Hidalgo

    Abstract: Computer vision methods that quantify the perception of urban environment are increasingly being used to study the relationship between a city's physical appearance and the behavior and health of its residents. Yet, the throughput of current methods is too limited to quantify the perception of cities across the world. To tackle this challenge, we introduce a new crowdsourced dataset containing 110… ▽ More

    Submitted 12 September, 2016; v1 submitted 5 August, 2016; originally announced August 2016.

    Comments: 23 pages, 8 figures. Accepted to the European Conference on Computer Vision (ECCV), 2016

  29. arXiv:1608.00462  [pdf, other

    cs.CY cs.SI physics.soc-ph

    Are Safer Looking Neighborhoods More Lively? A Multimodal Investigation into Urban Life

    Authors: Marco De Nadai, Radu L. Vieriu, Gloria Zen, Stefan Dragicevic, Nikhil Naik, Michele Caraviello, Cesar A. Hidalgo, Nicu Sebe, Bruno Lepri

    Abstract: Policy makers, urban planners, architects, sociologists, and economists are interested in creating urban areas that are both lively and safe. But are the safety and liveliness of neighborhoods independent characteristics? Or are they just two sides of the same coin? In a world where people avoid unsafe looking places, neighborhoods that look unsafe will be less lively, and will fail to harness the… ▽ More

    Submitted 1 August, 2016; originally announced August 2016.

    Comments: To appear in the Proceedings of ACM Multimedia Conference (MM), 2016. October 15 - 19, 2016, Amsterdam, Netherlands

  30. arXiv:1511.06147  [pdf, other

    cs.CV cs.LG

    Coreset-Based Adaptive Tracking

    Authors: Abhimanyu Dubey, Nikhil Naik, Dan Raviv, Rahul Sukthankar, Ramesh Raskar

    Abstract: We propose a method for learning from streaming visual data using a compact, constant size representation of all the data that was seen until a given moment. Specifically, we construct a 'coreset' representation of streaming data using a parallelized algorithm, which is an approximation of a set with relation to the squared distances between this set and all other points in its ambient space. We l… ▽ More

    Submitted 19 November, 2015; originally announced November 2015.

    Comments: 8 pages, 5 figures, In submission to IEEE TPAMI (Transactions on Pattern Analysis and Machine Intelligence)

  31. Robust real time face recognition and tracking on gpu using fusion of rgb and depth image

    Authors: Narmada Naik, G. N Rathna

    Abstract: This paper presents a real-time face recognition system using kinect sensor. The algorithm is implemented on GPU using opencl and significant speed improvements are observed. We use kinect depth image to increase the robustness and reduce computational cost of conventional LBP based face recognition. The main objective of this paper was to perform robust, high speed fusion based face recognition a… ▽ More

    Submitted 8 April, 2015; originally announced April 2015.

  32. arXiv:1501.04878   

    cs.CV

    A Light Transport Model for Mitigating Multipath Interference in TOF Sensors

    Authors: Nikhil Naik, Achuta Kadambi, Christoph Rhemann, Shahram Izadi, Ramesh Raskar, Sing Bing Kang

    Abstract: Continuous-wave Time-of-flight (TOF) range imaging has become a commercially viable technology with many applications in computer vision and graphics. However, the depth images obtained from TOF cameras contain scene dependent errors due to multipath interference (MPI). Specifically, MPI occurs when multiple optical reflections return to a single spatial location on the imaging sensor. Many prior… ▽ More

    Submitted 30 January, 2015; v1 submitted 20 January, 2015; originally announced January 2015.

    Comments: This paper has been withdrawn by the submitter as the submission was made due to a miscommunication

  33. arXiv:0912.3970  [pdf

    cs.NI cs.CR

    Penetration Testing: A Roadmap to Network Security

    Authors: Nitin A. Naik, Gajanan D. Kurundkar, Santosh D. Khamitkar, Namdeo V. Kalyankar

    Abstract: Network penetration testing identifies the exploits and vulnerabilities those exist within computer network infrastructure and help to confirm the security measures. The objective of this paper is to explain methodology and methods behind penetration testing and illustrate remedies over it, which will provide substantial value for network security Penetration testing should model real world atta… ▽ More

    Submitted 26 December, 2009; v1 submitted 19 December, 2009; originally announced December 2009.

    Journal ref: Journal of Computing, Volume 1, Issue 1, pp 187-190, December 2009