Skip to main content

Showing 1–50 of 104 results for author: Nair, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09246  [pdf, other

    cs.RO cs.LG

    OpenVLA: An Open-Source Vision-Language-Action Model

    Authors: Moo ** Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan Foster, Grace Lam, Pannag Sanketi, Quan Vuong, Thomas Kollar, Benjamin Burchfiel, Russ Tedrake, Dorsa Sadigh, Sergey Levine, Percy Liang, Chelsea Finn

    Abstract: Large policies pretrained on a combination of Internet-scale vision-language data and diverse robot demonstrations have the potential to change how we teach robots new skills: rather than training new behaviors from scratch, we can fine-tune such vision-language-action (VLA) models to obtain robust, generalizable policies for visuomotor control. Yet, widespread adoption of VLAs for robotics has be… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Website: https://openvla.github.io/

  2. arXiv:2406.03407  [pdf, other

    cs.LG cs.SD eess.AS physics.comp-ph

    Physics and geometry informed neural operator network with application to acoustic scattering

    Authors: Siddharth Nair, Timothy F. Walsh, Greg Pickrell, Fabio Semperlotti

    Abstract: In this paper, we introduce a physics and geometry informed neural operator network with application to the forward simulation of acoustic scattering. The development of geometry informed deep learning models capable of learning a solution operator for different computational domains is a problem of general importance for a variety of engineering applications. To this end, we propose a physics-inf… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: 20 pages of main text, 9 figures

  3. arXiv:2405.11511  [pdf, other

    cs.CV

    Online Action Representation using Change Detection and Symbolic Programming

    Authors: Vishnu S Nair, Sneha Sree, Jayaraj Joseph, Mohanasankar Sivaprakasam

    Abstract: This paper addresses the critical need for online action representation, which is essential for various applications like rehabilitation, surveillance, etc. The task can be defined as representation of actions as soon as they happen in a streaming video without access to video frames in the future. Most of the existing methods use predefined window sizes for video segments, which is a restrictive… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  4. arXiv:2405.01114  [pdf, other

    cs.LG cs.RO

    Continual Imitation Learning for Prosthetic Limbs

    Authors: Sharmita Dey, Benjamin Paassen, Sarath Ravindran Nair, Sabri Boughorbel, Arndt F. Schilling

    Abstract: Lower limb amputations and neuromuscular impairments severely restrict mobility, necessitating advancements beyond conventional prosthetics. Motorized bionic limbs offer promise, but their utility depends on mimicking the evolving synergy of human movement in various settings. In this context, we present a novel model for bionic prostheses' application that leverages camera-based motion capture an… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  5. arXiv:2404.18797  [pdf, other

    cs.IR

    Efficiency-Effectiveness Tradeoff of Probabilistic Structured Queries for Cross-Language Information Retrieval

    Authors: Eugene Yang, Suraj Nair, Dawn Lawrie, James Mayfield, Douglas W. Oard, Kevin Duh

    Abstract: Probabilistic Structured Queries (PSQ) is a cross-language information retrieval (CLIR) method that uses translation probabilities statistically derived from aligned corpora. PSQ is a strong baseline for efficient CLIR using sparse indexing. It is, therefore, useful as the first stage in a cascaded neural CLIR system whose second stage is more effective but too inefficient to be used on its own to… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 11 pages, 5 figures

  6. arXiv:2403.12945  [pdf, other

    cs.RO

    DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

    Authors: Alexander Khazatsky, Karl Pertsch, Suraj Nair, Ashwin Balakrishna, Sudeep Dasari, Siddharth Karamcheti, Soroush Nasiriany, Mohan Kumar Srirama, Lawrence Yunliang Chen, Kirsty Ellis, Peter David Fagan, Joey Hejna, Masha Itkina, Marion Lepert, Yecheng Jason Ma, Patrick Tree Miller, Jimmy Wu, Suneel Belkhale, Shivin Dass, Huy Ha, Arhan Jain, Abraham Lee, Youngwoon Lee, Marius Memmel, Sungjae Park , et al. (74 additional authors not shown)

    Abstract: The creation of large, diverse, high-quality robot manipulation datasets is an important step** stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a resu… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Project website: https://droid-dataset.github.io/

  7. arXiv:2403.06569  [pdf, other

    cs.LG cs.RO

    Enhancing Joint Motion Prediction for Individuals with Limb Loss Through Model Reprogramming

    Authors: Sharmita Dey, Sarath R. Nair

    Abstract: Mobility impairment caused by limb loss is a significant challenge faced by millions of individuals worldwide. The development of advanced assistive technologies, such as prosthetic devices, has the potential to greatly improve the quality of life for amputee patients. A critical component in the design of such technologies is the accurate prediction of reference joint motion for the missing limb.… ▽ More

    Submitted 12 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Journal ref: ICLR 2024 Workshop: Learning from Time Series for Health

  8. arXiv:2403.00122  [pdf

    physics.soc-ph cs.CY cs.ET quant-ph

    Quantum Readiness in Healthcare and Public Health: Building a Quantum Literate Workforce

    Authors: Jonathan B VanGeest, Kieran J Fogarty, William G Hervey, Robert A Hanson, Suresh Nair, Timothy A Akers

    Abstract: Quantum technologies, including quantum computing, cryptography, and sensing, among others, are set to revolutionize sectors ranging from materials science to drug discovery. Despite their significant potential, the implications for public health have been largely overlooked, highlighting a critical gap in recognition and preparation. This oversight necessitates immediate action, as public health… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

    Comments: 13 pages, 1 table

  9. arXiv:2402.07865  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models

    Authors: Siddharth Karamcheti, Suraj Nair, Ashwin Balakrishna, Percy Liang, Thomas Kollar, Dorsa Sadigh

    Abstract: Visually-conditioned language models (VLMs) have seen growing adoption in applications such as visual dialogue, scene understanding, and robotic task planning; adoption that has fueled a wealth of new models such as LLaVa, InstructBLIP, and PaLI-3. Despite the volume of new releases, key design decisions around image preprocessing, architecture, and optimization are under-explored, making it chall… ▽ More

    Submitted 30 May, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: Published at ICML 2024. 22 pages, 11 figures. Training code and models: https://github.com/TRI-ML/prismatic-vlms. Evaluation code: https://github.com/TRI-ML/vlm-evaluation

  10. arXiv:2402.07158  [pdf, other

    cs.SE cs.LG

    Effort and Size Estimation in Software Projects with Large Language Model-based Intelligent Interfaces

    Authors: Claudionor N. Coelho Jr, Hanchen Xiong, Tushar Karayil, Sree Koratala, Rex Shang, Jacob Bollinger, Mohamed Shabar, Syam Nair

    Abstract: The advancement of Large Language Models (LLM) has also resulted in an equivalent proliferation in its applications. Software design, being one, has gained tremendous benefits in using LLMs as an interface component that extends fixed user stories. However, inclusion of LLM-based AI agents in software design often poses unexpected challenges, especially in the estimation of development efforts. Th… ▽ More

    Submitted 28 June, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

  11. arXiv:2402.01116  [pdf, other

    cs.RO cs.LG eess.SY

    Scalable Multi-modal Model Predictive Control via Duality-based Interaction Predictions

    Authors: Hansung Kim, Siddharth H. Nair, Francesco Borrelli

    Abstract: We propose a hierarchical architecture designed for scalable real-time Model Predictive Control (MPC) in complex, multi-modal traffic scenarios. This architecture comprises two key components: 1) RAID-Net, a novel attention-based Recurrent Neural Network that predicts relevant interactions along the MPC prediction horizon between the autonomous vehicle and the surrounding vehicles using Lagrangian… ▽ More

    Submitted 2 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Accepted at IEEE Intelligent Vehicles Symposium 2024

  12. arXiv:2401.08598  [pdf, other

    cs.CV

    NutritionVerse-Real: An Open Access Manually Collected 2D Food Scene Dataset for Dietary Intake Estimation

    Authors: Chi-en Amy Tai, Saeejith Nair, Olivia Markham, Matthew Keller, Yifan Wu, Yuhao Chen, Alexander Wong

    Abstract: Dietary intake estimation plays a crucial role in understanding the nutritional habits of individuals and populations, aiding in the prevention and management of diet-related health issues. Accurate estimation requires comprehensive datasets of food scenes, including images, segmentation masks, and accompanying dietary intake metadata. In this paper, we introduce NutritionVerse-Real, an open acces… ▽ More

    Submitted 20 November, 2023; originally announced January 2024.

  13. arXiv:2312.14115  [pdf, other

    cs.RO cs.AI cs.CV

    LingoQA: Video Question Answering for Autonomous Driving

    Authors: Ana-Maria Marcu, Long Chen, Jan Hünermann, Alice Karnsund, Benoit Hanotte, Prajwal Chidananda, Saurabh Nair, Vijay Badrinarayanan, Alex Kendall, Jamie Shotton, Elahe Arani, Oleg Sinavski

    Abstract: Autonomous driving has long faced a challenge with public acceptance due to the lack of explainability in the decision-making process. Video question-answering (QA) in natural language provides the opportunity for bridging this gap. Nonetheless, evaluating the performance of Video QA models has proved particularly tough due to the absence of comprehensive benchmarks. To fill this gap, we introduce… ▽ More

    Submitted 19 March, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Benchmark and dataset are available at https://github.com/wayveai/LingoQA/

  14. arXiv:2312.06192  [pdf, other

    cs.CV

    NutritionVerse-Synth: An Open Access Synthetically Generated 2D Food Scene Dataset for Dietary Intake Estimation

    Authors: Saeejith Nair, Chi-en Amy Tai, Yuhao Chen, Alexander Wong

    Abstract: Manually tracking nutritional intake via food diaries is error-prone and burdensome. Automated computer vision techniques show promise for dietary monitoring but require large and diverse food image datasets. To address this need, we introduce NutritionVerse-Synth (NV-Synth), a large-scale synthetic food image dataset. NV-Synth contains 84,984 photorealistic meal images rendered from 7,082 dynamic… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 6 pages

  15. arXiv:2312.05171  [pdf, other

    cs.AI cs.NE

    DARLEI: Deep Accelerated Reinforcement Learning with Evolutionary Intelligence

    Authors: Saeejith Nair, Mohammad Javad Shafiee, Alexander Wong

    Abstract: We present DARLEI, a framework that combines evolutionary algorithms with parallelized reinforcement learning for efficiently training and evolving populations of UNIMAL agents. Our approach utilizes Proximal Policy Optimization (PPO) for individual agent learning and pairs it with a tournament selection-based generational learning mechanism to foster morphological evolution. By building on Nvidia… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: 9 pages

  16. arXiv:2310.20561  [pdf, other

    cs.RO eess.SY math.OC

    Predictive Control for Autonomous Driving with Uncertain, Multi-modal Predictions

    Authors: Siddharth H. Nair, Hotae Lee, Eunhyek Joa, Yan Wang, H. Eric Tseng, Francesco Borrelli

    Abstract: We propose a Stochastic MPC (SMPC) formulation for path planning with autonomous vehicles in scenarios involving multiple agents with multi-modal predictions. The multi-modal predictions capture the uncertainty of urban driving in distinct modes/maneuvers (e.g., yield, keep speed) and driving trajectories (e.g., speed, turning radius), which are incorporated for multi-modal collision avoidance cha… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: The first three authors contributed equally

  17. arXiv:2310.17774  [pdf, other

    cs.CL

    Words, Subwords, and Morphemes: What Really Matters in the Surprisal-Reading Time Relationship?

    Authors: Sathvik Nair, Philip Resnik

    Abstract: An important assumption that comes with using LLMs on psycholinguistic data has gone unverified. LLM-based predictions are based on subword tokenization, not decomposition of words into morphemes. Does that matter? We carefully test this by comparing surprisal estimates using orthographic, morphological, and BPE tokenization against reading time data. Our results replicate previous findings and pr… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Accepted to Findings of EMNLP 2023; 10 pages, 5 figures

  18. From Propeller Damage Estimation and Adaptation to Fault Tolerant Control: Enhancing Quadrotor Resilience

    Authors: Jeffrey Mao, Jennifer Yeom, Suraj Nair, Giuseppe Loianno

    Abstract: Aerial robots are required to remain operational even in the event of system disturbances, damages, or failures to ensure resilient and robust task completion and safety. One common failure case is propeller damage, which presents a significant challenge in both quantification and compensation. We propose a novel adaptive control scheme capable of detecting and compensating for multi-rotor propell… ▽ More

    Submitted 14 March, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: 8 Pages, 8 Figures

    Report number: ras.ral.23-2753.d1c6d6ca

    Journal ref: IEEE Robotics and Automation Letters (2024) Vol. 9 Issue 5

  19. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, A**kya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  20. arXiv:2309.14293  [pdf, other

    cs.CV cs.AI cs.LG

    NAS-NeRF: Generative Neural Architecture Search for Neural Radiance Fields

    Authors: Saeejith Nair, Yuhao Chen, Mohammad Javad Shafiee, Alexander Wong

    Abstract: Neural radiance fields (NeRFs) enable high-quality novel view synthesis, but their high computational complexity limits deployability. While existing neural-based solutions strive for efficiency, they use one-size-fits-all architectures regardless of scene complexity. The same architecture may be unnecessarily large for simple scenes but insufficient for complex ones. Thus, there is a need to dyna… ▽ More

    Submitted 11 December, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: 8 pages

  21. arXiv:2309.07704  [pdf, other

    cs.CV cs.AI

    NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches

    Authors: Chi-en Amy Tai, Matthew Keller, Saeejith Nair, Yuhao Chen, Yifan Wu, Olivia Markham, Krish Parmar, Pengcheng Xi, Heather Keller, Sharon Kirkpatrick, Alexander Wong

    Abstract: Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating, as malnutrition has been directly linked to decreased quality of life. However self-reporting methods such as food diaries suffer from substantial bias. Other conventional dietary assessment techniques and emerging alternative approaches such as mobile applications incur high time costs an… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  22. arXiv:2308.11421  [pdf, other

    cs.CV cs.AI cs.LG

    TurboViT: Generating Fast Vision Transformers via Generative Architecture Search

    Authors: Alexander Wong, Saad Abbasi, Saeejith Nair

    Abstract: Vision transformers have shown unprecedented levels of performance in tackling various visual perception tasks in recent years. However, the architectural and computational complexity of such network architectures have made them challenging to deploy in real-world applications with high-throughput, low-memory requirements. As such, there has been significant research recently on the design of effi… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: 5 pages

  23. arXiv:2305.00331  [pdf, other

    cs.IR

    Synthetic Cross-language Information Retrieval Training Data

    Authors: James Mayfield, Eugene Yang, Dawn Lawrie, Samuel Barham, Orion Weller, Marc Mason, Suraj Nair, Scott Miller

    Abstract: A key stumbling block for neural cross-language information retrieval (CLIR) systems has been the paucity of training data. The appearance of the MS MARCO monolingual training set led to significant advances in the state of the art in neural monolingual retrieval. By translating the MS MARCO documents into other languages using machine translation, this resource has been made useful to the CLIR co… ▽ More

    Submitted 29 April, 2023; originally announced May 2023.

    Comments: 11 pages, 4 figures

  24. arXiv:2304.11196  [pdf, other

    cs.CV cs.AI cs.LG

    Fast GraspNeXt: A Fast Self-Attention Neural Network Architecture for Multi-task Learning in Computer Vision Tasks for Robotic Gras** on the Edge

    Authors: Alexander Wong, Yifan Wu, Saad Abbasi, Saeejith Nair, Yuhao Chen, Mohammad Javad Shafiee

    Abstract: Multi-task learning has shown considerable promise for improving the performance of deep learning-driven vision systems for the purpose of robotic gras**. However, high architectural and computational complexity can result in poor suitability for deployment on embedded devices that are typically leveraged in robotic arms for real-world manufacturing and warehouse environments. As such, the desig… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.

    Comments: Accepted at CVPR-NAS 2023 Workshop

  25. arXiv:2304.08742  [pdf, other

    cs.RO cs.AI cs.LG

    Behavior Retrieval: Few-Shot Imitation Learning by Querying Unlabeled Datasets

    Authors: Maximilian Du, Suraj Nair, Dorsa Sadigh, Chelsea Finn

    Abstract: Enabling robots to learn novel visuomotor skills in a data-efficient manner remains an unsolved problem with myriad challenges. A popular paradigm for tackling this problem is through leveraging large unlabeled datasets that have many behaviors in them and then adapting a policy to a specific task using a small amount of task-specific human supervision (i.e. interventions or demonstrations). Howev… ▽ More

    Submitted 12 May, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

  26. arXiv:2304.05620  [pdf, other

    cs.CV

    NutritionVerse-Thin: An Optimized Strategy for Enabling Improved Rendering of 3D Thin Food Models

    Authors: Chi-en Amy Tai, Jason Li, Sriram Kumar, Saeejith Nair, Yuhao Chen, Pengcheng Xi, Alexander Wong

    Abstract: With the growth in capabilities of generative models, there has been growing interest in using photo-realistic renders of common 3D food items to improve downstream tasks such as food printing, nutrition prediction, or management of food wastage. Despite 3D modelling capabilities being more accessible than ever due to the success of NeRF based view-synthesis, such rendering methods still struggle… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

  27. arXiv:2304.05619  [pdf, other

    cs.CV

    NutritionVerse-3D: A 3D Food Model Dataset for Nutritional Intake Estimation

    Authors: Chi-en Amy Tai, Matthew Keller, Mattie Kerrigan, Yuhao Chen, Saeejith Nair, Pengcheng Xi, Alexander Wong

    Abstract: 77% of adults over 50 want to age in place today, presenting a major challenge to ensuring adequate nutritional intake. It has been reported that one in four older adults that are 65 years or older are malnourished and given the direct link between malnutrition and decreased quality of life, there have been numerous studies conducted on how to efficiently track nutritional intake of food. Recent a… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

  28. arXiv:2303.17951  [pdf, other

    cs.LG

    FP8 versus INT8 for efficient deep learning inference

    Authors: Mart van Baalen, Andrey Kuzmin, Suparna S Nair, Yuwei Ren, Eric Mahurin, Chirag Patel, Sundar Subramanian, Sanghyuk Lee, Markus Nagel, Joseph Soriaga, Tijmen Blankevoort

    Abstract: Recently, the idea of using FP8 as a number format for neural network training has been floating around the deep learning world. Given that most training is currently conducted with entire networks in FP32, or sometimes FP16 with mixed-precision, the step to having some parts of a network run in FP8 with 8-bit weights is an appealing potential speed-up for the generally costly and time-intensive t… ▽ More

    Submitted 15 June, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

  29. arXiv:2303.06381  [pdf, other

    eess.SP cs.IT cs.LG

    Learning to Precode for Integrated Sensing and Communications Systems

    Authors: R. S. Prasobh Sankar, Sidharth S. Nair, Siddhant Doshi, Sundeep Prabhakar Chepuri

    Abstract: In this paper, we present an unsupervised learning neural model to design transmit precoders for integrated sensing and communication (ISAC) systems to maximize the worst-case target illumination power while ensuring a minimum signal-to-interference-plus-noise ratio (SINR) for all the users. The problem of learning transmit precoders from uplink pilots and echoes can be viewed as a parameterized f… ▽ More

    Submitted 11 March, 2023; originally announced March 2023.

  30. arXiv:2302.12766  [pdf, other

    cs.RO cs.AI cs.CL cs.CV cs.LG

    Language-Driven Representation Learning for Robotics

    Authors: Siddharth Karamcheti, Suraj Nair, Annie S. Chen, Thomas Kollar, Chelsea Finn, Dorsa Sadigh, Percy Liang

    Abstract: Recent work in visual representation learning for robotics demonstrates the viability of learning from large video datasets of humans performing everyday tasks. Leveraging methods such as masked autoencoding and contrastive learning, these representations exhibit strong transfer to policy learning for visuomotor control. But, robot learning encompasses a diverse set of problems beyond control incl… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

    Comments: 30 Pages, 15 Figures

  31. Lazard-style CAD and Equational Constraints

    Authors: James H. Davenport, Akshar S. Nair, Gregory K. Sankaran, Ali K. Uncu

    Abstract: McCallum-style Cylindrical Algebra Decomposition (CAD) is a major improvement on the original Collins version, and has had many subsequent advances, notably for total or partial equational constraints. But it suffers from a problem with nullification. The recently-justified Lazard-style CAD does not have this problem. However, transporting the equational constraints work to Lazard-style does reint… ▽ More

    Submitted 7 December, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

    Comments: 9 pages

    MSC Class: 68W30 ACM Class: I.1.2

    Journal ref: Proceedings of ISSAC'23, 2023

  32. arXiv:2302.00060  [pdf, other

    cs.RO eess.SY

    Interaction and Decision Making-aware Motion Planning using Branch Model Predictive Control

    Authors: Rui Oliveira, Siddharth H. Nair, Bo Wahlberg

    Abstract: Motion planning for autonomous vehicles sharing the road with human drivers remains challenging. The difficulty arises from three challenging aspects: human drivers are 1) multi-modal, 2) interacting with the autonomous vehicle, and 3) actively making decisions based on the current state of the traffic scene. We propose a motion planning framework based on Branch Model Predictive Control to deal w… ▽ More

    Submitted 31 January, 2023; originally announced February 2023.

    Comments: 8 pages, 10 figures

  33. arXiv:2301.09268  [pdf, other

    cs.CV

    PCBDet: An Efficient Deep Neural Network Object Detection Architecture for Automatic PCB Component Detection on the Edge

    Authors: Brian Li, Steven Palayew, Francis Li, Saad Abbasi, Saeejith Nair, Alexander Wong

    Abstract: There can be numerous electronic components on a given PCB, making the task of visual inspection to detect defects very time-consuming and prone to error, especially at scale. There has thus been significant interest in automatic PCB component detection, particularly leveraging deep learning. However, deep neural networks typically require high computational resources, possibly limiting their feas… ▽ More

    Submitted 22 January, 2023; originally announced January 2023.

    Comments: 7 pages, 6 figures

  34. Physics-informed Neural Networks approach to solve the Blasius function

    Authors: Greeshma Krishna, Malavika S Nair, Pramod P Nair, Anil Lal S

    Abstract: Deep learning techniques with neural networks have been used effectively in computational fluid dynamics (CFD) to obtain solutions to nonlinear differential equations. This paper presents a physics-informed neural network (PINN) approach to solve the Blasius function. This method eliminates the process of changing the non-linear differential equation to an initial value problem. Also, it tackles t… ▽ More

    Submitted 5 February, 2023; v1 submitted 30 December, 2022; originally announced January 2023.

  35. arXiv:2212.10448  [pdf, other

    cs.IR cs.CL

    Parameter-efficient Zero-shot Transfer for Cross-Language Dense Retrieval with Adapters

    Authors: Eugene Yang, Suraj Nair, Dawn Lawrie, James Mayfield, Douglas W. Oard

    Abstract: A popular approach to creating a zero-shot cross-language retrieval model is to substitute a monolingual pretrained language model in the retrieval model with a multilingual pretrained language model such as Multilingual BERT. This multilingual model is fined-tuned to the retrieval task with monolingual data such as English MS MARCO using the same training recipe as the monolingual retrieval model… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Comments: 15 pages, 1 figure

  36. arXiv:2209.08403  [pdf, other

    cs.LG cs.IR

    Advertising Media and Target Audience Optimization via High-dimensional Bandits

    Authors: Wenjia Ba, J. Michael Harrison, Harikesh S. Nair

    Abstract: We present a data-driven algorithm that advertisers can use to automate their digital ad-campaigns at online publishers. The algorithm enables the advertiser to search across available target audiences and ad-media to find the best possible combination for its campaign via online experimentation. The problem of finding the best audience-ad combination is complicated by a number of distinctive chal… ▽ More

    Submitted 17 September, 2022; originally announced September 2022.

    Comments: 39 pages, 8 figures

  37. arXiv:2209.03910  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    PixTrack: Precise 6DoF Object Pose Tracking using NeRF Templates and Feature-metric Alignment

    Authors: Prajwal Chidananda, Saurabh Nair, Douglas Lee, Adrian Kaehler

    Abstract: We present PixTrack, a vision based object pose tracking framework using novel view synthesis and deep feature-metric alignment. We follow an SfM-based relocalization paradigm where we use a Neural Radiance Field to canonically represent the tracked object. Our evaluations demonstrate that our method produces highly accurate, robust, and jitter-free 6DoF pose estimates of objects in both monocular… ▽ More

    Submitted 14 February, 2024; v1 submitted 8 September, 2022; originally announced September 2022.

  38. arXiv:2208.06980  [pdf, other

    cs.CV eess.IV

    Faster Attention Is What You Need: A Fast Self-Attention Neural Network Backbone Architecture for the Edge via Double-Condensing Attention Condensers

    Authors: Alexander Wong, Mohammad Javad Shafiee, Saad Abbasi, Saeejith Nair, Mahmoud Famouri

    Abstract: With the growing adoption of deep learning for on-device TinyML applications, there has been an ever-increasing demand for efficient neural network backbones optimized for the edge. Recently, the introduction of attention condenser networks have resulted in low-footprint, highly-efficient, self-attention neural networks that strike a strong balance between accuracy and speed. In this study, we int… ▽ More

    Submitted 3 February, 2023; v1 submitted 14 August, 2022; originally announced August 2022.

  39. arXiv:2208.03529  [pdf, other

    cs.RO math.OC

    Collision Avoidance for Dynamic Obstacles with Uncertain Predictions using Model Predictive Control

    Authors: Siddharth H. Nair, Eric H. Tseng, Francesco Borrelli

    Abstract: We propose a Model Predictive Control (MPC) for collision avoidance between an autonomous agent and dynamic obstacles with uncertain predictions. The collision avoidance constraints are imposed by enforcing positive distance between convex sets representing the agent and the obstacles, and tractably reformulating them using Lagrange duality. This approach allows for smooth collision avoidance cons… ▽ More

    Submitted 6 August, 2022; originally announced August 2022.

    Comments: Accepted to CDC'22

  40. arXiv:2207.09372  [pdf, other

    cs.RO cs.AI

    On Decentralizing Federated Reinforcement Learning in Multi-Robot Scenarios

    Authors: Jayprakash S. Nair, Divya D. Kulkarni, Ajitem Joshi, Sruthy Suresh

    Abstract: Federated Learning (FL) allows for collaboratively aggregating learned information across several computing devices and sharing the same amongst them, thereby tackling issues of privacy and the need of huge bandwidth. FL techniques generally use a central server or cloud for aggregating the models received from the devices. Such centralized FL techniques suffer from inherent problems such as failu… ▽ More

    Submitted 7 September, 2022; v1 submitted 19 July, 2022; originally announced July 2022.

    Comments: Submitted to SEEDA 2022. This arxiv is a preprint and NOT the final version

  41. arXiv:2205.14850  [pdf, other

    cs.RO cs.LG cs.SD eess.AS

    Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual Imitation Learning

    Authors: Maximilian Du, Olivia Y. Lee, Suraj Nair, Chelsea Finn

    Abstract: Humans are capable of completing a range of challenging manipulation tasks that require reasoning jointly over modalities such as vision, touch, and sound. Moreover, many such tasks are partially-observed; for example, taking a notebook out of a backpack will lead to visual occlusion and require reasoning over the history of audio or tactile information. While robust tactile sensing can be costly… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

    Journal ref: Robotics Science and Systems (RSS) 2022

  42. Residual-Concatenate Neural Network with Deep Regularization Layers for Binary Classification

    Authors: Abhishek Gupta, Sruthi Nair, Raunak Joshi, Vidya Chitre

    Abstract: Many complex Deep Learning models are used with different variations for various prognostication tasks. The higher learning parameters not necessarily ensure great accuracy. This can be solved by considering changes in very deep models with many regularization based techniques. In this paper we train a deep neural network that uses many regularization layers with residual and concatenation process… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: 7 pages, 5 figures. To appear in the proceedings of 6th International Conference on Intelligent Computing and Control Systems (ICICCS 2022)

  43. arXiv:2204.12950  [pdf, other

    cs.LG cs.AI cs.CV

    MAPLE-Edge: A Runtime Latency Predictor for Edge Devices

    Authors: Saeejith Nair, Saad Abbasi, Alexander Wong, Mohammad Javad Shafiee

    Abstract: Neural Architecture Search (NAS) has enabled automatic discovery of more efficient neural network architectures, especially for mobile and embedded vision applications. Although recent research has proposed ways of quickly estimating latency on unseen hardware devices with just a few samples, little focus has been given to the challenges of estimating latency on runtimes using optimized graphs, su… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: 9 pages

  44. C3: Continued Pretraining with Contrastive Weak Supervision for Cross Language Ad-Hoc Retrieval

    Authors: Eugene Yang, Suraj Nair, Ramraj Chandradevan, Rebecca Iglesias-Flores, Douglas W. Oard

    Abstract: Pretrained language models have improved effectiveness on numerous tasks, including ad-hoc retrieval. Recent work has shown that continuing to pretrain a language model with auxiliary objectives before fine-tuning on the retrieval task can further improve retrieval effectiveness. Unlike monolingual retrieval, designing an appropriate auxiliary task for cross-language map**s is challenging. To ad… ▽ More

    Submitted 25 April, 2022; originally announced April 2022.

    Comments: 6 pages, 2 figures, accepted as a SIGIR 2022 Short Paper

  45. arXiv:2204.11766  [pdf, other

    eess.IV cs.CV cs.LG

    CellDefectNet: A Machine-designed Attention Condenser Network for Electroluminescence-based Photovoltaic Cell Defect Inspection

    Authors: Carol Xu, Mahmoud Famouri, Gautam Bathla, Saeejith Nair, Mohammad Javad Shafiee, Alexander Wong

    Abstract: Photovoltaic cells are electronic devices that convert light energy to electricity, forming the backbone of solar energy harvesting systems. An essential step in the manufacturing process for photovoltaic cells is visual quality inspection using electroluminescence imaging to identify defects such as cracks, finger interruptions, and broken cells. A big challenge faced by industry in photovoltaic… ▽ More

    Submitted 25 April, 2022; originally announced April 2022.

    Comments: 6 pages

  46. arXiv:2203.12601  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    R3M: A Universal Visual Representation for Robot Manipulation

    Authors: Suraj Nair, Aravind Rajeswaran, Vikash Kumar, Chelsea Finn, Abhinav Gupta

    Abstract: We study how visual representations pre-trained on diverse human video data can enable data-efficient learning of downstream robotic manipulation tasks. Concretely, we pre-train a visual representation using the Ego4D human video dataset using a combination of time-contrastive learning, video-language alignment, and an L1 penalty to encourage sparse and compact representations. The resulting repre… ▽ More

    Submitted 18 November, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

    Comments: Conference on Robot Learning (CoRL) 2022

  47. arXiv:2202.08910  [pdf, other

    cs.LG

    Combining Varied Learners for Binary Classification using Stacked Generalization

    Authors: Sruthi Nair, Abhishek Gupta, Raunak Joshi, Vidya Chitre

    Abstract: The Machine Learning has various learning algorithms that are better in some or the other aspect when compared with each other but a common error that all algorithms will suffer from is training data with very high dimensional feature set. This usually ends up algorithms into generalization error that deplete the performance. This can be solved using an Ensemble Learning method known as Stacking c… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

    Comments: 9 pages, 4 figures, 5 tables, 8 equations

  48. arXiv:2202.03870  [pdf, other

    cs.LG cs.AI

    Maximum Likelihood Uncertainty Estimation: Robustness to Outliers

    Authors: Deebul S. Nair, Nico Hochgeschwender, Miguel A. Olivares-Mendez

    Abstract: We benchmark the robustness of maximum likelihood based uncertainty estimation methods to outliers in training data for regression tasks. Outliers or noisy labels in training data results in degraded performances as well as incorrect estimation of uncertainty. We propose the use of a heavy-tailed distribution (Laplace distribution) to improve the robustness to outliers. This property is evaluated… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

    Comments: 8 Pages, 8 Figures, The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22), The AAAI's Workshop on Artificial Intelligence Safety

  49. arXiv:2201.08471  [pdf, other

    cs.IR cs.CL

    Transfer Learning Approaches for Building Cross-Language Dense Retrieval Models

    Authors: Suraj Nair, Eugene Yang, Dawn Lawrie, Kevin Duh, Paul McNamee, Kenton Murray, James Mayfield, Douglas W. Oard

    Abstract: The advent of transformer-based models such as BERT has led to the rise of neural ranking models. These models have improved the effectiveness of retrieval systems well beyond that of lexical term matching models such as BM25. While monolingual retrieval tasks have benefited from large-scale training collections such as MS MARCO and advances in neural architectures, cross-language retrieval tasks… ▽ More

    Submitted 20 January, 2022; originally announced January 2022.

    Comments: Accepted at ECIR 2022 (Full paper)

  50. arXiv:2201.01850  [pdf, other

    cs.CV cs.AI

    On the Real-World Adversarial Robustness of Real-Time Semantic Segmentation Models for Autonomous Driving

    Authors: Giulio Rossolini, Federico Nesti, Gianluca D'Amico, Saasha Nair, Alessandro Biondi, Giorgio Buttazzo

    Abstract: The existence of real-world adversarial examples (commonly in the form of patches) poses a serious threat for the use of deep learning models in safety-critical computer vision tasks such as visual perception in autonomous driving. This paper presents an extensive evaluation of the robustness of semantic segmentation models when attacked with different types of adversarial patches, including digit… ▽ More

    Submitted 5 January, 2022; originally announced January 2022.