Skip to main content

Showing 1–50 of 65 results for author: Dave, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16862  [pdf, other

    cs.RO cs.CV

    Dreamitate: Real-World Visuomotor Policy Learning via Video Generation

    Authors: Junbang Liang, Ruoshi Liu, Ege Ozguroglu, Sruthi Sudhakar, Achal Dave, Pavel Tokmakov, Shuran Song, Carl Vondrick

    Abstract: A key challenge in manipulation is learning a policy that can robustly generalize to diverse visual environments. A promising mechanism for learning robust policies is to leverage video generative models, which are pretrained on large-scale datasets of internet videos. In this paper, we propose a visuomotor policy learning framework that fine-tunes a video diffusion model on human demonstrations o… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Project page: https://dreamitate.cs.columbia.edu/

  2. arXiv:2406.12036  [pdf, other

    cs.CL cs.AI

    MedCalc-Bench: Evaluating Large Language Models for Medical Calculations

    Authors: Nikhil Khandekar, Qiao **, Guangzhi Xiong, Soren Dunn, Serina S Applebaum, Zain Anwar, Maame Sarfo-Gyamfi, Conrad W Safranek, Abid A Anwar, Andrew Zhang, Aidan Gilson, Maxwell B Singer, Amisha Dave, Andrew Taylor, Aidong Zhang, Qingyu Chen, Zhiyong Lu

    Abstract: As opposed to evaluating computation and logic-based reasoning, current benchmarks for evaluating large language models (LLMs) in medicine are primarily focused on question-answering involving domain knowledge and descriptive reasoning. While such qualitative capabilities are vital to medical diagnosis, in real-world scenarios, doctors frequently use clinical calculators that follow quantitative e… ▽ More

    Submitted 30 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Github link: https://github.com/ncbi-nlp/MedCalc-Bench HuggingFace link: https://huggingface.co/datasets/nsk7153/MedCalc-Bench

  3. arXiv:2406.11794  [pdf, other

    cs.LG cs.CL

    DataComp-LM: In search of the next generation of training sets for language models

    Authors: Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner , et al. (34 additional authors not shown)

    Abstract: We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants in the DCLM benchmark can experiment with dat… ▽ More

    Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Project page: https://www.datacomp.ai/dclm/

  4. arXiv:2406.10212  [pdf, other

    cs.CV cs.GR

    NeST: Neural Stress Tensor Tomography by leveraging 3D Photoelasticity

    Authors: Akshat Dave, Tianyi Zhang, Aaron Young, Ramesh Raskar, Wolfgang Heidrich, Ashok Veeraraghavan

    Abstract: Photoelasticity enables full-field stress analysis in transparent objects through stress-induced birefringence. Existing techniques are limited to 2D slices and require destructively slicing the object. Recovering the internal 3D stress distribution of the entire object is challenging as it involves solving a tensor tomography problem and handling phase wrap** ambiguities. We introduce NeST, an… ▽ More

    Submitted 24 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: Project webpage: https://akshatdave.github.io/nest

  5. arXiv:2405.14868  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis

    Authors: Basile Van Hoorick, Rundi Wu, Ege Ozguroglu, Kyle Sargent, Ruoshi Liu, Pavel Tokmakov, Achal Dave, Changxi Zheng, Carl Vondrick

    Abstract: Accurate reconstruction of complex dynamic scenes from just a single viewpoint continues to be a challenging task in computer vision. Current dynamic novel view synthesis methods typically require videos from many different camera viewpoints, necessitating careful recording setups, and significantly restricting their utility in the wild as well as in terms of embodied AI applications. In this pape… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Project webpage is available at: https://gcd.cs.columbia.edu/

  6. arXiv:2405.06640  [pdf, other

    cs.CL

    Linearizing Large Language Models

    Authors: Jean Mercat, Igor Vasiljevic, Sedrick Keh, Kushal Arora, Achal Dave, Adrien Gaidon, Thomas Kollar

    Abstract: Linear transformers have emerged as a subquadratic-time alternative to softmax attention and have garnered significant interest due to their fixed-size recurrent state that lowers inference cost. However, their original formulation suffers from poor scaling and underperforms compute-matched transformers. Recent linear models such as RWKV and Mamba have attempted to address these shortcomings by pr… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  7. arXiv:2404.11511  [pdf, other

    eess.IV cs.CV

    Event Cameras Meet SPADs for High-Speed, Low-Bandwidth Imaging

    Authors: Manasi Muglikar, Siddharth Somasundaram, Akshat Dave, Edoardo Charbon, Ramesh Raskar, Davide Scaramuzza

    Abstract: Traditional cameras face a trade-off between low-light performance and high-speed imaging: longer exposure times to capture sufficient light results in motion blur, whereas shorter exposures result in Poisson-corrupted noisy images. While burst photography techniques help mitigate this tradeoff, conventional cameras are fundamentally limited in their sensor noise characteristics. Event cameras and… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  8. arXiv:2403.13199  [pdf, other

    cs.CV cs.DC

    DecentNeRFs: Decentralized Neural Radiance Fields from Crowdsourced Images

    Authors: Zaid Tasneem, Akshat Dave, Abhishek Singh, Kushagra Tiwary, Praneeth Vepakomma, Ashok Veeraraghavan, Ramesh Raskar

    Abstract: Neural radiance fields (NeRFs) show potential for transforming images captured worldwide into immersive 3D visual experiences. However, most of this captured visual data remains siloed in our camera rolls as these images contain personal details. Even if made public, the problem of learning 3D representations of billions of scenes captured daily in a centralized manner is computationally intractab… ▽ More

    Submitted 28 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  9. arXiv:2403.08540  [pdf, other

    cs.CL cs.LG

    Language models scale reliably with over-training and on downstream tasks

    Authors: Samir Yitzhak Gadre, Georgios Smyrnis, Vaishaal Shankar, Suchin Gururangan, Mitchell Wortsman, Rulin Shao, Jean Mercat, Alex Fang, Jeffrey Li, Sedrick Keh, Rui Xin, Marianna Nezhurina, Igor Vasiljevic, Jenia Jitsev, Luca Soldaini, Alexandros G. Dimakis, Gabriel Ilharco, Pang Wei Koh, Shuran Song, Thomas Kollar, Yair Carmon, Achal Dave, Reinhard Heckel, Niklas Muennighoff, Ludwig Schmidt

    Abstract: Scaling laws are useful guides for derisking expensive training runs, as they predict performance of large models using cheaper, small-scale experiments. However, there remain gaps between current scaling studies and how language models are ultimately trained and evaluated. For instance, scaling is usually studied in the compute-optimal training regime (i.e., "Chinchilla optimal" regime). In contr… ▽ More

    Submitted 14 June, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  10. arXiv:2403.05742  [pdf, other

    eess.SY cs.RO

    Safe Merging in Mixed Traffic with Confidence

    Authors: Heeseung Bang, Aditya Dave, Andreas A. Malikopoulos

    Abstract: In this letter, we present an approach for learning human driving behavior, without relying on specific model structures or prior distributions, in a mixed-traffic environment where connected and automated vehicles (CAVs) coexist with human-driven vehicles (HDVs). We employ conformal prediction to obtain theoretical safety guarantees and use real-world traffic data to validate our approach. Then,… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 6 pages, 5 figures

  11. arXiv:2403.05715  [pdf, other

    eess.SY cs.AI cs.HC cs.LG

    A Framework for Effective AI Recommendations in Cyber-Physical-Human Systems

    Authors: Aditya Dave, Heeseung Bang, Andreas A. Malikopoulos

    Abstract: Many cyber-physical-human systems (CPHS) involve a human decision-maker who may receive recommendations from an artificial intelligence (AI) platform while holding the ultimate responsibility of making decisions. In such CPHS applications, the human decision-maker may depart from an optimal recommended decision and instead implement a different one for various reasons. In this letter, we develop a… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  12. arXiv:2402.06695  [pdf, other

    cs.AI cs.LG eess.SY

    Integrating LLMs for Explainable Fault Diagnosis in Complex Systems

    Authors: Akshay J. Dave, Tat Nghia Nguyen, Richard B. Vilim

    Abstract: This paper introduces an integrated system designed to enhance the explainability of fault diagnostics in complex systems, such as nuclear power plants, where operator understanding is critical for informed decision-making. By combining a physics-based diagnostic tool with a Large Language Model, we offer a novel solution that not only identifies faults but also provides clear, understandable expl… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: 4 pages

  13. arXiv:2401.14398  [pdf, other

    cs.CV cs.LG

    pix2gestalt: Amodal Segmentation by Synthesizing Wholes

    Authors: Ege Ozguroglu, Ruoshi Liu, Dídac Surís, Dian Chen, Achal Dave, Pavel Tokmakov, Carl Vondrick

    Abstract: We introduce pix2gestalt, a framework for zero-shot amodal segmentation, which learns to estimate the shape and appearance of whole objects that are only partially visible behind occlusions. By capitalizing on large-scale diffusion models and transferring their representations to this task, we learn a conditional diffusion model for reconstructing whole objects in challenging zero-shot cases, incl… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: Website: https://gestalt.cs.columbia.edu/

  14. arXiv:2401.13020  [pdf, other

    eess.SY cs.LG

    A Safe Reinforcement Learning Algorithm for Supervisory Control of Power Plants

    Authors: Yixuan Sun, Sami Khairy, Richard B. Vilim, Rui Hu, Akshay J. Dave

    Abstract: Traditional control theory-based methods require tailored engineering for each system and constant fine-tuning. In power plant control, one often needs to obtain a precise representation of the system dynamics and carefully design the control scheme accordingly. Model-free Reinforcement learning (RL) has emerged as a promising solution for control tasks due to its ability to learn from trial-and-e… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  15. arXiv:2401.10831  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Understanding Video Transformers via Universal Concept Discovery

    Authors: Matthew Kowal, Achal Dave, Rares Ambrus, Adrien Gaidon, Konstantinos G. Derpanis, Pavel Tokmakov

    Abstract: This paper studies the problem of concept-based interpretability of transformer representations for videos. Concretely, we seek to explain the decision-making process of video transformers based on high-level, spatiotemporal concepts that are automatically discovered. Prior research on concept-based interpretability has concentrated solely on image-level tasks. Comparatively, video models deal wit… ▽ More

    Submitted 10 April, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

    Comments: CVPR 2024 (Highlight)

  16. arXiv:2312.16215  [pdf, other

    cs.CV

    SUNDIAL: 3D Satellite Understanding through Direct, Ambient, and Complex Lighting Decomposition

    Authors: Nikhil Behari, Akshat Dave, Kushagra Tiwary, William Yang, Ramesh Raskar

    Abstract: 3D modeling from satellite imagery is essential in areas of environmental science, urban planning, agriculture, and disaster response. However, traditional 3D modeling techniques face unique challenges in the remote sensing context, including limited multi-view baselines over extensive regions, varying direct, ambient, and complex illumination conditions, and time-varying scene changes across capt… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: 8 pages, 6 figures

  17. arXiv:2312.14184  [pdf

    cs.CL cs.AI cs.LG

    Large Language Models in Medical Term Classification and Unexpected Misalignment Between Response and Reasoning

    Authors: Xiaodan Zhang, Sandeep Vemulapalli, Nabasmita Talukdar, Sumyeong Ahn, Jiankun Wang, Han Meng, Sardar Mehtab Bin Murtaza, Aakash Ajay Dave, Dmitry Leshchiner, Dimitri F. Joseph, Martin Witteveen-Lane, Dave Chesla, Jiayu Zhou, Bin Chen

    Abstract: This study assesses the ability of state-of-the-art large language models (LLMs) including GPT-3.5, GPT-4, Falcon, and LLaMA 2 to identify patients with mild cognitive impairment (MCI) from discharge summaries and examines instances where the models' responses were misaligned with their reasoning. Utilizing the MIMIC-IV v2.2 database, we focused on a cohort aged 65 and older, verifying MCI diagnos… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  18. arXiv:2312.12433  [pdf, other

    cs.CV cs.AI cs.LG

    TAO-Amodal: A Benchmark for Tracking Any Object Amodally

    Authors: Cheng-Yen Hsieh, Kaihua Chen, Achal Dave, Tarasha Khurana, Deva Ramanan

    Abstract: Amodal perception, the ability to comprehend complete object structures from partial visibility, is a fundamental skill, even for infants. Its significance extends to applications like autonomous driving, where a clear understanding of heavily occluded objects is essential. However, modern detection and tracking algorithms often overlook this critical capability, perhaps due to the prevalence of \… ▽ More

    Submitted 2 April, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: Project Page: https://tao-amodal.github.io

  19. arXiv:2311.16588  [pdf

    cs.CL

    Ascle: A Python Natural Language Processing Toolkit for Medical Text Generation

    Authors: Rui Yang, Qingcheng Zeng, Keen You, Yujie Qiao, Lucas Huang, Chia-Chun Hsieh, Benjamin Rosand, Jeremy Goldwasser, Amisha D Dave, Tiarnan D. L. Keenan, Emily Y Chew, Dragomir Radev, Zhiyong Lu, Hua Xu, Qingyu Chen, Irene Li

    Abstract: This study introduces Ascle, a pioneering natural language processing (NLP) toolkit designed for medical text generation. Ascle is tailored for biomedical researchers and healthcare professionals with an easy-to-use, all-in-one solution that requires minimal programming expertise. For the first time, Ascle evaluates and provides interfaces for the latest pre-trained language models, encompassing f… ▽ More

    Submitted 9 December, 2023; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: 5 figures, 4 tables

  20. arXiv:2310.06992  [pdf, other

    cs.CV

    Zero-Shot Open-Vocabulary Tracking with Large Pre-Trained Models

    Authors: Wen-Hsuan Chu, Adam W. Harley, Pavel Tokmakov, Achal Dave, Leonidas Guibas, Katerina Fragkiadaki

    Abstract: Object tracking is central to robot perception and scene understanding. Tracking-by-detection has long been a dominant paradigm for object tracking of specific object categories. Recently, large-scale pre-trained models have shown promising advances in detecting and segmenting objects and parts in 2D static images in the wild. This begs the question: can we re-purpose these large-scale pre-trained… ▽ More

    Submitted 25 January, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Project page available at https://wenhsuanchu.github.io/ovtracktor/

  21. arXiv:2310.03047  [pdf, other

    physics.chem-ph cs.LG

    Differentiable Modeling and Optimization of Battery Electrolyte Mixtures Using Geometric Deep Learning

    Authors: Shang Zhu, Bharath Ramsundar, Emil Annevelink, Hongyi Lin, Adarsh Dave, Pin-Wen Guan, Kevin Gering, Venkatasubramanian Viswanathan

    Abstract: Electrolytes play a critical role in designing next-generation battery systems, by allowing efficient ion transfer, preventing charge transfer, and stabilizing electrode-electrolyte interfaces. In this work, we develop a differentiable geometric deep learning (GDL) model for chemical mixtures, DiffMix, which is applied in guiding robotic experimentation and optimization towards fast-charging batte… ▽ More

    Submitted 1 November, 2023; v1 submitted 3 October, 2023; originally announced October 2023.

  22. arXiv:2309.12211  [pdf, other

    cs.LG eess.SY physics.comp-ph physics.flu-dyn

    Physics-informed State-space Neural Networks for Transport Phenomena

    Authors: Akshay J. Dave, Richard B. Vilim

    Abstract: This work introduces Physics-informed State-space neural network Models (PSMs), a novel solution to achieving real-time optimization, flexibility, and fault tolerance in autonomous systems, particularly in transport-dominated systems such as chemical, biomedical, and power plants. Traditional data-driven methods fall short due to a lack of physical constraints like mass conservation; PSMs address… ▽ More

    Submitted 18 December, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: 19 pages, 13 figures

  23. A Q-learning Approach for Adherence-Aware Recommendations

    Authors: Ioannis Faros, Aditya Dave, Andreas A. Malikopoulos

    Abstract: In many real-world scenarios involving high-stakes and safety implications, a human decision-maker (HDM) may receive recommendations from an artificial intelligence while holding the ultimate responsibility of making decisions. In this letter, we develop an "adherence-aware Q-learning" algorithm to address this problem. The algorithm learns the "adherence level" that captures the frequency with wh… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Journal ref: IEEE Control Systems Letters (L-CSS), 2024

  24. arXiv:2304.11489  [pdf, other

    cs.CR cs.AR

    FVCARE:Formal Verification of Security Primitives in Resilient Embedded SoCs

    Authors: Avani Dave, Nilanjan Banerjee, Chintan Patel

    Abstract: With the increased utilization, the small embedded and IoT devices have become an attractive target for sophisticated attacks that can exploit the devices security critical information and data in malevolent activities. Secure boot and Remote Attestation (RA) techniques verifies the integrity of the devices software state at boot-time and runtime. Correct implementation and formal verification of… ▽ More

    Submitted 22 April, 2023; originally announced April 2023.

  25. arXiv:2304.07389  [pdf, other

    cs.CV cs.AI cs.LG

    Shape of You: Precise 3D shape estimations for diverse body types

    Authors: Rohan Sarkar, Achal Dave, Gerard Medioni, Benjamin Biggs

    Abstract: This paper presents Shape of You (SoY), an approach to improve the accuracy of 3D body shape estimation for vision-based clothing recommendation systems. While existing methods have successfully estimated 3D poses, there remains a lack of work in precise shape estimation, particularly for diverse human bodies. To address this gap, we propose two loss functions that can be readily integrated into p… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  26. arXiv:2304.01308  [pdf, other

    eess.IV cs.CV

    Role of Transients in Two-Bounce Non-Line-of-Sight Imaging

    Authors: Siddharth Somasundaram, Akshat Dave, Connor Henley, Ashok Veeraraghavan, Ramesh Raskar

    Abstract: The goal of non-line-of-sight (NLOS) imaging is to image objects occluded from the camera's field of view using multiply scattered light. Recent works have demonstrated the feasibility of two-bounce (2B) NLOS imaging by scanning a laser and measuring cast shadows of occluded objects in scenes with two relay surfaces. In this work, we study the role of time-of-flight (ToF) measurements, \ie transie… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  27. arXiv:2304.00397  [pdf, other

    cs.LG cs.AI eess.SY

    Connected and Automated Vehicles in Mixed-Traffic: Learning Human Driver Behavior for Effective On-Ramp Merging

    Authors: Nishanth Venkatesh, Viet-Anh Le, Aditya Dave, Andreas A. Malikopoulos

    Abstract: Highway merging scenarios featuring mixed traffic conditions pose significant modeling and control challenges for connected and automated vehicles (CAVs) interacting with incoming on-ramp human-driven vehicles (HDVs). In this paper, we present an approach to learn an approximate information state model of CAV-HDV interactions for a CAV to maneuver safely during highway merging. In our approach, th… ▽ More

    Submitted 1 April, 2023; originally announced April 2023.

  28. arXiv:2303.16321  [pdf, other

    math.OC cs.AI eess.SY

    Worst-Case Control and Learning Using Partial Observations Over an Infinite Time-Horizon

    Authors: Aditya Dave, Ioannis Faros, Nishanth Venkatesh, Andreas A. Malikopoulos

    Abstract: Safety-critical cyber-physical systems require control strategies whose worst-case performance is robust against adversarial disturbances and modeling uncertainties. In this paper, we present a framework for approximate control and learning in partially observed systems to minimize the worst-case discounted cost over an infinite time horizon. We model disturbances to the system as finite-valued un… ▽ More

    Submitted 31 March, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

  29. arXiv:2301.05089  [pdf, other

    eess.SY cs.AI math.OC

    Approximate Information States for Worst-Case Control and Learning in Uncertain Systems

    Authors: Aditya Dave, Nishanth Venkatesh, Andreas A. Malikopoulos

    Abstract: In this paper, we investigate discrete-time decision-making problems in uncertain systems with partially observed states. We consider a non-stochastic model, where uncontrolled disturbances acting on the system take values in bounded sets with unknown distributions. We present a general framework for decision-making in such problems by using the notion of the information state and approximate info… ▽ More

    Submitted 5 April, 2024; v1 submitted 12 January, 2023; originally announced January 2023.

    Comments: Preliminary results related to this article were reported in arXiv:2203.15271

  30. arXiv:2212.12645  [pdf, other

    cs.CV cs.LG

    HandsOff: Labeled Dataset Generation With No Additional Human Annotations

    Authors: Austin Xu, Mariya I. Vasileva, Achal Dave, Arjun Seshadri

    Abstract: Recent work leverages the expressive power of generative adversarial networks (GANs) to generate labeled synthetic datasets. These dataset generation methods often require new annotations of synthetic images, which forces practitioners to seek out annotators, curate a set of synthetic images, and ensure the quality of generated labels. We introduce the HandsOff framework, a technique capable of pr… ▽ More

    Submitted 30 March, 2023; v1 submitted 23 December, 2022; originally announced December 2022.

    Comments: 22 pages, 20 figures. CVPR 2023

  31. arXiv:2212.04531  [pdf, other

    cs.CV cs.AI

    ORCa: Glossy Objects as Radiance Field Cameras

    Authors: Kushagra Tiwary, Akshat Dave, Nikhil Behari, Tzofi Klinghoffer, Ashok Veeraraghavan, Ramesh Raskar

    Abstract: Reflections on glossy objects contain valuable and hidden information about the surrounding environment. By converting these objects into cameras, we can unlock exciting applications, including imaging beyond the camera's field-of-view and from seemingly impossible vantage points, e.g. from reflections on the human eye. However, this task is challenging because reflections depend jointly on object… ▽ More

    Submitted 12 December, 2022; v1 submitted 8 December, 2022; originally announced December 2022.

    Comments: for more information, see https://ktiwary2.github.io/objectsascam/

  32. arXiv:2211.08691  [pdf, other

    cs.CV cs.RO

    Towards Long-Tailed 3D Detection

    Authors: Neehar Peri, Achal Dave, Deva Ramanan, Shu Kong

    Abstract: Contemporary autonomous vehicle (AV) benchmarks have advanced techniques for training 3D detectors, particularly on large-scale lidar data. Surprisingly, although semantic class labels naturally follow a long-tailed distribution, contemporary benchmarks focus on only a few common classes (e.g., pedestrian and car) and neglect many rare classes in-the-tail (e.g., debris and stroller). However, AVs… ▽ More

    Submitted 19 May, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: This work has been accepted to the Conference on Robot Learning (CoRL) 2022

  33. arXiv:2210.01917  [pdf, other

    cs.CV cs.RO

    Differentiable Raycasting for Self-supervised Occupancy Forecasting

    Authors: Tarasha Khurana, Peiyun Hu, Achal Dave, Jason Ziglar, David Held, Deva Ramanan

    Abstract: Motion planning for safe autonomous driving requires learning how the environment around an ego-vehicle evolves with time. Ego-centric perception of driveable regions in a scene not only changes with the motion of actors in the environment, but also with the movement of the ego-vehicle itself. Self-supervised representations proposed for large-scale planning, such as ego-centric freespace, confoun… ▽ More

    Submitted 18 October, 2022; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: ECCV 2022. Code available at https://github.com/tarashakhurana/emergent-occ-forecasting

  34. arXiv:2209.12118  [pdf, other

    cs.CV

    BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in Video

    Authors: Ali Athar, Jonathon Luiten, Paul Voigtlaender, Tarasha Khurana, Achal Dave, Bastian Leibe, Deva Ramanan

    Abstract: Multiple existing benchmarks involve tracking and segmenting objects in video e.g., Video Object Segmentation (VOS) and Multi-Object Tracking and Segmentation (MOTS), but there is little interaction between them due to the use of disparate benchmark datasets and metrics (e.g. J&F, mAP, sMOTSA). As a result, published works usually target a particular benchmark, and are not easily comparable to eac… ▽ More

    Submitted 22 November, 2022; v1 submitted 24 September, 2022; originally announced September 2022.

  35. Design of a Supervisory Control System for Autonomous Operation of Advanced Reactors

    Authors: Akshay J. Dave, Taeseung Lee, Roberto Ponciroli, Richard B. Vilim

    Abstract: Advanced reactors to be deployed in the coming decades will face deregulated energy markets, and may adopt flexible operation to boost profitability. To aid in the transition from baseload to flexible operation paradigm, autonomous operation is sought. This work focuses on the control aspect of autonomous operation. Specifically, a hierarchical control system is designed to support constraint enfo… ▽ More

    Submitted 1 November, 2022; v1 submitted 9 September, 2022; originally announced September 2022.

    Comments: 19 pages, 12 figures

  36. Accelerating Material Design with the Generative Toolkit for Scientific Discovery

    Authors: Matteo Manica, Jannis Born, Joris Cadow, Dimitrios Christofidellis, Ashish Dave, Dean Clarke, Yves Gaetan Nana Teukam, Giorgio Giannone, Samuel C. Hoffman, Matthew Buchan, Vijil Chenthamarakshan, Timothy Donovan, Hsiang Han Hsu, Federico Zipoli, Oliver Schilter, Akihiro Kishimoto, Lisa Hamada, Inkit Padhi, Karl Wehden, Lauren McHugh, Alexy Khrabrov, Payel Das, Seiji Takeda, John R. Smith

    Abstract: With the growing availability of data within various scientific domains, generative models hold enormous potential to accelerate scientific discovery. They harness powerful representations learned from datasets to speed up the formulation of novel hypotheses with the potential to impact material discovery broadly. We present the Generative Toolkit for Scientific Discovery (GT4SD). This extensible… ▽ More

    Submitted 31 January, 2023; v1 submitted 8 July, 2022; originally announced July 2022.

    Comments: 15 pages, 2 figures

    Journal ref: Nature Partner Journals (npj) Computational Materials 9, 69 (2023)

  37. arXiv:2205.01397  [pdf, other

    cs.CV cs.CL cs.LG

    Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP)

    Authors: Alex Fang, Gabriel Ilharco, Mitchell Wortsman, Yuhao Wan, Vaishaal Shankar, Achal Dave, Ludwig Schmidt

    Abstract: Contrastively trained language-image models such as CLIP, ALIGN, and BASIC have demonstrated unprecedented robustness to multiple challenging natural distribution shifts. Since these language-image models differ from previous training approaches in several ways, an important question is what causes the large robustness gains. We answer this question via a systematic experimental investigation. Con… ▽ More

    Submitted 22 August, 2022; v1 submitted 3 May, 2022; originally announced May 2022.

  38. arXiv:2203.13458  [pdf, other

    cs.CV cs.GR

    PANDORA: Polarization-Aided Neural Decomposition Of Radiance

    Authors: Akshat Dave, Yongyi Zhao, Ashok Veeraraghavan

    Abstract: Reconstructing an object's geometry and appearance from multiple images, also known as inverse rendering, is a fundamental problem in computer graphics and vision. Inverse rendering is inherently ill-posed because the captured image is an intricate function of unknown lighting conditions, material properties and scene geometry. Recent progress in representing scene properties as coordinate-based n… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

    Comments: Project webpage: https://akshatdave.github.io/pandora

  39. arXiv:2202.02094  [pdf, other

    eess.SY cs.LG

    Numerical Demonstration of Multiple Actuator Constraint Enforcement Algorithm for a Molten Salt Loop

    Authors: Akshay J. Dave, Haoyu Wang, Roberto Ponciroli, Richard B. Vilim

    Abstract: To advance the paradigm of autonomous operation for nuclear power plants, a data-driven machine learning approach to control is sought. Autonomous operation for next-generation reactor designs is anticipated to bolster safety and improve economics. However, any algorithms that are utilized need to be interpretable, adaptable, and robust. In this work, we focus on the specific problem of optimal… ▽ More

    Submitted 25 February, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

    Comments: 4 pages, 6 figures. Submitted to 2022 American Nuclear Society Annual Meeting

  40. arXiv:2112.06280  [pdf, other

    cs.DC

    In-Memory Indexed Caching for Distributed Data Processing

    Authors: Alexandru Uta, Bogdan Ghit, Ankur Dave, Jan Rellermeyer, Peter Boncz

    Abstract: Powerful abstractions such as dataframes are only as efficient as their underlying runtime system. The de-facto distributed data processing framework, Apache Spark, is poorly suited for the modern cloud-based data-science workloads due to its outdated assumptions: static datasets analyzed using coarse-grained transformations. In this paper, we introduce the Indexed DataFrame, an in-memory cache th… ▽ More

    Submitted 8 February, 2022; v1 submitted 12 December, 2021; originally announced December 2021.

    Comments: Accepted for publication at IEEE IPDPS 2022

  41. Autonomous optimization of nonaqueous battery electrolytes via robotic experimentation and machine learning

    Authors: Adarsh Dave, Jared Mitchell, Sven Burke, Hongyi Lin, Jay Whitacre, Venkatasubramanian Viswanathan

    Abstract: In this work, we introduce a novel workflow that couples robotics to machine-learning for efficient optimization of a non-aqueous battery electrolyte. A custom-built automated experiment named "Clio" is coupled to Dragonfly - a Bayesian optimization-based experiment planner. Clio autonomously optimizes electrolyte conductivity over a single-salt, ternary solvent design space. Using this workflow,… ▽ More

    Submitted 22 November, 2021; originally announced November 2021.

    Comments: 26 pages, 5 Figures, 7 Extended Data Figures

  42. arXiv:2108.07973  [pdf, other

    eess.IV cs.CV

    Thermal Image Processing via Physics-Inspired Deep Networks

    Authors: Vishwanath Saragadam, Akshat Dave, Ashok Veeraraghavan, Richard Baraniuk

    Abstract: We introduce DeepIR, a new thermal image processing framework that combines physically accurate sensor modeling with deep network-based image representation. Our key enabling observations are that the images captured by thermal sensors can be factored into slowly changing, scene-independent sensor non-uniformities (that can be accurately modeled using physics) and a scene-specific radiance flux (t… ▽ More

    Submitted 25 August, 2021; v1 submitted 18 August, 2021; originally announced August 2021.

    Comments: Accepted to 2nd ICCV workshop on Learning for Computational Imaging (LCI)

  43. arXiv:2105.14645  [pdf, other

    physics.comp-ph cs.LG

    Empirical Models for Multidimensional Regression of Fission Systems

    Authors: Akshay J. Dave, Jiankai Yu, Jarod Wilson, Bren Phillips, Kaichao Sun, Benoit Forget

    Abstract: The development of next-generation autonomous control of fission systems, such as nuclear power plants, will require leveraging advancements in machine learning. For fission systems, accurate prediction of nuclear transport is important to quantify the safety margin and optimize performance. The state-of-the-art approach to this problem is costly Monte Carlo (MC) simulations to approximate solutio… ▽ More

    Submitted 30 May, 2021; originally announced May 2021.

    Comments: 20 pages, 7 figures

  44. arXiv:2104.11221  [pdf, other

    cs.CV

    Opening up Open-World Tracking

    Authors: Yang Liu, Idil Esen Zulfikar, Jonathon Luiten, Achal Dave, Deva Ramanan, Bastian Leibe, Aljoša Ošep, Laura Leal-Taixé

    Abstract: Tracking and detecting any object, including ones never-seen-before during model training, is a crucial but elusive capability of autonomous systems. An autonomous agent that is blind to never-seen-before objects poses a safety hazard when operating in the real world - and yet this is how almost all current systems work. One of the main obstacles towards advancing tracking any object is that this… ▽ More

    Submitted 28 March, 2022; v1 submitted 22 April, 2021; originally announced April 2021.

    Comments: CVPR 2022 (Oral). https://openworldtracking.github.io/

  45. arXiv:2104.03702  [pdf, other

    cs.SI cs.CY

    Media Cloud: Massive Open Source Collection of Global News on the Open Web

    Authors: Hal Roberts, Rahul Bhargava, Linas Valiukas, Dennis Jen, Momin M. Malik, Cindy Bishop, Emily Ndulue, Aashka Dave, Justin Clark, Bruce Etling, Rob Faris, Anushka Shah, Jasmin Rubinovitz, Alexis Hope, Catherine D'Ignazio, Fernando Bermejo, Yochai Benkler, Ethan Zuckerman

    Abstract: We present the first full description of Media Cloud, an open source platform based on crawling hyperlink structure in operation for over 10 years, that for many uses will be the best way to collect data for studying the media ecosystem on the open web. We document the key choices behind what data Media Cloud collects and stores, how it processes and organizes these data, and its open API access a… ▽ More

    Submitted 1 May, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

    Comments: 15 pages, 9 figures, accepted (minus the 3-page, 3-image appendix given here) for publication and forthcoming in Proceedings of the Fifteenth International AAAI Conference on Web and Social Media (ICWSM-2021)

    ACM Class: J.4; H.3.5; J.7; J.5; K.4.1

  46. arXiv:2102.01066  [pdf, other

    cs.CV

    Evaluating Large-Vocabulary Object Detectors: The Devil is in the Details

    Authors: Achal Dave, Piotr Dollár, Deva Ramanan, Alexander Kirillov, Ross Girshick

    Abstract: By design, average precision (AP) for object detection aims to treat all classes independently: AP is computed independently per category and averaged. On one hand, this is desirable as it treats all classes equally. On the other hand, it ignores cross-category confidence calibration, a key property in real-world use cases. Unfortunately, under important conditions (i.e., large vocabulary, high in… ▽ More

    Submitted 15 March, 2022; v1 submitted 1 February, 2021; originally announced February 2021.

  47. arXiv:2101.06362  [pdf, other

    cs.CR cs.CY

    SEDAT:Security Enhanced Device Attestation with TPM2.0

    Authors: Avani Dave, Monty Wiseman, David Safford

    Abstract: Remote attestation is one of the ways to verify the state of an untrusted device. Earlier research has attempted remote verification of a devices' state using hardware, software, or hybrid approaches. Majority of them have used Attestation Key as a hardware root of trust, which does not detect hardware modification or counterfeit issues. In addition, they do not have a secure communication channel… ▽ More

    Submitted 15 January, 2021; originally announced January 2021.

  48. arXiv:2101.06300  [pdf, other

    cs.CR

    CARE: Lightweight Attack Resilient Secure Boot Architecturewith Onboard Recovery for RISC-V based SOC

    Authors: Avani Dave, Nilanjan Banerjee, Chintan Patel

    Abstract: Recent technological advancements have proliferated the use of small embedded devices for collecting, processing, and transferring the security-critical information. The Internet of Things (IoT) has enabled remote access and control of these network-connected devices. Consequently, an attacker can exploit security vulnerabilities and compromise these devices. In this context, the secure boot becom… ▽ More

    Submitted 15 January, 2021; originally announced January 2021.

  49. arXiv:2101.06148  [pdf, other

    cs.CR

    SRACARE: Secure Remote Attestation with Code Authentication and Resilience Engine

    Authors: Avani Dave, Nilanjan Banerjee, Chintan Patel

    Abstract: Recent technological advancements have enabled proliferated use of small embedded and IoT devices for collecting, processing, and transferring the security-critical information and user data. This exponential use has acted as a catalyst in the recent growth of sophisticated attacks such as the replay, man-in-the-middle, and malicious code modification to slink, leak, tweak or exploit the security-… ▽ More

    Submitted 15 January, 2021; originally announced January 2021.

  50. arXiv:2012.08419  [pdf, other

    cs.CV

    Detecting Invisible People

    Authors: Tarasha Khurana, Achal Dave, Deva Ramanan

    Abstract: Monocular object detection and tracking have improved drastically in recent years, but rely on a key assumption: that objects are visible to the camera. Many offline tracking approaches reason about occluded objects post-hoc, by linking together tracklets after the object re-appears, making use of reidentification (ReID). However, online tracking in embodied robotic agents (such as a self-driving… ▽ More

    Submitted 15 December, 2020; originally announced December 2020.

    Comments: Project page: http://www.cs.cmu.edu/~tkhurana/invisible.htm