Skip to main content

Showing 1–50 of 212 results for author: Johnson, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19512  [pdf, other

    cs.CL cs.AI cs.HC

    Captioning Visualizations with Large Language Models (CVLLM): A Tutorial

    Authors: Giuseppe Carenini, Jordon Johnson, Ali Salamatian

    Abstract: Automatically captioning visualizations is not new, but recent advances in large language models(LLMs) open exciting new possibilities. In this tutorial, after providing a brief review of Information Visualization (InfoVis) principles and past work in captioning, we introduce neural models and the transformer architecture used in generic LLMs. We then discuss their recent applications in InfoVis,… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 6 pages, 4 figures

  2. arXiv:2406.01792  [pdf, other

    cs.PL

    The SemGuS Toolkit

    Authors: Keith J. C. Johnson, Andrew Reynolds, Thomas Reps, Loris D'Antoni

    Abstract: Semantics-Guided Synthesis (SemGuS) is a programmable framework for defining synthesis problems in a domain- and solver-agnostic way. This paper presents the standardized SemGuS format, together with an open-source toolkit that provides a parser, a verifier, and enumerative SemGuS solvers. The paper also describes an initial set of SemGuS benchmarks, which form the basis for comparing SemGuS solve… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  3. arXiv:2405.02803  [pdf, other

    cs.LG cs.DC

    Is Flash Attention Stable?

    Authors: Alicia Golden, Samuel Hsia, Fei Sun, Bilge Acun, Basil Hosmer, Ye** Lee, Zachary DeVito, Jeff Johnson, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

    Abstract: Training large-scale machine learning models poses distinct system challenges, given both the size and complexity of today's workloads. Recently, many organizations training state-of-the-art Generative AI models have reported cases of instability during training, often taking the form of loss spikes. Numeric deviation has emerged as a potential cause of this training instability, although quantify… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  4. arXiv:2404.19760  [pdf, other

    cs.CV cs.GR

    Lightplane: Highly-Scalable Components for Neural 3D Fields

    Authors: Ang Cao, Justin Johnson, Andrea Vedaldi, David Novotny

    Abstract: Contemporary 3D research, particularly in reconstruction and generation, heavily relies on 2D images for inputs or supervision. However, current designs for these 2D-3D map** are memory-intensive, posing a significant bottleneck for existing methods and hindering new applications. In response, we propose a pair of highly scalable components for 3D neural fields: Lightplane Render and Splatter, w… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: Project Page: https://lightplane.github.io/ Code: https://github.com/facebookresearch/lightplane

  5. arXiv:2404.14436  [pdf, other

    cs.LG hep-ex nucl-ex physics.ins-det

    Investigating Resource-efficient Neutron/Gamma Classification ML Models Targeting eFPGAs

    Authors: Jyothisraj Johnson, Billy Boxer, Tarun Prakash, Carl Grace, Peter Sorensen, Mani Tripathi

    Abstract: There has been considerable interest and resulting progress in implementing machine learning (ML) models in hardware over the last several years from the particle and nuclear physics communities. A big driver has been the release of the Python package, hls4ml, which has enabled porting models specified and trained using Python ML libraries to register transfer level (RTL) code. So far, the primary… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  6. arXiv:2404.09357  [pdf, other

    cs.SE cs.ET

    Service Weaver: A Promising Direction for Cloud-native Systems?

    Authors: Jacoby Johnson, Subash Kharel, Alan Mannamplackal, Amr S. Abdelfattah, Tomas Cerny

    Abstract: Cloud-native and microservice architectures have taken over the development world by storm. While being incredibly scalable and resilient, microservice architectures also come at the cost of increased overhead to build and maintain. Google's Service Weaver aims to simplify the complexities associated with implementing cloud-native systems by introducing the concept of a single modular binary compo… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: This paper is accepted for publication at the International Conference on Cloud Computing and Services Science (CLOSER) 2024

  7. arXiv:2404.08636  [pdf, other

    cs.CV

    Probing the 3D Awareness of Visual Foundation Models

    Authors: Mohamed El Banani, Amit Raj, Kevis-Kokitsi Maninis, Abhishek Kar, Yuanzhen Li, Michael Rubinstein, Deqing Sun, Leonidas Guibas, Justin Johnson, Varun Jampani

    Abstract: Recent advances in large-scale pretraining have yielded visual foundation models with strong capabilities. Not only can recent models generalize to arbitrary images for their training task, their intermediate representations are useful for other visual tasks such as detection and segmentation. Given that such models can classify, delineate, and localize objects in 2D, we ask whether they also repr… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024. Project page: https://github.com/mbanani/probe3d

  8. arXiv:2404.07984  [pdf, other

    cs.CV

    View Selection for 3D Captioning via Diffusion Ranking

    Authors: Tiange Luo, Justin Johnson, Honglak Lee

    Abstract: Scalable annotation approaches are crucial for constructing extensive 3D-text datasets, facilitating a broader range of applications. However, existing methods sometimes lead to the generation of hallucinated captions, compromising caption quality. This paper explores the issue of hallucination in 3D object captioning, with a focus on Cap3D method, which renders 3D objects into 2D views for captio… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Dataset link: https://huggingface.co/datasets/tiange/Cap3D

  9. arXiv:2404.03566  [pdf, other

    cs.CV

    PointInfinity: Resolution-Invariant Point Diffusion Models

    Authors: Zixuan Huang, Justin Johnson, Shoubhik Debnath, James M. Rehg, Chao-Yuan Wu

    Abstract: We present PointInfinity, an efficient family of point cloud diffusion models. Our core idea is to use a transformer-based architecture with a fixed-size, resolution-invariant latent representation. This enables efficient training with low-resolution point clouds, while allowing high-resolution point clouds to be generated during inference. More importantly, we show that scaling the test-time reso… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024, project website at https://zixuanh.com/projects/pointinfinity

  10. arXiv:2403.18819  [pdf, other

    cs.CV

    Benchmarking Object Detectors with COCO: A New Path Forward

    Authors: Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai

    Abstract: The Common Objects in Context (COCO) dataset has been instrumental in benchmarking object detectors over the past decade. Like every dataset, COCO contains subtle errors and imperfections stemming from its annotation procedure. With the advent of high-performing models, we ask whether these errors of COCO are hindering its utility in reliably benchmarking further progress. In search for an answer,… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Technical report. Dataset website: https://cocorem.xyz and code: https://github.com/kdexd/coco-rem

  11. arXiv:2403.18147  [pdf, other

    cs.LG

    Divide, Conquer, Combine Bayesian Decision Tree Sampling

    Authors: Jodie A. Cochrane, Adrian Wills, Sarah J. Johnson

    Abstract: Decision trees are commonly used predictive models due to their flexibility and interpretability. This paper is directed at quantifying the uncertainty of decision tree predictions by employing a Bayesian inference approach. This is challenging because these approaches need to explore both the tree structure space and the space of decision parameters associated with each tree structure. This has b… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: 38 pages, 5 figures

  12. arXiv:2403.07822  [pdf, other

    stat.AP cs.LG

    Fusing Climate Data Products using a Spatially Varying Autoencoder

    Authors: Jacob A. Johnson, Matthew J. Heaton, William F. Christensen, Lynsie R. Warr, Summer B. Rupper

    Abstract: Autoencoders are powerful machine learning models used to compress information from multiple data sources. However, autoencoders, like all artificial neural networks, are often unidentifiable and uninterpretable. This research focuses on creating an identifiable and interpretable autoencoder that can be used to meld and combine climate data products. The proposed autoencoder utilizes a Bayesian st… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 13 pages, 7 figures

  13. arXiv:2403.03221  [pdf, other

    cs.CV

    FAR: Flexible, Accurate and Robust 6DoF Relative Camera Pose Estimation

    Authors: Chris Rockwell, Nilesh Kulkarni, Linyi **, Jeong Joon Park, Justin Johnson, David F. Fouhey

    Abstract: Estimating relative camera poses between images has been a central problem in computer vision. Methods that find correspondences and solve for the fundamental matrix offer high precision in most cases. Conversely, methods predicting pose directly using neural networks are more robust to limited overlap and can infer absolute translation scale, but at the expense of reduced precision. We show how t… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024. Project Page: https://crockwell.github.io/far/

  14. arXiv:2402.03239  [pdf, other

    cs.DC cs.SI

    Bluesky and the AT Protocol: Usable Decentralized Social Media

    Authors: Martin Kleppmann, Paul Frazee, Jake Gold, Jay Graber, Daniel Holmgren, Devin Ivy, Jeromy Johnson, Bryan Newbold, Jaz Volpert

    Abstract: Bluesky is a new social network built upon the AT Protocol, a decentralized foundation for public social media. It was launched in private beta in February 2023, and has grown to over 3 million registered users in the following year. In this paper we introduce the architecture of Bluesky and the AT Protocol, which is inspired by the web itself, but modernized to include streams of real-time update… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  15. arXiv:2402.01855  [pdf, other

    stat.ML cs.LG eess.IV

    SPDE priors for uncertainty quantification of end-to-end neural data assimilation schemes

    Authors: Maxime Beauchamp, Nicolas Desassis, J. Emmanuel Johnson, Simon Benaichouche, Pierre Tandeo, Ronan Fablet

    Abstract: The spatio-temporal interpolation of large geophysical datasets has historically been adressed by Optimal Interpolation (OI) and more sophisticated model-based or data-driven DA techniques. In the last ten years, the link established between Stochastic Partial Differential Equations (SPDE) and Gaussian Markov Random Fields (GMRF) opened a new way of handling both large datasets and physically-indu… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  16. arXiv:2402.00399  [pdf, other

    cs.RO

    Continuous-time Trajectory Estimation: A Comparative Study Between Gaussian Process and Spline-based Approaches

    Authors: Jacob Johnson, Joshua Mangelson, Timothy Barfoot, Randal Beard

    Abstract: Continuous-time trajectory estimation is an attractive alternative to discrete-time batch estimation due to the ability to incorporate high-frequency measurements from asynchronous sensors while kee** the number of optimization parameters bounded. Two types of continuous-time estimation have become prevalent in the literature: Gaussian process regression and spline-based estimation. In this pape… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  17. arXiv:2401.08281  [pdf, other

    cs.LG cs.CV cs.SE

    The Faiss library

    Authors: Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, Hervé Jégou

    Abstract: Vector databases manage large collections of embedding vectors. As AI applications are growing rapidly, so are the number of embeddings that need to be stored and indexed. The Faiss library is dedicated to vector similarity search, a core functionality of vector databases. Faiss is a toolkit of indexing methods and related primitives used to search, cluster, compress and transform vectors. This pa… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  18. arXiv:2312.14385  [pdf, other

    cs.DC cs.LG cs.MM

    Generative AI Beyond LLMs: System Implications of Multi-Modal Generation

    Authors: Alicia Golden, Samuel Hsia, Fei Sun, Bilge Acun, Basil Hosmer, Ye** Lee, Zachary DeVito, Jeff Johnson, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

    Abstract: As the development of large-scale Generative AI models evolve beyond text (1D) generation to include image (2D) and video (3D) generation, processing spatial and temporal information presents unique challenges to quality, performance, and efficiency. We present the first work towards understanding this new system design space for multi-modal text-to-image (TTI) and text-to-video (TTV) generation m… ▽ More

    Submitted 5 May, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Published at 2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

  19. arXiv:2312.01577  [pdf, other

    cs.LG stat.CO stat.ML

    RJHMC-Tree for Exploration of the Bayesian Decision Tree Posterior

    Authors: Jodie A. Cochrane, Adrian G. Wills, Sarah J. Johnson

    Abstract: Decision trees have found widespread application within the machine learning community due to their flexibility and interpretability. This paper is directed towards learning decision trees from data using a Bayesian approach, which is challenging due to the potentially enormous parameter space required to span all tree models. Several approaches have been proposed to combat this challenge, with on… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: 43 pages, 7 figures

  20. Learning Realistic Joint Space Boundaries for Range of Motion Analysis of Healthy and Impaired Human Arms

    Authors: Shafagh Keyvanian, Michelle J. Johnson, Nadia Figueroa

    Abstract: A realistic human kinematic model that satisfies anatomical constraints is essential for human-robot interaction, biomechanics and robot-assisted rehabilitation. Modeling realistic joint constraints, however, is challenging as human arm motion is constrained by joint limits, inter- and intra-joint dependencies, self-collisions, individual capabilities and muscular or neurological constraints which… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  21. arXiv:2310.20183  [pdf

    cs.CY

    Thriving in a Pandemic: Lessons Learned from a Resilient University Program Seen Through the CoI Lens

    Authors: Zihui Ma, Lingyao Li, John C. E. Johnson

    Abstract: In March 2020, college campuses underwent a sudden transformation to online learning due to the COVID-19 outbreak. To understand the impact of COVID-19 on students' expectations, this study conducted a three-year survey from ten core courses within the Project Management Center for Excellence at the University of Maryland. The study involved two main steps: 1) a statistical analysis to evaluate st… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

  22. arXiv:2310.13781  [pdf, other

    cs.CL

    How Much Consistency Is Your Accuracy Worth?

    Authors: Jacob K. Johnson, Ana Marasović

    Abstract: Contrast set consistency is a robustness measurement that evaluates the rate at which a model correctly responds to all instances in a bundle of minimally different examples relying on the same knowledge. To draw additional insights, we propose to complement consistency with relative consistency -- the probability that an equally accurate model would surpass the consistency of the proposed model,… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: BlackboxNLP 2023 accepted paper camera-ready version; 6 pages main, 3 pages appendix

  23. arXiv:2309.16778  [pdf, other

    cs.RO

    Coupled Active Perception and Manipulation Planning for a Mobile Manipulator in Precision Agriculture Applications

    Authors: Shuangyu Xie, Chengsong Hu, Di Wang, Joe Johnson, Muthukumar Bagavathiannan, Dezhen Song

    Abstract: A mobile manipulator often finds itself in an application where it needs to take a close-up view before performing a manipulation task. Named this as a coupled active perception and manipulation (CAPM) problem, we model the uncertainty in the perception process and devise a key state/task planning approach that considers reachability conditions as task constraints of both perception and manipulati… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: submitted to ICRA 2024

  24. arXiv:2309.15599  [pdf, other

    cs.LG physics.ao-ph

    OceanBench: The Sea Surface Height Edition

    Authors: J. Emmanuel Johnson, Quentin Febvre, Anastasia Gorbunova, Sammy Metref, Maxime Ballarotta, Julien Le Sommer, Ronan Fablet

    Abstract: The ocean profoundly influences human activities and plays a critical role in climate regulation. Our understanding has improved over the last decades with the advent of satellite remote sensing data, allowing us to capture essential quantities over the globe, e.g., sea surface height (SSH). However, ocean satellite data presents challenges for information extraction due to their sparsity and irre… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: J. Emmanuel Johnson and Quentin Febvre contributed equally to this work

  25. arXiv:2309.15272  [pdf, other

    cs.RO

    Zero-Shot Constrained Motion Planning Transformers Using Learned Sampling Dictionaries

    Authors: Jacob J. Johnson, Ahmed H. Qureshi, Michael C. Yip

    Abstract: Constrained robot motion planning is a ubiquitous need for robots interacting with everyday environments, but it is a notoriously difficult problem to solve. Many sampled points in a sample-based planner need to be rejected as they fall outside the constraint manifold, or require significant iterative effort to correct. Given this, few solutions exist that present a constraint-satisfying trajector… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  26. arXiv:2308.06956  [pdf, ps, other

    cs.PL

    Modular System Synthesis

    Authors: Kanghee Park, Keith J. C. Johnson, Loris D'Antoni, Thomas Reps

    Abstract: This paper describes a way to improve the scalability of program synthesis by exploiting modularity: larger programs are synthesized from smaller programs. The key issue is to make each "larger-created-from-smaller" synthesis sub-problem be of a similar nature, so that the kind of synthesis sub-problem that needs to be solved--and the size of each search space--has roughly the same character at ea… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

  27. arXiv:2307.10063  [pdf, other

    cs.RO

    Object-centric Representations for Interactive Online Learning with Non-Parametric Methods

    Authors: Nikhil U. Shinde, Jacob Johnson, Sylvia Herbert, Michael C. Yip

    Abstract: Large offline learning-based models have enabled robots to successfully interact with objects for a wide variety of tasks. However, these models rely on fairly consistent structured environments. For more unstructured environments, an online learning component is necessary to gather and estimate information about objects in the environment in order to successfully interact with them. Unfortunately… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: Paper Accepted to CASE 2023

  28. arXiv:2307.07511  [pdf, other

    cs.CV

    NIFTY: Neural Object Interaction Fields for Guided Human Motion Synthesis

    Authors: Nilesh Kulkarni, Davis Rempe, Kyle Genova, Abhijit Kundu, Justin Johnson, David Fouhey, Leonidas Guibas

    Abstract: We address the problem of generating realistic 3D motions of humans interacting with objects in a scene. Our key idea is to create a neural interaction field attached to a specific object, which outputs the distance to the valid interaction manifold given a human pose as input. This interaction field guides the sampling of an object-conditioned human motion diffusion model, so as to encourage plau… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: Project Page with additional results available https://nileshkulkarni.github.io/nifty

  29. arXiv:2306.08671  [pdf, other

    cs.CV

    Learning to Predict Scene-Level Implicit 3D from Posed RGBD Data

    Authors: Nilesh Kulkarni, Linyi **, Justin Johnson, David F. Fouhey

    Abstract: We introduce a method that can learn to predict scene-level implicit functions for 3D reconstruction from posed RGBD data. At test time, our system maps a previously unseen RGB image to a 3D reconstruction of a scene via implicit functions. While implicit functions for 3D reconstruction have often been tied to meshes, we show that we can train one using only a set of posed RGBD images. This settin… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: Project page this https://nileshkulkarni.github.io/d2drdf/

  30. arXiv:2306.07279  [pdf, other

    cs.CV

    Scalable 3D Captioning with Pretrained Models

    Authors: Tiange Luo, Chris Rockwell, Honglak Lee, Justin Johnson

    Abstract: We introduce Cap3D, an automatic approach for generating descriptive text for 3D objects. This approach utilizes pretrained models from image captioning, image-text alignment, and LLM to consolidate captions from multiple views of a 3D asset, completely side-step** the time-consuming and costly process of manual annotation. We apply Cap3D to the recently introduced large-scale 3D dataset, Objave… ▽ More

    Submitted 15 June, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

    Comments: Dataset link: https://huggingface.co/datasets/tiange/Cap3D

  31. arXiv:2306.00851  [pdf, other

    cs.RO cs.AI

    Learning Sampling Dictionaries for Efficient and Generalizable Robot Motion Planning with Transformers

    Authors: Jacob J Johnson, Ahmed H Qureshi, Michael Yip

    Abstract: Motion planning is integral to robotics applications such as autonomous driving, surgical robots, and industrial manipulators. Existing planning methods lack scalability to higher-dimensional spaces, while recent learning based planners have shown promise in accelerating sampling-based motion planners (SMP) but lack generalizability to out-of-distribution environments. To address this, we present… ▽ More

    Submitted 26 September, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

  32. arXiv:2304.09172  [pdf, other

    cs.CV cs.LG

    Hyperbolic Image-Text Representations

    Authors: Karan Desai, Maximilian Nickel, Tanmay Rajpurohit, Justin Johnson, Ramakrishna Vedantam

    Abstract: Visual and linguistic concepts naturally organize themselves in a hierarchy, where a textual concept "dog" entails all images that contain dogs. Despite being intuitive, current large-scale vision and language models such as CLIP do not explicitly capture such hierarchy. We propose MERU, a contrastive model that yields hyperbolic representations of images and text. Hyperbolic spaces have suitable… ▽ More

    Submitted 18 January, 2024; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: ICML 2023 (v3: Add link to code in abstract)

  33. arXiv:2304.05237  [pdf

    cs.CR cs.AR cs.DC cs.PF

    TREBUCHET: Fully Homomorphic Encryption Accelerator for Deep Computation

    Authors: David Bruce Cousins, Yuriy Polyakov, Ahmad Al Badawi, Matthew French, Andrew Schmidt, Ajey Jacob, Benedict Reynwar, Kellie Canida, Akhilesh Jaiswal, Clynn Mathew, Homer Gamil, Negar Neda, Deepraj Soni, Michail Maniatakos, Brandon Reagen, Naifeng Zhang, Franz Franchetti, Patrick Brinich, Jeremy Johnson, Patrick Broderick, Mike Franusich, Bo Zhang, Zeming Cheng, Massoud Pedram

    Abstract: Secure computation is of critical importance to not only the DoD, but across financial institutions, healthcare, and anywhere personally identifiable information (PII) is accessed. Traditional security techniques require data to be decrypted before performing any computation. When processed on untrusted systems the decrypted data is vulnerable to attacks to extract the sensitive information. To ad… ▽ More

    Submitted 18 April, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: 6 pages, 5 figures and 2 tables

  34. arXiv:2303.17667  [pdf, other

    cs.CY cs.SI q-fin.CP

    Taureau: A Stock Market Movement Inference Framework Based on Twitter Sentiment Analysis

    Authors: Nicholas Milikich, Joshua Johnson

    Abstract: With the advent of fast-paced information dissemination and retrieval, it has become inherently important to resort to automated means of predicting stock market prices. In this paper, we propose Taureau, a framework that leverages Twitter sentiment analysis for predicting stock market movement. The aim of our research is to determine whether Twitter, which is assumed to be representative of the g… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

  35. arXiv:2303.11989  [pdf, other

    cs.CV

    Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models

    Authors: Lukas Höllein, Ang Cao, Andrew Owens, Justin Johnson, Matthias Nießner

    Abstract: We present Text2Room, a method for generating room-scale textured 3D meshes from a given text prompt as input. To this end, we leverage pre-trained 2D text-to-image models to synthesize a sequence of images from different poses. In order to lift these outputs into a consistent 3D scene representation, we combine monocular depth estimation with a text-conditioned inpainting model. The core idea of… ▽ More

    Submitted 10 September, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

    Comments: Accepted to ICCV 2023 (Oral) video: https://youtu.be/fjRnFL91EZc project page: https://lukashoel.github.io/text-to-room/ code: https://github.com/lukasHoel/text2room

  36. arXiv:2302.12248  [pdf, other

    cs.CV

    Learning Visual Representations via Language-Guided Sampling

    Authors: Mohamed El Banani, Karan Desai, Justin Johnson

    Abstract: Although an object may appear in numerous contexts, we often describe it in a limited number of ways. Language allows us to abstract away visual variation to represent and communicate concepts. Building on this intuition, we propose an alternative approach to visual representation learning: using language similarity to sample semantically similar image pairs for contrastive learning. Our approach… ▽ More

    Submitted 29 March, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: Accepted to CVPR 2023. v2 is camera-ready version with additional ImageNet evaluations. Project page: https://github.com/mbanani/lgssl

  37. arXiv:2301.11280  [pdf, other

    cs.CV cs.AI cs.LG

    Text-To-4D Dynamic Scene Generation

    Authors: Uriel Singer, Shelly Sheynin, Adam Polyak, Oron Ashual, Iurii Makarov, Filippos Kokkinos, Naman Goyal, Andrea Vedaldi, Devi Parikh, Justin Johnson, Yaniv Taigman

    Abstract: We present MAV3D (Make-A-Video3D), a method for generating three-dimensional dynamic scenes from text descriptions. Our approach uses a 4D dynamic Neural Radiance Field (NeRF), which is optimized for scene appearance, density, and motion consistency by querying a Text-to-Video (T2V) diffusion-based model. The dynamic video output generated from the provided text can be viewed from any camera locat… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

  38. arXiv:2301.10904  [pdf, other

    cs.CR cs.DC cs.LG

    GPU-based Private Information Retrieval for On-Device Machine Learning Inference

    Authors: Maximilian Lam, Jeff Johnson, Wenjie Xiong, Kiwan Maeng, Udit Gupta, Yang Li, Liangzhen Lai, Ilias Leontiadis, Minsoo Rhu, Hsien-Hsin S. Lee, Vijay Janapa Reddi, Gu-Yeon Wei, David Brooks, G. Edward Suh

    Abstract: On-device machine learning (ML) inference can enable the use of private user data on user devices without revealing them to remote servers. However, a pure on-device solution to private ML inference is impractical for many applications that rely on embedding tables that are too large to be stored on-device. In particular, recommendation models typically use multiple embedding tables each on the or… ▽ More

    Submitted 25 September, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

  39. arXiv:2301.09632  [pdf, other

    cs.CV

    HexPlane: A Fast Representation for Dynamic Scenes

    Authors: Ang Cao, Justin Johnson

    Abstract: Modeling and re-rendering dynamic 3D scenes is a challenging task in 3D vision. Prior approaches build on NeRF and rely on implicit representations. This is slow since it requires many MLP evaluations, constraining real-world applications. We show that dynamic 3D scenes can be explicitly represented by six planes of learned features, leading to an elegant solution we call HexPlane. A HexPlane comp… ▽ More

    Submitted 27 March, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

    Comments: CVPR 2023, Camera Ready Project page: https://caoang327.github.io/HexPlane

  40. arXiv:2301.08247  [pdf, other

    cs.CV

    Multiview Compressive Coding for 3D Reconstruction

    Authors: Chao-Yuan Wu, Justin Johnson, Jitendra Malik, Christoph Feichtenhofer, Georgia Gkioxari

    Abstract: A central goal of visual recognition is to understand objects and scenes from a single image. 2D recognition has witnessed tremendous progress thanks to large-scale learning and general-purpose representations. Comparatively, 3D poses new challenges stemming from occlusions not depicted in the image. Prior works try to overcome these by inferring from multiple views or rely on scarce CAD models an… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

    Comments: Project page: https://mcc3d.github.io/

  41. An Empirical Investigation into the Reproduction of Bug Reports for Android Apps

    Authors: Jack Johnson, Junayed Mahmud, Tyler Wendland, Kevin Moran, Julia Rubin, Mattia Fazzini

    Abstract: One of the key tasks related to ensuring mobile app quality is the reporting, management, and resolution of bug reports. As such, researchers have committed considerable resources toward automating various tasks of the bug management process for mobile apps, such as reproduction and triaging. However, the success of these automated approaches is largely dictated by the characteristics and properti… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

    Comments: Published in the Proceedings of the 29th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER'22), Honolulu, Hawaii, March 15-18, 2022, pp. 321-332

  42. arXiv:2212.12952  [pdf, other

    cs.CV cs.AI

    Neural Shape Compiler: A Unified Framework for Transforming between Text, Point Cloud, and Program

    Authors: Tiange Luo, Honglak Lee, Justin Johnson

    Abstract: 3D shapes have complementary abstractions from low-level geometry to part-based hierarchies to languages, which convey different levels of information. This paper presents a unified framework to translate between pairs of shape abstractions: $\textit{Text}$ $\Longleftrightarrow$ $\textit{Point Cloud}$ $\Longleftrightarrow$ $\textit{Program}$. We propose $\textbf{Neural Shape Compiler}$ to model th… ▽ More

    Submitted 6 April, 2023; v1 submitted 25 December, 2022; originally announced December 2022.

    Comments: TMLR; project page: https://tiangeluo.github.io/projectpages/shapecompiler.html

  43. arXiv:2212.03236  [pdf, other

    cs.CV

    Self-Supervised Correspondence Estimation via Multiview Registration

    Authors: Mohamed El Banani, Ignacio Rocco, David Novotny, Andrea Vedaldi, Natalia Neverova, Justin Johnson, Benjamin Graham

    Abstract: Video provides us with the spatio-temporal consistency needed for visual learning. Recent approaches have utilized this signal to learn correspondence estimation from close-by frame pairs. However, by only relying on close-by frame pairs, those approaches miss out on the richer long-range consistency between distant overlap** frames. To address this, we propose a self-supervised approach for cor… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Comments: Accepted to WACV 2023. Project page: https://mbanani.github.io/syncmatch/

  44. arXiv:2211.16421  [pdf, other

    cs.CV eess.IV

    RGB no more: Minimally-decoded JPEG Vision Transformers

    Authors: Jeongsoo Park, Justin Johnson

    Abstract: Most neural networks for computer vision are designed to infer using RGB images. However, these RGB images are commonly encoded in JPEG before saving to disk; decoding them imposes an unavoidable overhead for RGB networks. Instead, our work focuses on training Vision Transformers (ViT) directly from the encoded features of JPEG. This way, we can avoid most of the decoding overhead, accelerating da… ▽ More

    Submitted 13 June, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: 17 pages, 6 figures, 6 tables

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 22334-22346

  45. arXiv:2211.10444  [pdf, other

    physics.ao-ph cs.AI cs.LG

    Neural Fields for Fast and Scalable Interpolation of Geophysical Ocean Variables

    Authors: J. Emmanuel Johnson, Redouane Lguensat, Ronan Fablet, Emmanuel Cosme, Julien Le Sommer

    Abstract: Optimal Interpolation (OI) is a widely used, highly trusted algorithm for interpolation and reconstruction problems in geosciences. With the influx of more satellite missions, we have access to more and more observations and it is becoming more pertinent to take advantage of these observations in applications such as forecasting and reanalysis. With the increase in the volume of available data, sc… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: Machine Learning and the Physical Sciences workshop, NeurIPS 2022

  46. arXiv:2209.06392  [pdf, other

    eess.SP cs.LG cs.NI

    Joint User and Data Detection in Grant-Free NOMA with Attention-based BiLSTM Network

    Authors: Saud Khan, Salman Durrani, Muhammad Basit Shahab, Sarah J. Johnson, Seyit Camtepe

    Abstract: We consider the multi-user detection (MUD) problem in uplink grant-free non-orthogonal multiple access (NOMA), where the access point has to identify the total number and correct identity of the active Internet of Things (IoT) devices and decode their transmitted data. We assume that IoT devices use complex spreading sequences and transmit information in a random-access manner following the burst-… ▽ More

    Submitted 12 July, 2023; v1 submitted 13 September, 2022; originally announced September 2022.

    Journal ref: IEEE Open Journal of the Communications Society, vol. 4, pp. 1499-1515, 2023

  47. arXiv:2208.08988  [pdf, other

    cs.CV

    The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs

    Authors: Chris Rockwell, Justin Johnson, David F. Fouhey

    Abstract: We present a simple baseline for directly estimating the relative pose (rotation and translation, including scale) between two images. Deep methods have recently shown strong progress but often require complex or multi-stage architectures. We show that a handful of modifications can be applied to a Vision Transformer (ViT) to bring its computations close to the Eight-Point Algorithm. This inductiv… ▽ More

    Submitted 23 January, 2023; v1 submitted 18 August, 2022; originally announced August 2022.

    Comments: Accepted to 3DV 2022; Project Page: https://crockwell.github.io/rel_pose/ Revision: Fixed Epipolar Lines in Figure 3, Figure 10

  48. arXiv:2207.10660  [pdf, other

    cs.CV

    Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild

    Authors: Garrick Brazil, Abhinav Kumar, Julian Straub, Nikhila Ravi, Justin Johnson, Georgia Gkioxari

    Abstract: Recognizing scenes and objects in 3D from a single image is a longstanding goal of computer vision with applications in robotics and AR/VR. For 2D recognition, large datasets and scalable solutions have led to unprecedented advances. In 3D, existing benchmarks are small in size and approaches specialize in few object categories and specific domains, e.g. urban driving scenes. Motivated by the succ… ▽ More

    Submitted 23 March, 2023; v1 submitted 21 July, 2022; originally announced July 2022.

    Comments: CVPR 2023, Project website: https://omni3d.garrickbrazil.com/

  49. arXiv:2206.08355  [pdf, other

    cs.CV

    FWD: Real-time Novel View Synthesis with Forward War** and Depth

    Authors: Ang Cao, Chris Rockwell, Justin Johnson

    Abstract: Novel view synthesis (NVS) is a challenging task requiring systems to generate photorealistic images of scenes from new viewpoints, where both quality and speed are important for applications. Previous image-based rendering (IBR) methods are fast, but have poor quality when input views are sparse. Recent Neural Radiance Fields (NeRF) and generalizable variants give impressive results but are not r… ▽ More

    Submitted 5 August, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: CVPR 2022. Project website https://caoang327.github.io/FWD/

  50. arXiv:2206.07028  [pdf, other

    cs.CV

    Learning 3D Object Shape and Layout without 3D Supervision

    Authors: Georgia Gkioxari, Nikhila Ravi, Justin Johnson

    Abstract: A 3D scene consists of a set of objects, each with a shape and a layout giving their position in space. Understanding 3D scenes from 2D images is an important goal, with applications in robotics and graphics. While there have been recent advances in predicting 3D shape and layout from a single image, most approaches rely on 3D ground truth for training which is expensive to collect at scale. We ov… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

    Comments: CVPR 2022, project page: https://gkioxari.github.io/usl/