Skip to main content

Showing 1–19 of 19 results for author: McGuire, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10325  [pdf, other

    cs.CL cs.LG eess.AS

    Enhancing Multilingual Voice Toxicity Detection with Speech-Text Alignment

    Authors: Joseph Liu, Mahesh Kumar Nandwana, Janne Pylkkönen, Hannes Heikinheimo, Morgan McGuire

    Abstract: Toxicity classification for voice heavily relies on the semantic content of speech. We propose a novel framework that utilizes cross-modal learning to integrate the semantic embedding of text into a multilabel speech toxicity classifier during training. This enables us to incorporate textual information during training while still requiring only audio during inference. We evaluate this classifier… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH 2024

  2. arXiv:2406.10223  [pdf, other

    cs.LG cs.SD eess.AS

    Diffusion Synthesizer for Efficient Multilingual Speech to Speech Translation

    Authors: Nameer Hirschkind, Xiao Yu, Mahesh Kumar Nandwana, Joseph Liu, Eloi DuBois, Dao Le, Nicolas Thiebaut, Colin Sinclair, Kyle Spence, Charles Shang, Zoe Abrams, Morgan McGuire

    Abstract: We introduce DiffuseST, a low-latency, direct speech-to-speech translation system capable of preserving the input speaker's voice zero-shot while translating from multiple source languages into English. We experiment with the synthesizer component of the architecture, comparing a Tacotron-based synthesizer to a novel diffusion-based synthesizer. We find the diffusion-based synthesizer to improve M… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Published in Interspeech 2024

  3. arXiv:2404.17718  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Lessons from Deploying CropFollow++: Under-Canopy Agricultural Navigation with Keypoints

    Authors: Arun N. Sivakumar, Mateus V. Gasparino, Michael McGuire, Vitor A. H. Higuti, M. Ugur Akcal, Girish Chowdhary

    Abstract: We present a vision-based navigation system for under-canopy agricultural robots using semantic keypoints. Autonomous under-canopy navigation is challenging due to the tight spacing between the crop rows ($\sim 0.75$ m), degradation in RTK-GPS accuracy due to multipath error, and noise in LiDAR measurements from the excessive clutter. Our system, CropFollow++, introduces modular and interpretable… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Accepted to the IEEE ICRA Workshop on Field Robotics 2024

  4. arXiv:2310.00239  [pdf, other

    cs.GR cs.AI cs.LG

    AdaptNet: Policy Adaptation for Physics-Based Character Control

    Authors: Pei Xu, Kaixiang Xie, Sheldon Andrews, Paul G. Kry, Michael Neff, Morgan McGuire, Ioannis Karamouzas, Victor Zordan

    Abstract: Motivated by humans' ability to adapt skills in the learning of new ones, this paper presents AdaptNet, an approach for modifying the latent space of existing policies to allow new behaviors to be quickly learned from like tasks in comparison to learning from scratch. Building on top of a given reinforcement learning controller, AdaptNet uses a two-tier hierarchy that augments the original state e… ▽ More

    Submitted 14 November, 2023; v1 submitted 29 September, 2023; originally announced October 2023.

    Comments: SIGGRAPH Asia 2023. Video: https://youtu.be/WxmJSCNFb28. Website: https://motion-lab.github.io/AdaptNet, https://pei-xu.github.io/AdaptNet

    Journal ref: ACM Transactions on Graphics 42, 6, Article 112.1522 (December 2023)

  5. arXiv:2306.01201  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models

    Authors: Liam Dugan, Anshul Wadhawan, Kyle Spence, Chris Callison-Burch, Morgan McGuire, Victor Zordan

    Abstract: Recent work in speech-to-speech translation (S2ST) has focused primarily on offline settings, where the full input utterance is available before any output is given. This, however, is not reasonable in many real-world scenarios. In latency-sensitive applications, rather than waiting for the full utterance, translations should be spoken as soon as the information in the input is present. In this wo… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: To appear at INTERSPEECH 2023

  6. arXiv:2206.07707  [pdf, other

    cs.CV cs.GR cs.LG cs.MM

    Variable Bitrate Neural Fields

    Authors: Towaki Takikawa, Alex Evans, Jonathan Tremblay, Thomas Müller, Morgan McGuire, Alec Jacobson, Sanja Fidler

    Abstract: Neural approximations of scalar and vector fields, such as signed distance functions and radiance fields, have emerged as accurate, high-quality representations. State-of-the-art results are obtained by conditioning a neural approximation with a lookup from trainable feature grids that take on part of the learning task and allow for smaller, more efficient neural networks. Unfortunately, these fea… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    Comments: SIGGRAPH 2022. Project Page: https://nv-tlabs.github.io/vqad/

  7. arXiv:2202.06726  [pdf, other

    cs.HC cs.GR

    Experimental Augmented Reality User Experience

    Authors: Josef Spjut, Fengyuan Zhu, Xiaolei Huang, Yichen Shou, Ben Boudaoud, Omer Shapira, Morgan McGuire

    Abstract: Augmented Reality (AR) is an emerging field ripe for experimentation, especially when it comes to develo** the kinds of applications and experiences that will drive mass adoption of the technology. While we aren't aware of any current consumer product that realize a wearable, wide Field of View (FoV), AR Head Mounted Display (HMD), such devices will certainly come. In order for these sophisticat… ▽ More

    Submitted 10 February, 2022; originally announced February 2022.

    Comments: 2 pages, 3 figures, work original completed in 2019

  8. arXiv:2202.06429  [pdf

    cs.HC cs.GR

    FirstPersonScience: Quantifying Psychophysics for First Person Shooter Tasks

    Authors: Josef Spjut, Ben Boudaoud, Kamran Binaee, Zander Majercik, Morgan McGuire, Joohwan Kim

    Abstract: In the emerging field of esports research, there is an increasing demand for quantitative results that can be used by players, coaches and analysts to make decisions and present meaningful commentary for spectators. We present FirstPersonScience, a software application intended to fill this need in the esports community by allowing scientists to design carefully controlled experiments and capture… ▽ More

    Submitted 10 February, 2022; originally announced February 2022.

    Comments: 7 pages, 4 figures, appeared in UCI Esports Conference, October 10, 2019

  9. arXiv:2201.00094  [pdf, other

    cs.GR

    Wavelet Transparency

    Authors: Maksim Aizenshtein, Niklas Smal, Morgan McGuire

    Abstract: Order-independent transparency schemes rely on low-order approximations of transmittance as a function of depth. We introduce a new wavelet representation of this function and an algorithm for building and evaluating it efficiently on a GPU. We then extend the order-independent Phenomenological Transparency algorithm to our representation and introduce a new phenomenological approximation of chrom… ▽ More

    Submitted 31 December, 2021; originally announced January 2022.

  10. arXiv:2108.05263  [pdf, other

    cs.GR

    Dynamic Diffuse Global Illumination Resampling

    Authors: Zander Majercik, Thomas Müller, Alexander Keller, Derek Nowrouzezahrai, Morgan McGuire

    Abstract: Interactive global illumination remains a challenge in radiometrically- and geometrically-complex scenes. Specialized sampling strategies are effective for specular and near-specular transport because the scattering has relatively low directional variance per scattering event. In contrast, the high variance from transport paths comprising multiple rough glossy or diffuse scattering events remains… ▽ More

    Submitted 11 August, 2021; originally announced August 2021.

  11. arXiv:2107.11505  [pdf, other

    cs.GR

    Efficient Dataflow Modeling of Peripheral Encoding in the Human Visual System

    Authors: Rachel Brown, Vasha DuTell, Bruce Walter, Ruth Rosenholtz, Peter Shirley, Morgan McGuire, David Luebke

    Abstract: Computer graphics seeks to deliver compelling images, generated within a computing budget, targeted at a specific display device, and ultimately viewed by an individual user. The foveated nature of human vision offers an opportunity to efficiently allocate computation and compression to appropriate areas of the viewer's visual field, especially with the rise of high resolution and wide field-of-vi… ▽ More

    Submitted 23 July, 2021; originally announced July 2021.

  12. arXiv:2105.10568  [pdf, other

    cs.RO cs.CV

    High Throughput Soybean Pod-Counting with In-Field Robotic Data Collection and Machine-Vision Based Data Analysis

    Authors: Michael McGuire, Chinmay Soman, Brian Diers, Girish Chowdhary

    Abstract: We report promising results for high-throughput on-field soybean pod count with small mobile robots and machine-vision algorithms. Our results show that the machine-vision based soybean pod counts are strongly correlated with soybean yield. While pod counts has a strong correlation with soybean yield, pod counting is extremely labor intensive, and has been difficult to automate. Our results establ… ▽ More

    Submitted 27 May, 2021; v1 submitted 21 May, 2021; originally announced May 2021.

  13. arXiv:2103.10031  [pdf, other

    cs.CV

    Robust Vision-Based Cheat Detection in Competitive Gaming

    Authors: Aditya Jonnalagadda, Iuri Frosio, Seth Schneider, Morgan McGuire, Joohwan Kim

    Abstract: Game publishers and anti-cheat companies have been unsuccessful in blocking cheating in online gaming. We propose a novel, vision-based approach that captures the final state of the frame buffer and detects illicit overlays. To this aim, we train and evaluate a DNN detector on a new dataset, collected using two first-person shooter games and three cheating software. We study the advantages and dis… ▽ More

    Submitted 27 March, 2021; v1 submitted 18 March, 2021; originally announced March 2021.

    Comments: 17 pages, 4 figures

  14. arXiv:2103.05875  [pdf, other

    cs.DC cs.GR

    A Distributed, Decoupled System for Losslessly Streaming Dynamic Light Probes to Thin Clients

    Authors: Michael Stengel, Zander Majercik, Benjamin Boudaoud, Morgan McGuire

    Abstract: We present a networked, high performance graphics system that combines dynamic, high quality, ray traced global illumination computed on a server with direct illumination and primary visibility computed on a client. This approach provides many of the image quality benefits of real-time ray tracing on low-power and legacy hardware, while maintaining a low latency response and mobile form factor. Ou… ▽ More

    Submitted 10 March, 2021; originally announced March 2021.

    Comments: 12 pages, 7 figures, 3 tables

  15. arXiv:2101.10994  [pdf, other

    cs.CV cs.GR

    Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

    Authors: Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, Sanja Fidler

    Abstract: Neural signed distance functions (SDFs) are emerging as an effective representation for 3D shapes. State-of-the-art methods typically encode the SDF with a large, fixed-size neural network to approximate complex shapes with implicit surfaces. Rendering with these large networks is, however, computationally expensive since it requires many forward passes through the network for every pixel, making… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

  16. arXiv:2011.01437  [pdf, other

    cs.CV

    Learning Deformable Tetrahedral Meshes for 3D Reconstruction

    Authors: Jun Gao, Wenzheng Chen, Tommy Xiang, Clement Fuji Tsang, Alec Jacobson, Morgan McGuire, Sanja Fidler

    Abstract: 3D shape representations that accommodate learning-based 3D reconstruction are an open problem in machine learning and computer graphics. Previous work on neural 3D reconstruction demonstrated benefits, but also limitations, of point cloud, voxel, surface mesh, and implicit function representations. We introduce Deformable Tetrahedral Meshes (DefTet) as a particular parameterization that utilizes… ▽ More

    Submitted 23 November, 2020; v1 submitted 2 November, 2020; originally announced November 2020.

    Comments: Accepted to NeurIPS 2020. Webpage: https://nv-tlabs.github.io/DefTet/

  17. arXiv:2009.10796  [pdf, other

    cs.GR

    Scaling Probe-Based Real-Time Dynamic Global Illumination for Production

    Authors: Zander Majercik, Adam Marrs, Josef Spjut, Morgan McGuire

    Abstract: We contribute several practical extensions to the probe based irradiance-field-with-visibility representation to improve image quality, constant and asymptotic performance, memory efficiency, and artist control. We developed these extensions in the process of incorporating the previous work into the global illumination solutions of the NVIDIA RTXGI SDK, the Unity and Unreal Engine 4 game engines,… ▽ More

    Submitted 21 June, 2021; v1 submitted 22 September, 2020; originally announced September 2020.

    Comments: Supplemental video: https://youtu.be/vbJ2aNI94Ho Journal of Computer Graphics Techniques (published version): http://www.jcgt.org/published/0010/02/01/

  18. arXiv:2004.01353  [pdf, ps, other

    cs.AR

    Hardware Trojan with Frequency Modulation

    Authors: Ash Luft, Mihai Sima, Michael McGuire

    Abstract: The use of third-party IP cores in implementing applications in FPGAs has given rise to the threat of malicious alterations through the insertion of hardware Trojans. To address this threat, it is important to predict the way hardware Trojans are built and to identify their weaknesses. This paper describes a logic family for implementing robust hardware Trojans, which can evade the two major detec… ▽ More

    Submitted 2 April, 2020; originally announced April 2020.

  19. arXiv:1904.08500  [pdf, other

    cs.CV cs.LG eess.IV

    Machine Vision for Natural Gas Methane Emissions Detection Using an Infrared Camera

    Authors: **gfan Wang, Lyne P. Tchapmi, Arvind P. Ravikumara, Mike McGuire, Clay S. Bell, Daniel Zimmerle, Silvio Savarese, Adam R. Brandt

    Abstract: It is crucial to reduce natural gas methane emissions, which can potentially offset the climate benefits of replacing coal with gas. Optical gas imaging (OGI) is a widely-used method to detect methane leaks, but is labor-intensive and cannot provide leak detection results without operators' judgment. In this paper, we develop a computer vision approach to OGI-based leak detection using convolution… ▽ More

    Submitted 1 April, 2019; originally announced April 2019.

    Comments: This paper was submitted to Applied Energy