Skip to main content

Showing 1–2 of 2 results for author: Churchill, A

Searching in archive eess. Search in all archives.
.
  1. A Multimodal Approach to Device-Directed Speech Detection with Large Language Models

    Authors: Dominik Wagner, Alexander Churchill, Siddharth Sigtia, Panayiotis Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi

    Abstract: Interactions with virtual assistants typically start with a predefined trigger phrase followed by the user command. To make interactions with the assistant more intuitive, we explore whether it is feasible to drop the requirement that users must begin each command with a trigger phrase. We explore this task in three ways: First, we train classifiers using only acoustic information obtained from th… ▽ More

    Submitted 26 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: text overlap with arXiv:2312.03632

  2. arXiv:2312.03632  [pdf, other

    cs.SD cs.LG eess.AS

    Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models

    Authors: Dominik Wagner, Alexander Churchill, Siddharth Sigtia, Panayiotis Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi

    Abstract: Interactions with virtual assistants typically start with a trigger phrase followed by a command. In this work, we explore the possibility of making these interactions more natural by eliminating the need for a trigger phrase. Our goal is to determine whether a user addressed the virtual assistant based on signals obtained from the streaming audio recorded by the device microphone. We address this… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.