Skip to main content

Showing 1–3 of 3 results for author: Janin, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2206.06192  [pdf, ps, other

    eess.AS cs.CL cs.SD

    Toward Zero Oracle Word Error Rate on the Switchboard Benchmark

    Authors: Arlo Faria, Adam Janin, Korbinian Riedhammer, Sidhi Adkoli

    Abstract: The "Switchboard benchmark" is a very well-known test set in automatic speech recognition (ASR) research, establishing record-setting performance for systems that claim human-level transcription accuracy. This work highlights lesser-known practical considerations of this evaluation, demonstrating major improvements in word error rate (WER) by correcting the reference transcriptions and deviating f… ▽ More

    Submitted 27 June, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

    Comments: Submitted to Interspeech 2022

  2. arXiv:1607.04378  [pdf, other

    cs.SD cs.MM

    DCAR: A Discriminative and Compact Audio Representation to Improve Event Detection

    Authors: Li** **g, Bo Liu, Jaeyoung Choi, Adam Janin, Julia Bernd, Michael W. Mahoney, Gerald Friedland

    Abstract: This paper presents a novel two-phase method for audio representation, Discriminative and Compact Audio Representation (DCAR), and evaluates its performance at detecting events in consumer-produced videos. In the first phase of DCAR, each audio track is modeled using a Gaussian mixture model (GMM) that includes several components to capture the variability within that track. The second phase takes… ▽ More

    Submitted 15 July, 2016; originally announced July 2016.

    Comments: An abbreviated version of this paper will be published in ACM Multimedia 2016

    ACM Class: H.5.1

  3. arXiv:1503.04250  [pdf, other

    cs.MM cs.CL

    The YLI-MED Corpus: Characteristics, Procedures, and Plans

    Authors: Julia Bernd, Damian Borth, Benjamin Elizalde, Gerald Friedland, Heather Gallagher, Luke Gottlieb, Adam Janin, Sara Karabashlieva, Jocelyn Takahashi, Jennifer Won

    Abstract: The YLI Multimedia Event Detection corpus is a public-domain index of videos with annotations and computed features, specialized for research in multimedia event detection (MED), i.e., automatically identifying what's happening in a video by analyzing the audio and visual content. The videos indexed in the YLI-MED corpus are a subset of the larger YLI feature corpus, which is being developed by th… ▽ More

    Submitted 13 March, 2015; originally announced March 2015.

    Comments: 47 pages; 3 figures; 25 tables. Also published as ICSI Technical Report TR-15-001

    Report number: TR-15-001