Skip to main content

Showing 1–11 of 11 results for author: Wan, L

Searching in archive stat. Search in all archives.
.
  1. arXiv:2402.02701  [pdf, other

    cs.LG cs.AI stat.ML

    Understanding What Affects Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence

    Authors: Jiafei Lyu, Le Wan, Xiu Li, Zongqing Lu

    Abstract: Recently, there are many efforts attempting to learn useful policies for continuous control in visual reinforcement learning (RL). In this scenario, it is important to learn a generalizable policy, as the testing environment may differ from the training environment, e.g., there exist distractors during deployment. Many practical algorithms are proposed to handle this problem. However, to the best… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: Part of this work is accepted as AAMAS 2024 extended abstract

  2. arXiv:2310.04367  [pdf

    stat.ML cs.LG

    A Marketplace Price Anomaly Detection System at Scale

    Authors: Akshit Sarpal, Qiwen Kang, Fang** Huang, Yang Song, Lijie Wan

    Abstract: Online marketplaces execute large volume of price updates that are initiated by individual marketplace sellers each day on the platform. This price democratization comes with increasing challenges with data quality. Lack of centralized guardrails that are available for a traditional online retailer causes a higher likelihood for inaccurate prices to get published on the website, leading to poor cu… ▽ More

    Submitted 9 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: 10 pages, 4 figures, 7 tables

  3. arXiv:2306.08956  [pdf, other

    cs.SD eess.AS stat.ML

    Multi-Loss Convolutional Network with Time-Frequency Attention for Speech Enhancement

    Authors: Liang Wan, Hongqing Liu, Yi Zhou, Jie Ji

    Abstract: The Dual-Path Convolution Recurrent Network (DPCRN) was proposed to effectively exploit time-frequency domain information. By combining the DPRNN module with Convolution Recurrent Network (CRN), the DPCRN obtained a promising performance in speech separation with a limited model size. In this paper, we explore self-attention in the DPCRN module and design a model called Multi-Loss Convolutional Ne… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  4. arXiv:1910.09687  [pdf, other

    cs.LG eess.AS stat.ML

    Signal Combination for Language Identification

    Authors: Shengye Wang, Li Wan, Yang Yu, Ignacio Lopez Moreno

    Abstract: Google's multilingual speech recognition system combines low-level acoustic signals with language-specific recognizer signals to better predict the language of an utterance. This paper presents our experience with different signal combination methods to improve overall language identification accuracy. We compare the performance of a lattice-based ensemble model and a deep neural network model to… ▽ More

    Submitted 4 November, 2019; v1 submitted 21 October, 2019; originally announced October 2019.

  5. arXiv:1909.11532  [pdf, other

    q-fin.CP cs.LG stat.ML

    Deep Neural Network Framework Based on Backward Stochastic Differential Equations for Pricing and Hedging American Options in High Dimensions

    Authors: Yangang Chen, Justin W. L. Wan

    Abstract: We propose a deep neural network framework for computing prices and deltas of American options in high dimensions. The architecture of the framework is a sequence of neural networks, where each network learns the difference of the price functions between adjacent timesteps. We introduce the least squares residual of the associated backward stochastic differential equation as the loss function. Our… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

    Comments: 35 pages, 11 figures, 15 tables

  6. arXiv:1908.04284  [pdf, other

    eess.AS cs.LG stat.ML

    Personal VAD: Speaker-Conditioned Voice Activity Detection

    Authors: Shao** Ding, Quan Wang, Shuo-yiin Chang, Li Wan, Ignacio Lopez Moreno

    Abstract: In this paper, we propose "personal VAD", a system to detect the voice activity of a target speaker at the frame level. This system is useful for gating the inputs to a streaming on-device speech recognition system, such that it only triggers for the target user, which helps reduce the computational cost and battery consumption, especially in scenarios where a keyword detector is unpreferable. We… ▽ More

    Submitted 8 April, 2020; v1 submitted 12 August, 2019; originally announced August 2019.

    Comments: Speaker Odyssey 2020

  7. arXiv:1811.12290  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Tuplemax Loss for Language Identification

    Authors: Li Wan, Prashant Sridhar, Yang Yu, Quan Wang, Ignacio Lopez Moreno

    Abstract: In many scenarios of a language identification task, the user will specify a small set of languages which he/she can speak instead of a large set of all possible languages. We want to model such prior knowledge into the way we train our neural networks, by replacing the commonly used softmax loss function with a novel loss function named tuplemax loss. As a matter of fact, a typical language ident… ▽ More

    Submitted 17 February, 2019; v1 submitted 29 November, 2018; originally announced November 2018.

    Comments: Submitted to ICASSP 2019

  8. arXiv:1801.10123  [pdf, ps, other

    stat.ML cs.LG

    Links: A High-Dimensional Online Clustering Method

    Authors: Philip Andrew Mansfield, Quan Wang, Carlton Downey, Li Wan, Ignacio Lopez Moreno

    Abstract: We present a novel algorithm, called Links, designed to perform online clustering on unit vectors in a high-dimensional Euclidean space. The algorithm is appropriate when it is necessary to cluster data efficiently as it streams in, and is to be contrasted with traditional batch clustering algorithms that have access to all data at once. For example, Links has been successfully applied to embeddin… ▽ More

    Submitted 30 January, 2018; originally announced January 2018.

  9. arXiv:1710.10470  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Attention-Based Models for Text-Dependent Speaker Verification

    Authors: F A Rezaur Rahman Chowdhury, Quan Wang, Ignacio Lopez Moreno, Li Wan

    Abstract: Attention-based models have recently shown great performance on a range of tasks, such as speech recognition, machine translation, and image captioning due to their ability to summarize relevant information that expands through the entire length of an input sequence. In this paper, we analyze the usage of attention mechanisms to the problem of sequence summarization in our end-to-end text-dependen… ▽ More

    Submitted 31 January, 2018; v1 submitted 28 October, 2017; originally announced October 2017.

    Comments: Submitted to ICASSP 2018

  10. arXiv:1710.10468  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Speaker Diarization with LSTM

    Authors: Quan Wang, Carlton Downey, Li Wan, Philip Andrew Mansfield, Ignacio Lopez Moreno

    Abstract: For many years, i-vector based audio embedding techniques were the dominant approach for speaker verification and speaker diarization applications. However, mirroring the rise of deep learning in various domains, neural network based audio embeddings, also known as d-vectors, have consistently demonstrated superior speaker verification performance. In this paper, we build on the success of d-vecto… ▽ More

    Submitted 23 January, 2022; v1 submitted 28 October, 2017; originally announced October 2017.

    Comments: Published at ICASSP 2018

  11. arXiv:1710.10467  [pdf, other

    eess.AS cs.CL cs.LG stat.ML

    Generalized End-to-End Loss for Speaker Verification

    Authors: Li Wan, Quan Wang, Alan Papir, Ignacio Lopez Moreno

    Abstract: In this paper, we propose a new loss function called generalized end-to-end (GE2E) loss, which makes the training of speaker verification models more efficient than our previous tuple-based end-to-end (TE2E) loss function. Unlike TE2E, the GE2E loss function updates the network in a way that emphasizes examples that are difficult to verify at each step of the training process. Additionally, the GE… ▽ More

    Submitted 9 November, 2020; v1 submitted 28 October, 2017; originally announced October 2017.

    Comments: Published at ICASSP 2018