Skip to main content

Showing 1–5 of 5 results for author: Hsu, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2310.02971  [pdf, other

    eess.AS cs.CL eess.SP

    Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech Model

    Authors: Kai-Wei Chang, Ming-Hsin Chen, Yun-** Lin, **g Neng Hsu, Paul Kuo-Ming Huang, Chien-yu Huang, Shang-Wen Li, Hung-yi Lee

    Abstract: Prompting and adapter tuning have emerged as efficient alternatives to fine-tuning (FT) methods. However, existing studies on speech prompting focused on classification tasks and failed on more complex sequence generation tasks. Besides, adapter tuning is primarily applied with a focus on encoder-only self-supervised models. Our experiments show that prompting on Wav2Seq, a self-supervised encoder… ▽ More

    Submitted 14 November, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: Accepted to IEEE ASRU 2023

  2. arXiv:2106.00497  [pdf, ps, other

    cs.SD cs.AI eess.AS

    Omnizart: A General Toolbox for Automatic Music Transcription

    Authors: Yu-Te Wu, Yin-Jyun Luo, Tsung-** Chen, I-Chieh Wei, Jui-Yang Hsu, Yi-Chin Chuang, Li Su

    Abstract: We present and release Omnizart, a new Python library that provides a streamlined solution to automatic music transcription (AMT). Omnizart encompasses modules that construct the life-cycle of deep learning-based AMT, and is designed for ease of use with a compact command-line interface. To the best of our knowledge, Omnizart is the first transcription toolkit which offers models covering a wide c… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

  3. arXiv:2009.10858  [pdf, other

    cs.LG cs.CV eess.IV

    Improving Medical Annotation Quality to Decrease Labeling Burden Using Stratified Noisy Cross-Validation

    Authors: Joy Hsu, Sonia Phene, Akinori Mitani, Jieying Luo, Naama Hammel, Jonathan Krause, Rory Sayres

    Abstract: As machine learning has become increasingly applied to medical imaging data, noise in training labels has emerged as an important challenge. Variability in diagnosis of medical images is well established; in addition, variability in training and attention to task among medical labelers may exacerbate this issue. Methods for identifying and mitigating the impact of low quality labels have been stud… ▽ More

    Submitted 22 September, 2020; originally announced September 2020.

    Journal ref: ACM Conference on Health, Inference, and Learning, April 02-04, 2020, Toronto, Canada

  4. arXiv:2005.07029  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation

    Authors: Yi-Chen Chen, Jui-Yang Hsu, Cheng-Kuang Lee, Hung-yi Lee

    Abstract: In previous works, only parameter weights of ASR models are optimized under fixed-topology architecture. However, the design of successful model architecture has always relied on human experience and intuition. Besides, many hyperparameters related to model architecture need to be manually tuned. Therefore in this paper, we propose an ASR approach with efficient gradient-based architecture search,… ▽ More

    Submitted 25 July, 2020; v1 submitted 13 May, 2020; originally announced May 2020.

    Comments: Accepted at INTERSPEECH 2020

  5. arXiv:1910.12094  [pdf, other

    cs.SD cs.CL eess.AS

    Meta Learning for End-to-End Low-Resource Speech Recognition

    Authors: Jui-Yang Hsu, Yuan-Jui Chen, Hung-yi Lee

    Abstract: In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the propose… ▽ More

    Submitted 26 October, 2019; originally announced October 2019.

    Comments: 5 pages, submitted to ICASSP 2020