-
DMLR: Data-centric Machine Learning Research -- Past, Present and Future
Authors:
Luis Oala,
Manil Maskey,
Lilith Bat-Leah,
Alicia Parrish,
Nezihe Merve Gürel,
Tzu-Sheng Kuo,
Yang Liu,
Rotem Dror,
Danilo Brajovic,
Xiaozhe Yao,
Max Bartolo,
William A Gaviria Rojas,
Ryan Hileman,
Rainier Aliment,
Michael W. Mahoney,
Meg Risdal,
Matthew Lease,
Wojciech Samek,
Debojyoti Dutta,
Curtis G Northcutt,
Cody Coleman,
Braden Hancock,
Bernard Koch,
Girmaw Abebe Tadesse,
Bojan Karlaš
, et al. (13 additional authors not shown)
Abstract:
Drawing from discussions at the inaugural DMLR workshop at ICML 2023 and meetings prior, in this report we outline the relevance of community engagement and infrastructure development for the creation of next-generation public datasets that will advance machine learning science. We chart a path forward as a collective effort to sustain the creation and maintenance of these datasets and methods tow…
▽ More
Drawing from discussions at the inaugural DMLR workshop at ICML 2023 and meetings prior, in this report we outline the relevance of community engagement and infrastructure development for the creation of next-generation public datasets that will advance machine learning science. We chart a path forward as a collective effort to sustain the creation and maintenance of these datasets and methods towards positive scientific, societal and business impact.
△ Less
Submitted 1 June, 2024; v1 submitted 21 November, 2023;
originally announced November 2023.
-
SEOFP-NET: Compression and Acceleration of Deep Neural Networks for Speech Enhancement Using Sign-Exponent-Only Floating-Points
Authors:
Yu-Chen Lin,
Cheng Yu,
Yi-Te Hsu,
Szu-Wei Fu,
Yu Tsao,
Tei-Wei Kuo
Abstract:
Numerous compression and acceleration strategies have achieved outstanding results on classification tasks in various fields, such as computer vision and speech signal processing. Nevertheless, the same strategies have yielded ungratified performance on regression tasks because the nature between these and classification tasks differs. In this paper, a novel sign-exponent-only floating-point netwo…
▽ More
Numerous compression and acceleration strategies have achieved outstanding results on classification tasks in various fields, such as computer vision and speech signal processing. Nevertheless, the same strategies have yielded ungratified performance on regression tasks because the nature between these and classification tasks differs. In this paper, a novel sign-exponent-only floating-point network (SEOFP-NET) technique is proposed to compress the model size and accelerate the inference time for speech enhancement, a regression task of speech signal processing. The proposed method compressed the sizes of deep neural network (DNN)-based speech enhancement models by quantizing the fraction bits of single-precision floating-point parameters during training. Before inference implementation, all parameters in the trained SEOFP-NET model are slightly adjusted to accelerate the inference time by replacing the floating-point multiplier with an integer-adder. For generalization, the SEOFP-NET technique is introduced to different speech enhancement tasks in speech signal processing with different model architectures under various corpora. The experimental results indicate that the size of SEOFP-NET models can be significantly compressed by up to 81.249% without noticeably downgrading their speech enhancement performance, and the inference time can be accelerated to 1.212x compared with the baseline models. The results also verify that the proposed SEOFP-NET can cooperate with other efficiency strategies to achieve a synergy effect for model compression. In addition, the just noticeable difference (JND) was applied to the user study experiment to statistically analyze the effect of speech enhancement on listening. The results indicate that the listeners cannot facilely differentiate between the enhanced speech signals processed by the baseline model and the proposed SEOFP-NET.
△ Less
Submitted 8 November, 2021;
originally announced November 2021.
-
Speech Recovery for Real-World Self-powered Intermittent Devices
Authors:
Yu-Chen Lin,
Tsun-An Hsieh,
Kuo-Hsuan Hung,
Cheng Yu,
Harinath Garudadri,
Yu Tsao,
Tei-Wei Kuo
Abstract:
The incompleteness of speech inputs severely degrades the performance of all the related speech signal processing applications. Although many researches have been proposed to address this issue, they controlled the data missing conditions by simulation with self-defined masking lengths or sizes. Besides, the masking definitions are different among all these experimental settings. This paper presen…
▽ More
The incompleteness of speech inputs severely degrades the performance of all the related speech signal processing applications. Although many researches have been proposed to address this issue, they controlled the data missing conditions by simulation with self-defined masking lengths or sizes. Besides, the masking definitions are different among all these experimental settings. This paper presents a novel intermittent speech recovery (ISR) system for real-world self-powered intermittent devices. Three contributive stages: interpolation, enhancement, and combination are applied to the ISR system for speech reconstruction. The experimental results show that our recovery system increases speech quality by up to 591.7%, while increasing speech intelligibility by up to 80.5%. Most importantly, the proposed ISR system improves the WER scores by up to 52.6%. The promising results not only confirm the effectiveness of the reconstruction but also encourage the utilization of these battery-free wearable/IoT devices.
△ Less
Submitted 24 January, 2022; v1 submitted 9 June, 2021;
originally announced June 2021.
-
Efficient attention guided 5G power amplifier digital predistortion
Authors:
Alexandru Cioba,
Alvin Chua,
Da-shan Shiu,
Ting-Hsun Kuo,
Chia-Sheng Peng
Abstract:
We investigate neural network (NN) assisted techniques for compensating the non-linear behaviour and the memory effect of a 5G PA through digital predistortion (DPD). Traditionally, the most prevalent compensation technique computes the compensation element using a Memory Polynomial Model (MPM). Various neural network proposals have been shown to improve on this performance. However, thus far they…
▽ More
We investigate neural network (NN) assisted techniques for compensating the non-linear behaviour and the memory effect of a 5G PA through digital predistortion (DPD). Traditionally, the most prevalent compensation technique computes the compensation element using a Memory Polynomial Model (MPM). Various neural network proposals have been shown to improve on this performance. However, thus far they mostly come with prohibitive training or inference costs for real world implementations. In this paper, we propose a DPD architecture that builds upon the practical MPM formulation governed by neural attention. Our approach enables a set of MPM DPD components to individually learn to target different regions of the data space, combining their outputs for a superior overall compensation. Our method produces similar performance to that of higher capacity NN models with minimal complexity. Finally, we view our approach as a framework that can be extended to a wide variety of local compensator types.
△ Less
Submitted 30 March, 2020;
originally announced March 2020.
-
A study on speech enhancement using exponent-only floating point quantized neural network (EOFP-QNN)
Authors:
Yi-Te Hsu,
Yu-Chen Lin,
Szu-Wei Fu,
Yu Tsao,
Tei-Wei Kuo
Abstract:
Numerous studies have investigated the effectiveness of neural network quantization on pattern classification tasks. The present study, for the first time, investigated the performance of speech enhancement (a regression task in speech processing) using a novel exponent-only floating-point quantized neural network (EOFP-QNN). The proposed EOFP-QNN consists of two stages: mantissa-quantization and…
▽ More
Numerous studies have investigated the effectiveness of neural network quantization on pattern classification tasks. The present study, for the first time, investigated the performance of speech enhancement (a regression task in speech processing) using a novel exponent-only floating-point quantized neural network (EOFP-QNN). The proposed EOFP-QNN consists of two stages: mantissa-quantization and exponent-quantization. In the mantissa-quantization stage, EOFP-QNN learns how to quantize the mantissa bits of the model parameters while preserving the regression accuracy using the least mantissa precision. In the exponent-quantization stage, the exponent part of the parameters is further quantized without causing any additional performance degradation. We evaluated the proposed EOFP quantization technique on two types of neural networks, namely, bidirectional long short-term memory (BLSTM) and fully convolutional neural network (FCN), on a speech enhancement task. Experimental results showed that the model sizes can be significantly reduced (the model sizes of the quantized BLSTM and FCN models were only 18.75% and 21.89%, respectively, compared to those of the original models) while maintaining satisfactory speech-enhancement performance.
△ Less
Submitted 30 October, 2018; v1 submitted 17 August, 2018;
originally announced August 2018.