Skip to main content

Showing 1–13 of 13 results for author: Irino, T

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.05286  [pdf, other

    eess.AS cs.SD

    Signal processing algorithm effective for sound quality of hearing loss simulators

    Authors: Toshio Irino, Shintaro Doan, Minami Ishikawa

    Abstract: Hearing loss (HL) simulators, which allow normal hearing (NH) listeners to experience HL, have been used in speech intelligibility experiments, but not in sound quality experiments due to perceptible distortion. If they produced less distortion, they might be useful for NH listeners to evaluate the sound quality of, for example, hearing aids. We conducted perceptual sound quality experiments to co… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted for publication in Interspeech 2024

  2. arXiv:2310.15399  [pdf, ps, other

    eess.AS cs.SD

    GESI: Gammachirp Envelope Similarity Index for Predicting Intelligibility of Simulated Hearing Loss Sounds

    Authors: Ayako Yamamoto, Toshio Irino, Fuki Miyazaki, Honoka Tamaru

    Abstract: We propose an objective intelligibility measure (OIM), called the Gammachirp Envelope Similarity Index (GESI), which can predict the speech intelligibility (SI) of simulated hearing loss (HL) sounds for normal hearing (NH) listeners. GESI is an intrusive method that computes the SI metric using the gammachirp filterbank (GCFB), the modulation filterbank, and the extended cosine similarity measure.… ▽ More

    Submitted 13 March, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: This paper was submitted to JASA on March 14, 2024

  3. arXiv:2306.01522  [pdf, ps, other

    eess.AS cs.SD

    Auditory Representation Effective for Estimating Vocal Tract Information

    Authors: Toshio Irino, Shintaro Doan

    Abstract: We can estimate the size of the speakers based on their speech sounds alone. We had proposed an auditory computational theory of the Stabilised Wavelet-Mellin Transform (SWMT), which segregates information about the size and shape of the vocal tract and glottal vibration, to explain this observation. It has been shown that the auditory representation or excitation pattern (EP) associated with a we… ▽ More

    Submitted 14 September, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: This manuscript is a revised version after acceptance for publication in Proc. APSIPA ASC 2023 on August 25, 2023

  4. WHIS: Hearing impairment simulator based on the gammachirp auditory filterbank

    Authors: Toshio Irino

    Abstract: A new version of a hearing impairment simulator (WHIS) was implemented based on a revised version of the gammachirp filterbank (GCFB), which incorporates fast frame-based processing, absolute threshold (AT), an audiogram of a hearing-impaired (HI) listener, and a parameter to control the cochlear input-output (IO) function. The parameter referred to as the compression health $α$ controlled the slo… ▽ More

    Submitted 28 November, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: This preprint was an original version that was unsuccessfully submitted to Trends in Hearing on June 5, 2022. The revised version has been accepted for publication in IEEE access. See https://doi.org/10.1109/ACCESS.2023.3298673 ( https://ieeexplore.ieee.org/document/10193769 )

    Journal ref: IEEE access, 25 July 2023

  5. Speech intelligibility of simulated hearing loss sounds and its prediction using the Gammachirp Envelope Similarity Index (GESI)

    Authors: Toshio Irino, Honoka Tamaru, Ayako Yamamoto

    Abstract: In the present study, speech intelligibility (SI) experiments were performed using simulated hearing loss (HL) sounds in laboratory and remote environments to clarify the effects of peripheral dysfunction. Noisy speech sounds were processed to simulate the average HL of 70- and 80-year-olds using Wadai Hearing Impairment Simulator (WHIS). These sounds were presented to normal hearing (NH) listener… ▽ More

    Submitted 28 November, 2023; v1 submitted 13 June, 2022; originally announced June 2022.

    Comments: This preprint is a copy of the final version accepted for Interspeech 2022. See https://doi.org/10.21437/Interspeech.2022-211

    Journal ref: Proc. Interspeech 2022

  6. Effective data screening technique for crowdsourced speech intelligibility experiments: Evaluation with IRM-based speech enhancement

    Authors: Ayako Yamamoto, Toshio Irino, Shoko Araki, Kenichi Arai, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani

    Abstract: It is essential to perform speech intelligibility (SI) experiments with human listeners in order to evaluate objective intelligibility measures for develo** effective speech enhancement and noise reduction algorithms. Recently, crowdsourced remote testing has become a popular means for collecting a massive amount and variety of data at a relatively small cost and in a short time. However, carefu… ▽ More

    Submitted 19 August, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

    Comments: This paper was submitted to APSIPA ASC 2022 (https://www.apsipa2022.org). The original title [v1] was "Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening."

    Journal ref: Proc. APSIPA ASC 2022

  7. arXiv:2109.11594  [pdf, ps, other

    cs.SD cs.HC eess.AS eess.SP

    Implementation of interactive tools for investigating fundamental frequency response of voiced sounds to auditory stimulation

    Authors: Hideki Kawahara, Toshie Matsui Kohei, Yatabe Ken-Ichi Sakakibara Minoru Tsuzaki Masanori Morise Toshio Irino

    Abstract: We introduced a measurement procedure for the involuntary response of voice fundamental-frequency to frequency modulated auditory stimulation. This involuntary response plays an essential role in voice fundamental frequency control while less investigated due to technical difficulties. This article introduces an interactive and real-time tool for investigating this response and supporting tools ad… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: Accepted for APSIPA ASC 2021

    MSC Class: 91E45; 92-04; 94A11

  8. Comparison of remote experiments using crowdsourcing and laboratory experiments on speech intelligibility

    Authors: Ayako Yamamoto, Toshio Irino, Kenichi Arai, Shoko Araki, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani

    Abstract: Many subjective experiments have been performed to develop objective speech intelligibility measures, but the novel coronavirus outbreak has made it very difficult to conduct experiments in a laboratory. One solution is to perform remote testing using crowdsourcing; however, because we cannot control the listening conditions, it is unclear whether the results are entirely reliable. In this study,… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

    Comments: This paper was submitted to Interspeech2021

    Journal ref: Proc. Interspeech 2021

  9. arXiv:2104.01444  [pdf, ps, other

    cs.SD eess.AS eess.SP

    Mixture of orthogonal sequences made from extended time-stretched pulses enables measurement of involuntary voice fundamental frequency response to pitch perturbation

    Authors: Hideki Kawahara, Toshie Matsui, Kohei Yatabe, Ken-Ichi Sakakibara, Minoru Tsuzaki, Masanori Morise, Toshio Irino

    Abstract: Auditory feedback plays an essential role in the regulation of the fundamental frequency of voiced sounds. The fundamental frequency also responds to auditory stimulation other than the speaker's voice. We propose to use this response of the fundamental frequency of sustained vowels to frequency-modulated test signals for investigating involuntary control of voice pitch. This involuntary response… ▽ More

    Submitted 3 April, 2021; originally announced April 2021.

    Comments: 5 pages, 9 figures, submitted to Interspeech2021

    MSC Class: 92C55

  10. Frequency domain variant of Velvet noise and its application to acoustic measurements

    Authors: Hideki Kawahara, Ken-Ichi Sakakibara, Mitsunori Mizumachi, Hideki Banno, Masanori Morise, Toshio Irino

    Abstract: We propose a new family of test signals for acoustic measurements such as impulse response, nonlinearity, and the effects of background noise. The proposed family complements difficulties in existing families, the Swept-Sine (SS), pseudo-random noise such as the maximum length sequence (MLS). The proposed family uses the frequency domain variant of the Velvet noise (FVN) as its building block. An… ▽ More

    Submitted 10 September, 2019; originally announced September 2019.

    Comments: 10 pages, 14 figures, APSIPA ASC 2019. arXiv admin note: text overlap with arXiv:1806.06812

    Journal ref: 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Lanzhou, China, 2019, pp. 1523-1532

  11. GEDI: Gammachirp Envelope Distortion Index for Predicting Intelligibility of Enhanced Speech

    Authors: Katsuhiko Yamamoto, Toshio Irino, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani

    Abstract: In this study, we propose a new concept, the gammachirp envelope distortion index (GEDI), based on the signal-to-distortion ratio in the auditory envelope, SDRenv to predict the intelligibility of speech enhanced by nonlinear algorithms. The objective of GEDI is to calculate the distortion between enhanced and clean-speech representations in the domain of a temporal envelope extracted by the gamma… ▽ More

    Submitted 19 July, 2020; v1 submitted 3 April, 2019; originally announced April 2019.

    Comments: Preprint, 37 pages, 6 tables, 9 figures

    Journal ref: Speech Communication, Vol. 123, pp. 43-58, 2020

  12. arXiv:1806.06812  [pdf, ps, other

    cs.SD eess.AS eess.SP

    Frequency domain variants of velvet noise and their application to speech processing and synthesis: with appendices

    Authors: Hideki Kawahara, Ken-Ichi Sakakibara, Masanori Morise, Hideki Banno, Tomoki Toda, Toshio Irino

    Abstract: We propose a new excitation source signal for VOCODERs and an all-pass impulse response for post-processing of synthetic sounds and pre-processing of natural sounds for data-augmentation. The proposed signals are variants of velvet noise, which is a sparse discrete signal consisting of a few non-zero (1 or -1) elements and sounds smoother than Gaussian white noise. One of the proposed variants, FV… ▽ More

    Submitted 18 June, 2018; originally announced June 2018.

    Comments: 11 pages, 20 figures, and 1 table, Interspeech 2018

  13. arXiv:1702.06724  [pdf, ps, other

    eess.AS cs.SD eess.SP

    A new cosine series antialiasing function and its application to aliasing-free glottal source models for speech and singing synthesis

    Authors: Hideki Kawahara, Ken-Ichi Sakakibara, Hideki Banno, Masanori Morise, Tomoki Toda, Toshio Irino

    Abstract: We formulated and implemented a procedure to generate aliasing-free excitation source signals. It uses a new antialiasing filter in the continuous time domain followed by an IIR digital filter for response equalization. We introduced a cosine-series-based general design procedure for the new antialiasing function. We applied this new procedure to implement the antialiased Fujisaki-Ljungqvist model… ▽ More

    Submitted 8 June, 2017; v1 submitted 22 February, 2017; originally announced February 2017.

    Comments: Submitted to Interspeech 2017

    Journal ref: Proc. Interspeech 2017, pp.1358-1362