-
Masking Kernel for Learning Energy-Efficient Representations for Speaker Recognition and Mobile Health
Authors:
Apiwat Ditthapron,
Emmanuel O. Agu,
Adam C. Lammert
Abstract:
Modern smartphones possess hardware for audio acquisition and to perform speech processing tasks such as speaker recognition and health assessment. However, energy consumption remains a concern, especially for resource-intensive DNNs. Prior work has improved the DNN energy efficiency by utilizing a compact model or reducing the dimensions of speech features. Both approaches reduced energy consumpt…
▽ More
Modern smartphones possess hardware for audio acquisition and to perform speech processing tasks such as speaker recognition and health assessment. However, energy consumption remains a concern, especially for resource-intensive DNNs. Prior work has improved the DNN energy efficiency by utilizing a compact model or reducing the dimensions of speech features. Both approaches reduced energy consumption during DNN inference but not during speech acquisition. This paper proposes using a masking kernel integrated into gradient descent during DNN training to learn the most energy-efficient speech length and sampling rate for windowing, a common step for sample construction. To determine the most energy-optimal parameters, a masking function with non-zero derivatives was combined with a low-pass filter. The proposed approach minimizes the energy consumption of both data collection and inference by 57%, and is competitive with speaker recognition and traumatic brain injury detection baselines.
△ Less
Submitted 15 August, 2023; v1 submitted 8 February, 2023;
originally announced February 2023.
-
Longitudinal Acoustic Speech Tracking Following Pediatric Traumatic Brain Injury
Authors:
Camille Noufi,
Adam C. Lammert,
Daryush D. Mehta,
James R. Williamson,
Gregory Ciccarelli,
Douglas Sturim,
Jordan R. Green,
Thomas F. Quatieri,
Thomas F. Campbell
Abstract:
Recommendations for common outcome measures following pediatric traumatic brain injury (TBI) support the integration of instrumental measurements alongside perceptual assessment in recovery and treatment plans. A comprehensive set of sensitive, robust and non-invasive measurements is therefore essential in assessing variations in speech characteristics over time following pediatric TBI. In this ar…
▽ More
Recommendations for common outcome measures following pediatric traumatic brain injury (TBI) support the integration of instrumental measurements alongside perceptual assessment in recovery and treatment plans. A comprehensive set of sensitive, robust and non-invasive measurements is therefore essential in assessing variations in speech characteristics over time following pediatric TBI. In this article, we study the changes in the acoustic speech patterns of a pediatric cohort of ten subjects diagnosed with severe TBI. We extract a diverse set of both well-known and novel acoustic features from child speech recorded throughout the year after the child produced intelligible words. These features are analyzed individually and by speech subsystem, within-subject and across the cohort. As a group, older children exhibit highly significant (p<0.01) increases in pitch variation and phoneme diversity, shortened pause length, and steadying articulation rate variability. Younger children exhibit similar steadied rate variability alongside an increase in formant-based articulation complexity. Correlation analysis of the feature set with age and comparisons to normative developmental data confirm that age at injury plays a significant role in framing the recovery trajectory. Nearly all speech features significantly change (p<0.05) for the cohort as a whole, confirming that acoustic measures supplementing perceptual assessment are needed to identify efficacious treatment targets for speech therapy following TBI.
△ Less
Submitted 9 September, 2022;
originally announced September 2022.
-
Derivation of Fitts' law from the Task Dynamics model of speech production
Authors:
Tanner Sorensen,
Adam Lammert,
Louis Goldstein,
Shrikanth Narayanan
Abstract:
Fitts' law is a linear equation relating movement time to an index of movement difficulty. The recent finding that Fitts' law applies to voluntary movement of the vocal tract raises the question of whether the theory of speech production implies Fitts' law. The present letter establishes a theoretical connection between Fitts' law and the Task Dynamics model of speech production. We derive a varia…
▽ More
Fitts' law is a linear equation relating movement time to an index of movement difficulty. The recent finding that Fitts' law applies to voluntary movement of the vocal tract raises the question of whether the theory of speech production implies Fitts' law. The present letter establishes a theoretical connection between Fitts' law and the Task Dynamics model of speech production. We derive a variant of Fitts' law where the intercept and slope are functions of the parameters of the Task Dynamics model and the index of difficulty is a product logarithm, or Lambert W function, rather than a logarithm.
△ Less
Submitted 17 March, 2020; v1 submitted 14 January, 2020;
originally announced January 2020.