-
A pilot protocol and cohort for the investigation of non-pathological variability in speech
Authors:
Nicholas Cummins,
Lauren L. White,
Zahia Rahman,
Catriona Lucas,
Tian Pan,
Ewan Carr,
Faith Matcham,
Johnny Downs,
Richard J. Dobson,
Judith Dineley
Abstract:
Background Speech-based biomarkers have potential as a means for regular, objective assessment of symptom severity, remotely and in-clinic in combination with advanced analytical models. However, the complex nature of speech and the often subtle changes associated with health mean that findings are highly dependent on methodological and cohort choices. These are often not reported adequately in st…
▽ More
Background Speech-based biomarkers have potential as a means for regular, objective assessment of symptom severity, remotely and in-clinic in combination with advanced analytical models. However, the complex nature of speech and the often subtle changes associated with health mean that findings are highly dependent on methodological and cohort choices. These are often not reported adequately in studies investigating speech-based health assessment Objective To develop and apply an exemplar protocol to generate a pilot dataset of healthy speech with detailed metadata for the assessment of factors in the speech recording-analysis pipeline, including device choice, speech elicitation task and non-pathological variability. Methods We developed our collection protocol and choice of exemplar speech features based on a thematic literature review. Our protocol includes the elicitation of three different speech types. With a focus towards remote applications, we also choose to collect speech with three different microphone types. We developed a pipeline to extract a set of 14 exemplar speech features. Results We collected speech from 28 individuals three times in one day, repeated at the same times 8-11 weeks later, and from 25 healthy individuals three times in one week. Participant characteristics collected included sex, age, native language status and voice use habits of the participant. A preliminary set of 14 speech features covering timing, prosody, voice quality, articulation and spectral moment characteristics were extracted that provide a resource of normative values. Conclusions There are multiple methodological factors involved in the collection, processing and analysis of speech recordings. Consistent reporting and greater harmonisation of study protocols are urgently required to aid the translation of speech processing into clinical research and practice.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Deciphering seasonal depression variations and interplays between weather changes, physical activity, and depression severity in real-world settings: Learnings from RADAR-MDD longitudinal mobile health study
Authors:
Yuezhou Zhang,
Amos A. Folarin,
Yatharth Ranjan,
Nicholas Cummins,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Shaoxiong Sun,
Srinivasan Vairavan,
Faith Matcham,
Carolin Oetzmann,
Sara Siddi,
Femke Lamers,
Sara Simblett,
Til Wykes,
David C. Mohr,
Josep Maria Haro,
Brenda W. J. H. Penninx,
Vaibhav A. Narayan,
Matthew Hotopf,
Richard J. B. Dobson,
Abhishek Pratap,
RADAR-CNS consortium
Abstract:
Prior research has shown that changes in seasons and weather can have a significant impact on depression severity. However, findings are inconsistent across populations, and the interplay between weather, behavior, and depression has not been fully quantified. This study analyzed real-world data from 428 participants (a subset; 68.7% of the cohort) in the RADAR-MDD longitudinal mobile health study…
▽ More
Prior research has shown that changes in seasons and weather can have a significant impact on depression severity. However, findings are inconsistent across populations, and the interplay between weather, behavior, and depression has not been fully quantified. This study analyzed real-world data from 428 participants (a subset; 68.7% of the cohort) in the RADAR-MDD longitudinal mobile health study to investigate seasonal variations in depression (measured through a remote validated assessment - PHQ-8) and examine the potential interplay between dynamic weather changes, physical activity (monitored via wearables), and depression severity. The clustering of PHQ-8 scores identified four distinct seasonal variations in depression severity: one stable trend and three varying patterns where depression peaks in different seasons. Among these patterns, participants within the stable trend had the oldest average age (p=0.002) and the lowest baseline PHQ-8 score (p=0.003). Mediation analysis assessing the indirect effect of weather on physical activity and depression showed significant differences among participants with different affective responses to weather. These findings illustrate the heterogeneity in individuals' seasonal depression variations and responses to weather, underscoring the necessity for personalized approaches to help understand the impact of environmental factors on the real-world effectiveness of behavioral treatments.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Longitudinal Assessment of Seasonal Impacts and Depression Associations on Circadian Rhythm Using Multimodal Wearable Sensing
Authors:
Yuezhou Zhang,
Amos A Folarin,
Shaoxiong Sun,
Nicholas Cummins,
Yatharth Ranjan,
Zulqarnain Rashid,
Callum Stewart,
Pauline Conde,
Heet Sankesara,
Petroula Laiou,
Faith Matcham,
Katie M White,
Carolin Oetzmann,
Femke Lamers,
Sara Siddi,
Sara Simblett,
Srinivasan Vairavan,
Inez Myin-Germeys,
David C. Mohr,
Til Wykes,
Josep Maria Haro,
Peter Annas,
Brenda WJH Penninx,
Vaibhav A Narayan,
Matthew Hotopf
, et al. (2 additional authors not shown)
Abstract:
Objective: This study aimed to explore the associations between depression severity and wearable-measured circadian rhythms, accounting for seasonal impacts and quantifying seasonal changes in circadian rhythms.Materials and Methods: Data used in this study came from a large longitudinal mobile health study. Depression severity (measured biweekly using the 8-item Patient Health Questionnaire [PHQ-…
▽ More
Objective: This study aimed to explore the associations between depression severity and wearable-measured circadian rhythms, accounting for seasonal impacts and quantifying seasonal changes in circadian rhythms.Materials and Methods: Data used in this study came from a large longitudinal mobile health study. Depression severity (measured biweekly using the 8-item Patient Health Questionnaire [PHQ-8]) and behaviors (monitored by Fitbit) were tracked for up to two years. Twelve features were extracted from Fitbit recordings to approximate circadian rhythms. Three nested linear mixed-effects models were employed for each feature: (1) incorporating the PHQ-8 score as an independent variable; (2) adding the season variable; and (3) adding an interaction term between season and the PHQ-8 score. Results: This study analyzed 10,018 PHQ-8 records with Fitbit data from 543 participants. Upon adjusting for seasonal effects, higher PHQ-8 scores were associated with reduced activity, irregular behaviors, and delayed rhythms. Notably, the negative association with daily step counts was stronger in summer and spring than in winter, and the positive association with the onset of the most active continuous 10-hour period was significant only during summer. Furthermore, participants had shorter and later sleep, more activity, and delayed circadian rhythms in summer compared to winter. Discussion and Conclusions: Our findings underscore the significant seasonal impacts on human circadian rhythms and their associations with depression and indicate that wearable-measured circadian rhythms have the potential to be the digital biomarkers of depression.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Identifying depression-related topics in smartphone-collected free-response speech recordings using an automatic speech recognition system and a deep learning topic model
Authors:
Yuezhou Zhang,
Amos A Folarin,
Judith Dineley,
Pauline Conde,
Valeria de Angel,
Shaoxiong Sun,
Yatharth Ranjan,
Zulqarnain Rashid,
Callum Stewart,
Petroula Laiou,
Heet Sankesara,
Linglong Qian,
Faith Matcham,
Katie M White,
Carolin Oetzmann,
Femke Lamers,
Sara Siddi,
Sara Simblett,
Björn W. Schuller,
Srinivasan Vairavan,
Til Wykes,
Josep Maria Haro,
Brenda WJH Penninx,
Vaibhav A Narayan,
Matthew Hotopf
, et al. (3 additional authors not shown)
Abstract:
Language use has been shown to correlate with depression, but large-scale validation is needed. Traditional methods like clinic studies are expensive. So, natural language processing has been employed on social media to predict depression, but limitations remain-lack of validated labels, biased user samples, and no context. Our study identified 29 topics in 3919 smartphone-collected speech recordi…
▽ More
Language use has been shown to correlate with depression, but large-scale validation is needed. Traditional methods like clinic studies are expensive. So, natural language processing has been employed on social media to predict depression, but limitations remain-lack of validated labels, biased user samples, and no context. Our study identified 29 topics in 3919 smartphone-collected speech recordings from 265 participants using the Whisper tool and BERTopic model. Six topics with a median PHQ-8 greater than or equal to 10 were regarded as risk topics for depression: No Expectations, Sleep, Mental Therapy, Haircut, Studying, and Coursework. To elucidate the topic emergence and associations with depression, we compared behavioral (from wearables) and linguistic characteristics across identified topics. The correlation between topic shifts and changes in depression severity over time was also investigated, indicating the importance of longitudinally monitoring language use. We also tested the BERTopic model on a similar smaller dataset (356 speech recordings from 57 participants), obtaining some consistent results. In summary, our findings demonstrate specific speech topics may indicate depression severity. The presented data-driven workflow provides a practical approach to collecting and analyzing large-scale speech data from real-world settings for digital health research.
△ Less
Submitted 5 September, 2023; v1 submitted 22 August, 2023;
originally announced August 2023.
-
Towards robust paralinguistic assessment for real-world mobile health (mHealth) monitoring: an initial study of reverberation effects on speech
Authors:
Judith Dineley,
Ewan Carr,
Faith Matcham,
Johnny Downs,
Richard Dobson,
Thomas F Quatieri,
Nicholas Cummins
Abstract:
Speech is promising as an objective, convenient tool to monitor health remotely over time using mobile devices. Numerous paralinguistic features have been demonstrated to contain salient information related to an individual's health. However, mobile device specification and acoustic environments vary widely, risking the reliability of the extracted features. In an initial step towards quantifying…
▽ More
Speech is promising as an objective, convenient tool to monitor health remotely over time using mobile devices. Numerous paralinguistic features have been demonstrated to contain salient information related to an individual's health. However, mobile device specification and acoustic environments vary widely, risking the reliability of the extracted features. In an initial step towards quantifying these effects, we report the variability of 13 exemplar paralinguistic features commonly reported in the speech-health literature and extracted from the speech of 42 healthy volunteers recorded consecutively in rooms with low and high reverberation with one budget and two higher-end smartphones and a condenser microphone. Our results show reverberation has a clear effect on several features, in particular voice quality markers. They point to new research directions investigating how best to record and process in-the-wild speech for reliable longitudinal health state assessment.
△ Less
Submitted 31 May, 2023; v1 submitted 21 May, 2023;
originally announced May 2023.
-
Challenges in Using mHealth Data From Smartphones and Wearable Devices to Predict Depression Symptom Severity: Retrospective Analysis
Authors:
Shaoxiong Sun,
Amos A. Folarin,
Yuezhou Zhang,
Nicholas Cummins,
Rafael Garcia-Dias,
Callum Stewart,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Petroula Laiou,
Heet Sankesara,
Faith Matcham,
Daniel Leightley,
Katie M. White,
Carolin Oetzmann,
Alina Ivan,
Femke Lamers,
Sara Siddi,
Sara Simblett,
Raluca Nica,
Aki Rintala,
David C. Mohr,
Inez Myin-Germeys,
Til Wykes,
Josep Maria Haro
, et al. (6 additional authors not shown)
Abstract:
A number of challenges exist for the analysis of mHealth data: maintaining participant engagement over extended time periods and therefore understanding what constitutes an acceptable threshold of missing data; distinguishing between the cross-sectional and longitudinal relationships for different features to determine their utility in tracking within-individual longitudinal variation or screening…
▽ More
A number of challenges exist for the analysis of mHealth data: maintaining participant engagement over extended time periods and therefore understanding what constitutes an acceptable threshold of missing data; distinguishing between the cross-sectional and longitudinal relationships for different features to determine their utility in tracking within-individual longitudinal variation or screening individuals at high risk; and understanding the heterogeneity with which depression manifests itself in behavioral patterns quantified by the passive features. From 479 participants with MDD, we extracted 21 features capturing mobility, sleep, and smartphone use. We investigated the impact of the number of days of available data on feature quality using the intraclass correlation coefficient and Bland-Altman analysis. We then examined the nature of the correlation between the 8-item Patient Health Questionnaire (PHQ-8) depression scale (measured every 14 days) and the features using the individual-mean correlation, repeated measures correlation, and linear mixed effects model. Furthermore, we stratified the participants based on their behavioral difference, quantified by the features, between periods of high (depression) and low (no depression) PHQ-8 scores using the Gaussian mixture model. We demonstrated that at least 8 (range 2-12) days were needed for reliable calculation of most of the features in the 14-day time window. We observed that features such as sleep onset time correlated better with PHQ-8 scores cross-sectionally than longitudinally, whereas features such as wakefulness after sleep onset correlated well with PHQ-8 longitudinally but worse cross-sectionally. Finally, we found that participants could be separated into 3 distinct clusters according to their behavioral difference between periods of depression and periods of no depression.
△ Less
Submitted 14 August, 2023; v1 submitted 20 December, 2022;
originally announced December 2022.
-
Bayesian Networks for the robust and unbiased prediction of depression and its symptoms utilizing speech and multimodal data
Authors:
Salvatore Fara,
Orlaith Hickey,
Alexandra Georgescu,
Stefano Goria,
Emilia Molimpakis,
Nicholas Cummins
Abstract:
Predicting the presence of major depressive disorder (MDD) using behavioural and cognitive signals is a highly non-trivial task. The heterogeneous clinical profile of MDD means that any given speech, facial expression and/or observed cognitive pattern may be associated with a unique combination of depressive symptoms. Conventional discriminative machine learning models potentially lack the complex…
▽ More
Predicting the presence of major depressive disorder (MDD) using behavioural and cognitive signals is a highly non-trivial task. The heterogeneous clinical profile of MDD means that any given speech, facial expression and/or observed cognitive pattern may be associated with a unique combination of depressive symptoms. Conventional discriminative machine learning models potentially lack the complexity to robustly model this heterogeneity. Bayesian networks, however, may instead be well-suited to such a scenario. These networks are probabilistic graphical models that efficiently describe the joint probability distribution over a set of random variables by explicitly capturing their conditional dependencies. This framework provides further advantages over standard discriminative modelling by offering the possibility to incorporate expert opinion in the graphical structure of the models, generating explainable model predictions, informing about the uncertainty of predictions, and naturally handling missing data. In this study, we apply a Bayesian framework to capture the relationships between depression, depression symptoms, and features derived from speech, facial expression and cognitive game data collected at thymia.
△ Less
Submitted 22 June, 2023; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Detecting the Severity of Major Depressive Disorder from Speech: A Novel HARD-Training Methodology
Authors:
Edward L. Campbell,
Judith Dineley,
Pauline Conde,
Faith Matcham,
Femke Lamers,
Sara Siddi,
Laura Docio-Fernandez,
Carmen Garcia-Mateo,
Nicholas Cummins,
the RADAR-CNS Consortium
Abstract:
Major Depressive Disorder (MDD) is a common worldwide mental health issue with high associated socioeconomic costs. The prediction and automatic detection of MDD can, therefore, make a huge impact on society. Speech, as a non-invasive, easy to collect signal, is a promising marker to aid the diagnosis and assessment of MDD. In this regard, speech samples were collected as part of the Remote Assess…
▽ More
Major Depressive Disorder (MDD) is a common worldwide mental health issue with high associated socioeconomic costs. The prediction and automatic detection of MDD can, therefore, make a huge impact on society. Speech, as a non-invasive, easy to collect signal, is a promising marker to aid the diagnosis and assessment of MDD. In this regard, speech samples were collected as part of the Remote Assessment of Disease and Relapse in Major Depressive Disorder (RADAR-MDD) research programme. RADAR-MDD was an observational cohort study in which speech and other digital biomarkers were collected from a cohort of individuals with a history of MDD in Spain, United Kingdom and the Netherlands. In this paper, the RADAR-MDD speech corpus was taken as an experimental framework to test the efficacy of a Sequence-to-Sequence model with a local attention mechanism in a two-class depression severity classification paradigm. Additionally, a novel training method, HARD-Training, is proposed. It is a methodology based on the selection of more ambiguous samples for the model training, and inspired by the curriculum learning paradigm. HARD-Training was found to consistently improve - with an average increment of 8.6% - the performance of our classifiers for both of two speech elicitation tasks used and each collection site of the RADAR-MDD speech corpus. With this novel methodology, our Sequence-to-Sequence model was able to effectively detect MDD severity regardless of language. Finally, recognising the need for greater awareness of potential algorithmic bias, we conduct an additional analysis of our results separately for each gender.
△ Less
Submitted 25 May, 2023; v1 submitted 2 June, 2022;
originally announced June 2022.
-
Speech and the n-Back task as a lens into depression. How combining both may allow us to isolate different core symptoms of depression
Authors:
Salvatore Fara,
Stefano Goria,
Emilia Molimpakis,
Nicholas Cummins
Abstract:
Embedded in any speech signal is a rich combination of cognitive, neuromuscular and physiological information. This richness makes speech a powerful signal in relation to a range of different health conditions, including major depressive disorders (MDD). One pivotal issue in speech-depression research is the assumption that depressive severity is the dominant measurable effect. However, given the…
▽ More
Embedded in any speech signal is a rich combination of cognitive, neuromuscular and physiological information. This richness makes speech a powerful signal in relation to a range of different health conditions, including major depressive disorders (MDD). One pivotal issue in speech-depression research is the assumption that depressive severity is the dominant measurable effect. However, given the heterogeneous clinical profile of MDD, it may actually be the case that speech alterations are more strongly associated with subsets of key depression symptoms. This paper presents strong evidence in support of this argument. First, we present a novel large, cross-sectional, multi-modal dataset collected at Thymia. We then present a set of machine learning experiments that demonstrate that combining speech with features from an n-Back working memory assessment improves classifier performance when predicting the popular eight-item Patient Health Questionnaire depression scale (PHQ-8). Finally, we present a set of experiments that highlight the association between different speech and n-Back markers at the PHQ-8 item level. Specifically, we observe that somatic and psychomotor symptoms are more strongly associated with n-Back performance scores, whilst the other items: anhedonia, depressed mood, change in appetite, feelings of worthlessness and trouble concentrating are more strongly associated with speech changes.
△ Less
Submitted 30 March, 2022;
originally announced April 2022.
-
Automatic Detection of Expressed Emotion from Five-Minute Speech Samples: Challenges and Opportunities
Authors:
Bahman Mirheidari,
André Bittar,
Nicholas Cummins,
Johnny Downs,
Helen L. Fisher,
Heidi Christensen
Abstract:
We present a novel feasibility study on the automatic recognition of Expressed Emotion (EE), a family environment concept based on caregivers speaking freely about their relative/family member. We describe an automated approach for determining the \textit{degree of warmth}, a key component of EE, from acoustic and text features acquired from a sample of 37 recorded interviews. These recordings, co…
▽ More
We present a novel feasibility study on the automatic recognition of Expressed Emotion (EE), a family environment concept based on caregivers speaking freely about their relative/family member. We describe an automated approach for determining the \textit{degree of warmth}, a key component of EE, from acoustic and text features acquired from a sample of 37 recorded interviews. These recordings, collected over 20 years ago, are derived from a nationally representative birth cohort of 2,232 British twin children and were manually coded for EE. We outline the core steps of extracting usable information from recordings with highly variable audio quality and assess the efficacy of four machine learning approaches trained with different combinations of acoustic and text features. Despite the challenges of working with this legacy data, we demonstrated that the degree of warmth can be predicted with an $F_{1}$-score of \textbf{61.5\%}. In this paper, we summarise our learning and provide recommendations for future work using real-world speech samples.
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
Associations between depression symptom severity and daily-life gait characteristics derived from long-term acceleration signals in real-world settings
Authors:
Yuezhou Zhang,
Amos A Folarin,
Shaoxiong Sun,
Nicholas Cummins,
Srinivasan Vairavan,
Linglong Qian,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Petroula Laiou,
Heet Sankesara,
Faith Matcham,
Katie M White,
Carolin Oetzmann,
Alina Ivan,
Femke Lamers,
Sara Siddi,
Sara Simblett,
Aki Rintala,
David C Mohr,
Inez Myin-Germeys,
Til Wykes,
Josep Maria Haro,
Brenda WJH Penninx
, et al. (5 additional authors not shown)
Abstract:
Gait is an essential manifestation of depression. Laboratory gait characteristics have been found to be closely associated with depression. However, the gait characteristics of daily walking in real-world scenarios and their relationships with depression are yet to be fully explored. This study aimed to explore associations between depression symptom severity and daily-life gait characteristics de…
▽ More
Gait is an essential manifestation of depression. Laboratory gait characteristics have been found to be closely associated with depression. However, the gait characteristics of daily walking in real-world scenarios and their relationships with depression are yet to be fully explored. This study aimed to explore associations between depression symptom severity and daily-life gait characteristics derived from acceleration signals in real-world settings. In this study, we used two ambulatory datasets: a public dataset with 71 elder adults' 3-day acceleration signals collected by a wearable device, and a subset of an EU longitudinal depression study with 215 participants and their phone-collected acceleration signals (average 463 hours per participant). We detected participants' gait cycles and force from acceleration signals and extracted 20 statistics-based daily-life gait features to describe the distribution and variance of gait cadence and force over a long-term period corresponding to the self-reported depression score. The gait cadence of faster steps (75th percentile) over a long-term period has a significant negative association with the depression symptom severity of this period in both datasets. Daily-life gait features could significantly improve the goodness of fit of evaluating depression severity relative to laboratory gait patterns and demographics, which was assessed by likelihood-ratio tests in both datasets. This study indicated that the significant links between daily-life walking characteristics and depression symptom severity could be captured by both wearable devices and mobile phones. The gait cadence of faster steps in daily-life walking has the potential to be a biomarker for evaluating depression severity, which may contribute to clinical tools to remotely monitor mental health in real-world settings.
△ Less
Submitted 29 January, 2022;
originally announced January 2022.
-
The utility of wearable devices in assessing ambulatory impairments of people with multiple sclerosis in free-living conditions
Authors:
Shaoxiong Sun,
Amos A Folarin,
Yuezhou Zhang,
Nicholas Cummins,
Shuo Liu,
Callum Stewart,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Petroula Laiou,
Heet Sankesara,
Gloria Dalla Costa,
Letizia Leocani,
Per Soelberg Sørensen,
Melinda Magyari,
Ana Isabel Guerrero,
Ana Zabalza,
Srinivasan Vairavan,
Raquel Bailon,
Sara Simblett,
Inez Myin-Germeys,
Aki Rintala,
Til Wykes,
Vaibhav A Narayan,
Matthew Hotopf
, et al. (3 additional authors not shown)
Abstract:
Multiple sclerosis (MS) is a progressive inflammatory and neurodegenerative disease of the central nervous system affecting over 2.5 million people globally. In-clinic six-minute walk test (6MWT) is a widely used objective measure to evaluate the progression of MS. Yet, it has limitations such as the need for a clinical visit and a proper walkway. The widespread use of wearable devices capable of…
▽ More
Multiple sclerosis (MS) is a progressive inflammatory and neurodegenerative disease of the central nervous system affecting over 2.5 million people globally. In-clinic six-minute walk test (6MWT) is a widely used objective measure to evaluate the progression of MS. Yet, it has limitations such as the need for a clinical visit and a proper walkway. The widespread use of wearable devices capable of depicting patients activity profiles has the potential to assess the level of MS-induced disability in free-living conditions. In this work, we extracted 96 activity features in different temporal granularities (from minute-level to day-level) and explored their utility in estimating 6MWT scores in a European (Italy, Spain, and Denmark) MS cohort of 337 participants over an average of 10-month duration. We combined these features with participant demographics using three regression models including elastic net, gradient boosted trees and random forest. In addition, we quantified the individual feature contribution using feature importance in these regression models, linear mixed-effects models, generalized estimating equations, and correlation-based feature selection (CFS). The results showed promising estimation performance with R2 of 0.30, which was derived using random forest after CFS. This model was able to distinguish the participants with low disability from those with high disability. Furthermore, we observed that the minute-level (no longer than 8 minutes) step count, particularly those capturing the upper end of the step count distribution, had a stronger association with 6MWT. The use of a walking aid was indicative of ambulatory function measured through 6MWT. This study provides a basis for future investigation into the clinical relevance and utility of wearables in assessing MS progression in free-living conditions.
△ Less
Submitted 22 December, 2021;
originally announced December 2021.
-
Predicting Depressive Symptom Severity through Individuals' Nearby Bluetooth Devices Count Data Collected by Mobile Phones: A Preliminary Longitudinal Study
Authors:
Yuezhou Zhang,
Amos A Folarin,
Shaoxiong Sun,
Nicholas Cummins,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Petroula Laiou,
Faith Matcham,
Carolin Oetzmann,
Femke Lamers,
Sara Siddi,
Sara Simblett,
Aki Rintala,
David C Mohr,
Inez Myin-Germeys,
Til Wykes,
Josep Maria Haro,
Brenda WJH Pennix,
Vaibhav A Narayan,
Peter Annas,
Matthew Hotopf,
Richard JB Dobson
Abstract:
The Bluetooth sensor embedded in mobile phones provides an unobtrusive, continuous, and cost-efficient means to capture individuals' proximity information, such as the nearby Bluetooth devices count (NBDC). The continuous NBDC data can partially reflect individuals' behaviors and status, such as social connections and interactions, working status, mobility, and social isolation and loneliness, whi…
▽ More
The Bluetooth sensor embedded in mobile phones provides an unobtrusive, continuous, and cost-efficient means to capture individuals' proximity information, such as the nearby Bluetooth devices count (NBDC). The continuous NBDC data can partially reflect individuals' behaviors and status, such as social connections and interactions, working status, mobility, and social isolation and loneliness, which were found to be significantly associated with depression by previous survey-based studies. This paper aims to explore the NBDC data's value in predicting depressive symptom severity as measured via the 8-item Patient Health Questionnaire (PHQ-8). The data used in this paper included 2,886 bi-weekly PHQ-8 records collected from 316 participants recruited from three study sites in the Netherlands, Spain, and the UK as part of the EU RADAR-CNS study. From the NBDC data two weeks prior to each PHQ-8 score, we extracted 49 Bluetooth features, including statistical features and nonlinear features for measuring periodicity and regularity of individuals' life rhythms. Linear mixed-effect models were used to explore associations between Bluetooth features and the PHQ-8 score. We then applied hierarchical Bayesian linear regression models to predict the PHQ-8 score from the extracted Bluetooth features. A number of significant associations were found between Bluetooth features and depressive symptom severity. Compared with commonly used machine learning models, the proposed hierarchical Bayesian linear regression model achieved the best prediction metrics, R2= 0.526, and root mean squared error (RMSE) of 3.891. Bluetooth features can explain an extra 18.8% of the variance in the PHQ-8 score relative to the baseline model without Bluetooth features (R2=0.338, RMSE = 4.547).
△ Less
Submitted 26 April, 2021;
originally announced April 2021.
-
Fitbeat: COVID-19 Estimation based on Wristband Heart Rate
Authors:
Shuo Liu,
**g Han,
Estela Laporta Puyal,
Spyridon Kontaxis,
Shaoxiong Sun,
Patrick Locatelli,
Judith Dineley,
Florian B. Pokorny,
Gloria Dalla Costa,
Letizia Leocan,
Ana Isabel Guerrero,
Carlos Nos,
Ana Zabalza,
Per Soelberg Sørensen,
Mathias Buron,
Melinda Magyari,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Amos A Folarin,
Richard JB Dobson,
Raquel Bailón,
Srinivasan Vairavan,
Nicholas Cummins
, et al. (4 additional authors not shown)
Abstract:
This study investigates the potential of deep learning methods to identify individuals with suspected COVID-19 infection using remotely collected heart-rate data. The study utilises data from the ongoing EU IMI RADAR-CNS research project that is investigating the feasibility of wearable devices and smart phones to monitor individuals with multiple sclerosis (MS), depression or epilepsy. Aspart of…
▽ More
This study investigates the potential of deep learning methods to identify individuals with suspected COVID-19 infection using remotely collected heart-rate data. The study utilises data from the ongoing EU IMI RADAR-CNS research project that is investigating the feasibility of wearable devices and smart phones to monitor individuals with multiple sclerosis (MS), depression or epilepsy. Aspart of the project protocol, heart-rate data was collected from participants using a Fitbit wristband. The presence of COVID-19 in the cohort in this work was either confirmed through a positive swab test, or inferred through the self-reporting of a combination of symptoms including fever, respiratory symptoms, loss of smell or taste, tiredness and gastrointestinal symptoms. Experimental results indicate that our proposed contrastive convolutional auto-encoder (contrastive CAE), i. e., a combined architecture of an auto-encoder and contrastive loss, outperforms a conventional convolutional neural network (CNN), as well as a convolutional auto-encoder (CAE) without using contrastive loss. Our final contrastive CAE achieves 95.3% unweighted average recall, 86.4% precision, anF1 measure of 88.2%, a sensitivity of 100% and a specificity of 90.6% on a testset of 19 participants with MS who reported symptoms of COVID-19. Each of these participants was paired with a participant with MS with no COVID-19 symptoms.
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
Remote smartphone-based speech collection: acceptance and barriers in individuals with major depressive disorder
Authors:
Judith Dineley,
Grace Lavelle,
Daniel Leightley,
Faith Matcham,
Sara Siddi,
Maria Teresa Peñarrubia-María,
Katie M. White,
Alina Ivan,
Carolin Oetzmann,
Sara Simblett,
Erin Dawe-Lane,
Stuart Bruce,
Daniel Stahl,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Amos A. Folarin,
Josep Maria Haro,
Til Wykes,
Richard J. B. Dobson,
Vaibhav A. Narayan,
Matthew Hotopf,
Björn W. Schuller,
Nicholas Cummins,
The RADAR-CNS Consortium
Abstract:
The ease of in-the-wild speech recording using smartphones has sparked considerable interest in the combined application of speech, remote measurement technology (RMT) and advanced analytics as a research and healthcare tool. For this to be realised, the acceptability of remote speech collection to the user must be established, in addition to feasibility from an analytical perspective. To understa…
▽ More
The ease of in-the-wild speech recording using smartphones has sparked considerable interest in the combined application of speech, remote measurement technology (RMT) and advanced analytics as a research and healthcare tool. For this to be realised, the acceptability of remote speech collection to the user must be established, in addition to feasibility from an analytical perspective. To understand the acceptance, facilitators, and barriers of smartphone-based speech recording, we invited 384 individuals with major depressive disorder (MDD) from the Remote Assessment of Disease and Relapse - Central Nervous System (RADAR-CNS) research programme in Spain and the UK to complete a survey on their experiences recording their speech. In this analysis, we demonstrate that study participants were more comfortable completing a scripted speech task than a free speech task. For both speech tasks, we found depression severity and country to be significant predictors of comfort. Not seeing smartphone notifications of the scheduled speech tasks, low mood and forgetfulness were the most commonly reported obstacles to providing speech recordings.
△ Less
Submitted 30 August, 2021; v1 submitted 17 April, 2021;
originally announced April 2021.
-
The Relationship between Major Depression Symptom Severity and Sleep Collected Using a Wristband Wearable Device: Multi-centre Longitudinal Observational Study
Authors:
Yuezhou Zhang,
Amos A Folarin,
Shaoxiong Sun,
Nicholas Cummins,
Rebecca Bendayan Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Petroula Laiou,
Faith Matcham,
Katie White,
Femke Lamers,
Sara Siddi,
Sara Simblett,
Inez Myin-Germeys,
Aki Rintala,
Til Wykes,
Josep Maria Haro,
Brenda WJH Pennix,
Vaibhav A Narayan,
Matthew Hotopf,
Richard JB Dobson
Abstract:
Research in mental health has implicated sleep pathologies with depression. However, the gold standard for sleep assessment, polysomnography, is not suitable for long-term, continuous, monitoring of daily sleep, and methods such as sleep diaries rely on subjective recall, which is qualitative and inaccurate. Wearable devices, on the other hand, provide a low-cost and convenient means to monitor sl…
▽ More
Research in mental health has implicated sleep pathologies with depression. However, the gold standard for sleep assessment, polysomnography, is not suitable for long-term, continuous, monitoring of daily sleep, and methods such as sleep diaries rely on subjective recall, which is qualitative and inaccurate. Wearable devices, on the other hand, provide a low-cost and convenient means to monitor sleep in home settings. The main aim of this study was to devise and extract sleep features, from data collected using a wearable device, and analyse their correlation with depressive symptom severity and sleep quality, as measured by the self-assessed Patient Health Questionnaire 8-item. Daily sleep data were collected passively by Fitbit wristband devices, and depressive symptom severity was self-reported every two weeks by the PHQ-8. The data used in this paper included 2,812 PHQ-8 records from 368 participants recruited from three study sites in the Netherlands, Spain, and the UK.We extracted 21 sleep features from Fitbit data which describe sleep in the following five aspects: sleep architecture, sleep stability, sleep quality, insomnia, and hypersomnia. Linear mixed regression models were used to explore associations between sleep features and depressive symptom severity. The z-test was used to evaluate the significance of the coefficient of each feature. We tested our models on the entire dataset and individually on the data of three different study sites. We identified 16 sleep features that were significantly correlated with the PHQ-8 score on the entire dataset. Associations between sleep features and the PHQ-8 score varied across different sites, possibly due to the difference in the populations.
△ Less
Submitted 27 September, 2020;
originally announced September 2020.
-
Using smartphones and wearable devices to monitor behavioural changes during COVID-19
Authors:
Shaoxiong Sun,
Amos Folarin,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Nicholas Cummins,
Faith Matcham,
Gloria Dalla Costa,
Sara Simblett,
Letizia Leocani,
Per Soelberg Sørensen,
Mathias Buron,
Ana Isabel Guerrero,
Ana Zabalza,
Brenda WJH Penninx,
Femke Lamers,
Sara Siddi,
Josep Maria Haro,
Inez Myin-Germeys,
Aki Rintala,
Til Wykes,
Vaibhav A. Narayan,
Giancarlo Comi,
Matthew Hotopf
, et al. (1 additional authors not shown)
Abstract:
We aimed to explore the utility of the recently developed open-source mobile health platform RADAR-base as a toolbox to rapidly test the effect and response to NPIs aimed at limiting the spread of COVID-19. We analysed data extracted from smartphone and wearable devices and managed by the RADAR-base from 1062 participants recruited in Italy, Spain, Denmark, the UK, and the Netherlands. We derived…
▽ More
We aimed to explore the utility of the recently developed open-source mobile health platform RADAR-base as a toolbox to rapidly test the effect and response to NPIs aimed at limiting the spread of COVID-19. We analysed data extracted from smartphone and wearable devices and managed by the RADAR-base from 1062 participants recruited in Italy, Spain, Denmark, the UK, and the Netherlands. We derived nine features on a daily basis including time spent at home, maximum distance travelled from home, maximum number of Bluetooth-enabled nearby devices (as a proxy for physical distancing), step count, average heart rate, sleep duration, bedtime, phone unlock duration, and social app use duration. We performed Kruskal-Wallis tests followed by post-hoc Dunns tests to assess differences in these features among baseline, pre-, and during-lockdown periods. We also studied behavioural differences by age, gender, body mass index (BMI), and educational background. We were able to quantify expected changes in time spent at home, distance travelled, and the number of nearby Bluetooth-enabled devices between pre- and during-lockdown periods. We saw reduced sociality as measured through mobility features, and increased virtual sociality through phone usage. People were more active on their phones, spending more time using social media apps, particularly around major news events. Furthermore, participants had lower heart rate, went to bed later, and slept more. We also found that young people had longer homestay than older people during lockdown and fewer daily steps. Although there was no significant difference between the high and low BMI groups in time spent at home, the low BMI group walked more. RADAR-base can be used to rapidly quantify and provide a holistic view of behavioural changes in response to public health interventions as a result of infectious outbreaks such as COVID-19.
△ Less
Submitted 22 July, 2020; v1 submitted 29 April, 2020;
originally announced April 2020.
-
The Ambiguous World of Emotion Representation
Authors:
Vidhyasaharan Sethu,
Emily Mower Provost,
Julien Epps,
Carlos Busso,
Nicholas Cummins,
Shrikanth Narayanan
Abstract:
Artificial intelligence and machine learning systems have demonstrated huge improvements and human-level parity in a range of activities, including speech recognition, face recognition and speaker verification. However, these diverse tasks share a key commonality that is not true in affective computing: the ground truth information that is inferred can be unambiguously represented. This observatio…
▽ More
Artificial intelligence and machine learning systems have demonstrated huge improvements and human-level parity in a range of activities, including speech recognition, face recognition and speaker verification. However, these diverse tasks share a key commonality that is not true in affective computing: the ground truth information that is inferred can be unambiguously represented. This observation provides some hints as to why affective computing, despite having attracted the attention of researchers for years, may not still be considered a mature field of research. A key reason for this is the lack of a common mathematical framework to describe all the relevant elements of emotion representations. This paper proposes the AMBiguous Emotion Representation (AMBER) framework to address this deficiency. AMBER is a unified framework that explicitly describes categorical, numerical and ordinal representations of emotions, including time varying representations. In addition to explaining the core elements of AMBER, the paper also discusses how some of the commonly employed emotion representation schemes can be viewed through the AMBER framework, and concludes with a discussion of how the proposed framework can be used to reason about current and future affective computing systems.
△ Less
Submitted 1 September, 2019;
originally announced September 2019.
-
AVEC 2019 Workshop and Challenge: State-of-Mind, Detecting Depression with AI, and Cross-Cultural Affect Recognition
Authors:
Fabien Ringeval,
Björn Schuller,
Michel Valstar,
NIcholas Cummins,
Roddy Cowie,
Leili Tavabi,
Maximilian Schmitt,
Sina Alisamir,
Shahin Amiriparian,
Eva-Maria Messner,
Siyang Song,
Shuo Liu,
Zi** Zhao,
Adria Mallol-Ragolta,
Zhao Ren,
Mohammad Soleymani,
Maja Pantic
Abstract:
The Audio/Visual Emotion Challenge and Workshop (AVEC 2019) "State-of-Mind, Detecting Depression with AI, and Cross-cultural Affect Recognition" is the ninth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic audiovisual health and emotion analysis, with all participants competing strictly under the same conditions. The goal of the Challen…
▽ More
The Audio/Visual Emotion Challenge and Workshop (AVEC 2019) "State-of-Mind, Detecting Depression with AI, and Cross-cultural Affect Recognition" is the ninth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic audiovisual health and emotion analysis, with all participants competing strictly under the same conditions. The goal of the Challenge is to provide a common benchmark test set for multimodal information processing and to bring together the health and emotion recognition communities, as well as the audiovisual processing communities, to compare the relative merits of various approaches to health and emotion recognition from real-life data. This paper presents the major novelties introduced this year, the challenge guidelines, the data used, and the performance of the baseline systems on the three proposed tasks: state-of-mind recognition, depression assessment with AI, and cross-cultural affect sensing, respectively.
△ Less
Submitted 10 July, 2019;
originally announced July 2019.
-
Voice command generation using Progressive Wavegans
Authors:
Thomas Wiest,
Nicholas Cummins,
Alice Baird,
Simone Hantke,
Judith Dineley,
Björn Schuller
Abstract:
Generative Adversarial Networks (GANs) have become exceedingly popular in a wide range of data-driven research fields, due in part to their success in image generation. Their ability to generate new samples, often from only a small amount of input data, makes them an exciting research tool in areas with limited data resources. One less-explored application of GANs is the synthesis of speech and au…
▽ More
Generative Adversarial Networks (GANs) have become exceedingly popular in a wide range of data-driven research fields, due in part to their success in image generation. Their ability to generate new samples, often from only a small amount of input data, makes them an exciting research tool in areas with limited data resources. One less-explored application of GANs is the synthesis of speech and audio samples. Herein, we propose a set of extensions to the WaveGAN paradigm, a recently proposed approach for sound generation using GANs. The aim of these extensions - preprocessing, Audio-to-Audio generation, skip connections and progressive structures - is to improve the human likeness of synthetic speech samples. Scores from listening tests with 30 volunteers demonstrated a moderate improvement (Cohen's d coefficient of 0.65) in human likeness using the proposed extensions compared to the original WaveGAN approach.
△ Less
Submitted 13 March, 2019;
originally announced March 2019.
-
Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives
Authors:
**g Han,
Zixing Zhang,
Nicholas Cummins,
Björn Schuller
Abstract:
Over the past few years, adversarial training has become an extremely active research topic and has been successfully applied to various Artificial Intelligence (AI) domains. As a potentially crucial technique for the development of the next generation of emotional AI systems, we herein provide a comprehensive overview of the application of adversarial training to affective computing and sentiment…
▽ More
Over the past few years, adversarial training has become an extremely active research topic and has been successfully applied to various Artificial Intelligence (AI) domains. As a potentially crucial technique for the development of the next generation of emotional AI systems, we herein provide a comprehensive overview of the application of adversarial training to affective computing and sentiment analysis. Various representative adversarial training algorithms are explained and discussed accordingly, aimed at tackling diverse challenges associated with emotional AI systems. Further, we highlight a range of potential future research directions. We expect that this overview will help facilitate the development of adversarial training for affective computing and sentiment analysis in both the academic and industrial communities.
△ Less
Submitted 21 September, 2018;
originally announced September 2018.
-
Calibrated Prediction Intervals for Neural Network Regressors
Authors:
Gil Keren,
Nicholas Cummins,
Björn Schuller
Abstract:
Ongoing developments in neural network models are continually advancing the state of the art in terms of system accuracy. However, the predicted labels should not be regarded as the only core output; also important is a well-calibrated estimate of the prediction uncertainty. Such estimates and their calibration are critical in many practical applications. Despite their obvious aforementioned advan…
▽ More
Ongoing developments in neural network models are continually advancing the state of the art in terms of system accuracy. However, the predicted labels should not be regarded as the only core output; also important is a well-calibrated estimate of the prediction uncertainty. Such estimates and their calibration are critical in many practical applications. Despite their obvious aforementioned advantage in relation to accuracy, contemporary neural networks can, generally, be regarded as poorly calibrated and as such do not produce reliable output probability estimates. Further, while post-processing calibration solutions can be found in the relevant literature, these tend to be for systems performing classification. In this regard, we herein present two novel methods for acquiring calibrated predictions intervals for neural network regressors: empirical calibration and temperature scaling. In experiments using different regression tasks from the audio and computer vision domains, we find that both our proposed methods are indeed capable of producing calibrated prediction intervals for neural network regressors with any desired confidence level, a finding that is consistent across all datasets and neural network architectures we experimented with. In addition, we derive an additional practical recommendation for producing more accurate calibrated prediction intervals. We release the source code implementing our proposed methods for computing calibrated predicted intervals. The code for computing calibrated predicted intervals is publicly available.
△ Less
Submitted 7 January, 2019; v1 submitted 26 March, 2018;
originally announced March 2018.
-
auDeep: Unsupervised Learning of Representations from Audio with Deep Recurrent Neural Networks
Authors:
Michael Freitag,
Shahin Amiriparian,
Sergey Pugachevskiy,
Nicholas Cummins,
Björn Schuller
Abstract:
auDeep is a Python toolkit for deep unsupervised representation learning from acoustic data. It is based on a recurrent sequence to sequence autoencoder approach which can learn representations of time series data by taking into account their temporal dynamics. We provide an extensive command line interface in addition to a Python API for users and developers, both of which are comprehensively doc…
▽ More
auDeep is a Python toolkit for deep unsupervised representation learning from acoustic data. It is based on a recurrent sequence to sequence autoencoder approach which can learn representations of time series data by taking into account their temporal dynamics. We provide an extensive command line interface in addition to a Python API for users and developers, both of which are comprehensively documented and publicly available at https://github.com/auDeep/auDeep. Experimental results indicate that auDeep features are competitive with state-of-the art audio classification.
△ Less
Submitted 22 December, 2017; v1 submitted 12 December, 2017;
originally announced December 2017.