-
The Design, Implementation, and Performance of the LZ Calibration Systems
Authors:
J. Aalbers,
D. S. Akerib,
A. K. Al Musalhi,
F. Alder,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
A. Baker,
S. Balashov,
J. Bang,
E. E. Barillier,
J. W. Bargemann,
K. Beattie,
T. Benson,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski,
H. J. Birch,
E. Bishop,
G. M. Blockinger,
B. Boxer
, et al. (179 additional authors not shown)
Abstract:
LUX-ZEPLIN (LZ) is a tonne-scale experiment searching for direct dark matter interactions and other rare events. It is located at the Sanford Underground Research Facility (SURF) in Lead, South Dakota, USA. The core of the LZ detector is a dual-phase xenon time projection chamber (TPC), designed with the primary goal of detecting Weakly Interacting Massive Particles (WIMPs) via their induced low e…
▽ More
LUX-ZEPLIN (LZ) is a tonne-scale experiment searching for direct dark matter interactions and other rare events. It is located at the Sanford Underground Research Facility (SURF) in Lead, South Dakota, USA. The core of the LZ detector is a dual-phase xenon time projection chamber (TPC), designed with the primary goal of detecting Weakly Interacting Massive Particles (WIMPs) via their induced low energy nuclear recoils. Surrounding the TPC, two veto detectors immersed in an ultra-pure water tank enable reducing background events to enhance the discovery potential. Intricate calibration systems are purposely designed to precisely understand the responses of these three detector volumes to various types of particle interactions and to demonstrate LZ's ability to discriminate between signals and backgrounds. In this paper, we present a comprehensive discussion of the key features, requirements, and performance of the LZ calibration systems, which play a crucial role in enabling LZ's WIMP-search and its broad science program. The thorough description of these calibration systems, with an emphasis on their novel aspects, is valuable for future calibration efforts in direct dark matter and other rare-event search experiments.
△ Less
Submitted 20 June, 2024; v1 submitted 2 May, 2024;
originally announced June 2024.
-
A pilot protocol and cohort for the investigation of non-pathological variability in speech
Authors:
Nicholas Cummins,
Lauren L. White,
Zahia Rahman,
Catriona Lucas,
Tian Pan,
Ewan Carr,
Faith Matcham,
Johnny Downs,
Richard J. Dobson,
Judith Dineley
Abstract:
Background Speech-based biomarkers have potential as a means for regular, objective assessment of symptom severity, remotely and in-clinic in combination with advanced analytical models. However, the complex nature of speech and the often subtle changes associated with health mean that findings are highly dependent on methodological and cohort choices. These are often not reported adequately in st…
▽ More
Background Speech-based biomarkers have potential as a means for regular, objective assessment of symptom severity, remotely and in-clinic in combination with advanced analytical models. However, the complex nature of speech and the often subtle changes associated with health mean that findings are highly dependent on methodological and cohort choices. These are often not reported adequately in studies investigating speech-based health assessment Objective To develop and apply an exemplar protocol to generate a pilot dataset of healthy speech with detailed metadata for the assessment of factors in the speech recording-analysis pipeline, including device choice, speech elicitation task and non-pathological variability. Methods We developed our collection protocol and choice of exemplar speech features based on a thematic literature review. Our protocol includes the elicitation of three different speech types. With a focus towards remote applications, we also choose to collect speech with three different microphone types. We developed a pipeline to extract a set of 14 exemplar speech features. Results We collected speech from 28 individuals three times in one day, repeated at the same times 8-11 weeks later, and from 25 healthy individuals three times in one week. Participant characteristics collected included sex, age, native language status and voice use habits of the participant. A preliminary set of 14 speech features covering timing, prosody, voice quality, articulation and spectral moment characteristics were extracted that provide a resource of normative values. Conclusions There are multiple methodological factors involved in the collection, processing and analysis of speech recordings. Consistent reporting and greater harmonisation of study protocols are urgently required to aid the translation of speech processing into clinical research and practice.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Probing the Scalar WIMP-Pion Coupling with the first LUX-ZEPLIN data
Authors:
J. Aalbers,
D. S. Akerib,
A. K. Al Musalhi,
F. Alder,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
A. Baker,
S. Balashov,
J. Bang,
E. E. Barillier,
J. W. Bargemann,
K. Beattie,
T. Benson,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski,
H. J. Birch,
E. J. Bishop,
G. M. Blockinger,
B. Boxer
, et al. (178 additional authors not shown)
Abstract:
Weakly interacting massive particles (WIMPs) may interact with a virtual pion that is exchanged between nucleons. This interaction channel is important to consider in models where the spin-independent isoscalar channel is suppressed. Using data from the first science run of the LUX-ZEPLIN dark matter experiment, containing 60 live days of data in a 5.5~tonne fiducial mass of liquid xenon, we repor…
▽ More
Weakly interacting massive particles (WIMPs) may interact with a virtual pion that is exchanged between nucleons. This interaction channel is important to consider in models where the spin-independent isoscalar channel is suppressed. Using data from the first science run of the LUX-ZEPLIN dark matter experiment, containing 60 live days of data in a 5.5~tonne fiducial mass of liquid xenon, we report the results on a search for WIMP-pion interactions. We observe no significant excess and set an upper limit of $1.5\times10^{-46}$~cm$^2$ at a 90\% confidence level for a WIMP mass of 33~GeV/c$^2$ for this interaction.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Unveiling the Secrets: How Masking Strategies Shape Time Series Imputation
Authors:
Linglong Qian,
Zina Ibrahim,
Wenjie Du,
Yiyuan Yang,
Richard JB Dobson
Abstract:
In this study, we explore the impact of different masking strategies on time series imputation models. We evaluate the effects of pre-masking versus in-mini-batch masking, normalization timing, and the choice between augmenting and overlaying artificial missingness. Using three diverse datasets, we benchmark eleven imputation models with different missing rates. Our results demonstrate that maskin…
▽ More
In this study, we explore the impact of different masking strategies on time series imputation models. We evaluate the effects of pre-masking versus in-mini-batch masking, normalization timing, and the choice between augmenting and overlaying artificial missingness. Using three diverse datasets, we benchmark eleven imputation models with different missing rates. Our results demonstrate that masking strategies significantly influence imputation accuracy, revealing that more sophisticated and data-driven masking designs are essential for robust model evaluation. We advocate for refined experimental designs and comprehensive disclosureto better simulate real-world patterns, enhancing the practical applicability of imputation models.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
The Data Acquisition System of the LZ Dark Matter Detector: FADR
Authors:
J. Aalbers,
D. S. Akerib,
A. K. Al Musalhi,
F. Alder,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
A. Baker,
S. Balashov,
J. Bang,
E. E. Barillier,
J. W. Bargemann,
K. Beattie,
T. Benson,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski,
H. J. Birch,
E. Bishop,
G. M. Blockinger,
B. Boxer
, et al. (190 additional authors not shown)
Abstract:
The Data Acquisition System (DAQ) for the LUX-ZEPLIN (LZ) dark matter detector is described. The signals from 745 PMTs, distributed across three subsystems, are sampled with 100-MHz 32-channel digitizers (DDC-32s). A basic waveform analysis is carried out on the on-board Field Programmable Gate Arrays (FPGAs) to extract information about the observed scintillation and electroluminescence signals.…
▽ More
The Data Acquisition System (DAQ) for the LUX-ZEPLIN (LZ) dark matter detector is described. The signals from 745 PMTs, distributed across three subsystems, are sampled with 100-MHz 32-channel digitizers (DDC-32s). A basic waveform analysis is carried out on the on-board Field Programmable Gate Arrays (FPGAs) to extract information about the observed scintillation and electroluminescence signals. This information is used to determine if the digitized waveforms should be preserved for offline analysis.
The system is designed around the Kintex-7 FPGA. In addition to digitizing the PMT signals and providing basic event selection in real time, the flexibility provided by the use of FPGAs allows us to monitor the performance of the detector and the DAQ in parallel to normal data acquisition.
The hardware and software/firmware of this FPGA-based Architecture for Data acquisition and Realtime monitoring (FADR) are discussed and performance measurements are described.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Constraints On Covariant WIMP-Nucleon Effective Field Theory Interactions from the First Science Run of the LUX-ZEPLIN Experiment
Authors:
J. Aalbers,
D. S. Akerib,
A. K. Al Musalhi,
F. Alder,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
A. Baker,
S. Balashov,
J. Bang,
E. E. Barillier,
J. W. Bargemann,
K. Beattie,
T. Benson,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski,
H. J. Birch,
E. J. Bishop,
G. M. Blockinger,
B. Boxer
, et al. (179 additional authors not shown)
Abstract:
The first science run of the LUX-ZEPLIN (LZ) experiment, a dual-phase xenon time project chamber operating in the Sanford Underground Research Facility in South Dakota, USA, has reported leading limits on spin-independent WIMP-nucleon interactions and interactions described from a non-relativistic effective field theory (NREFT). Using the same 5.5~t fiducial mass and 60 live days of exposure we re…
▽ More
The first science run of the LUX-ZEPLIN (LZ) experiment, a dual-phase xenon time project chamber operating in the Sanford Underground Research Facility in South Dakota, USA, has reported leading limits on spin-independent WIMP-nucleon interactions and interactions described from a non-relativistic effective field theory (NREFT). Using the same 5.5~t fiducial mass and 60 live days of exposure we report on the results of a relativistic extension to the NREFT. We present constraints on couplings from covariant interactions arising from the coupling of vector, axial currents, and electric dipole moments of the nucleon to the magnetic and electric dipole moments of the WIMP which cannot be described by recasting previous results described by an NREFT. Using a profile-likelihood ratio analysis, in an energy region between 0~keV$_\text{nr}$ to 270~keV$_\text{nr}$, we report 90% confidence level exclusion limits on the coupling strength of five interactions in both the isoscalar and isovector bases.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Deciphering seasonal depression variations and interplays between weather changes, physical activity, and depression severity in real-world settings: Learnings from RADAR-MDD longitudinal mobile health study
Authors:
Yuezhou Zhang,
Amos A. Folarin,
Yatharth Ranjan,
Nicholas Cummins,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Shaoxiong Sun,
Srinivasan Vairavan,
Faith Matcham,
Carolin Oetzmann,
Sara Siddi,
Femke Lamers,
Sara Simblett,
Til Wykes,
David C. Mohr,
Josep Maria Haro,
Brenda W. J. H. Penninx,
Vaibhav A. Narayan,
Matthew Hotopf,
Richard J. B. Dobson,
Abhishek Pratap,
RADAR-CNS consortium
Abstract:
Prior research has shown that changes in seasons and weather can have a significant impact on depression severity. However, findings are inconsistent across populations, and the interplay between weather, behavior, and depression has not been fully quantified. This study analyzed real-world data from 428 participants (a subset; 68.7% of the cohort) in the RADAR-MDD longitudinal mobile health study…
▽ More
Prior research has shown that changes in seasons and weather can have a significant impact on depression severity. However, findings are inconsistent across populations, and the interplay between weather, behavior, and depression has not been fully quantified. This study analyzed real-world data from 428 participants (a subset; 68.7% of the cohort) in the RADAR-MDD longitudinal mobile health study to investigate seasonal variations in depression (measured through a remote validated assessment - PHQ-8) and examine the potential interplay between dynamic weather changes, physical activity (monitored via wearables), and depression severity. The clustering of PHQ-8 scores identified four distinct seasonal variations in depression severity: one stable trend and three varying patterns where depression peaks in different seasons. Among these patterns, participants within the stable trend had the oldest average age (p=0.002) and the lowest baseline PHQ-8 score (p=0.003). Mediation analysis assessing the indirect effect of weather on physical activity and depression showed significant differences among participants with different affective responses to weather. These findings illustrate the heterogeneity in individuals' seasonal depression variations and responses to weather, underscoring the necessity for personalized approaches to help understand the impact of environmental factors on the real-world effectiveness of behavioral treatments.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
New constraints on ultraheavy dark matter from the LZ experiment
Authors:
J. Aalbers,
D. S. Akerib,
A. K. Al Musalhi,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
A. Baker,
S. Balashov,
J. Bang,
J. W. Bargemann,
A. Baxter,
K. Beattie,
T. Benson,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski,
H. J. Birch,
E. Bishop,
G. M. Blockinger,
B. Boxer,
C. A. J. Brew
, et al. (174 additional authors not shown)
Abstract:
Searches for dark matter with liquid xenon time projection chamber experiments have traditionally focused on the region of the parameter space that is characteristic of weakly interacting massive particles, ranging from a few GeV/$c^2$ to a few TeV/$c^2$. Models of dark matter with a mass much heavier than this are well motivated by early production mechanisms different from the standard thermal f…
▽ More
Searches for dark matter with liquid xenon time projection chamber experiments have traditionally focused on the region of the parameter space that is characteristic of weakly interacting massive particles, ranging from a few GeV/$c^2$ to a few TeV/$c^2$. Models of dark matter with a mass much heavier than this are well motivated by early production mechanisms different from the standard thermal freeze-out, but they have generally been less explored experimentally. In this work, we present a re-analysis of the first science run (SR1) of the LZ experiment, with an exposure of $0.9$ tonne$\times$year, to search for ultraheavy particle dark matter. The signal topology consists of multiple energy deposits in the active region of the detector forming a straight line, from which the velocity of the incoming particle can be reconstructed on an event-by-event basis. Zero events with this topology were observed after applying the data selection calibrated on a simulated sample of signal-like events. New experimental constraints are derived, which rule out previously unexplored regions of the dark matter parameter space of spin-independent interactions beyond a mass of 10$^{17}$ GeV/$c^2$.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Longitudinal Assessment of Seasonal Impacts and Depression Associations on Circadian Rhythm Using Multimodal Wearable Sensing
Authors:
Yuezhou Zhang,
Amos A Folarin,
Shaoxiong Sun,
Nicholas Cummins,
Yatharth Ranjan,
Zulqarnain Rashid,
Callum Stewart,
Pauline Conde,
Heet Sankesara,
Petroula Laiou,
Faith Matcham,
Katie M White,
Carolin Oetzmann,
Femke Lamers,
Sara Siddi,
Sara Simblett,
Srinivasan Vairavan,
Inez Myin-Germeys,
David C. Mohr,
Til Wykes,
Josep Maria Haro,
Peter Annas,
Brenda WJH Penninx,
Vaibhav A Narayan,
Matthew Hotopf
, et al. (2 additional authors not shown)
Abstract:
Objective: This study aimed to explore the associations between depression severity and wearable-measured circadian rhythms, accounting for seasonal impacts and quantifying seasonal changes in circadian rhythms.Materials and Methods: Data used in this study came from a large longitudinal mobile health study. Depression severity (measured biweekly using the 8-item Patient Health Questionnaire [PHQ-…
▽ More
Objective: This study aimed to explore the associations between depression severity and wearable-measured circadian rhythms, accounting for seasonal impacts and quantifying seasonal changes in circadian rhythms.Materials and Methods: Data used in this study came from a large longitudinal mobile health study. Depression severity (measured biweekly using the 8-item Patient Health Questionnaire [PHQ-8]) and behaviors (monitored by Fitbit) were tracked for up to two years. Twelve features were extracted from Fitbit recordings to approximate circadian rhythms. Three nested linear mixed-effects models were employed for each feature: (1) incorporating the PHQ-8 score as an independent variable; (2) adding the season variable; and (3) adding an interaction term between season and the PHQ-8 score. Results: This study analyzed 10,018 PHQ-8 records with Fitbit data from 543 participants. Upon adjusting for seasonal effects, higher PHQ-8 scores were associated with reduced activity, irregular behaviors, and delayed rhythms. Notably, the negative association with daily step counts was stronger in summer and spring than in winter, and the positive association with the onset of the most active continuous 10-hour period was significant only during summer. Furthermore, participants had shorter and later sleep, more activity, and delayed circadian rhythms in summer compared to winter. Discussion and Conclusions: Our findings underscore the significant seasonal impacts on human circadian rhythms and their associations with depression and indicate that wearable-measured circadian rhythms have the potential to be the digital biomarkers of depression.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
First Constraints on WIMP-Nucleon Effective Field Theory Couplings in an Extended Energy Region From LUX-ZEPLIN
Authors:
LZ Collaboration,
J. Aalbers,
D. S. Akerib,
A. K. Al Musalhi,
F. Alder,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
A. Baker,
S. Balashov,
J. Bang,
J. W. Bargemann,
A. Baxter,
K. Beattie,
T. Benson,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski,
H. J. Birch,
E. Bishop,
G. M. Blockinger
, et al. (175 additional authors not shown)
Abstract:
Following the first science results of the LUX-ZEPLIN (LZ) experiment, a dual-phase xenon time projection chamber operating from the Sanford Underground Research Facility in Lead, South Dakota, USA, we report the initial limits on a model-independent non-relativistic effective field theory describing the complete set of possible interactions of a weakly interacting massive particle (WIMP) with a n…
▽ More
Following the first science results of the LUX-ZEPLIN (LZ) experiment, a dual-phase xenon time projection chamber operating from the Sanford Underground Research Facility in Lead, South Dakota, USA, we report the initial limits on a model-independent non-relativistic effective field theory describing the complete set of possible interactions of a weakly interacting massive particle (WIMP) with a nucleon. These results utilize the same 5.5 t fiducial mass and 60 live days of exposure collected for the LZ spin-independent and spin-dependent analyses while extending the upper limit of the energy region of interest by a factor of 7.5 to 270 keVnr. No significant excess in this high energy region is observed. Using a profile-likelihood ratio analysis, we report 90% confidence level exclusion limits on the coupling of each individual non-relativistic WIMP-nucleon operator for both elastic and inelastic interactions in the isoscalar and isovector bases.
△ Less
Submitted 26 February, 2024; v1 submitted 4 December, 2023;
originally announced December 2023.
-
MBD+C: how to incorporate metallic character into atom-based dispersion energy schemes
Authors:
John F. Dobson,
Alberto Ambroselli
Abstract:
The dispersion component of the van der Waals (vdW) interaction in low-dimensional metals is known to exhibit anomalous "Type-C non-additivity" [Int. J. Quantum Chem. 114, 1157 (2014)]. This causes dispersion energy behavior, at asymptotically large separations, that is missed by popular atom-based schemes for dispersion energy calculations. For example, the dispersion interaction energy between p…
▽ More
The dispersion component of the van der Waals (vdW) interaction in low-dimensional metals is known to exhibit anomalous "Type-C non-additivity" [Int. J. Quantum Chem. 114, 1157 (2014)]. This causes dispersion energy behavior, at asymptotically large separations, that is missed by popular atom-based schemes for dispersion energy calculations. For example, the dispersion interaction energy between parallel metallic nanotubes at separation $D$ falls off aymptotically as approximately $D^{-2}$, whereas current atom-based schemes predict $D^{-5}$ asymptotically. To date it has not been clear whether current atom-based theories also give the dispersion interaction inaccurately at smaller separations for low-dimensional metals.
Here we introduce a new theory that we term "MBD+C" . It permits inclusion of Type C effects efficiently within atom-based dispersion energy schemes such as Many Body Dispersion (MBD) and Universal MBD (uMBD). This allows us to investigate asymptotic, intermediate and near-contact regimes with equal accuracy. (The large contact energy of intimate metallic bonding is not primarily governed by dispersion energy and is described well by semi-local density functional theory.) Here we apply a simplified version,"nn-MBD+C", of our new theory to calculate the dispersion interaction for three low-dimensional metallic systems: parallel metallic chains of gold atoms, parallel Li-doped graphene sheets; and parallel (4,4) armchair carbon nanotubes. In addition to giving the correct asymptotic behavior, the new theory seamlessly gives the dispersion energy down to near-contact geometry, where it is similar to MBD but can give up to 15% more dispersion energy than current MBD schemes, in the systems studied so far. This percentage increases with separation until nn-MBD+C dominates MBD at asymptotic separations.
△ Less
Submitted 22 August, 2023;
originally announced August 2023.
-
Identifying depression-related topics in smartphone-collected free-response speech recordings using an automatic speech recognition system and a deep learning topic model
Authors:
Yuezhou Zhang,
Amos A Folarin,
Judith Dineley,
Pauline Conde,
Valeria de Angel,
Shaoxiong Sun,
Yatharth Ranjan,
Zulqarnain Rashid,
Callum Stewart,
Petroula Laiou,
Heet Sankesara,
Linglong Qian,
Faith Matcham,
Katie M White,
Carolin Oetzmann,
Femke Lamers,
Sara Siddi,
Sara Simblett,
Björn W. Schuller,
Srinivasan Vairavan,
Til Wykes,
Josep Maria Haro,
Brenda WJH Penninx,
Vaibhav A Narayan,
Matthew Hotopf
, et al. (3 additional authors not shown)
Abstract:
Language use has been shown to correlate with depression, but large-scale validation is needed. Traditional methods like clinic studies are expensive. So, natural language processing has been employed on social media to predict depression, but limitations remain-lack of validated labels, biased user samples, and no context. Our study identified 29 topics in 3919 smartphone-collected speech recordi…
▽ More
Language use has been shown to correlate with depression, but large-scale validation is needed. Traditional methods like clinic studies are expensive. So, natural language processing has been employed on social media to predict depression, but limitations remain-lack of validated labels, biased user samples, and no context. Our study identified 29 topics in 3919 smartphone-collected speech recordings from 265 participants using the Whisper tool and BERTopic model. Six topics with a median PHQ-8 greater than or equal to 10 were regarded as risk topics for depression: No Expectations, Sleep, Mental Therapy, Haircut, Studying, and Coursework. To elucidate the topic emergence and associations with depression, we compared behavioral (from wearables) and linguistic characteristics across identified topics. The correlation between topic shifts and changes in depression severity over time was also investigated, indicating the importance of longitudinally monitoring language use. We also tested the BERTopic model on a similar smaller dataset (356 speech recordings from 57 participants), obtaining some consistent results. In summary, our findings demonstrate specific speech topics may indicate depression severity. The presented data-driven workflow provides a practical approach to collecting and analyzing large-scale speech data from real-world settings for digital health research.
△ Less
Submitted 5 September, 2023; v1 submitted 22 August, 2023;
originally announced August 2023.
-
Disease Insight through Digital Biomarkers Developed by Remotely Collected Wearables and Smartphone Data
Authors:
Zulqarnain Rashid,
Amos A Folarin,
Yatharth Ranjan,
Pauline Conde,
Heet Sankesara,
Yuezhou Zhang,
Shaoxiong Sun,
Callum Stewart,
Petroula Laiou,
Richard JB Dobson
Abstract:
Digital Biomarkers and remote patient monitoring can provide valuable and timely insights into how a patient is co** with their condition (disease progression, treatment response, etc.), complementing treatment in traditional healthcare settings.Smartphones with embedded and connected sensors have immense potential for improving healthcare through various apps and mHealth (mobile health) platfor…
▽ More
Digital Biomarkers and remote patient monitoring can provide valuable and timely insights into how a patient is co** with their condition (disease progression, treatment response, etc.), complementing treatment in traditional healthcare settings.Smartphones with embedded and connected sensors have immense potential for improving healthcare through various apps and mHealth (mobile health) platforms. This capability could enable the development of reliable digital biomarkers from long-term longitudinal data collected remotely from patients. We built an open-source platform, RADAR-base, to support large-scale data collection in remote monitoring studies. RADAR-base is a modern remote data collection platform built around Confluent's Apache Kafka, to support scalability, extensibility, security, privacy and quality of data. It provides support for study design and set-up, active (eg PROMs) and passive (eg. phone sensors, wearable devices and IoT) remote data collection capabilities with feature generation (eg. behavioural, environmental and physiological markers). The backend enables secure data transmission, and scalable solutions for data storage, management and data access. The platform has successfully collected longitudinal data for various cohorts in a number of disease areas including Multiple Sclerosis, Depression, Epilepsy, ADHD, Alzheimer, Autism and Lung diseases. Digital biomarkers developed through collected data are providing useful insights into different diseases. RADAR-base provides a modern open-source, community-driven solution for remote monitoring, data collection, and digital phenoty** of physical and mental health diseases. Clinicians can use digital biomarkers to augment their decision making for the prevention, personalisation and early intervention of disease.
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
-
A search for new physics in low-energy electron recoils from the first LZ exposure
Authors:
The LZ Collaboration,
J. Aalbers,
D. S. Akerib,
A. K. Al Musalhi,
F. Alder,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
A. Baker,
S. Balashov,
J. Bang,
J. W. Bargemann,
A. Baxter,
K. Beattie,
P. Beltrame,
T. Benson,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski,
H. J. Birch,
G. M. Blockinger
, et al. (178 additional authors not shown)
Abstract:
The LUX-ZEPLIN (LZ) experiment is a dark matter detector centered on a dual-phase xenon time projection chamber. We report searches for new physics appearing through few-keV-scale electron recoils, using the experiment's first exposure of 60 live days and a fiducial mass of 5.5t. The data are found to be consistent with a background-only hypothesis, and limits are set on models for new physics inc…
▽ More
The LUX-ZEPLIN (LZ) experiment is a dark matter detector centered on a dual-phase xenon time projection chamber. We report searches for new physics appearing through few-keV-scale electron recoils, using the experiment's first exposure of 60 live days and a fiducial mass of 5.5t. The data are found to be consistent with a background-only hypothesis, and limits are set on models for new physics including solar axion electron coupling, solar neutrino magnetic moment and millicharge, and electron couplings to galactic axion-like particles and hidden photons. Similar limits are set on weakly interacting massive particle (WIMP) dark matter producing signals through ionized atomic states from the Migdal effect.
△ Less
Submitted 9 September, 2023; v1 submitted 28 July, 2023;
originally announced July 2023.
-
Nuclear recoil response of liquid xenon and its impact on solar 8B neutrino and dark matter searches
Authors:
X. Xiang,
R. J. Gaitskell,
R. Liu,
J. Bang,
J. Xu,
W. H. Lippincott,
J. Aalbers,
J. E. Y. Dobson,
M. Szydagis,
G. R. C. Rischbieter,
N. Parveen,
D. Q. Huang,
I. Olcina,
R. J. James,
J. A. Nikoleyczik
Abstract:
Knowledge of the ionization and scintillation responses of liquid xenon (LXe) to nuclear recoils is crucial for LXe-based dark matter experiments. Current calibrations carry large uncertainties in the low-energy region below $\sim3$ keV$_nr$ where signals from dark matter particles of $<$10 GeV/c$^2$ masses are expected. The coherent elastic neutrino-nucleus scattering (CE$ν$NS) by solar $^8$B neu…
▽ More
Knowledge of the ionization and scintillation responses of liquid xenon (LXe) to nuclear recoils is crucial for LXe-based dark matter experiments. Current calibrations carry large uncertainties in the low-energy region below $\sim3$ keV$_nr$ where signals from dark matter particles of $<$10 GeV/c$^2$ masses are expected. The coherent elastic neutrino-nucleus scattering (CE$ν$NS) by solar $^8$B neutrinos also results in a continuum of nuclear recoil events below 3.0 keV$_{nr}$ (99\% of events), which further complicates low-mass dark matter searches in LXe experiments. In this paper, we describe a method to quantify the uncertainties of low-energy LXe responses using published calibration data, followed by case studies to evaluate the impact of yield uncertainties on ${^8}$B searches and low-mass dark matter sensitivity in a typical ton-scale LXe experiment. We conclude that naively omitting yield uncertainties leads to overly optimistic limits by factor $\sim2$ for a 6 GeV/c$^2$ WIMP mass. Future nuclear recoil light yield calibrations could allow experiments to recover this sensitivity and also improve the accuracy of solar ${^8}$B flux measurements.
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
On the special harmonic numbers $H_{\lfloor p/9 \rfloor}$ and $H_{\lfloor p/18 \rfloor}$ modulo $p$
Authors:
John Blythe Dobson
Abstract:
Building on work of Zhi-Hong Sun, we establish congruences for the special harmonic numbers $H_\lfloor p/9 \rfloor$ and $H_{\lfloor p/18 \rfloor}$ modulo $p$, which contain respectively three and four distinct arithmetic components. We also obtain a complete determination modulo $p$ of the corresponding families of sums of reciprocals of the type studied by Dilcher and Skula. Applications to the f…
▽ More
Building on work of Zhi-Hong Sun, we establish congruences for the special harmonic numbers $H_\lfloor p/9 \rfloor$ and $H_{\lfloor p/18 \rfloor}$ modulo $p$, which contain respectively three and four distinct arithmetic components. We also obtain a complete determination modulo $p$ of the corresponding families of sums of reciprocals of the type studied by Dilcher and Skula. Applications to the first case of Fermat's Last Theorem are considered.
△ Less
Submitted 3 February, 2023;
originally announced February 2023.
-
Challenges in Using mHealth Data From Smartphones and Wearable Devices to Predict Depression Symptom Severity: Retrospective Analysis
Authors:
Shaoxiong Sun,
Amos A. Folarin,
Yuezhou Zhang,
Nicholas Cummins,
Rafael Garcia-Dias,
Callum Stewart,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Petroula Laiou,
Heet Sankesara,
Faith Matcham,
Daniel Leightley,
Katie M. White,
Carolin Oetzmann,
Alina Ivan,
Femke Lamers,
Sara Siddi,
Sara Simblett,
Raluca Nica,
Aki Rintala,
David C. Mohr,
Inez Myin-Germeys,
Til Wykes,
Josep Maria Haro
, et al. (6 additional authors not shown)
Abstract:
A number of challenges exist for the analysis of mHealth data: maintaining participant engagement over extended time periods and therefore understanding what constitutes an acceptable threshold of missing data; distinguishing between the cross-sectional and longitudinal relationships for different features to determine their utility in tracking within-individual longitudinal variation or screening…
▽ More
A number of challenges exist for the analysis of mHealth data: maintaining participant engagement over extended time periods and therefore understanding what constitutes an acceptable threshold of missing data; distinguishing between the cross-sectional and longitudinal relationships for different features to determine their utility in tracking within-individual longitudinal variation or screening individuals at high risk; and understanding the heterogeneity with which depression manifests itself in behavioral patterns quantified by the passive features. From 479 participants with MDD, we extracted 21 features capturing mobility, sleep, and smartphone use. We investigated the impact of the number of days of available data on feature quality using the intraclass correlation coefficient and Bland-Altman analysis. We then examined the nature of the correlation between the 8-item Patient Health Questionnaire (PHQ-8) depression scale (measured every 14 days) and the features using the individual-mean correlation, repeated measures correlation, and linear mixed effects model. Furthermore, we stratified the participants based on their behavioral difference, quantified by the features, between periods of high (depression) and low (no depression) PHQ-8 scores using the Gaussian mixture model. We demonstrated that at least 8 (range 2-12) days were needed for reliable calculation of most of the features in the 14-day time window. We observed that features such as sleep onset time correlated better with PHQ-8 scores cross-sectionally than longitudinally, whereas features such as wakefulness after sleep onset correlated well with PHQ-8 longitudinally but worse cross-sectionally. Finally, we found that participants could be separated into 3 distinct clusters according to their behavioral difference between periods of depression and periods of no depression.
△ Less
Submitted 14 August, 2023; v1 submitted 20 December, 2022;
originally announced December 2022.
-
Foresight -- Generative Pretrained Transformer (GPT) for Modelling of Patient Timelines using EHRs
Authors:
Zeljko Kraljevic,
Dan Bean,
Anthony Shek,
Rebecca Bendayan,
Harry Hemingway,
Joshua Au Yeung,
Alexander Deng,
Alfie Baston,
Jack Ross,
Esther Idowu,
James T Teo,
Richard J Dobson
Abstract:
Background: Electronic Health Records hold detailed longitudinal information about each patient's health status and general clinical history, a large portion of which is stored within the unstructured text. Existing approaches focus mostly on structured data and a subset of single-domain outcomes. We explore how temporal modelling of patients from free text and structured data, using deep generati…
▽ More
Background: Electronic Health Records hold detailed longitudinal information about each patient's health status and general clinical history, a large portion of which is stored within the unstructured text. Existing approaches focus mostly on structured data and a subset of single-domain outcomes. We explore how temporal modelling of patients from free text and structured data, using deep generative transformers can be used to forecast a wide range of future disorders, substances, procedures or findings. Methods: We present Foresight, a novel transformer-based pipeline that uses named entity recognition and linking tools to convert document text into structured, coded concepts, followed by providing probabilistic forecasts for future medical events such as disorders, substances, procedures and findings. We processed the entire free-text portion from three different hospital datasets totalling 811336 patients covering both physical and mental health. Findings: On tests in two UK hospitals (King's College Hospital, South London and Maudsley) and the US MIMIC-III dataset precision@10 0.68, 0.76 and 0.88 was achieved for forecasting the next disorder in a patient timeline, while precision@10 of 0.80, 0.81 and 0.91 was achieved for forecasting the next biomedical concept. Foresight was also validated on 34 synthetic patient timelines by five clinicians and achieved relevancy of 97% for the top forecasted candidate disorder. As a generative model, it can forecast follow-on biomedical concepts for as many steps as required. Interpretation: Foresight is a general-purpose model for biomedical concept modelling that can be used for real-world risk forecasting, virtual trials and clinical research to study the progression of disorders, simulate interventions and counterfactuals, and educational purposes.
△ Less
Submitted 24 January, 2023; v1 submitted 13 December, 2022;
originally announced December 2022.
-
Background Determination for the LUX-ZEPLIN (LZ) Dark Matter Experiment
Authors:
J. Aalbers,
D. S. Akerib,
A. K. Al Musalhi,
F. Alder,
S. K. Alsum,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
A. Baker,
J. Bang,
J. W. Bargemann,
A. Baxter,
K. Beattie,
P. Beltrame,
E. P. Bernard,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski,
H. J. Birch,
G. M. Blockinger,
B. Boxer
, et al. (178 additional authors not shown)
Abstract:
The LUX-ZEPLIN experiment recently reported limits on WIMP-nucleus interactions from its initial science run, down to $9.2\times10^{-48}$ cm$^2$ for the spin-independent interaction of a 36 GeV/c$^2$ WIMP at 90% confidence level. In this paper, we present a comprehensive analysis of the backgrounds important for this result and for other upcoming physics analyses, including neutrinoless double-bet…
▽ More
The LUX-ZEPLIN experiment recently reported limits on WIMP-nucleus interactions from its initial science run, down to $9.2\times10^{-48}$ cm$^2$ for the spin-independent interaction of a 36 GeV/c$^2$ WIMP at 90% confidence level. In this paper, we present a comprehensive analysis of the backgrounds important for this result and for other upcoming physics analyses, including neutrinoless double-beta decay searches and effective field theory interpretations of LUX-ZEPLIN data. We confirm that the in-situ determinations of bulk and fixed radioactive backgrounds are consistent with expectations from the ex-situ assays. The observed background rate after WIMP search criteria were applied was $(6.3\pm0.5)\times10^{-5}$ events/keV$_{ee}$/kg/day in the low-energy region, approximately 60 times lower than the equivalent rate reported by the LUX experiment.
△ Less
Submitted 17 July, 2023; v1 submitted 30 November, 2022;
originally announced November 2022.
-
First Dark Matter Search Results from the LUX-ZEPLIN (LZ) Experiment
Authors:
J. Aalbers,
D. S. Akerib,
C. W. Akerlof,
A. K. Al Musalhi,
F. Alder,
A. Alqahtani,
S. K. Alsum,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
S. Azadi,
A. J. Bailey,
A. Baker,
J. Balajthy,
S. Balashov,
J. Bang,
J. W. Bargemann,
M. J. Barry,
J. Barthel,
D. Bauer,
A. Baxter
, et al. (322 additional authors not shown)
Abstract:
The LUX-ZEPLIN experiment is a dark matter detector centered on a dual-phase xenon time projection chamber operating at the Sanford Underground Research Facility in Lead, South Dakota, USA. This Letter reports results from LUX-ZEPLIN's first search for weakly interacting massive particles (WIMPs) with an exposure of 60~live days using a fiducial mass of 5.5 t. A profile-likelihood ratio analysis s…
▽ More
The LUX-ZEPLIN experiment is a dark matter detector centered on a dual-phase xenon time projection chamber operating at the Sanford Underground Research Facility in Lead, South Dakota, USA. This Letter reports results from LUX-ZEPLIN's first search for weakly interacting massive particles (WIMPs) with an exposure of 60~live days using a fiducial mass of 5.5 t. A profile-likelihood ratio analysis shows the data to be consistent with a background-only hypothesis, setting new limits on spin-independent WIMP-nucleon, spin-dependent WIMP-neutron, and spin-dependent WIMP-proton cross sections for WIMP masses above 9 GeV/c$^2$. The most stringent limit is set for spin-independent scattering at 36 GeV/c$^2$, rejecting cross sections above 9.2$\times 10^{-48}$ cm$^2$ at the 90% confidence level.
△ Less
Submitted 2 August, 2023; v1 submitted 8 July, 2022;
originally announced July 2022.
-
Predicting Clinical Intent from Free Text Electronic Health Records
Authors:
Kawsar Noor,
Katherine Smith,
Julia Bennett,
Jade OConnell,
Jessica Fisk,
Monika Hunt,
Gary Philippo,
Teresa Xu,
Simon Knight,
Luis Romao,
Richard JB Dobson,
Wai Keong Wong
Abstract:
After a patient consultation, a clinician determines the steps in the management of the patient. A clinician may for example request to see the patient again or refer them to a specialist. Whilst most clinicians will record their intent as "next steps" in the patient's clinical notes, in some cases the clinician may forget to indicate their intent as an order or request, e.g. failure to place the…
▽ More
After a patient consultation, a clinician determines the steps in the management of the patient. A clinician may for example request to see the patient again or refer them to a specialist. Whilst most clinicians will record their intent as "next steps" in the patient's clinical notes, in some cases the clinician may forget to indicate their intent as an order or request, e.g. failure to place the follow-up order. This consequently results in patients becoming lost-to-follow up and may in some cases lead to adverse consequences. In this paper we train a machine learning model to detect a clinician's intent to follow up with a patient from the patient's clinical notes. Annotators systematically identified 22 possible types of clinical intent and annotated 3000 Bariatric clinical notes. The annotation process revealed a class imbalance in the labeled data and we found that there was only sufficient labeled data to train 11 out of the 22 intents. We used the data to train a BERT based multilabel classification model and reported the following average accuracy metrics for all intents: macro-precision: 0.91, macro-recall: 0.90, macro-f1: 0.90.
△ Less
Submitted 25 March, 2022;
originally announced April 2022.
-
Snowmass2021 Cosmic Frontier White Paper: Calibrations and backgrounds for dark matter direct detection
Authors:
Daniel Baxter,
Raymond Bunker,
Sally Shaw,
Shawn Westerdale,
Isaac Arnquist,
Daniel S. Akerib,
Rob Calkins,
Susana Cebrián,
James B. Dent,
Maria Laura di Vacri,
Jim Dobson,
Daniel Egana-Ugrinovic,
Andrew Erlandson,
Chamkaur Ghag,
Carter Hall,
Jeter Hall,
Scott Haselschwardt,
Eric Hoppe,
Chris M. Jackson,
Yonatan Kahn,
Alvine Kamaha,
Mike Kelsey,
Alexander Kish,
Noah Kurinsky,
Matthias Laubenstein
, et al. (26 additional authors not shown)
Abstract:
Future dark matter direct detection experiments will reach unprecedented levels of sensitivity. Achieving this sensitivity will require more precise models of signal and background rates in future detectors. Improving the precision of signal and background modeling goes hand-in-hand with novel calibration techniques that can probe rare processes and lower threshold detector response. The goal of t…
▽ More
Future dark matter direct detection experiments will reach unprecedented levels of sensitivity. Achieving this sensitivity will require more precise models of signal and background rates in future detectors. Improving the precision of signal and background modeling goes hand-in-hand with novel calibration techniques that can probe rare processes and lower threshold detector response. The goal of this white paper is to outline community needs to meet the background and calibration requirements of next-generation dark matter direct detection experiments.
△ Less
Submitted 1 May, 2022; v1 submitted 14 March, 2022;
originally announced March 2022.
-
A Next-Generation Liquid Xenon Observatory for Dark Matter and Neutrino Physics
Authors:
J. Aalbers,
K. Abe,
V. Aerne,
F. Agostini,
S. Ahmed Maouloud,
D. S. Akerib,
D. Yu. Akimov,
J. Akshat,
A. K. Al Musalhi,
F. Alder,
S. K. Alsum,
L. Althueser,
C. S. Amarasinghe,
F. D. Amaro,
A. Ames,
T. J. Anderson,
B. Andrieu,
N. Angelides,
E. Angelino,
J. Angevaare,
V. C. Antochi,
D. Antón Martin,
B. Antunovic,
E. Aprile,
H. M. Araújo
, et al. (572 additional authors not shown)
Abstract:
The nature of dark matter and properties of neutrinos are among the most pressing issues in contemporary particle physics. The dual-phase xenon time-projection chamber is the leading technology to cover the available parameter space for Weakly Interacting Massive Particles (WIMPs), while featuring extensive sensitivity to many alternative dark matter candidates. These detectors can also study neut…
▽ More
The nature of dark matter and properties of neutrinos are among the most pressing issues in contemporary particle physics. The dual-phase xenon time-projection chamber is the leading technology to cover the available parameter space for Weakly Interacting Massive Particles (WIMPs), while featuring extensive sensitivity to many alternative dark matter candidates. These detectors can also study neutrinos through neutrinoless double-beta decay and through a variety of astrophysical sources. A next-generation xenon-based detector will therefore be a true multi-purpose observatory to significantly advance particle physics, nuclear physics, astrophysics, solar physics, and cosmology. This review article presents the science cases for such a detector.
△ Less
Submitted 4 March, 2022;
originally announced March 2022.
-
Associations between depression symptom severity and daily-life gait characteristics derived from long-term acceleration signals in real-world settings
Authors:
Yuezhou Zhang,
Amos A Folarin,
Shaoxiong Sun,
Nicholas Cummins,
Srinivasan Vairavan,
Linglong Qian,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Petroula Laiou,
Heet Sankesara,
Faith Matcham,
Katie M White,
Carolin Oetzmann,
Alina Ivan,
Femke Lamers,
Sara Siddi,
Sara Simblett,
Aki Rintala,
David C Mohr,
Inez Myin-Germeys,
Til Wykes,
Josep Maria Haro,
Brenda WJH Penninx
, et al. (5 additional authors not shown)
Abstract:
Gait is an essential manifestation of depression. Laboratory gait characteristics have been found to be closely associated with depression. However, the gait characteristics of daily walking in real-world scenarios and their relationships with depression are yet to be fully explored. This study aimed to explore associations between depression symptom severity and daily-life gait characteristics de…
▽ More
Gait is an essential manifestation of depression. Laboratory gait characteristics have been found to be closely associated with depression. However, the gait characteristics of daily walking in real-world scenarios and their relationships with depression are yet to be fully explored. This study aimed to explore associations between depression symptom severity and daily-life gait characteristics derived from acceleration signals in real-world settings. In this study, we used two ambulatory datasets: a public dataset with 71 elder adults' 3-day acceleration signals collected by a wearable device, and a subset of an EU longitudinal depression study with 215 participants and their phone-collected acceleration signals (average 463 hours per participant). We detected participants' gait cycles and force from acceleration signals and extracted 20 statistics-based daily-life gait features to describe the distribution and variance of gait cadence and force over a long-term period corresponding to the self-reported depression score. The gait cadence of faster steps (75th percentile) over a long-term period has a significant negative association with the depression symptom severity of this period in both datasets. Daily-life gait features could significantly improve the goodness of fit of evaluating depression severity relative to laboratory gait patterns and demographics, which was assessed by likelihood-ratio tests in both datasets. This study indicated that the significant links between daily-life walking characteristics and depression symptom severity could be captured by both wearable devices and mobile phones. The gait cadence of faster steps in daily-life walking has the potential to be a biomarker for evaluating depression severity, which may contribute to clinical tools to remotely monitor mental health in real-world settings.
△ Less
Submitted 29 January, 2022;
originally announced January 2022.
-
Cosmogenic production of $^{37}$Ar in the context of the LUX-ZEPLIN experiment
Authors:
J. Aalbers,
D. S. Akerib,
A. K. Al Musalhi,
F. Alder,
S. K. Alsum,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
X. Bai,
A. Baker,
J. Balajthy,
S. Balashov,
J. Bang,
J. W. Bargemann,
D. Bauer,
A. Baxter,
K. Beattie,
E. P. Bernard,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski
, et al. (183 additional authors not shown)
Abstract:
We estimate the amount of $^{37}$Ar produced in natural xenon via cosmic ray-induced spallation, an inevitable consequence of the transportation and storage of xenon on the Earth's surface. We then calculate the resulting $^{37}$Ar concentration in a 10-tonne payload~(similar to that of the LUX-ZEPLIN experiment) assuming a representative schedule of xenon purification, storage and delivery to the…
▽ More
We estimate the amount of $^{37}$Ar produced in natural xenon via cosmic ray-induced spallation, an inevitable consequence of the transportation and storage of xenon on the Earth's surface. We then calculate the resulting $^{37}$Ar concentration in a 10-tonne payload~(similar to that of the LUX-ZEPLIN experiment) assuming a representative schedule of xenon purification, storage and delivery to the underground facility. Using the spallation model by Silberberg and Tsao, the sea level production rate of $^{37}$Ar in natural xenon is estimated to be 0.024~atoms/kg/day. Assuming the xenon is successively purified to remove radioactive contaminants in 1-tonne batches at a rate of 1~tonne/month, the average $^{37}$Ar activity after 10~tonnes are purified and transported underground is 0.058--0.090~$μ$Bq/kg, depending on the degree of argon removal during above-ground purification. Such cosmogenic $^{37}$Ar will appear as a noticeable background in the early science data, while decaying with a 35~day half-life. This newly-noticed production mechanism of $^{37}$Ar should be considered when planning for future liquid xenon-based experiments.
△ Less
Submitted 22 March, 2022; v1 submitted 8 January, 2022;
originally announced January 2022.
-
The utility of wearable devices in assessing ambulatory impairments of people with multiple sclerosis in free-living conditions
Authors:
Shaoxiong Sun,
Amos A Folarin,
Yuezhou Zhang,
Nicholas Cummins,
Shuo Liu,
Callum Stewart,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Petroula Laiou,
Heet Sankesara,
Gloria Dalla Costa,
Letizia Leocani,
Per Soelberg Sørensen,
Melinda Magyari,
Ana Isabel Guerrero,
Ana Zabalza,
Srinivasan Vairavan,
Raquel Bailon,
Sara Simblett,
Inez Myin-Germeys,
Aki Rintala,
Til Wykes,
Vaibhav A Narayan,
Matthew Hotopf
, et al. (3 additional authors not shown)
Abstract:
Multiple sclerosis (MS) is a progressive inflammatory and neurodegenerative disease of the central nervous system affecting over 2.5 million people globally. In-clinic six-minute walk test (6MWT) is a widely used objective measure to evaluate the progression of MS. Yet, it has limitations such as the need for a clinical visit and a proper walkway. The widespread use of wearable devices capable of…
▽ More
Multiple sclerosis (MS) is a progressive inflammatory and neurodegenerative disease of the central nervous system affecting over 2.5 million people globally. In-clinic six-minute walk test (6MWT) is a widely used objective measure to evaluate the progression of MS. Yet, it has limitations such as the need for a clinical visit and a proper walkway. The widespread use of wearable devices capable of depicting patients activity profiles has the potential to assess the level of MS-induced disability in free-living conditions. In this work, we extracted 96 activity features in different temporal granularities (from minute-level to day-level) and explored their utility in estimating 6MWT scores in a European (Italy, Spain, and Denmark) MS cohort of 337 participants over an average of 10-month duration. We combined these features with participant demographics using three regression models including elastic net, gradient boosted trees and random forest. In addition, we quantified the individual feature contribution using feature importance in these regression models, linear mixed-effects models, generalized estimating equations, and correlation-based feature selection (CFS). The results showed promising estimation performance with R2 of 0.30, which was derived using random forest after CFS. This model was able to distinguish the participants with low disability from those with high disability. Furthermore, we observed that the minute-level (no longer than 8 minutes) step count, particularly those capturing the upper end of the step count distribution, had a stronger association with 6MWT. The use of a walking aid was indicative of ambulatory function measured through 6MWT. This study provides a basis for future investigation into the clinical relevance and utility of wearables in assessing MS progression in free-living conditions.
△ Less
Submitted 22 December, 2021;
originally announced December 2021.
-
Deployment of a Free-Text Analytics Platform at a UK National Health Service Research Hospital: CogStack at University College London Hospitals
Authors:
Kawsar Noor,
Lukasz Roguski,
Alex Handy,
Roman Klapaukh,
Amos Folarin,
Luis Romao,
Joshua Matteson,
Nathan Lea,
Leilei Zhu,
Wai Keong Wong,
Anoop Shah,
Richard J Dobson
Abstract:
As more healthcare organisations transition to using electronic health record (EHR) systems it is important for these organisations to maximise the secondary use of their data to support service improvement and clinical research. These organisations will find it challenging to have systems which can mine information from the unstructured data fields in the record (clinical notes, letters etc) and…
▽ More
As more healthcare organisations transition to using electronic health record (EHR) systems it is important for these organisations to maximise the secondary use of their data to support service improvement and clinical research. These organisations will find it challenging to have systems which can mine information from the unstructured data fields in the record (clinical notes, letters etc) and more practically have such systems interact with all of the hospitals data systems (legacy and current). To tackle this problem at University College London Hospitals, we have deployed an enhanced version of the CogStack platform; an information retrieval platform with natural language processing capabilities which we have configured to process the hospital's existing and legacy records. The platform has improved data ingestion capabilities as well as better tools for natural language processing. To date we have processed over 18 million records and the insights produced from CogStack have informed a number of clinical research use cases at the hospitals.
△ Less
Submitted 15 August, 2021;
originally announced August 2021.
-
LEGEND-1000 Preconceptual Design Report
Authors:
LEGEND Collaboration,
N. Abgrall,
I. Abt,
M. Agostini,
A. Alexander,
C. Andreoiu,
G. R. Araujo,
F. T. Avignone III,
W. Bae,
A. Bakalyarov,
M. Balata,
M. Bantel,
I. Barabanov,
A. S. Barabash,
P. S. Barbeau,
C. J. Barton,
P. J. Barton,
L. Baudis,
C. Bauer,
E. Bernieri,
L. Bezrukov,
K. H. Bhimani,
V. Biancacci,
E. Blalock,
A. Bolozdynya
, et al. (239 additional authors not shown)
Abstract:
We propose the construction of LEGEND-1000, the ton-scale Large Enriched Germanium Experiment for Neutrinoless $ββ$ Decay. This international experiment is designed to answer one of the highest priority questions in fundamental physics. It consists of 1000 kg of Ge detectors enriched to more than 90% in the $^{76}$Ge isotope operated in a liquid argon active shield at a deep underground laboratory…
▽ More
We propose the construction of LEGEND-1000, the ton-scale Large Enriched Germanium Experiment for Neutrinoless $ββ$ Decay. This international experiment is designed to answer one of the highest priority questions in fundamental physics. It consists of 1000 kg of Ge detectors enriched to more than 90% in the $^{76}$Ge isotope operated in a liquid argon active shield at a deep underground laboratory. By combining the lowest background levels with the best energy resolution in the field, LEGEND-1000 will perform a quasi-background-free search and can make an unambiguous discovery of neutrinoless double-beta decay with just a handful of counts at the decay $Q$ value. The experiment is designed to probe this decay with a 99.7%-CL discovery sensitivity in the $^{76}$Ge half-life of $1.3\times10^{28}$ years, corresponding to an effective Majorana mass upper limit in the range of 9-21 meV, to cover the inverted-ordering neutrino mass scale with 10 yr of live time.
△ Less
Submitted 23 July, 2021;
originally announced July 2021.
-
Estimating Redundancy in Clinical Text
Authors:
Thomas Searle,
Zina Ibrahim,
James Teo,
Richard JB Dobson
Abstract:
The current mode of use of Electronic Health Record (EHR) elicits text redundancy. Clinicians often populate new documents by duplicating existing notes, then updating accordingly. Data duplication can lead to a propagation of errors, inconsistencies and misreporting of care. Therefore, quantifying information redundancy can play an essential role in evaluating innovations that operate on clinical…
▽ More
The current mode of use of Electronic Health Record (EHR) elicits text redundancy. Clinicians often populate new documents by duplicating existing notes, then updating accordingly. Data duplication can lead to a propagation of errors, inconsistencies and misreporting of care. Therefore, quantifying information redundancy can play an essential role in evaluating innovations that operate on clinical narratives.
This work is a quantitative examination of information redundancy in EHR notes. We present and evaluate two strategies to measure redundancy: an information-theoretic approach and a lexicosyntactic and semantic model. We evaluate the measures by training large Transformer-based language models using clinical text from a large openly available US-based ICU dataset and a large multi-site UK based Trust. By comparing the information-theoretic content of the trained models with open-domain language models, the language models trained using clinical text have shown ~1.5x to ~3x less efficient than open-domain corpora. Manual evaluation shows a high correlation with lexicosyntactic and semantic redundancy, with averages ~43 to ~65%.
△ Less
Submitted 26 October, 2021; v1 submitted 25 May, 2021;
originally announced May 2021.
-
Recommended conventions for reporting results from direct dark matter searches
Authors:
D. Baxter,
I. M. Bloch,
E. Bodnia,
X. Chen,
J. Conrad,
P. Di Gangi,
J. E. Y. Dobson,
D. Durnford,
S. J. Haselschwardt,
A. Kaboth,
R. F. Lang,
Q. Lin,
W. H. Lippincott,
J. Liu,
A. Manalaysay,
C. McCabe,
K. D. Mora,
D. Naim,
R. Neilson,
I. Olcina,
M. -C. Piro,
M. Selvi,
B. von Krosigk,
S. Westerdale,
Y. Yang
, et al. (1 additional authors not shown)
Abstract:
The field of dark matter detection is a highly visible and highly competitive one. In this paper, we propose recommendations for presenting dark matter direct detection results particularly suited for weak-scale dark matter searches, although we believe the spirit of the recommendations can apply more broadly to searches for other dark matter candidates, such as very light dark matter or axions. T…
▽ More
The field of dark matter detection is a highly visible and highly competitive one. In this paper, we propose recommendations for presenting dark matter direct detection results particularly suited for weak-scale dark matter searches, although we believe the spirit of the recommendations can apply more broadly to searches for other dark matter candidates, such as very light dark matter or axions. To translate experimental data into a final published result, direct detection collaborations must make a series of choices in their analysis, ranging from how to model astrophysical parameters to how to make statistical inferences based on observed data. While many collaborations follow a standard set of recommendations in some areas, for example the expected flux of dark matter particles (to a large degree based on a paper from Lewin and Smith in 1995), in other areas, particularly in statistical inference, they have taken different approaches, often from result to result by the same collaboration. We set out a number of recommendations on how to apply the now commonly used Profile Likelihood Ratio method to direct detection data. In addition, updated recommendations for the Standard Halo Model astrophysical parameters and relevant neutrino fluxes are provided. The authors of this note include members of the DAMIC, DarkSide, DARWIN, DEAP, LZ, NEWS-G, PandaX, PICO, SBC, SENSEI, SuperCDMS, and XENON collaborations, and these collaborations provided input to the recommendations laid out here. Wide-spread adoption of these recommendations will make it easier to compare and combine future dark matter results.
△ Less
Submitted 6 January, 2022; v1 submitted 2 May, 2021;
originally announced May 2021.
-
Projected sensitivity of the LUX-ZEPLIN (LZ) experiment to the two-neutrino and neutrinoless double beta decays of $^{134}$Xe
Authors:
The LUX-ZEPLIN,
Collaboration,
:,
D. S. Akerib,
A. K. Al Musalhi,
S. K. Alsum,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araujo,
J. E. Armstrong,
M. Arthurs,
X. Bai,
J. Balajthy,
S. Balashov,
J. Bang,
J. W. Bargemann,
D. Bauer,
A. Baxter,
P. Beltrame,
E. P. Bernard,
A. Bernstein,
A. Bhatti,
A. Biekert
, et al. (172 additional authors not shown)
Abstract:
The projected sensitivity of the LUX-ZEPLIN (LZ) experiment to two-neutrino and neutrinoless double beta decay of $^{134}$Xe is presented. LZ is a 10-tonne xenon time projection chamber optimized for the detection of dark matter particles, that is expected to start operating in 2021 at Sanford Underground Research Facility, USA. Its large mass of natural xenon provides an exceptional opportunity t…
▽ More
The projected sensitivity of the LUX-ZEPLIN (LZ) experiment to two-neutrino and neutrinoless double beta decay of $^{134}$Xe is presented. LZ is a 10-tonne xenon time projection chamber optimized for the detection of dark matter particles, that is expected to start operating in 2021 at Sanford Underground Research Facility, USA. Its large mass of natural xenon provides an exceptional opportunity to search for the double beta decay of $^{134}$Xe, for which xenon detectors enriched in $^{136}$Xe are less effective. For the two-neutrino decay mode, LZ is predicted to exclude values of the half-life up to 1.7$\times$10$^{24}$ years at 90% confidence level (CL), and has a three-sigma observation potential of 8.7$\times$10$^{23}$ years, approaching the predictions of nuclear models. For the neutrinoless decay mode LZ, is projected to exclude values of the half-life up to 7.3$\times$10$^{24}$ years at 90% CL.
△ Less
Submitted 22 November, 2021; v1 submitted 26 April, 2021;
originally announced April 2021.
-
Predicting Depressive Symptom Severity through Individuals' Nearby Bluetooth Devices Count Data Collected by Mobile Phones: A Preliminary Longitudinal Study
Authors:
Yuezhou Zhang,
Amos A Folarin,
Shaoxiong Sun,
Nicholas Cummins,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Petroula Laiou,
Faith Matcham,
Carolin Oetzmann,
Femke Lamers,
Sara Siddi,
Sara Simblett,
Aki Rintala,
David C Mohr,
Inez Myin-Germeys,
Til Wykes,
Josep Maria Haro,
Brenda WJH Pennix,
Vaibhav A Narayan,
Peter Annas,
Matthew Hotopf,
Richard JB Dobson
Abstract:
The Bluetooth sensor embedded in mobile phones provides an unobtrusive, continuous, and cost-efficient means to capture individuals' proximity information, such as the nearby Bluetooth devices count (NBDC). The continuous NBDC data can partially reflect individuals' behaviors and status, such as social connections and interactions, working status, mobility, and social isolation and loneliness, whi…
▽ More
The Bluetooth sensor embedded in mobile phones provides an unobtrusive, continuous, and cost-efficient means to capture individuals' proximity information, such as the nearby Bluetooth devices count (NBDC). The continuous NBDC data can partially reflect individuals' behaviors and status, such as social connections and interactions, working status, mobility, and social isolation and loneliness, which were found to be significantly associated with depression by previous survey-based studies. This paper aims to explore the NBDC data's value in predicting depressive symptom severity as measured via the 8-item Patient Health Questionnaire (PHQ-8). The data used in this paper included 2,886 bi-weekly PHQ-8 records collected from 316 participants recruited from three study sites in the Netherlands, Spain, and the UK as part of the EU RADAR-CNS study. From the NBDC data two weeks prior to each PHQ-8 score, we extracted 49 Bluetooth features, including statistical features and nonlinear features for measuring periodicity and regularity of individuals' life rhythms. Linear mixed-effect models were used to explore associations between Bluetooth features and the PHQ-8 score. We then applied hierarchical Bayesian linear regression models to predict the PHQ-8 score from the extracted Bluetooth features. A number of significant associations were found between Bluetooth features and depressive symptom severity. Compared with commonly used machine learning models, the proposed hierarchical Bayesian linear regression model achieved the best prediction metrics, R2= 0.526, and root mean squared error (RMSE) of 3.891. Bluetooth features can explain an extra 18.8% of the variance in the PHQ-8 score relative to the baseline model without Bluetooth features (R2=0.338, RMSE = 4.547).
△ Less
Submitted 26 April, 2021;
originally announced April 2021.
-
Fitbeat: COVID-19 Estimation based on Wristband Heart Rate
Authors:
Shuo Liu,
**g Han,
Estela Laporta Puyal,
Spyridon Kontaxis,
Shaoxiong Sun,
Patrick Locatelli,
Judith Dineley,
Florian B. Pokorny,
Gloria Dalla Costa,
Letizia Leocan,
Ana Isabel Guerrero,
Carlos Nos,
Ana Zabalza,
Per Soelberg Sørensen,
Mathias Buron,
Melinda Magyari,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Amos A Folarin,
Richard JB Dobson,
Raquel Bailón,
Srinivasan Vairavan,
Nicholas Cummins
, et al. (4 additional authors not shown)
Abstract:
This study investigates the potential of deep learning methods to identify individuals with suspected COVID-19 infection using remotely collected heart-rate data. The study utilises data from the ongoing EU IMI RADAR-CNS research project that is investigating the feasibility of wearable devices and smart phones to monitor individuals with multiple sclerosis (MS), depression or epilepsy. Aspart of…
▽ More
This study investigates the potential of deep learning methods to identify individuals with suspected COVID-19 infection using remotely collected heart-rate data. The study utilises data from the ongoing EU IMI RADAR-CNS research project that is investigating the feasibility of wearable devices and smart phones to monitor individuals with multiple sclerosis (MS), depression or epilepsy. Aspart of the project protocol, heart-rate data was collected from participants using a Fitbit wristband. The presence of COVID-19 in the cohort in this work was either confirmed through a positive swab test, or inferred through the self-reporting of a combination of symptoms including fever, respiratory symptoms, loss of smell or taste, tiredness and gastrointestinal symptoms. Experimental results indicate that our proposed contrastive convolutional auto-encoder (contrastive CAE), i. e., a combined architecture of an auto-encoder and contrastive loss, outperforms a conventional convolutional neural network (CNN), as well as a convolutional auto-encoder (CAE) without using contrastive loss. Our final contrastive CAE achieves 95.3% unweighted average recall, 86.4% precision, anF1 measure of 88.2%, a sensitivity of 100% and a specificity of 90.6% on a testset of 19 participants with MS who reported symptoms of COVID-19. Each of these participants was paired with a participant with MS with no COVID-19 symptoms.
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
Remote smartphone-based speech collection: acceptance and barriers in individuals with major depressive disorder
Authors:
Judith Dineley,
Grace Lavelle,
Daniel Leightley,
Faith Matcham,
Sara Siddi,
Maria Teresa Peñarrubia-María,
Katie M. White,
Alina Ivan,
Carolin Oetzmann,
Sara Simblett,
Erin Dawe-Lane,
Stuart Bruce,
Daniel Stahl,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Amos A. Folarin,
Josep Maria Haro,
Til Wykes,
Richard J. B. Dobson,
Vaibhav A. Narayan,
Matthew Hotopf,
Björn W. Schuller,
Nicholas Cummins,
The RADAR-CNS Consortium
Abstract:
The ease of in-the-wild speech recording using smartphones has sparked considerable interest in the combined application of speech, remote measurement technology (RMT) and advanced analytics as a research and healthcare tool. For this to be realised, the acceptability of remote speech collection to the user must be established, in addition to feasibility from an analytical perspective. To understa…
▽ More
The ease of in-the-wild speech recording using smartphones has sparked considerable interest in the combined application of speech, remote measurement technology (RMT) and advanced analytics as a research and healthcare tool. For this to be realised, the acceptability of remote speech collection to the user must be established, in addition to feasibility from an analytical perspective. To understand the acceptance, facilitators, and barriers of smartphone-based speech recording, we invited 384 individuals with major depressive disorder (MDD) from the Remote Assessment of Disease and Relapse - Central Nervous System (RADAR-CNS) research programme in Spain and the UK to complete a survey on their experiences recording their speech. In this analysis, we demonstrate that study participants were more comfortable completing a scripted speech task than a free speech task. For both speech tasks, we found depression severity and country to be significant predictors of comfort. Not seeing smartphone notifications of the scheduled speech tasks, low mood and forgetfulness were the most commonly reported obstacles to providing speech recordings.
△ Less
Submitted 30 August, 2021; v1 submitted 17 April, 2021;
originally announced April 2021.
-
Damage accumulation during high temperature fatigue of Ti/SiC$_f$ metal matrix composites under different stress amplitudes
Authors:
Ying Wang,
Xu Xu,
Wenxia Zhao,
Nan Li,
Samuel A. McDonald,
Yuan Chai,
Michael Atkinson,
Katherine J. Dobson,
Stefan Michalik,
Yingwei Fan,
Philip J. Withers,
Xiaorong Zhou,
Timothy L. Burnett
Abstract:
The damage mechanisms and load redistribution of high strength TC17 titanium alloy/unidirectional SiC fibre composite (fibre diameter = 100 $μ$m) under high temperature (350 °C) fatigue cycling have been investigated in situ using synchrotron X-ray computed tomography (CT) and X-ray diffraction (XRD) for high cycle fatigue (HCF) under different stress amplitudes. The three-dimensional morphology o…
▽ More
The damage mechanisms and load redistribution of high strength TC17 titanium alloy/unidirectional SiC fibre composite (fibre diameter = 100 $μ$m) under high temperature (350 °C) fatigue cycling have been investigated in situ using synchrotron X-ray computed tomography (CT) and X-ray diffraction (XRD) for high cycle fatigue (HCF) under different stress amplitudes. The three-dimensional morphology of the crack and fibre fractures has been mapped by CT. During stable growth, matrix cracking dominates with the crack deflecting (by 50-100 $μ$m in height) when bypassing bridging fibres. A small number of bridging fibres have fractured close to the matrix crack plane especially under relatively high stress amplitude cycling. Loading to the peak stress led to rapid crack growth accompanied by a burst of fibre fractures. Many of the fibre fractures occurred 50-300 $μ$m from the matrix crack plane during rapid growth, in contrast to that in the stable growth stage, leading to extensive fibre pull-out on the fracture surface. The changes in fibre loading, interfacial stress, and the extent of fibre-matrix debonding in the vicinity of the crack have been mapped for the fatigue cycle and after the rapid growth by high spatial resolution XRD. The fibre/matrix interfacial sliding extends up to 600 $μ$m (in the stable growth zone) or 700 $μ$m (in the rapid growth zone) either side of the crack plane. The direction of interfacial shear stress reverses with the loading cycle, with the maximum frictional sliding stress reaching ~55 MPa in both the stable growth and rapid growth regimes.
△ Less
Submitted 26 February, 2021;
originally announced February 2021.
-
Projected sensitivities of the LUX-ZEPLIN (LZ) experiment to new physics via low-energy electron recoils
Authors:
The LZ Collaboration,
D. S. Akerib,
A. K. Al Musalhi,
S. K. Alsum,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
X. Bai,
J. Balajthy,
S. Balashov,
J. Bang,
J. W. Bargemann,
D. Bauer,
A. Baxter,
P. Beltrame,
E. P. Bernard,
A. Bernstein,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski,
H. J. Birch
, et al. (172 additional authors not shown)
Abstract:
LUX-ZEPLIN (LZ) is a dark matter detector expected to obtain world-leading sensitivity to weakly interacting massive particles (WIMPs) interacting via nuclear recoils with a ~7-tonne xenon target mass. This manuscript presents sensitivity projections to several low-energy signals of the complementary electron recoil signal type: 1) an effective neutrino magnetic moment and 2) an effective neutrino…
▽ More
LUX-ZEPLIN (LZ) is a dark matter detector expected to obtain world-leading sensitivity to weakly interacting massive particles (WIMPs) interacting via nuclear recoils with a ~7-tonne xenon target mass. This manuscript presents sensitivity projections to several low-energy signals of the complementary electron recoil signal type: 1) an effective neutrino magnetic moment and 2) an effective neutrino millicharge, both for pp-chain solar neutrinos, 3) an axion flux generated by the Sun, 4) axion-like particles forming the galactic dark matter, 5) hidden photons, 6) mirror dark matter, and 7) leptophilic dark matter. World-leading sensitivities are expected in each case, a result of the large 5.6t 1000d exposure and low expected rate of electron recoil backgrounds in the $<$100keV energy regime. A consistent signal generation, background model and profile-likelihood analysis framework is used throughout.
△ Less
Submitted 18 May, 2021; v1 submitted 23 February, 2021;
originally announced February 2021.
-
Enhancing the sensitivity of the LUX-ZEPLIN (LZ) dark matter experiment to low energy signals
Authors:
D. S. Akerib,
A. K. Al Musalhi,
S. K. Alsum,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
X. Bai,
J. Balajthy,
S. Balashov,
J. Bang,
J. W. Bargemann,
D. Bauer,
A. Baxter,
P. Beltrame,
E. P. Bernard,
A. Bernstein,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski,
H. J. Birch,
G. M. Blockinger
, et al. (162 additional authors not shown)
Abstract:
Two-phase xenon detectors, such as that at the core of the forthcoming LZ dark matter experiment, use photomultiplier tubes to sense the primary (S1) and secondary (S2) scintillation signals resulting from particle interactions in their liquid xenon target. This paper describes a simulation study exploring two techniques to lower the energy threshold of LZ to gain sensitivity to low-mass dark matt…
▽ More
Two-phase xenon detectors, such as that at the core of the forthcoming LZ dark matter experiment, use photomultiplier tubes to sense the primary (S1) and secondary (S2) scintillation signals resulting from particle interactions in their liquid xenon target. This paper describes a simulation study exploring two techniques to lower the energy threshold of LZ to gain sensitivity to low-mass dark matter and astrophysical neutrinos, which will be applicable to other liquid xenon detectors. The energy threshold is determined by the number of detected S1 photons; typically, these must be recorded in three or more photomultiplier channels to avoid dark count coincidences that mimic real signals. To lower this threshold: a) we take advantage of the double photoelectron emission effect, whereby a single vacuum ultraviolet photon has a $\sim20\%$ probability of ejecting two photoelectrons from a photomultiplier tube photocathode; and b) we drop the requirement of an S1 signal altogether, and use only the ionization signal, which can be detected more efficiently. For both techniques we develop signal and background models for the nominal exposure, and explore accompanying systematic effects, including the dependence on the free electron lifetime in the liquid xenon. When incorporating double photoelectron signals, we predict a factor of $\sim 4$ sensitivity improvement to the dark matter-nucleon scattering cross-section at $2.5$ GeV/c$^2$, and a factor of $\sim1.6$ increase in the solar $^8$B neutrino detection rate. Drop** the S1 requirement may allow sensitivity gains of two orders of magnitude in both cases. Finally, we apply these techniques to even lower masses by taking into account the atomic Migdal effect; this could lower the dark matter particle mass threshold to $80$ MeV/c$^2$.
△ Less
Submitted 21 January, 2021;
originally announced January 2021.
-
A Knowledge Distillation Ensemble Framework for Predicting Short and Long-term Hospitalisation Outcomes from Electronic Health Records Data
Authors:
Zina M Ibrahim,
Daniel Bean,
Thomas Searle,
Honghan Wu,
Anthony Shek,
Zeljko Kraljevic,
James Galloway,
Sam Norton,
James T Teo,
Richard JB Dobson
Abstract:
The ability to perform accurate prognosis of patients is crucial for proactive clinical decision making, informed resource management and personalised care. Existing outcome prediction models suffer from a low recall of infrequent positive outcomes. We present a highly-scalable and robust machine learning framework to automatically predict adversity represented by mortality and ICU admission from…
▽ More
The ability to perform accurate prognosis of patients is crucial for proactive clinical decision making, informed resource management and personalised care. Existing outcome prediction models suffer from a low recall of infrequent positive outcomes. We present a highly-scalable and robust machine learning framework to automatically predict adversity represented by mortality and ICU admission from time-series vital signs and laboratory results obtained within the first 24 hours of hospital admission. The stacked platform comprises two components: a) an unsupervised LSTM Autoencoder that learns an optimal representation of the time-series, using it to differentiate the less frequent patterns which conclude with an adverse event from the majority patterns that do not, and b) a gradient boosting model, which relies on the constructed representation to refine prediction, incorporating static features of demographics, admission details and clinical summaries. The model is used to assess a patient's risk of adversity over time and provides visual justifications of its prediction based on the patient's static features and dynamic signals. Results of three case studies for predicting mortality and ICU admission show that the model outperforms all existing outcome prediction models, achieving PR-AUC of 0.891 (95$%$ CI: 0.878 - 0.969) in predicting mortality in ICU and general ward settings and 0.908 (95$%$ CI: 0.870-0.935) in predicting ICU admission.
△ Less
Submitted 11 June, 2021; v1 submitted 18 November, 2020;
originally announced November 2020.
-
Multi-domain Clinical Natural Language Processing with MedCAT: the Medical Concept Annotation Toolkit
Authors:
Zeljko Kraljevic,
Thomas Searle,
Anthony Shek,
Lukasz Roguski,
Kawsar Noor,
Daniel Bean,
Aurelie Mascio,
Leilei Zhu,
Amos A Folarin,
Angus Roberts,
Rebecca Bendayan,
Mark P Richardson,
Robert Stewart,
Anoop D Shah,
Wai Keong Wong,
Zina Ibrahim,
James T Teo,
Richard JB Dobson
Abstract:
Electronic health records (EHR) contain large volumes of unstructured text, requiring the application of Information Extraction (IE) technologies to enable clinical analysis. We present the open-source Medical Concept Annotation Toolkit (MedCAT) that provides: a) a novel self-supervised machine learning algorithm for extracting concepts using any concept vocabulary including UMLS/SNOMED-CT; b) a f…
▽ More
Electronic health records (EHR) contain large volumes of unstructured text, requiring the application of Information Extraction (IE) technologies to enable clinical analysis. We present the open-source Medical Concept Annotation Toolkit (MedCAT) that provides: a) a novel self-supervised machine learning algorithm for extracting concepts using any concept vocabulary including UMLS/SNOMED-CT; b) a feature-rich annotation interface for customising and training IE models; and c) integrations to the broader CogStack ecosystem for vendor-agnostic health system deployment. We show improved performance in extracting UMLS concepts from open datasets (F1:0.448-0.738 vs 0.429-0.650). Further real-world validation demonstrates SNOMED-CT extraction at 3 large London hospitals with self-supervised training over ~8.8B words from ~17M clinical records and further fine-tuning with ~6K clinician annotated examples. We show strong transferability (F1 > 0.94) between hospitals, datasets, and concept types indicating cross-domain EHR-agnostic utility for accelerated clinical and research use cases.
△ Less
Submitted 25 March, 2021; v1 submitted 2 October, 2020;
originally announced October 2020.
-
The Relationship between Major Depression Symptom Severity and Sleep Collected Using a Wristband Wearable Device: Multi-centre Longitudinal Observational Study
Authors:
Yuezhou Zhang,
Amos A Folarin,
Shaoxiong Sun,
Nicholas Cummins,
Rebecca Bendayan Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Petroula Laiou,
Faith Matcham,
Katie White,
Femke Lamers,
Sara Siddi,
Sara Simblett,
Inez Myin-Germeys,
Aki Rintala,
Til Wykes,
Josep Maria Haro,
Brenda WJH Pennix,
Vaibhav A Narayan,
Matthew Hotopf,
Richard JB Dobson
Abstract:
Research in mental health has implicated sleep pathologies with depression. However, the gold standard for sleep assessment, polysomnography, is not suitable for long-term, continuous, monitoring of daily sleep, and methods such as sleep diaries rely on subjective recall, which is qualitative and inaccurate. Wearable devices, on the other hand, provide a low-cost and convenient means to monitor sl…
▽ More
Research in mental health has implicated sleep pathologies with depression. However, the gold standard for sleep assessment, polysomnography, is not suitable for long-term, continuous, monitoring of daily sleep, and methods such as sleep diaries rely on subjective recall, which is qualitative and inaccurate. Wearable devices, on the other hand, provide a low-cost and convenient means to monitor sleep in home settings. The main aim of this study was to devise and extract sleep features, from data collected using a wearable device, and analyse their correlation with depressive symptom severity and sleep quality, as measured by the self-assessed Patient Health Questionnaire 8-item. Daily sleep data were collected passively by Fitbit wristband devices, and depressive symptom severity was self-reported every two weeks by the PHQ-8. The data used in this paper included 2,812 PHQ-8 records from 368 participants recruited from three study sites in the Netherlands, Spain, and the UK.We extracted 21 sleep features from Fitbit data which describe sleep in the following five aspects: sleep architecture, sleep stability, sleep quality, insomnia, and hypersomnia. Linear mixed regression models were used to explore associations between sleep features and depressive symptom severity. The z-test was used to evaluate the significance of the coefficient of each feature. We tested our models on the entire dataset and individually on the data of three different study sites. We identified 16 sleep features that were significantly correlated with the PHQ-8 score on the entire dataset. Associations between sleep features and the PHQ-8 score varied across different sites, possibly due to the difference in the populations.
△ Less
Submitted 27 September, 2020;
originally announced September 2020.
-
Measuring the effect of Non-Pharmaceutical Interventions (NPIs) on mobility during the COVID-19 pandemic using global mobility data
Authors:
Berber T Snoeijer,
Mariska Burger,
Shaoxiong Sun,
Richard JB Dobson,
Amos A Folarin
Abstract:
The implementation of governmental Non-Pharmaceutical Interventions (NPIs) has been the primary means of controlling the spread of the COVID-19 disease. The intended effect of these NPIs has been to reduce mobility. A strong reduction in mobility is believed to have a positive effect on the reduction of COVID-19 transmission by limiting the opportunity for the virus to spread in the population. Du…
▽ More
The implementation of governmental Non-Pharmaceutical Interventions (NPIs) has been the primary means of controlling the spread of the COVID-19 disease. The intended effect of these NPIs has been to reduce mobility. A strong reduction in mobility is believed to have a positive effect on the reduction of COVID-19 transmission by limiting the opportunity for the virus to spread in the population. Due to the huge costs of implementing these NPIs, it is essential to have a good understanding of their efficacy. Using global mobility data, released by Apple and Google, and ACAPS NPI data, we investigate the proportional contribution of NPIs on i) size of the change (magnitude) of transition between pre- and post-lockdown mobility levels and ii) rate (gradient) of this transition. Using generalized linear models to find the best fit model we found similar results using Apple or Google data. NPIs found to impact the magnitude of the change in mobility were: Lockdown measures (Apple, Google Retail and Recreation (RAR) and Google Transit and Stations (TS)), declaring a state of emergency (Apple, Google RAR and Google TS), closure of businesses and public services (Google RAR) and school closures (Apple). Using cluster analysis and chi square tests we found that closure of businesses and public services, school closures and limiting public gatherings as well as border closures and international flight suspensions were closely related. The implementation of lockdown measures and limiting public gatherings had the greatest effect on the rate of mobility change. In conclusion, we were able to quantitatively assess the efficacy of NPIs in reducing mobility, which enables us to understand their fine grained effects in a timely manner and therefore facilitate well-informed and cost-effective interventions.
△ Less
Submitted 21 September, 2020;
originally announced September 2020.
-
Experimental Evaluation and Development of a Silver-Standard for the MIMIC-III Clinical Coding Dataset
Authors:
Thomas Searle,
Zina Ibrahim,
Richard JB Dobson
Abstract:
Clinical coding is currently a labour-intensive, error-prone, but critical administrative process whereby hospital patient episodes are manually assigned codes by qualified staff from large, standardised taxonomic hierarchies of codes. Automating clinical coding has a long history in NLP research and has recently seen novel developments setting new state of the art results. A popular dataset used…
▽ More
Clinical coding is currently a labour-intensive, error-prone, but critical administrative process whereby hospital patient episodes are manually assigned codes by qualified staff from large, standardised taxonomic hierarchies of codes. Automating clinical coding has a long history in NLP research and has recently seen novel developments setting new state of the art results. A popular dataset used in this task is MIMIC-III, a large intensive care database that includes clinical free text notes and associated codes. We argue for the reconsideration of the validity MIMIC-III's assigned codes that are often treated as gold-standard, especially when MIMIC-III has not undergone secondary validation. This work presents an open-source, reproducible experimental methodology for assessing the validity of codes derived from EHR discharge summaries. We exemplify the methodology with MIMIC-III discharge summaries and show the most frequently assigned codes in MIMIC-III are under-coded up to 35%.
△ Less
Submitted 12 June, 2020;
originally announced June 2020.
-
The LUX-ZEPLIN (LZ) radioactivity and cleanliness control programs
Authors:
D. S. Akerib,
C. W. Akerlof,
D. Yu. Akimov,
A. Alquahtani,
S. K. Alsum,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
A. Arbuckle,
J. E. Armstrong,
M. Arthurs,
H. Auyeung,
S. Aviles,
X. Bai,
A. J. Bailey,
J. Balajthy,
S. Balashov,
J. Bang,
M. J. Barry,
D. Bauer,
P. Bauer,
A. Baxter,
J. Belle,
P. Beltrame,
J. Bensinger
, et al. (365 additional authors not shown)
Abstract:
LUX-ZEPLIN (LZ) is a second-generation direct dark matter experiment with spin-independent WIMP-nucleon scattering sensitivity above $1.4 \times 10^{-48}$ cm$^{2}$ for a WIMP mass of 40 GeV/c$^{2}$ and a 1000 d exposure. LZ achieves this sensitivity through a combination of a large 5.6 t fiducial volume, active inner and outer veto systems, and radio-pure construction using materials with inherent…
▽ More
LUX-ZEPLIN (LZ) is a second-generation direct dark matter experiment with spin-independent WIMP-nucleon scattering sensitivity above $1.4 \times 10^{-48}$ cm$^{2}$ for a WIMP mass of 40 GeV/c$^{2}$ and a 1000 d exposure. LZ achieves this sensitivity through a combination of a large 5.6 t fiducial volume, active inner and outer veto systems, and radio-pure construction using materials with inherently low radioactivity content. The LZ collaboration performed an extensive radioassay campaign over a period of six years to inform material selection for construction and provide an input to the experimental background model against which any possible signal excess may be evaluated. The campaign and its results are described in this paper. We present assays of dust and radon daughters depositing on the surface of components as well as cleanliness controls necessary to maintain background expectations through detector construction and assembly. Finally, examples from the campaign to highlight fixed contaminant radioassays for the LZ photomultiplier tubes, quality control and quality assurance procedures through fabrication, radon emanation measurements of major sub-systems, and bespoke detector systems to assay scintillator are presented.
△ Less
Submitted 28 February, 2022; v1 submitted 3 June, 2020;
originally announced June 2020.
-
Using smartphones and wearable devices to monitor behavioural changes during COVID-19
Authors:
Shaoxiong Sun,
Amos Folarin,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Nicholas Cummins,
Faith Matcham,
Gloria Dalla Costa,
Sara Simblett,
Letizia Leocani,
Per Soelberg Sørensen,
Mathias Buron,
Ana Isabel Guerrero,
Ana Zabalza,
Brenda WJH Penninx,
Femke Lamers,
Sara Siddi,
Josep Maria Haro,
Inez Myin-Germeys,
Aki Rintala,
Til Wykes,
Vaibhav A. Narayan,
Giancarlo Comi,
Matthew Hotopf
, et al. (1 additional authors not shown)
Abstract:
We aimed to explore the utility of the recently developed open-source mobile health platform RADAR-base as a toolbox to rapidly test the effect and response to NPIs aimed at limiting the spread of COVID-19. We analysed data extracted from smartphone and wearable devices and managed by the RADAR-base from 1062 participants recruited in Italy, Spain, Denmark, the UK, and the Netherlands. We derived…
▽ More
We aimed to explore the utility of the recently developed open-source mobile health platform RADAR-base as a toolbox to rapidly test the effect and response to NPIs aimed at limiting the spread of COVID-19. We analysed data extracted from smartphone and wearable devices and managed by the RADAR-base from 1062 participants recruited in Italy, Spain, Denmark, the UK, and the Netherlands. We derived nine features on a daily basis including time spent at home, maximum distance travelled from home, maximum number of Bluetooth-enabled nearby devices (as a proxy for physical distancing), step count, average heart rate, sleep duration, bedtime, phone unlock duration, and social app use duration. We performed Kruskal-Wallis tests followed by post-hoc Dunns tests to assess differences in these features among baseline, pre-, and during-lockdown periods. We also studied behavioural differences by age, gender, body mass index (BMI), and educational background. We were able to quantify expected changes in time spent at home, distance travelled, and the number of nearby Bluetooth-enabled devices between pre- and during-lockdown periods. We saw reduced sociality as measured through mobility features, and increased virtual sociality through phone usage. People were more active on their phones, spending more time using social media apps, particularly around major news events. Furthermore, participants had lower heart rate, went to bed later, and slept more. We also found that young people had longer homestay than older people during lockdown and fewer daily steps. Although there was no significant difference between the high and low BMI groups in time spent at home, the low BMI group walked more. RADAR-base can be used to rapidly quantify and provide a holistic view of behavioural changes in response to public health interventions as a result of infectious outbreaks such as COVID-19.
△ Less
Submitted 22 July, 2020; v1 submitted 29 April, 2020;
originally announced April 2020.
-
Simulations of Events for the LUX-ZEPLIN (LZ) Dark Matter Experiment
Authors:
The LUX-ZEPLIN Collaboration,
:,
D. S. Akerib,
C. W. Akerlof,
A. Alqahtani,
S. K. Alsum,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
X. Bai,
J. Balajthy,
S. Balashov,
J. Bang,
D. Bauer,
A. Baxter,
J. Bensinger,
E. P. Bernard,
A. Bernstein,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski,
H. J. Birch,
K. E. Boast
, et al. (173 additional authors not shown)
Abstract:
The LUX-ZEPLIN dark matter search aims to achieve a sensitivity to the WIMP-nucleon spin-independent cross-section down to (1--2)$\times10^{-12}$\,pb at a WIMP mass of 40 GeV/$c^2$. This paper describes the simulations framework that, along with radioactivity measurements, was used to support this projection, and also to provide mock data for validating reconstruction and analysis software. Of par…
▽ More
The LUX-ZEPLIN dark matter search aims to achieve a sensitivity to the WIMP-nucleon spin-independent cross-section down to (1--2)$\times10^{-12}$\,pb at a WIMP mass of 40 GeV/$c^2$. This paper describes the simulations framework that, along with radioactivity measurements, was used to support this projection, and also to provide mock data for validating reconstruction and analysis software. Of particular note are the event generators, which allow us to model the background radiation, and the detector response physics used in the production of raw signals, which can be converted into digitized waveforms similar to data from the operational detector. Inclusion of the detector response allows us to process simulated data using the same analysis routines as developed to process the experimental data.
△ Less
Submitted 23 June, 2020; v1 submitted 25 January, 2020;
originally announced January 2020.
-
Projected sensitivity of the LUX-ZEPLIN experiment to the $0νββ$ decay of $^{136}$Xe
Authors:
D. S. Akerib,
C. W. Akerlof,
A. Alqahtani,
S. K. Alsum,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
X. Bai,
J. Balajthy,
S. Balashov,
J. Bang,
A. Baxter,
J. Bensinger,
E. P. Bernard,
A. Bernstein,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski,
H. J. Birch,
K. E. Boast,
B. Boxer,
P. Brás,
J. H. Buckley
, et al. (167 additional authors not shown)
Abstract:
The LUX-ZEPLIN (LZ) experiment will enable a neutrinoless double beta decay search in parallel to the main science goal of discovering dark matter particle interactions. We report the expected LZ sensitivity to $^{136}$Xe neutrinoless double beta decay, taking advantage of the significant ($>$600 kg) $^{136}$Xe mass contained within the active volume of LZ without isotopic enrichment. After 1000 l…
▽ More
The LUX-ZEPLIN (LZ) experiment will enable a neutrinoless double beta decay search in parallel to the main science goal of discovering dark matter particle interactions. We report the expected LZ sensitivity to $^{136}$Xe neutrinoless double beta decay, taking advantage of the significant ($>$600 kg) $^{136}$Xe mass contained within the active volume of LZ without isotopic enrichment. After 1000 live-days, the median exclusion sensitivity to the half-life of $^{136}$Xe is projected to be 1.06$\times$10$^{26}$ years (90% confidence level), similar to existing constraints. We also report the expected sensitivity of a possible subsequent dedicated exposure using 90% enrichment with $^{136}$Xe at 1.06$\times$10$^{27}$ years.
△ Less
Submitted 24 April, 2020; v1 submitted 9 December, 2019;
originally announced December 2019.
-
The LUX-ZEPLIN (LZ) Experiment
Authors:
The LZ Collaboration,
D. S. Akerib,
C. W. Akerlof,
D. Yu. Akimov,
A. Alquahtani,
S. K. Alsum,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
A. Arbuckle,
J. E. Armstrong,
M. Arthurs,
H. Auyeung,
X. Bai,
A. J. Bailey,
J. Balajthy,
S. Balashov,
J. Bang,
M. J. Barry,
J. Barthel,
D. Bauer,
P. Bauer,
A. Baxter,
J. Belle,
P. Beltrame
, et al. (357 additional authors not shown)
Abstract:
We describe the design and assembly of the LUX-ZEPLIN experiment, a direct detection search for cosmic WIMP dark matter particles. The centerpiece of the experiment is a large liquid xenon time projection chamber sensitive to low energy nuclear recoils. Rejection of backgrounds is enhanced by a Xe skin veto detector and by a liquid scintillator Outer Detector loaded with gadolinium for efficient n…
▽ More
We describe the design and assembly of the LUX-ZEPLIN experiment, a direct detection search for cosmic WIMP dark matter particles. The centerpiece of the experiment is a large liquid xenon time projection chamber sensitive to low energy nuclear recoils. Rejection of backgrounds is enhanced by a Xe skin veto detector and by a liquid scintillator Outer Detector loaded with gadolinium for efficient neutron capture and tagging. LZ is located in the Davis Cavern at the 4850' level of the Sanford Underground Research Facility in Lead, South Dakota, USA. We describe the major subsystems of the experiment and its key design features and requirements.
△ Less
Submitted 3 November, 2019; v1 submitted 20 October, 2019;
originally announced October 2019.
-
Extending light WIMP searches to single scintillation photons in LUX
Authors:
D. S. Akerib,
S. Alsum,
H. M. Araújo,
X. Bai,
A. J. Bailey,
J. Balajthy,
A. Baxter,
P. Beltrame,
E. P. Bernard,
A. Bernstein,
T. P. Biesiadzinski,
E. M. Boulton,
B. Boxer,
P. Brás,
S. Burdin,
D. Byram,
S. B. Cahn,
M. C. Carmona-Benitez,
C. Chan,
A. A. Chiller,
C. Chiller,
A. Currie,
J. E. Cutter,
L. de Viveiros,
A. Dobi
, et al. (100 additional authors not shown)
Abstract:
We present a novel analysis technique for liquid xenon time projection chambers that allows for a lower threshold by relying on events with a prompt scintillation signal consisting of single detected photons. The energy threshold of the LUX dark matter experiment is primarily determined by the smallest scintillation response detectable, which previously required a 2-fold coincidence signal in its…
▽ More
We present a novel analysis technique for liquid xenon time projection chambers that allows for a lower threshold by relying on events with a prompt scintillation signal consisting of single detected photons. The energy threshold of the LUX dark matter experiment is primarily determined by the smallest scintillation response detectable, which previously required a 2-fold coincidence signal in its photomultiplier arrays, enforced in data analysis. The technique presented here exploits the double photoelectron emission effect observed in some photomultiplier models at vacuum ultraviolet wavelengths. We demonstrate this analysis using an electron recoil calibration dataset and place new constraints on the spin-independent scattering cross section of weakly interacting massive particles (WIMPs) down to 2.5 GeV/c$^2$ WIMP mass using the 2013 LUX dataset. This new technique is promising to enhance light WIMP and astrophysical neutrino searches in next-generation liquid xenon experiments.
△ Less
Submitted 27 December, 2019; v1 submitted 14 July, 2019;
originally announced July 2019.
-
Measurement of the Gamma Ray Background in the Davis Cavern at the Sanford Underground Research Facility
Authors:
D. S. Akerib,
C. W. Akerlof,
S. K. Alsum,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
X. Bai,
J. Balajthy,
S. Balashov,
A. Baxter,
E. P. Bernard,
A. Biekert,
T. P. Biesiadzinski,
K. E. Boast,
B. Boxer,
P. Brás,
J. H. Buckley,
V. V. Bugaev,
S. Burdin,
J. K. Busenitz,
C. Carels,
D. L. Carlsmith,
M. C. Carmona-Benitez,
M. Cascella
, et al. (142 additional authors not shown)
Abstract:
Deep underground environments are ideal for low background searches due to the attenuation of cosmic rays by passage through the earth. However, they are affected by backgrounds from $γ$-rays emitted by $^{40}$K and the $^{238}$U and $^{232}$Th decay chains in the surrounding rock. The LUX-ZEPLIN (LZ) experiment will search for dark matter particle interactions with a liquid xenon TPC located with…
▽ More
Deep underground environments are ideal for low background searches due to the attenuation of cosmic rays by passage through the earth. However, they are affected by backgrounds from $γ$-rays emitted by $^{40}$K and the $^{238}$U and $^{232}$Th decay chains in the surrounding rock. The LUX-ZEPLIN (LZ) experiment will search for dark matter particle interactions with a liquid xenon TPC located within the Davis campus at the Sanford Underground Research Facility, Lead, South Dakota, at the 4,850-foot level. In order to characterise the cavern background, in-situ $γ$-ray measurements were taken with a sodium iodide detector in various locations and with lead shielding. The integral count rates (0--3300~keV) varied from 596~Hz to 1355~Hz for unshielded measurements, corresponding to a total flux in the cavern of $1.9\pm0.4$~$γ~$cm$^{-2}$s$^{-1}$. The resulting activity in the walls of the cavern can be characterised as $220\pm60$~Bq/kg of $^{40}$K, $29\pm15$~Bq/kg of $^{238}$U, and $13\pm3$~Bq/kg of $^{232}$Th.
△ Less
Submitted 14 November, 2019; v1 submitted 3 April, 2019;
originally announced April 2019.
-
Efficiently Reusing Natural Language Processing Models for Phenotype-Mention Identification in Free-text Electronic Medical Records: Methodology Study
Authors:
Honghan Wu,
Karen Hodgson,
Sue Dyson,
Katherine I. Morley,
Zina M. Ibrahim,
Ehtesham Iqbal,
Robert Stewart,
Richard JB Dobson,
Cathie Sudlow
Abstract:
Background: Many efforts have been put into the use of automated approaches, such as natural language processing (NLP), to mine or extract data from free-text medical records to construct comprehensive patient profiles for delivering better health-care. Reusing NLP models in new settings, however, remains cumbersome - requiring validation and/or retraining on new data iteratively to achieve conver…
▽ More
Background: Many efforts have been put into the use of automated approaches, such as natural language processing (NLP), to mine or extract data from free-text medical records to construct comprehensive patient profiles for delivering better health-care. Reusing NLP models in new settings, however, remains cumbersome - requiring validation and/or retraining on new data iteratively to achieve convergent results.
Objective: The aim of this work is to minimize the effort involved in reusing NLP models on free-text medical records.
Methods: We formally define and analyse the model adaptation problem in phenotype-mention identification tasks. We identify "duplicate waste" and "imbalance waste", which collectively impede efficient model reuse. We propose a phenotype embedding based approach to minimize these sources of waste without the need for labelled data from new settings.
Results: We conduct experiments on data from a large mental health registry to reuse NLP models in four phenotype-mention identification tasks. The proposed approach can choose the best model for a new task, identifying up to 76% (duplicate waste), i.e. phenotype mentions without the need for validation and model retraining, and with very good performance (93-97% accuracy). It can also provide guidance for validating and retraining the selected model for novel language patterns in new tasks, saving around 80% (imbalance waste), i.e. the effort required in "blind" model-adaptation approaches.
Conclusions: Adapting pre-trained NLP models for new tasks can be more efficient and effective if the language pattern landscapes of old settings and new settings can be made explicit and comparable. Our experiments show that the phenotype-mention embedding approach is an effective way to model language patterns for phenotype-mention identification tasks and that its use can guide efficient NLP model reuse.
△ Less
Submitted 23 October, 2019; v1 submitted 10 March, 2019;
originally announced March 2019.