Search | arXiv e-print repository

Machine learning for classifying and interpreting coherent X-ray speckle patterns

Authors: Mingren Shen, Dina Sheyfer, Troy David Loeffler, Subramanian K. R. S. Sankaranarayanan, G. Brian Stephenson, Maria K. Y. Chan, Dane Morgan

Abstract: Speckle patterns produced by coherent X-ray have a close relationship with the internal structure of materials but quantitative inversion of the relationship to determine structure from speckle patterns is challenging. Here, we investigate the link between coherent X-ray speckle patterns and sample structures using a model 2D disk system and explore the ability of machine learning to learn aspects… ▽ More Speckle patterns produced by coherent X-ray have a close relationship with the internal structure of materials but quantitative inversion of the relationship to determine structure from speckle patterns is challenging. Here, we investigate the link between coherent X-ray speckle patterns and sample structures using a model 2D disk system and explore the ability of machine learning to learn aspects of the relationship. Specifically, we train a deep neural network to classify the coherent X-ray speckle patterns according to the disk number density in the corresponding structure. It is demonstrated that the classification system is accurate for both non-disperse and disperse size distributions. △ Less

Submitted 1 September, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

arXiv:2207.01718 [pdf, other]

BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model

Authors: Brooke Stephenson, Laurent Besacier, Laurent Girin, Thomas Hueber

Abstract: Several recent studies have tested the use of transformer language model representations to infer prosodic features for text-to-speech synthesis (TTS). While these studies have explored prosody in general, in this work, we look specifically at the prediction of contrastive focus on personal pronouns. This is a particularly challenging task as it often requires semantic, discursive and/or pragmatic… ▽ More Several recent studies have tested the use of transformer language model representations to infer prosodic features for text-to-speech synthesis (TTS). While these studies have explored prosody in general, in this work, we look specifically at the prediction of contrastive focus on personal pronouns. This is a particularly challenging task as it often requires semantic, discursive and/or pragmatic knowledge to predict correctly. We collect a corpus of utterances containing contrastive focus and we evaluate the accuracy of a BERT model, finetuned to predict quantized acoustic prominence features, on these samples. We also investigate how past utterances can provide relevant information for this prediction. Furthermore, we evaluate the controllability of pronoun prominence in a TTS model conditioned on acoustic prominence features. △ Less

Submitted 4 July, 2022; originally announced July 2022.

Comments: 5 pages

arXiv:2102.09914 [pdf, other]

Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input

Authors: Brooke Stephenson, Thomas Hueber, Laurent Girin, Laurent Besacier

Abstract: The prosody of a spoken word is determined by its surrounding context. In incremental text-to-speech synthesis, where the synthesizer produces an output before it has access to the complete input, the full context is often unknown which can result in a loss of naturalness in the synthesized speech. In this paper, we investigate whether the use of predicted future text can attenuate this loss. We c… ▽ More The prosody of a spoken word is determined by its surrounding context. In incremental text-to-speech synthesis, where the synthesizer produces an output before it has access to the complete input, the full context is often unknown which can result in a loss of naturalness in the synthesized speech. In this paper, we investigate whether the use of predicted future text can attenuate this loss. We compare several test conditions of next future word: (a) unknown (zero-word), (b) language model predicted, (c) randomly predicted and (d) ground-truth. We measure the prosodic features (pitch, energy and duration) and find that predicted text provides significant improvements over a zero-word lookahead, but only slight gains over random-word lookahead. We confirm these results with a perceptive test. △ Less

Submitted 15 June, 2021; v1 submitted 19 February, 2021; originally announced February 2021.

Comments: 4 pages

arXiv:2009.02035 [pdf, other]

What the Future Brings: Investigating the Impact of Lookahead for Incremental Neural TTS

Authors: Brooke Stephenson, Laurent Besacier, Laurent Girin, Thomas Hueber

Abstract: In incremental text to speech synthesis (iTTS), the synthesizer produces an audio output before it has access to the entire input sentence. In this paper, we study the behavior of a neural sequence-to-sequence TTS system when used in an incremental mode, i.e. when generating speech output for token n, the system has access to n + k tokens from the text sequence. We first analyze the impact of this… ▽ More In incremental text to speech synthesis (iTTS), the synthesizer produces an audio output before it has access to the entire input sentence. In this paper, we study the behavior of a neural sequence-to-sequence TTS system when used in an incremental mode, i.e. when generating speech output for token n, the system has access to n + k tokens from the text sequence. We first analyze the impact of this incremental policy on the evolution of the encoder representations of token n for different values of k (the lookahead parameter). The results show that, on average, tokens travel 88% of the way to their full context representation with a one-word lookahead and 94% after 2 words. We then investigate which text features are the most influential on the evolution towards the final representation using a random forest analysis. The results show that the most salient factors are related to token length. We finally evaluate the effects of lookahead k at the decoder level, using a MUSHRA listening test. This test shows results that contrast with the above high figures: speech synthesis quality obtained with 2 word-lookahead is significantly lower than the one obtained with the full sentence. △ Less

Submitted 4 September, 2020; originally announced September 2020.

Comments: 5 pages, 4 figures

arXiv:2007.13454 [pdf, other]

How Robust are the Estimated Effects of Nonpharmaceutical Interventions against COVID-19?

Authors: Mrinank Sharma, Sören Mindermann, Jan Markus Brauner, Gavin Leech, Anna B. Stephenson, Tomáš Gavenčiak, Jan Kulveit, Yee Whye Teh, Leonid Chindelevitch, Yarin Gal

Abstract: To what extent are effectiveness estimates of nonpharmaceutical interventions (NPIs) against COVID-19 influenced by the assumptions our models make? To answer this question, we investigate 2 state-of-the-art NPI effectiveness models and propose 6 variants that make different structural assumptions. In particular, we investigate how well NPI effectiveness estimates generalise to unseen countries, a… ▽ More To what extent are effectiveness estimates of nonpharmaceutical interventions (NPIs) against COVID-19 influenced by the assumptions our models make? To answer this question, we investigate 2 state-of-the-art NPI effectiveness models and propose 6 variants that make different structural assumptions. In particular, we investigate how well NPI effectiveness estimates generalise to unseen countries, and their sensitivity to unobserved factors. Models that account for noise in disease transmission compare favourably. We further evaluate how robust estimates are to different choices of epidemiological parameters and data. Focusing on models that assume transmission noise, we find that previously published results are remarkably robust across these variables. Finally, we mathematically ground the interpretation of NPI effectiveness estimates when certain common assumptions do not hold. △ Less

Submitted 20 December, 2020; v1 submitted 27 July, 2020; originally announced July 2020.

Journal ref: NeurIPS 2020, Advances in Neural Information Processing Systems 33

arXiv:1807.09632 [pdf, other]

doi 10.1145/3219104.3229289

PaPaS: A Portable, Lightweight, and Generic Framework for Parallel Parameter Studies

Authors: Eduardo Ponce, Brittany Stephenson, Suzanne Lenhart, Judy Day, Gregory D. Peterson

Abstract: The current landscape of scientific research is widely based on modeling and simulation, typically with complexity in the simulation's flow of execution and parameterization properties. Execution flows are not necessarily straightforward since they may need multiple processing tasks and iterations. Furthermore, parameter and performance studies are common approaches used to characterize a simulati… ▽ More The current landscape of scientific research is widely based on modeling and simulation, typically with complexity in the simulation's flow of execution and parameterization properties. Execution flows are not necessarily straightforward since they may need multiple processing tasks and iterations. Furthermore, parameter and performance studies are common approaches used to characterize a simulation, often requiring traversal of a large parameter space. High-performance computers offer practical resources at the expense of users handling the setup, submission, and management of jobs. This work presents the design of PaPaS, a portable, lightweight, and generic workflow framework for conducting parallel parameter and performance studies. Workflows are defined using parameter files based on keyword-value pairs syntax, thus removing from the user the overhead of creating complex scripts to manage the workflow. A parameter set consists of any combination of environment variables, files, partial file contents, and command line arguments. PaPaS is being developed in Python 3 with support for distributed parallelization using SSH, batch systems, and C++ MPI. The PaPaS framework will run as user processes, and can be used in single/multi-node and multi-tenant computing systems. An example simulation using the BehaviorSpace tool from NetLogo and a matrix multiply using OpenMP are presented as parameter and performance studies, respectively. The results demonstrate that the PaPaS framework offers a simple method for defining and managing parameter studies, while increasing resource utilization. △ Less

Submitted 25 July, 2018; originally announced July 2018.

Comments: 8 pages, 6 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, USA

Showing 1–6 of 6 results for author: Stephenson, B