Search | arXiv e-print repository

Using autoencoders and deep transfer learning to determine the stellar parameters of 286 CARMENES M dwarfs

Authors: P. Mas-Buitrago, A. González-Marcos, E. Solano, V. M. Passegger, M. Cortés-Contreras, J. Ordieres-Meré, A. Bello-García, J. A. Caballero, A. Schweitzer, H. M. Tabernero, D. Montes, C. Cifuentes

Abstract: Deep learning (DL) techniques are a promising approach among the set of methods used in the ever-challenging determination of stellar parameters in M dwarfs. In this context, transfer learning could play an important role in mitigating uncertainties in the results due to the synthetic gap (i.e. difference in feature distributions between observed and synthetic data). We propose a feature-based dee… ▽ More Deep learning (DL) techniques are a promising approach among the set of methods used in the ever-challenging determination of stellar parameters in M dwarfs. In this context, transfer learning could play an important role in mitigating uncertainties in the results due to the synthetic gap (i.e. difference in feature distributions between observed and synthetic data). We propose a feature-based deep transfer learning (DTL) approach based on autoencoders to determine stellar parameters from high-resolution spectra. Using this methodology, we provide new estimations for the effective temperature, surface gravity, metallicity, and projected rotational velocity for 286 M dwarfs observed by the CARMENES survey. Using autoencoder architectures, we projected synthetic PHOENIX-ACES spectra and observed CARMENES spectra onto a new feature space of lower dimensionality in which the differences between the two domains are reduced. We used this low-dimensional new feature space as input for a convolutional neural network to obtain the stellar parameter determinations. We performed an extensive analysis of our estimated stellar parameters, ranging from 3050 to 4300 K, 4.7 to 5.1 dex, and -0.53 to 0.25 dex for Teff, logg, and [Fe/H], respectively. Our results are broadly consistent with those of recent studies using CARMENES data, with a systematic deviation in our Teff scale towards hotter values for estimations above 3750 K. Furthermore, our methodology mitigates the deviations in metallicity found in previous DL techniques due to the synthetic gap. We consolidated a DTL-based methodology to determine stellar parameters in M dwarfs from synthetic spectra, with no need for high-quality measurements involved in the knowledge transfer. These results suggest the great potential of DTL to mitigate the differences in feature distributions between the observations and the PHOENIX-ACES spectra. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: Accepted in A&A

arXiv:2405.00011 [pdf, other]

doi 10.1016/j.mechrescom.2024.104275

A Multiscale Fracture Model using Peridynamic Enrichment of Finite Elements within an Adaptive Partition of Unity: Experimental Validation

Authors: Matthias Birner, Patrick Diehl, Robert Lipton, Marc Alexander Schweitzer

Abstract: Partition of unity methods (PUM) are of domain decomposition type and provide the opportunity for multiscale and multiphysics numerical modeling. Within the PUM global-local enrichment scheme [1, 2] different physical models can exist to capture multiscale behavior. For instance, we consider classical linear elasticity globally and local zones where fractures occur. The elastic fields of the undam… ▽ More Partition of unity methods (PUM) are of domain decomposition type and provide the opportunity for multiscale and multiphysics numerical modeling. Within the PUM global-local enrichment scheme [1, 2] different physical models can exist to capture multiscale behavior. For instance, we consider classical linear elasticity globally and local zones where fractures occur. The elastic fields of the undamaged media provide appropriate boundary data for local PD simulations on a subdomain containing the crack tip to grow the crack path. Once the updated crack path is found, the elastic field in the body and surrounding the crack is updated using PUM basis with appropriate enrichment near the crack. The subdomain for the PD simulation is chosen to include the current crack tip as well as nearby features that will influence crack growth. This paper is part II of this series and validates the combined PD/PUM simulator against the experimental results presented in [3]. The presented results show that we can attain good agreement between experimental and simulation data with a local PD subdomain that is moving with the crack tip and adaptively chosen size. △ Less

Submitted 1 February, 2024; originally announced May 2024.

arXiv:2310.17499 [pdf, other]

The IMS Toucan System for the Blizzard Challenge 2023

Authors: Florian Lux, Julia Koch, Sarina Meyer, Thomas Bott, Nadja Schauffler, Pavel Denisov, Antje Schweitzer, Ngoc Thang Vu

Abstract: For our contribution to the Blizzard Challenge 2023, we improved on the system we submitted to the Blizzard Challenge 2021. Our approach entails a rule-based text-to-phoneme processing system that includes rule-based disambiguation of homographs in the French language. It then transforms the phonemes to spectrograms as intermediate representations using a fast and efficient non-autoregressive synt… ▽ More For our contribution to the Blizzard Challenge 2023, we improved on the system we submitted to the Blizzard Challenge 2021. Our approach entails a rule-based text-to-phoneme processing system that includes rule-based disambiguation of homographs in the French language. It then transforms the phonemes to spectrograms as intermediate representations using a fast and efficient non-autoregressive synthesis architecture based on Conformer and Glow. A GAN based neural vocoder that combines recent state-of-the-art approaches converts the spectrogram to the final wave. We carefully designed the data processing, training, and inference procedures for the challenge data. Our system identifier is G. Open source code and demo are available. △ Less

Submitted 26 October, 2023; originally announced October 2023.

Comments: Published at the Blizzard Challenge Workshop 2023, colocated with the Speech Synthesis Workshop 2023, a sattelite event of the Interspeech 2023

arXiv:2304.04472 [pdf, other]

Modeling Speaker-Listener Interaction for Backchannel Prediction

Authors: Daniel Ortega, Sarina Meyer, Antje Schweitzer, Ngoc Thang Vu

Abstract: We present our latest findings on backchannel modeling novelly motivated by the canonical use of the minimal responses Yeah and Uh-huh in English and their correspondent tokens in German, and the effect of encoding the speaker-listener interaction. Backchanneling theories emphasize the active and continuous role of the listener in the course of the conversation, their effects on the speaker's subs… ▽ More We present our latest findings on backchannel modeling novelly motivated by the canonical use of the minimal responses Yeah and Uh-huh in English and their correspondent tokens in German, and the effect of encoding the speaker-listener interaction. Backchanneling theories emphasize the active and continuous role of the listener in the course of the conversation, their effects on the speaker's subsequent talk, and the consequent dynamic speaker-listener interaction. Therefore, we propose a neural-based acoustic backchannel classifier on minimal responses by processing acoustic features from the speaker speech, capturing and imitating listeners' backchanneling behavior, and encoding speaker-listener interaction. Our experimental results on the Switchboard and GECO datasets reveal that in almost all tested scenarios the speaker or listener behavior embeddings help the model make more accurate backchannel predictions. More importantly, a proper interaction encoding strategy, i.e., combining the speaker and listener embeddings, leads to the best performance on both datasets in terms of F1-score. △ Less

Submitted 10 April, 2023; originally announced April 2023.

Comments: Published in IWSDS 2023

arXiv:2206.06302 [pdf, other]

doi 10.1007/978-3-319-46079-6_2

Closing the Performance Gap with Modern C++

Authors: Thomas Heller, Hartmut Kaiser, Patrick Diehl, Dietmar Fey, Marc Alexander Schweitzer

Abstract: On the way to Exascale, programmers face the increasing challenge of having to support multiple hardware architectures from the same code base. At the same time, portability of code and performance are increasingly difficult to achieve as hardware architectures are becoming more and more diverse. Today's heterogeneous systems often include two or more completely distinct and incompatible hardware… ▽ More On the way to Exascale, programmers face the increasing challenge of having to support multiple hardware architectures from the same code base. At the same time, portability of code and performance are increasingly difficult to achieve as hardware architectures are becoming more and more diverse. Today's heterogeneous systems often include two or more completely distinct and incompatible hardware execution models, such as GPGPU's, SIMD vector units, and general purpose cores which conventionally have to be programmed using separate tool chains representing non-overlap** programming models. The recent revival of interest in the industry and the wider community for the C++ language has spurred a remarkable amount of standardization proposals and technical specifications in the arena of concurrency and parallelism. This recently includes an increasing amount of discussion around the need for a uniform, higher-level abstraction and programming model for parallelism in the C++ standard targeting heterogeneous and distributed computing. Such an abstraction should perfectly blend with existing, already standardized language and library features, but should also be generic enough to support future hardware developments. In this paper, we present the results from develo** such a higher-level programming abstraction for parallelism in C++ which aims at enabling code and performance portability over a wide range of architectures and for various types of parallelism. We present and compare performance data obtained from running the well-known STREAM benchmark ported to our higher level C++ abstraction with the corresponding results from running it natively. We show that our abstractions enable performance at least as good as the comparable base-line benchmarks while providing a uniform programming API on all compared target architectures. △ Less

Submitted 30 May, 2022; originally announced June 2022.

arXiv:2202.12132 [pdf, other]

"splink" is happy and "phrouth" is scary: Emotion Intensity Analysis for Nonsense Words

Authors: Valentino Sabbatino, Enrica Troiano, Antje Schweitzer, Roman Klinger

Abstract: People associate affective meanings to words - "death" is scary and sad while "party" is connotated with surprise and joy. This raises the question if the association is purely a product of the learned affective imports inherent to semantic meanings, or is also an effect of other features of words, e.g., morphological and phonological patterns. We approach this question with an annotation-based an… ▽ More People associate affective meanings to words - "death" is scary and sad while "party" is connotated with surprise and joy. This raises the question if the association is purely a product of the learned affective imports inherent to semantic meanings, or is also an effect of other features of words, e.g., morphological and phonological patterns. We approach this question with an annotation-based analysis leveraging nonsense words. Specifically, we conduct a best-worst scaling crowdsourcing study in which participants assign intensity scores for joy, sadness, anger, disgust, fear, and surprise to 272 non-sense words and, for comparison of the results to previous work, to 68 real words. Based on this resource, we develop character-level and phonology-based intensity regressors. We evaluate them on both nonsense words and real words (making use of the NRC emotion intensity lexicon of 7493 words), across six emotion categories. The analysis of our data reveals that some phonetic patterns show clear differences between emotion intensities. For instance, s as a first phoneme contributes to joy, sh to surprise, p as last phoneme more to disgust than to anger and fear. In the modelling experiments, a regressor trained on real words from the NRC emotion intensity lexicon shows a higher performance (r = 0.17) than regressors that aim at learning the emotion connotation purely from nonsense words. We conclude that humans do associate affective meaning to words based on surface patterns, but also based on similarities to existing words ("juy" to "joy", or "flike" to "like"). △ Less

Submitted 16 May, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

Comments: WASSA 2022 at ACL 2022, published at https://aclanthology.org/2022.wassa-1.4/ Please cite by using https://aclanthology.org/2022.wassa-1.4.bib

arXiv:2111.04893 [pdf, other]

Mitigating domain shift in AI-based tuberculosis screening with unsupervised domain adaptation

Authors: Nishanjan Ravin, Sourajit Saha, Alan Schweitzer, Ameena Elahi, Farouk Dako, Daniel Mollura, David Chapman

Abstract: We demonstrate that Domain Invariant Feature Learning (DIFL) can improve the out-of-domain generalizability of a deep learning Tuberculosis screening algorithm. It is well known that state of the art deep learning algorithms often have difficulty generalizing to unseen data distributions due to "domain shift". In the context of medical imaging, this could lead to unintended biases such as the inab… ▽ More We demonstrate that Domain Invariant Feature Learning (DIFL) can improve the out-of-domain generalizability of a deep learning Tuberculosis screening algorithm. It is well known that state of the art deep learning algorithms often have difficulty generalizing to unseen data distributions due to "domain shift". In the context of medical imaging, this could lead to unintended biases such as the inability to generalize from one patient population to another. We analyze the performance of a ResNet-50 classifier for the purposes of Tuberculosis screening using the four most popular public datasets with geographically diverse sources of imagery. We show that without domain adaptation, ResNet-50 has difficulty in generalizing between imaging distributions from a number of public Tuberculosis screening datasets with imagery from geographically distributed regions. However, with the incorporation of DIFL, the out-of-domain performance is greatly enhanced. Analysis criteria includes a comparison of accuracy, sensitivity, specificity and AUC over both the baseline, as well as the DIFL enhanced algorithms. We conclude that DIFL improves generalizability of Tuberculosis screening while maintaining acceptable accuracy over the source domain imagery when applied across a variety of public datasets. △ Less

Submitted 8 November, 2021; originally announced November 2021.

arXiv:2108.02336 [pdf, other]

doi 10.1016/j.advengsoft.2022.103360

A Fracture Multiscale Model for Peridynamic enrichment within the Partition of Unity Method

Authors: Matthias Birner, Patrick Diehl, Robert Lipton, Marc Alexander Schweitzer

Abstract: Partition of unity methods (PUM) are of domain decomposition type and provide the opportunity for multiscale and multiphysics numerical modeling. Different physical models can exist within a PUM scheme for handling problems with zones of linear elasticity and zones where fractures occur. Here, the peridynamic (PD) model is used in regions of fracture and smooth PUM is used in the surrounding linea… ▽ More Partition of unity methods (PUM) are of domain decomposition type and provide the opportunity for multiscale and multiphysics numerical modeling. Different physical models can exist within a PUM scheme for handling problems with zones of linear elasticity and zones where fractures occur. Here, the peridynamic (PD) model is used in regions of fracture and smooth PUM is used in the surrounding linear elastic media. The method is a so-called global-local enrichment strategy. The elastic fields of the undamaged media provide appropriate boundary data for the localized PD simulations. The first steps for a combined PD/PUM simulator are presented. In part I of this series, we show that the local PD approximation can be utilized to enrich the global PUM approximation to capture the true material response with high accuracy efficiently. Test problems are provided demonstrating the validity and potential of this numerical approach. △ Less

Submitted 2 February, 2023; v1 submitted 4 August, 2021; originally announced August 2021.

arXiv:1805.05237 [pdf, other]

Effects of Word Embeddings on Neural Network-based Pitch Accent Detection

Authors: Sabrina Stehwien, Ngoc Thang Vu, Antje Schweitzer

Abstract: Pitch accent detection often makes use of both acoustic and lexical features based on the fact that pitch accents tend to correlate with certain words. In this paper, we extend a pitch accent detector that involves a convolutional neural network to include word embeddings, which are state-of-the-art vector representations of words. We examine the effect these features have on within-corpus and cro… ▽ More Pitch accent detection often makes use of both acoustic and lexical features based on the fact that pitch accents tend to correlate with certain words. In this paper, we extend a pitch accent detector that involves a convolutional neural network to include word embeddings, which are state-of-the-art vector representations of words. We examine the effect these features have on within-corpus and cross-corpus experiments on three English datasets. The results show that while word embeddings can improve the performance in corpus-dependent experiments, they also have the potential to make generalization to unseen data more challenging. △ Less

Submitted 7 June, 2018; v1 submitted 14 May, 2018; originally announced May 2018.

Comments: This is an updated version of the paper that has been accepted at Speech Prosody 2018 and published on the ISCA archive. The updates consist of minor corrections that do not change the main conclusions in this work

arXiv:1212.2991 [pdf, other]

Accelerating Inference: towards a full Language, Compiler and Hardware stack

Authors: Shawn Hershey, Jeff Bernstein, Bill Bradley, Andrew Schweitzer, Noah Stein, Theo Weber, Ben Vigoda

Abstract: We introduce Dimple, a fully open-source API for probabilistic modeling. Dimple allows the user to specify probabilistic models in the form of graphical models, Bayesian networks, or factor graphs, and performs inference (by automatically deriving an inference engine from a variety of algorithms) on the model. Dimple also serves as a compiler for GP5, a hardware accelerator for inference. We introduce Dimple, a fully open-source API for probabilistic modeling. Dimple allows the user to specify probabilistic models in the form of graphical models, Bayesian networks, or factor graphs, and performs inference (by automatically deriving an inference engine from a variety of algorithms) on the model. Dimple also serves as a compiler for GP5, a hardware accelerator for inference. △ Less

Submitted 12 December, 2012; originally announced December 2012.

Showing 1–10 of 10 results for author: Schweitzer, A