Search | arXiv e-print repository

LLM-Mediated Domain-Specific Voice Agents: The Case of TextileBot

Authors: Shu Zhong, Elia Gatti, James Hardwick, Miriam Ribul, Youngjun Cho, Marianna Obrist

Abstract: Develo** domain-specific conversational agents (CAs) has been challenged by the need for extensive domain-focused data. Recent advancements in Large Language Models (LLMs) make them a viable option as a knowledge backbone. LLMs behaviour can be enhanced through prompting, instructing them to perform downstream tasks in a zero-shot fashion (i.e. without training). To this end, we incorporated str… ▽ More Develo** domain-specific conversational agents (CAs) has been challenged by the need for extensive domain-focused data. Recent advancements in Large Language Models (LLMs) make them a viable option as a knowledge backbone. LLMs behaviour can be enhanced through prompting, instructing them to perform downstream tasks in a zero-shot fashion (i.e. without training). To this end, we incorporated structural knowledge into prompts and used prompted LLMs to build domain-specific voice-based CAs. We demonstrate this approach for the specific domain of textile circularity in form of the design, development, and evaluation of TextileBot. We present the design and development of the voice agent TextileBot and also the insights from an in-person user study (N=30) evaluating three variations of TextileBots. We analyse the human-agent interactions, combining quantitative and qualitative methods. Our results suggest that participants engaged in multi-turn conversations, and their perceptions of the three variation agents and respective interactions varied demonstrating the effectiveness of our prompt-based LLM approach. We discuss the dynamics of these interactions and their implications for designing future voice-based CAs. The results show that our method's potential for building domain-specific CAs. Furthermore, most participants engaged in multi-turn conversations, and their perceptions of the three voice agents and respective interactions varied demonstrating the effectiveness of our prompt-based LLM approach. We discuss the dynamics of these interactions and their implications for designing future voice-based CAs. △ Less

Submitted 15 June, 2024; originally announced June 2024.

arXiv:2406.06587 [pdf, other]

Exploring Human-AI Perception Alignment in Sensory Experiences: Do LLMs Understand Textile Hand?

Authors: Shu Zhong, Elia Gatti, Youngjun Cho, Marianna Obrist

Abstract: Aligning large language models (LLMs) behaviour with human intent is critical for future AI. An important yet often overlooked aspect of this alignment is the perceptual alignment. Perceptual modalities like touch are more multifaceted and nuanced compared to other sensory modalities such as vision. This work investigates how well LLMs align with human touch experiences using the "textile hand" ta… ▽ More Aligning large language models (LLMs) behaviour with human intent is critical for future AI. An important yet often overlooked aspect of this alignment is the perceptual alignment. Perceptual modalities like touch are more multifaceted and nuanced compared to other sensory modalities such as vision. This work investigates how well LLMs align with human touch experiences using the "textile hand" task. We created a "Guess What Textile" interaction in which participants were given two textile samples -- a target and a reference -- to handle. Without seeing them, participants described the differences between them to the LLM. Using these descriptions, the LLM attempted to identify the target textile by assessing similarity within its high-dimensional embedding space. Our results suggest that a degree of perceptual alignment exists, however varies significantly among different textile samples. For example, LLM predictions are well aligned for silk satin, but not for cotton denim. Moreover, participants didn't perceive their textile experiences closely matched by the LLM predictions. This is only the first exploration into perceptual alignment around touch, exemplified through textile hand. We discuss possible sources of this alignment variance, and how better human-AI perceptual alignment can benefit future everyday tasks. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2206.14165 [pdf, other]

Expressive, Variable, and Controllable Duration Modelling in TTS

Authors: Ammar Abbas, Thomas Merritt, Alexis Moinet, Sri Karlapati, Ewa Muszynska, Simon Slangen, Elia Gatti, Thomas Drugman

Abstract: Duration modelling has become an important research problem once more with the rise of non-attention neural text-to-speech systems. The current approaches largely fall back to relying on previous statistical parametric speech synthesis technology for duration prediction, which poorly models the expressiveness and variability in speech. In this paper, we propose two alternate approaches to improve… ▽ More Duration modelling has become an important research problem once more with the rise of non-attention neural text-to-speech systems. The current approaches largely fall back to relying on previous statistical parametric speech synthesis technology for duration prediction, which poorly models the expressiveness and variability in speech. In this paper, we propose two alternate approaches to improve duration modelling. First, we propose a duration model conditioned on phrasing that improves the predicted durations and provides better modelling of pauses. We show that the duration model conditioned on phrasing improves the naturalness of speech over our baseline duration model. Second, we also propose a multi-speaker duration model called Cauliflow, that uses normalising flows to predict durations that better match the complex target duration distribution. Cauliflow performs on par with our other proposed duration model in terms of naturalness, whilst providing variable durations for the same prompt and variable levels of expressiveness. Lastly, we propose to condition Cauliflow on parameters that provide an intuitive control of the pacing and pausing in the synthesised speech in a novel way. △ Less

Submitted 28 June, 2022; originally announced June 2022.

Comments: Accepted to be published in the Proceedings of InterSpeech 2022

arXiv:1603.08700 [pdf, ps, other]

doi 10.1063/1.4955020

Disentangling nonradiative recombination processes in Ge micro-crystals on Si substrates

Authors: F. Pezzoli, A. Giorgioni, K. Gallacher, F. Isa, P. Biagioni, R. W. Millar, E. Gatti, E. Grilli, E. Bonera, G. Isella, D. J. Paul, Leo Miglio

Abstract: We address nonradiative recombination pathways by leveraging surface passivation and dislocation management in micron-scale arrays of Ge crystals grown on deeply patterned Si substrates. The time decay photoluminescence (PL) at cryogenic temperatures discloses carrier lifetimes approaching 45 ns in band-gap engineered Ge micro-crystals. This investigation provides compelling information about the… ▽ More We address nonradiative recombination pathways by leveraging surface passivation and dislocation management in micron-scale arrays of Ge crystals grown on deeply patterned Si substrates. The time decay photoluminescence (PL) at cryogenic temperatures discloses carrier lifetimes approaching 45 ns in band-gap engineered Ge micro-crystals. This investigation provides compelling information about the competitive interplay between the radiative band-edge transitions and the trap** of carriers by dislocations and free surfaces. Furthermore, an in-depth analysis of the temperature dependence of the PL, combined with capacitance data and finite difference time domain modeling, demonstrates the effectiveness of GeO2 in passivating the surface of Ge and thus in enhancing the room temperature PL emission. △ Less

Submitted 29 March, 2016; originally announced March 2016.

Journal ref: Appl. Phys. Lett. 108, 262103 (2016)

arXiv:1510.08614 [pdf, ps, other]

doi 10.1103/PhysRevB.92.201203

Spin-dependent direct gap emission in tensile-strained Ge films on Si substrates

Authors: E. Vitiello, M. Virgilio, A. Giorgioni, J. Frigerio, E. Gatti, S. De Cesari, E. Bonera, E. Grilli, G. Isella, F. Pezzoli

Abstract: The circular polarization of direct gap emission of Ge is studied in optically-excited tensile-strained Ge-on-Si heterostructures as a function of do** and temperature. Owing to the spin-dependent optical selection rules, the radiative recombinations involving strain-split light (cG-LH) and heavy hole (cG-HH) bands are unambiguously resolved. The fundamental cG-LH transition is found to have a l… ▽ More The circular polarization of direct gap emission of Ge is studied in optically-excited tensile-strained Ge-on-Si heterostructures as a function of do** and temperature. Owing to the spin-dependent optical selection rules, the radiative recombinations involving strain-split light (cG-LH) and heavy hole (cG-HH) bands are unambiguously resolved. The fundamental cG-LH transition is found to have a low temperature circular polarization degree of about 85% despite an off-resonance excitation of more than 300 meV. By photoluminescence (PL) measurements and tight binding calculations we show that this exceptionally high value is due to the peculiar energy dependence of the optically-induced electron spin population. Finally, our observation of the direct gap doublet clarifies that the light hole contribution, previously considered to be negligible, can dominate the room temperature PL even at low tensile strain values of about 0.2%. △ Less

Submitted 29 October, 2015; originally announced October 2015.

Journal ref: Phys. Rev. B 92, 201203(R) (2015)

arXiv:1509.03977 [pdf, ps, other]

doi 10.1016/j.artmed.2014.07.004

Optimization of anemia treatment in hemodialysis patients via reinforcement learning

Authors: Pablo Escandell-Montero, Milena Chermisi, José M. Martínez-Martínez, Juan Gómez-Sanchis, Carlo Barbieri, Emilio Soria-Olivas, Flavio Mari, Joan Vila-Francés, Andrea Stopper, Emanuele Gatti, José D. Martín-Guerrero

Abstract: Objective: Anemia is a frequent comorbidity in hemodialysis patients that can be successfully treated by administering erythropoiesis-stimulating agents (ESAs). ESAs dosing is currently based on clinical protocols that often do not account for the high inter- and intra-individual variability in the patient's response. As a result, the hemoglobin level of some patients oscillates around the target… ▽ More Objective: Anemia is a frequent comorbidity in hemodialysis patients that can be successfully treated by administering erythropoiesis-stimulating agents (ESAs). ESAs dosing is currently based on clinical protocols that often do not account for the high inter- and intra-individual variability in the patient's response. As a result, the hemoglobin level of some patients oscillates around the target range, which is associated with multiple risks and side-effects. This work proposes a methodology based on reinforcement learning (RL) to optimize ESA therapy. Methods: RL is a data-driven approach for solving sequential decision-making problems that are formulated as Markov decision processes (MDPs). Computing optimal drug administration strategies for chronic diseases is a sequential decision-making problem in which the goal is to find the best sequence of drug doses. MDPs are particularly suitable for modeling these problems due to their ability to capture the uncertainty associated with the outcome of the treatment and the stochastic nature of the underlying process. The RL algorithm employed in the proposed methodology is fitted Q iteration, which stands out for its ability to make an efficient use of data. Results: The experiments reported here are based on a computational model that describes the effect of ESAs on the hemoglobin level. The performance of the proposed method is evaluated and compared with the well-known Q-learning algorithm and with a standard protocol. Simulation results show that the performance of Q-learning is substantially lower than FQI and the protocol. Conclusion: Although prospective validation is required, promising results demonstrate the potential of RL to become an alternative to current protocols. △ Less

Submitted 14 September, 2015; originally announced September 2015.

Comments: 17 pages, 10 figures

Journal ref: Artificial Intelligence in Medicine, Volume 62, Issue 1, September 2014, Pages 47-60

arXiv:nucl-ex/0506002 [pdf, ps, other]

doi 10.1140/epjc/s2005-02272-3

e+e--pair production in Pb-Au collisions at 158 GeV per nucleon

Authors: G. Agakichiev, H. Appelshauser, J. Bielcikova, R. Baur, P. Braun-Munzinger, A. Cherlin, S. Damjanovic, A. Drees, S. Esumi, U. Faschingbauer, Z. Fraenkel, Ch. Fuchs, E. Gatti, P. Glassel, G. Hering, C. P. de los Heros, P. Holl, Ch. Jung, B. Lenkeit, A. Marin, F. Messer, M. Messer, D. Miskowiec, O. Nix, Yu. Panebrattsev , et al. (26 additional authors not shown)

Abstract: We present the combined results on electron-pair production in 158 GeV/n {Pb-Au} ($\sqrt{s}$= 17.2 GeV) collisions taken at the CERN SPS in 1995 and 1996, and give a detailed account of the data analysis. The enhancement over the reference of neutral meson decays amounts to a factor of 2.31$\pm0.19 (stat.)\pm0.55 (syst.)\pm0.69 (decays)$ for semi-central collisions (28% $σ/σ_{geo}$) when yields… ▽ More We present the combined results on electron-pair production in 158 GeV/n {Pb-Au} ($\sqrt{s}$= 17.2 GeV) collisions taken at the CERN SPS in 1995 and 1996, and give a detailed account of the data analysis. The enhancement over the reference of neutral meson decays amounts to a factor of 2.31$\pm0.19 (stat.)\pm0.55 (syst.)\pm0.69 (decays)$ for semi-central collisions (28% $σ/σ_{geo}$) when yields are integrated over $m>$ 200 MeV/$c^2$ in invariant mass. The measured yield, its stronger-than-linear scaling with $N_{ch}$, and the dominance of low pair $p_t$ strongly suggest an interpretation as {\it thermal radiation} from pion annihilation in the hadronic fireball. The shape of the excess centring at $m\approx$ 500 MeV/$c^2$, however, cannot be described without strong medium modifications of the $ρ$ meson. The results are put into perspective by comparison to predictions from Brown-Rho scaling governed by chiral symmetry restoration, and from the spectral-function many-body treatment in which the approach to the phase boundary is less explicit. △ Less

Submitted 1 June, 2005; originally announced June 2005.

Comments: 39 pages, 40 figures, to appear in Eur.Phys.J.C. (2005)

Journal ref: Eur.Phys.J.C41:475-513,2005

Showing 1–7 of 7 results for author: Gatti, E