Skip to main content

Showing 1–50 of 50 results for author: Georgiou, P

.
  1. arXiv:2406.15346  [pdf, other

    cs.LG cs.AI

    Privacy Preserved Blood Glucose Level Cross-Prediction: An Asynchronous Decentralized Federated Learning Approach

    Authors: Chengzhe Piao, Taiyu Zhu, Yu Wang, Stephanie E Baldeweg, Paul Taylor, Pantelis Georgiou, Jiahao Sun, Jun Wang, Kezhi Li

    Abstract: Newly diagnosed Type 1 Diabetes (T1D) patients often struggle to obtain effective Blood Glucose (BG) prediction models due to the lack of sufficient BG data from Continuous Glucose Monitoring (CGM), presenting a significant "cold start" problem in patient care. Utilizing population models to address this challenge is a potential solution, but collecting patient data for training population models… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  2. A Multimodal Approach to Device-Directed Speech Detection with Large Language Models

    Authors: Dominik Wagner, Alexander Churchill, Siddharth Sigtia, Panayiotis Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi

    Abstract: Interactions with virtual assistants typically start with a predefined trigger phrase followed by the user command. To make interactions with the assistant more intuitive, we explore whether it is feasible to drop the requirement that users must begin each command with a trigger phrase. We explore this task in three ways: First, we train classifiers using only acoustic information obtained from th… ▽ More

    Submitted 26 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: text overlap with arXiv:2312.03632

  3. arXiv:2403.03712  [pdf, ps, other

    cs.LO cs.SC

    Saturating Sorting without Sorts

    Authors: Pamina Georgiou, Márton Hajdu, Laura Kovács

    Abstract: We present a first-order theorem proving framework for establishing the correctness of functional programs implementing sorting algorithms with recursive data structures. We formalize the semantics of recursive programs in many-sorted first-order logic and integrate sortedness/permutation properties within our first-order formalization. Rather than focusing on sorting lists of elements of specif… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  4. arXiv:2402.16230  [pdf, other

    cs.LG cs.AI

    GARNN: An Interpretable Graph Attentive Recurrent Neural Network for Predicting Blood Glucose Levels via Multivariate Time Series

    Authors: Chengzhe Piao, Taiyu Zhu, Stephanie E Baldeweg, Paul Taylor, Pantelis Georgiou, Jiahao Sun, Jun Wang, Kezhi Li

    Abstract: Accurate prediction of future blood glucose (BG) levels can effectively improve BG management for people living with diabetes, thereby reducing complications and improving quality of life. The state of the art of BG prediction has been achieved by leveraging advanced deep learning methods to model multi-modal data, i.e., sensor data and self-reported event data, organised as multi-variate time ser… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  5. arXiv:2402.00176  [pdf, other

    quant-ph cs.ET cs.LG

    Adversarial Quantum Machine Learning: An Information-Theoretic Generalization Analysis

    Authors: Petros Georgiou, Sharu Theresa Jose, Osvaldo Simeone

    Abstract: In a manner analogous to their classical counterparts, quantum classifiers are vulnerable to adversarial attacks that perturb their inputs. A promising countermeasure is to train the quantum classifier by adopting an attack-aware, or adversarial, loss function. This paper studies the generalization properties of quantum classifiers that are adversarially trained against bounded-norm white-box atta… ▽ More

    Submitted 15 February, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: 10 pages, 2 figures. Fixed a typo (wrong inequality sign) in lemma 2 and extended to cover the whole range of values of p. Added reference on inequalities in trace norms

  6. arXiv:2312.03632  [pdf, other

    cs.SD cs.LG eess.AS

    Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models

    Authors: Dominik Wagner, Alexander Churchill, Siddharth Sigtia, Panayiotis Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi

    Abstract: Interactions with virtual assistants typically start with a trigger phrase followed by a command. In this work, we explore the possibility of making these interactions more natural by eliminating the need for a trigger phrase. Our goal is to determine whether a user addressed the virtual assistant based on signals obtained from the streaming audio recorded by the device microphone. We address this… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  7. arXiv:2312.01146  [pdf

    stat.AP stat.ME

    Bayesian models are better than frequentist models in identifying differences in small datasets comprising phonetic data

    Authors: Georgios P. Georgiou

    Abstract: While many studies have previously conducted direct comparisons between results obtained from frequentist and Bayesian models, our research introduces a novel perspective by examining these models in the context of a small dataset comprising phonetic data. Specifically, we employed mixed-effects models and Bayesian regression models to explore differences between monolingual and bilingual populati… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: 15 pages, 3 figures

  8. arXiv:2311.15054  [pdf

    cs.CL cs.LG

    Detection of developmental language disorder in Cypriot Greek children using a neural network algorithm

    Authors: Georgios P. Georgiou, Elena Theodorou

    Abstract: Children with developmental language disorder (DLD) encounter difficulties in acquiring various language structures. Early identification and intervention are crucial to prevent negative long-term outcomes impacting the academic, social, and emotional development of children. The study aims to develop an automated method for the identification of DLD using artificial intelligence, specifically a n… ▽ More

    Submitted 10 February, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

    Comments: 15 pages, 3 figures, journal article

  9. arXiv:2302.09044  [pdf, other

    cs.HC

    From User Perceptions to Technical Improvement: Enabling People Who Stutter to Better Use Speech Recognition

    Authors: Colin Lea, Zifang Huang, Lauren Tooley, Jaya Narain, Dianna Yee, Panayiotis Georgiou, Tien Dung Tran, Jeffrey P. Bigham, Leah Findlater

    Abstract: Consumer speech recognition systems do not work as well for many people with speech diferences, such as stuttering, relative to the rest of the general population. However, what is not clear is the degree to which these systems do not work, how they can be improved, or how much people want to use them. In this paper, we frst address these questions using results from a 61-person survey from people… ▽ More

    Submitted 27 February, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

    Comments: CHI 2023

  10. arXiv:2212.09351  [pdf, other

    physics.optics physics.app-ph

    Polarization Modulation in Quantum-Dot Spin-VCSELs for Ultrafast Data Transmission

    Authors: Christos Tselios, Panagiotis Georgiou, Christina, Politi, Antonio Hurtado, Dimitris Alexandropoulos

    Abstract: Spin-Vertical Cavity Surface Emitting Lasers (spin-VCSELs) are undergoing increasing research effort for new paradigms in high-speed optical communications and photon-enabled computing. To date research in spin-VCSELs has mostly focused on Quantum-Well (QW) devices. However, novel Quantum-Dot (QD) spin-VCSELs, offer enhanced parameter controls permitting the effective, dynamical and ultrafast mani… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

  11. Develo** moral AI to support antimicrobial decision making

    Authors: William J Bolton, Cosmin Badea, Pantelis Georgiou, Alison Holmes, Timothy M Rawson

    Abstract: Artificial intelligence (AI) assisting with antimicrobial prescribing raises significant moral questions. Utilising ethical frameworks alongside AI-driven systems, while considering infection specific complexities, can support moral decision making to tackle antimicrobial resistance.

    Submitted 12 August, 2022; originally announced August 2022.

    ACM Class: I.2.1

  12. arXiv:2202.03587  [pdf, other

    eess.AS cs.SD eess.SP

    CALM: Contrastive Aligned Audio-Language Multirate and Multimodal Representations

    Authors: Vin Sachidananda, Shao-Yen Tseng, Erik Marchi, Sachin Kajarekar, Panayiotis Georgiou

    Abstract: Deriving multimodal representations of audio and lexical inputs is a central problem in Natural Language Understanding (NLU). In this paper, we present Contrastive Aligned Audio-Language Multirate and Multimodal Representations (CALM), an approach for learning multimodal representations using contrastive and multirate information inherent in audio and lexical inputs. The proposed model aligns acou… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

  13. arXiv:2106.11759  [pdf, other

    eess.AS cs.AI cs.CL cs.CV cs.LG cs.SD

    Analysis and Tuning of a Voice Assistant System for Dysfluent Speech

    Authors: Vikramjit Mitra, Zifang Huang, Colin Lea, Lauren Tooley, Sarah Wu, Darren Botten, Ashwini Palekar, Shrinath Thelapurath, Panayiotis Georgiou, Sachin Kajarekar, Jefferey Bigham

    Abstract: Dysfluencies and variations in speech pronunciation can severely degrade speech recognition performance, and for many individuals with moderate-to-severe speech disorders, voice operated systems do not work. Current speech recognition systems are trained primarily with data from fluent speakers and as a consequence do not generalize well to speech with dysfluencies such as sound or word repetition… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

    Comments: 5 pages, 1 page reference, 2 figures

  14. arXiv:2104.03899  [pdf, other

    eess.AS cs.AI cs.SD

    Unsupervised Speech Representation Learning for Behavior Modeling using Triplet Enhanced Contextualized Networks

    Authors: Haoqi Li, Brian Baucom, Shrikanth Narayanan, Panayiotis Georgiou

    Abstract: Speech encodes a wealth of information related to human behavior and has been used in a variety of automated behavior recognition tasks. However, extracting behavioral information from speech remains challenging including due to inadequate training data resources stemming from the often low occurrence frequencies of specific behavioral patterns. Moreover, supervised behavioral modeling typically r… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

  15. arXiv:2102.11265  [pdf, other

    eess.AS cs.CL cs.SD

    Automated Evaluation Of Psychotherapy Skills Using Speech And Language Technologies

    Authors: Nikolaos Flemotomos, Victor R. Martinez, Zhuohao Chen, Karan Singla, Victor Ardulov, Raghuveer Peri, Derek D. Caperton, James Gibson, Michael J. Tanana, Panayiotis Georgiou, Jake Van Epps, Sarah P. Lord, Tad Hirsch, Zac E. Imel, David C. Atkins, Shrikanth Narayanan

    Abstract: With the growing prevalence of psychological interventions, it is vital to have measures which rate the effectiveness of psychological care to assist in training, supervision, and quality assurance of services. Traditionally, quality assessment is addressed by human raters who evaluate recorded sessions along specific dimensions, often codified through constructs relevant to the approach and domai… ▽ More

    Submitted 27 March, 2021; v1 submitted 22 February, 2021; originally announced February 2021.

    Comments: new version has an updated title

  16. Confusion2vec 2.0: Enriching Ambiguous Spoken Language Representations with Subwords

    Authors: Prashanth Gurunath Shivakumar, Panayiotis Georgiou, Shrikanth Narayanan

    Abstract: Word vector representations enable machines to encode human language for spoken language understanding and processing. Confusion2vec, motivated from human speech production and perception, is a word vector representation which encodes ambiguities present in human spoken language in addition to semantics and syntactic information. Confusion2vec provides a robust spoken language representation by co… ▽ More

    Submitted 19 February, 2021; v1 submitted 3 February, 2021; originally announced February 2021.

  17. arXiv:2008.03550  [pdf, other

    cs.CY cs.HC

    A novel hand-held interface supporting the self-management of Type 1 diabetes

    Authors: Robert Spence, Chukwuma Uduku, Kezhi Li, Nick Oliver, Pantelis Georgiou

    Abstract: The paper describes the interaction design of a hand-held interface supporting the self-management of Type 1 diabetes. It addresses well-established clinical and human-computer interaction requirements. The design exploits three opportunities. One is associated with visible context, whether conspicuous or inconspicuous. A second arises from the design freedom made possible by the user's anticipa… ▽ More

    Submitted 8 August, 2020; originally announced August 2020.

  18. arXiv:2008.01387  [pdf, ps, other

    cs.LO cs.SC

    Trace Logic for Inductive Loop Reasoning

    Authors: Pamina Georgiou, Bernhard Gleiss, Laura Kovács

    Abstract: We propose trace logic, an instance of many-sorted first-order logic, to automate the partial correctness verification of programs containing loops. Trace logic generalizes semantics of program locations and captures loop semantics by encoding properties at arbitrary timepoints and loop iterations. We guide and automate inductive loop reasoning in trace logic by using generic trace lemmas capturin… ▽ More

    Submitted 6 August, 2020; v1 submitted 4 August, 2020; originally announced August 2020.

    Comments: Related Version: A compact, peer-reviewed version of this paper will be available in the conference proceedings of Formal Methods of Computer-Aided Design (FMCAD) 2020

  19. arXiv:2005.09059  [pdf, other

    eess.SP cs.LG q-bio.QM

    Basal Glucose Control in Type 1 Diabetes using Deep Reinforcement Learning: An In Silico Validation

    Authors: Taiyu Zhu, Kezhi Li, Pau Herrero, Pantelis Georgiou

    Abstract: People with Type 1 diabetes (T1D) require regular exogenous infusion of insulin to maintain their blood glucose concentration in a therapeutically adequate target range. Although the artificial pancreas and continuous glucose monitoring have been proven to be effective in achieving closed-loop control, significant challenges still remain due to the high complexity of glucose dynamics and limitatio… ▽ More

    Submitted 18 May, 2020; originally announced May 2020.

    Journal ref: IEEE journal of biomedical and health informatics 2020

  20. Speaker Diarization with Lexical Information

    Authors: Tae ** Park, Kyu J. Han, **g Huang, Xiaodong He, Bowen Zhou, Panayiotis Georgiou, Shrikanth Narayanan

    Abstract: This work presents a novel approach for speaker diarization to leverage lexical information provided by automatic speech recognition. We propose a speaker diarization system that can incorporate word-level speaker turn probabilities with speaker embeddings into a speaker clustering process to improve the overall diarization accuracy. To integrate lexical and acoustic information in a comprehensive… ▽ More

    Submitted 13 April, 2020; originally announced April 2020.

    Journal ref: Interspeech 2019, 391-395

  21. arXiv:1911.11927  [pdf, ps, other

    eess.AS

    Automatic prediction of suicidal risk in military couples using multimodal interaction cues from couples conversations

    Authors: Sandeep Nallan Chakravarthula, Md Nasir, Shao-Yen Tseng, Haoqi Li, Tae ** Park, Brian Baucom, Craig J. Bryan, Shrikanth Narayanan, Panayiotis Georgiou

    Abstract: Suicide is a major societal challenge globally, with a wide range of risk factors, from individual health, psychological and behavioral elements to socio-economic aspects. Military personnel, in particular, are at especially high risk. Crisis resources, while helpful, are often constrained by access to clinical visits or therapist availability, especially when needed in a timely manner. There have… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Comments: submitted to ICASSP 2020

  22. arXiv:1911.09515  [pdf, other

    cs.CL

    An analysis of observation length requirements for machine understanding of human behaviors from spoken language

    Authors: Sandeep Nallan Chakravarthula, Brian Baucom, Shrikanth Narayanan, Panayiotis Georgiou

    Abstract: The task of quantifying human behavior by observing interaction cues is an important and useful one across a range of domains in psychological research and practice. Machine learning-based approaches typically perform this task by first estimating behavior based on cues within an observation window, such as a fixed number of words, and then aggregating the behavior over all the windows in that int… ▽ More

    Submitted 26 August, 2020; v1 submitted 21 November, 2019; originally announced November 2019.

    Comments: converted to CSL format, restructured presentation of analysis and methodology, moved finer details to Appendix, enlarged figures and text, fixed typos and notational inconsistency

  23. Linguistically Aided Speaker Diarization Using Speaker Role Information

    Authors: Nikolaos Flemotomos, Panayiotis Georgiou, Shrikanth Narayanan

    Abstract: Speaker diarization relies on the assumption that speech segments corresponding to a particular speaker are concentrated in a specific region of the speaker space; a region which represents that speaker's identity. These identities are not known a priori, so a clustering algorithm is typically employed, which is traditionally based solely on audio. Under noisy conditions, however, such an approach… ▽ More

    Submitted 5 February, 2020; v1 submitted 18 November, 2019; originally announced November 2019.

    Comments: from v1: restructured Introduction and Background, added experimental results with ASR text and language-only baseline

  24. arXiv:1911.01533  [pdf, other

    eess.AS cs.LG cs.SD

    Speaker-invariant Affective Representation Learning via Adversarial Training

    Authors: Haoqi Li, Ming Tu, **g Huang, Shrikanth Narayanan, Panayiotis Georgiou

    Abstract: Representation learning for speech emotion recognition is challenging due to labeled data sparsity issue and lack of gold standard references. In addition, there is much variability from input speech signals, human subjective perception of the signals and emotion label ambiguity. In this paper, we propose a machine learning framework to obtain speech emotion representations by limiting the effect… ▽ More

    Submitted 12 August, 2021; v1 submitted 4 November, 2019; originally announced November 2019.

    Comments: Accepted by ICASSP 2020; 5 pages

  25. arXiv:1910.10287  [pdf, other

    cs.CL cs.LG eess.AS

    RNN based Incremental Online Spoken Language Understanding

    Authors: Prashanth Gurunath Shivakumar, Naveen Kumar, Panayiotis Georgiou, Shrikanth Narayanan

    Abstract: Spoken Language Understanding (SLU) typically comprises of an automatic speech recognition (ASR) followed by a natural language understanding (NLU) module. The two modules process signals in a blocking sequential fashion, i.e., the NLU often has to wait for the ASR to finish processing on an utterance basis, potentially leading to high latencies that render the spoken interaction less natural. In… ▽ More

    Submitted 30 November, 2020; v1 submitted 22 October, 2019; originally announced October 2019.

    Comments: Accepted for publication at IEEE Spoken Language Technology Workshop 2021

  26. arXiv:1910.04059  [pdf, other

    q-bio.QM cs.LG

    A Dual-Hormone Closed-Loop Delivery System for Type 1 Diabetes Using Deep Reinforcement Learning

    Authors: Taiyu Zhu, Kezhi Li, Pantelis Georgiou

    Abstract: We propose a dual-hormone delivery strategy by exploiting deep reinforcement learning (RL) for people with Type 1 Diabetes (T1D). Specifically, double dilated recurrent neural networks (RNN) are used to learn the hormone delivery strategy, trained by a variant of Q-learning, whose inputs are raw data of glucose \& meal carbohydrate and outputs are dual-hormone (insulin and glucagon) delivery. With… ▽ More

    Submitted 9 October, 2019; originally announced October 2019.

  27. arXiv:1910.03641  [pdf, other

    cs.LG cs.CL cs.HC eess.AS

    Linking emotions to behaviors through deep transfer learning

    Authors: Haoqi Li, Brian Baucom, Panayiotis Georgiou

    Abstract: Human behavior refers to the way humans act and interact. Understanding human behavior is a cornerstone of observational practice, especially in psychotherapy. An important cue of behavior analysis is the dynamical changes of emotions during the conversation. Domain experts integrate emotional information in a highly nonlinear manner, thus, it is challenging to explicitly quantify the relationship… ▽ More

    Submitted 8 October, 2019; originally announced October 2019.

    Comments: 23 pages, 8 figures

  28. arXiv:1909.04302  [pdf, other

    cs.CL cs.LG

    Multimodal Embeddings from Language Models

    Authors: Shao-Yen Tseng, Panayiotis Georgiou, Shrikanth Narayanan

    Abstract: Word embeddings such as ELMo have recently been shown to model word semantics with greater efficacy through contextualized learning on large-scale language corpora, resulting in significant improvement in state of the art across many natural language tasks. In this work we integrate acoustic information into contextualized lexical embeddings through the addition of multimodal inputs to a pretraine… ▽ More

    Submitted 10 September, 2019; originally announced September 2019.

  29. arXiv:1909.00107  [pdf, other

    cs.CL

    Behavior Gated Language Models

    Authors: Prashanth Gurunath Shivakumar, Shao-Yen Tseng, Panayiotis Georgiou, Shrikanth Narayanan

    Abstract: Most current language modeling techniques only exploit co-occurrence, semantic and syntactic information from the sequence of words. However, a range of information such as the state of the speaker and dynamics of the interaction might be useful. In this work we derive motivation from psycholinguistics and propose the addition of behavioral information into the context of language modeling. We pro… ▽ More

    Submitted 30 August, 2019; originally announced September 2019.

  30. arXiv:1908.00908  [pdf, other

    cs.CL

    Predicting Behavior in Cancer-Afflicted Patient and Spouse Interactions using Speech and Language

    Authors: Sandeep Nallan Chakravarthula, Haoqi Li, Shao-Yen Tseng, Maija Reblin, Panayiotis Georgiou

    Abstract: Cancer impacts the quality of life of those diagnosed as well as their spouse caregivers, in addition to potentially influencing their day-to-day behaviors. There is evidence that effective communication between spouses can improve well-being related to cancer but it is difficult to efficiently evaluate the quality of daily life interactions using manual annotation frameworks. Automated recognitio… ▽ More

    Submitted 2 August, 2019; originally announced August 2019.

  31. arXiv:1906.09899  [pdf, ps, other

    cs.LO

    Verifying Relational Properties using Trace Logic

    Authors: Gilles Barthe, Renate Eilers, Pamina Georgiou, Bernhard Gleiss, Laura Kovacs, Matteo Maffei

    Abstract: We present a logical framework for the verification of relational properties in imperative programs. Our work is motivated by relational properties which come from security applications and often require reasoning about formulas with quantifier-alternations. Our framework reduces verification of relational properties of imperative programs to a validity problem into trace logic, an expressive inst… ▽ More

    Submitted 12 August, 2019; v1 submitted 24 June, 2019; originally announced June 2019.

  32. arXiv:1904.06002  [pdf, other

    cs.CL

    Modeling Interpersonal Linguistic Coordination in Conversations using Word Mover's Distance

    Authors: Md Nasir, Sandeep Nallan Chakravarthula, Brian Baucom, David C. Atkins, Panayiotis Georgiou, Shrikanth Narayanan

    Abstract: Linguistic coordination is a well-established phenomenon in spoken conversations and often associated with positive social behaviors and outcomes. While there have been many attempts to measure lexical coordination or entrainment in literature, only a few have explored coordination in syntactic or semantic space. In this work, we attempt to combine these different aspects of coordination into a si… ▽ More

    Submitted 11 April, 2019; originally announced April 2019.

  33. Spoken Language Intent Detection using Confusion2Vec

    Authors: Prashanth Gurunath Shivakumar, Mu Yang, Panayiotis Georgiou

    Abstract: Decoding speaker's intent is a crucial part of spoken language understanding (SLU). The presence of noise or errors in the text transcriptions, in real life scenarios make the task more challenging. In this paper, we address the spoken language intent detection under noisy conditions imposed by automatic speech recognition (ASR) systems. We propose to employ confusion2vec word feature representati… ▽ More

    Submitted 1 July, 2019; v1 submitted 6 April, 2019; originally announced April 2019.

    Report number: 2226

    Journal ref: Proceedings of Interspeech 2019

  34. arXiv:1901.07467  [pdf, ps, other

    q-bio.TO eess.SY

    Enhancing Blood Glucose Prediction with Meal Absorption and Physical Exercise Information

    Authors: Chengyuan Liu, Josep Vehi, Nick Oliver, Pantelis Georgiou, Pau Herrero

    Abstract: Objective: Numerous glucose prediction algorithm have been proposed to empower type 1 diabetes (T1D) management. Most of these algorithms only account for input such as glucose, insulin and carbohydrate, which limits their performance. Here, we present a novel glucose prediction algorithm which, in addition to standard inputs, accounts for meal absorption and physical exercise information to enhan… ▽ More

    Submitted 13 December, 2018; originally announced January 2019.

    Comments: 10 pages, 5 figures, 8 tables and one appendix

  35. arXiv:1811.10761   

    cs.CL

    Speaker Diarization With Lexical Information

    Authors: Tae ** Park, Kyu Han, Ian Lane, Panayiotis Georgiou

    Abstract: This work presents a novel approach to leverage lexical information for speaker diarization. We introduce a speaker diarization system that can directly integrate lexical as well as acoustic information into a speaker clustering process. Thus, we propose an adjacency matrix integration technique to integrate word level speaker turn probabilities with speaker embeddings in a comprehensive way. Our… ▽ More

    Submitted 28 November, 2018; v1 submitted 26 November, 2018; originally announced November 2018.

    Comments: This version removed by arXiv administrators because the author did not have the right to agree to our license at the time of submission

  36. Confusion2Vec: Towards Enriching Vector Space Word Representations with Representational Ambiguities

    Authors: Prashanth Gurunath Shivakumar, Panayiotis Georgiou

    Abstract: Word vector representations are a crucial part of Natural Language Processing (NLP) and Human Computer Interaction. In this paper, we propose a novel word vector representation, Confusion2Vec, motivated from the human speech production and perception that encodes representational ambiguity. Humans employ both acoustic similarity cues and contextual cues to decode information and we focus on a mode… ▽ More

    Submitted 28 March, 2019; v1 submitted 7 November, 2018; originally announced November 2018.

    Journal ref: PeerJ Computer Science 5:e195, 2019

  37. Multi-label Multi-task Deep Learning for Behavioral Coding

    Authors: James Gibson, David C. Atkins, Torrey Creed, Zac Imel, Panayiotis Georgiou, Shrikanth Narayanan

    Abstract: We propose a methodology for estimating human behaviors in psychotherapy sessions using mutli-label and multi-task learning paradigms. We discuss the problem of behavioral coding in which data of human interactions is the annotated with labels to describe relevant human behaviors of interest. We describe two related, yet distinct, corpora consisting of therapist client interactions in psychotherap… ▽ More

    Submitted 5 November, 2018; v1 submitted 29 October, 2018; originally announced October 2018.

  38. arXiv:1807.06792  [pdf, other

    cs.CL

    Unsupervised Online Multitask Learning of Behavioral Sentence Embeddings

    Authors: Shao-Yen Tseng, Brian Baucom, Panayiotis Georgiou

    Abstract: Unsupervised learning has been an attractive method for easily deriving meaningful data representations from vast amounts of unlabeled data. These representations, or embeddings, often yield superior results in many tasks, whether used directly or as features in subsequent training stages. However, the quality of the embeddings is highly dependent on the assumed knowledge in the unlabeled data and… ▽ More

    Submitted 1 November, 2018; v1 submitted 18 July, 2018; originally announced July 2018.

  39. arXiv:1807.03043  [pdf, other

    cs.CV

    Convolutional Recurrent Neural Networks for Glucose Prediction

    Authors: Kezhi Li, John Daniels, Chengyuan Liu, Pau Herrero, Pantelis Georgiou

    Abstract: Control of blood glucose is essential for diabetes management. Current digital therapeutic approaches for subjects with Type 1 diabetes mellitus (T1DM) such as the artificial pancreas and insulin bolus calculators leverage machine learning techniques for predicting subcutaneous glucose for improved control. Deep learning has recently been applied in healthcare and medical research to achieve state… ▽ More

    Submitted 25 February, 2019; v1 submitted 9 July, 2018; originally announced July 2018.

    Comments: 10 pages, 7 figures

    Journal ref: IEEE journal of biomedical and health informatics 2019

  40. arXiv:1805.10731  [pdf, other

    eess.AS cs.SD

    Multimodal Speaker Segmentation and Diarization using Lexical and Acoustic Cues via Sequence to Sequence Neural Networks

    Authors: Tae ** Park, Panayiotis Georgiou

    Abstract: While there has been substantial amount of work in speaker diarization recently, there are few efforts in jointly employing lexical and acoustic information for speaker segmentation. Towards that, we investigate a speaker diarization system using a sequence-to-sequence neural network trained on both lexical and acoustic features. We also propose a loss function that allows for selecting not only t… ▽ More

    Submitted 27 May, 2018; originally announced May 2018.

  41. arXiv:1805.09436  [pdf, other

    cs.CL

    Modeling Interpersonal Influence of Verbal Behavior in Couples Therapy Dyadic Interactions

    Authors: Sandeep Nallan Chakravarthula, Brian Baucom, Panayiotis Georgiou

    Abstract: Dyadic interactions among humans are marked by speakers continuously influencing and reacting to each other in terms of responses and behaviors, among others. Understanding how interpersonal dynamics affect behavior is important for successful treatment in psychotherapy domains. Traditional schemes that automatically identify behavior for this purpose have often looked at only the target speaker.… ▽ More

    Submitted 23 May, 2018; originally announced May 2018.

  42. arXiv:1805.05840  [pdf

    physics.app-ph eess.SP physics.bio-ph physics.med-ph

    Body Dust: Miniaturized Highly-integrated Low Power Sensing for Remotely Powered Drinkable CMOS Bioelectronics

    Authors: Sandro Carrara, Pantelis Georgiou

    Abstract: The aim of this paper is to introduce current advances in technology that could enable the development of fully drinkable and autonomous bio-electronic CMOS sensors in the form of dust particles, capable of identifying the source of a disease by targeting a specific region in organs and tissue such as a tumor mass and automatically sending diagnostic information wirelessly outside the body. We cal… ▽ More

    Submitted 30 April, 2018; originally announced May 2018.

    Comments: 9 pages, 14 figures

  43. arXiv:1805.03322  [pdf, other

    eess.AS cs.CL cs.SD

    Transfer Learning from Adult to Children for Speech Recognition: Evaluation, Analysis and Recommendations

    Authors: Prashanth Gurunath Shivakumar, Panayiotis Georgiou

    Abstract: Children speech recognition is challenging mainly due to the inherent high variability in children's physical and articulatory characteristics and expressions. This variability manifests in both acoustic constructs and linguistic usage due to the rapidly changing developmental stage in children's life. Part of the challenge is due to the lack of large amounts of available children speech data for… ▽ More

    Submitted 8 May, 2018; originally announced May 2018.

  44. Towards an Unsupervised Entrainment Distance in Conversational Speech using Deep Neural Networks

    Authors: Md Nasir, Brian Baucom, Shrikanth Narayanan, Panayiotis Georgiou

    Abstract: Entrainment is a known adaptation mechanism that causes interaction participants to adapt or synchronize their acoustic characteristics. Understanding how interlocutors tend to adapt to each other's speaking style through entrainment involves measuring a range of acoustic features and comparing those via multiple signal comparison methods. In this work, we present a turn-level distance measure obt… ▽ More

    Submitted 23 April, 2018; originally announced April 2018.

    Comments: submitted to Interspeech 2018

  45. arXiv:1802.07860  [pdf, other

    cs.SD cs.CL eess.AS

    Neural Predictive Coding using Convolutional Neural Networks towards Unsupervised Learning of Speaker Characteristics

    Authors: Arindam Jati, Panayiotis Georgiou

    Abstract: Learning speaker-specific features is vital in many applications like speaker recognition, diarization and speech recognition. This paper provides a novel approach, we term Neural Predictive Coding (NPC), to learn speaker-specific characteristics in a completely unsupervised manner from large amounts of unlabeled training data that even contain many non-speech events and multi-speaker audio stream… ▽ More

    Submitted 25 April, 2019; v1 submitted 21 February, 2018; originally announced February 2018.

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 10, pp. 1577-1589, Oct. 2019

  46. arXiv:1802.02607  [pdf, other

    cs.CL cs.SD eess.AS

    Learning from Past Mistakes: Improving Automatic Speech Recognition Output via Noisy-Clean Phrase Context Modeling

    Authors: Prashanth Gurunath Shivakumar, Haoqi Li, Kevin Knight, Panayiotis Georgiou

    Abstract: Automatic speech recognition (ASR) systems often make unrecoverable errors due to subsystem pruning (acoustic, language and pronunciation models); for example pruning words due to acoustics using short-term context, prior to rescoring with long-term context based on linguistics. In this work we model ASR as a phrase-based noisy transformation channel and propose an error correction system that can… ▽ More

    Submitted 28 March, 2019; v1 submitted 7 February, 2018; originally announced February 2018.

    Journal ref: APSIPA Transactions on Signal and Information Processing 8. Cambridge University Press: e8, 2019

  47. arXiv:1701.03198  [pdf, other

    cs.LG cs.SD

    Unsupervised Latent Behavior Manifold Learning from Acoustic Features: audio2behavior

    Authors: Haoqi Li, Brian Baucom, Panayiotis Georgiou

    Abstract: Behavioral annotation using signal processing and machine learning is highly dependent on training data and manual annotations of behavioral labels. Previous studies have shown that speech information encodes significant behavioral information and be used in a variety of automated behavior recognition tasks. However, extracting behavior information from speech is still a difficult task due to the… ▽ More

    Submitted 11 January, 2017; originally announced January 2017.

    Comments: Accepted by ICASSP 2017

  48. arXiv:1606.04518  [pdf, other

    cs.LG cs.NE

    Sparsely Connected and Disjointly Trained Deep Neural Networks for Low Resource Behavioral Annotation: Acoustic Classification in Couples' Therapy

    Authors: Haoqi Li, Brian Baucom, Panayiotis Georgiou

    Abstract: Observational studies are based on accurate assessment of human state. A behavior recognition system that models interlocutors' state in real-time can significantly aid the mental health domain. However, behavior recognition from speech remains a challenging task since it is difficult to find generalizable and representative features because of noisy and high-dimensional data, especially when data… ▽ More

    Submitted 14 June, 2016; originally announced June 2016.

  49. arXiv:1605.02021  [pdf, other

    cond-mat.mes-hall

    Window functions and sigmoidal behaviour of memristive systems

    Authors: Panayiotis S. Georgiou, Sophia N. Yaliraki, Emmanuel M. Drakakis, Mauricio Barahona

    Abstract: A common approach to model memristive systems is to include empirical window functions to describe edge effects and non-linearities in the change of the memristance. We demonstrate that under quite general conditions, each window function can be associated with a sigmoidal curve relating the normalised time-dependent memristance to the time integral of the input. Conversely, this explicit relation… ▽ More

    Submitted 14 January, 2016; originally announced May 2016.

    Comments: 12 pages, 5 figures, 1 table. To appear in International Journal of Circuit Theory and Applications

  50. arXiv:1011.0060  [pdf, ps, other

    cond-mat.mes-hall math.DS

    Quantitative Measure of Hysteresis for Memristors Through Explicit Dynamics

    Authors: Panayiotis S. Georgiou, Sophia N. Yaliraki, Emmanuel M. Drakakis, Mauricio Barahona

    Abstract: We introduce a mathematical framework for the analysis of the input-output dynamics of externally driven memristors. We show that, under general assumptions, their dynamics comply with a Bernoulli differential equation and hence can be nonlinearly transformed into a formally solvable linear equation. The Bernoulli formalism, which applies to both charge- and flux-controlled memristors when either… ▽ More

    Submitted 17 July, 2011; v1 submitted 30 October, 2010; originally announced November 2010.

    Comments: 11 pages, 12 figures