Skip to main content

Showing 1–11 of 11 results for author: Spithourakis, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2204.13496  [pdf, other

    cs.CL cs.LG

    EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and Identification

    Authors: Georgios P. Spithourakis, Ivan Vulić, Michał Lis, Iñigo Casanueva, Paweł Budzianowski

    Abstract: Knowledge-based authentication is crucial for task-oriented spoken dialogue systems that offer personalised and privacy-focused services. Such systems should be able to enrol (E), verify (V), and identify (I) new and recurring users based on their personal information, e.g. postcode, name, and date of birth. In this work, we formalise the three authentication tasks and their evaluation protocols,… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

    Comments: 13 pages, 7 figures, 7 tables. Accepted in NAACL 2022 (Findings)

  2. arXiv:2204.13021  [pdf, other

    cs.CL cs.LG

    NLU++: A Multi-Label, Slot-Rich, Generalisable Dataset for Natural Language Understanding in Task-Oriented Dialogue

    Authors: Iñigo Casanueva, Ivan Vulić, Georgios P. Spithourakis, Paweł Budzianowski

    Abstract: We present NLU++, a novel dataset for natural language understanding (NLU) in task-oriented dialogue (ToD) systems, with the aim to provide a much more challenging evaluation environment for dialogue NLU models, up to date with the current application and industry requirements. NLU++ is divided into two domains (BANKING and HOTELS) and brings several crucial improvements over current commonly used… ▽ More

    Submitted 5 May, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

    Comments: 16 pages, 1 figure, 10 tables. Accepted in NAACL 2022 (Findings)

  3. arXiv:1909.01296  [pdf, other

    cs.CL

    PolyResponse: A Rank-based Approach to Task-Oriented Dialogue with Application in Restaurant Search and Booking

    Authors: Matthew Henderson, Ivan Vulić, Iñigo Casanueva, Paweł Budzianowski, Daniela Gerz, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su

    Abstract: We present PolyResponse, a conversational search engine that supports task-oriented dialogue. It is a retrieval-based approach that bypasses the complex multi-component design of traditional task-oriented dialogue systems and the use of explicit semantics in the form of task-specific ontologies. The PolyResponse engine is trained on hundreds of millions of examples extracted from real conversation… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019 (Demo paper)

  4. arXiv:1906.01543  [pdf, other

    cs.CL

    Training Neural Response Selection for Task-Oriented Dialogue Systems

    Authors: Matthew Henderson, Ivan Vulić, Daniela Gerz, Iñigo Casanueva, Paweł Budzianowski, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su

    Abstract: Despite their popularity in the chatbot literature, retrieval-based models have had modest impact on task-oriented dialogue systems, with the main obstacle to their application being the low-data regime of most task-oriented dialogue tasks. Inspired by the recent success of pretraining in language modelling, we propose an effective method for deploying response selection in task-oriented dialogue.… ▽ More

    Submitted 7 June, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: ACL 2019 long paper

  5. arXiv:1904.06472  [pdf, other

    cs.CL

    A Repository of Conversational Datasets

    Authors: Matthew Henderson, Paweł Budzianowski, Iñigo Casanueva, Sam Coope, Daniela Gerz, Girish Kumar, Nikola Mrkšić, Georgios Spithourakis, Pei-Hao Su, Ivan Vulić, Tsung-Hsien Wen

    Abstract: Progress in Machine Learning is often driven by the availability of large datasets, and consistent evaluation metrics for comparing modeling approaches. To this end, we present a repository of conversational datasets consisting of hundreds of millions of examples, and a standardised evaluation procedure for conversational response selection models using '1-of-100 accuracy'. The repository contains… ▽ More

    Submitted 28 May, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

    Journal ref: Proceedings of the Workshop on NLP for Conversational AI (2019)

  6. arXiv:1805.08154  [pdf, other

    cs.CL cs.NE stat.ML

    Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers

    Authors: Georgios P. Spithourakis, Sebastian Riedel

    Abstract: Numeracy is the ability to understand and work with numbers. It is a necessary skill for composing and understanding documents in clinical, scientific, and other technical domains. In this paper, we explore different strategies for modelling numerals with language models, such as memorisation and digit-by-digit composition, and propose a novel neural architecture that uses a continuous probability… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

    Comments: accepted at ACL 2018

  7. arXiv:1707.03264  [pdf, other

    cs.CL

    A simple but tough-to-beat baseline for the Fake News Challenge stance detection task

    Authors: Benjamin Riedel, Isabelle Augenstein, Georgios P. Spithourakis, Sebastian Riedel

    Abstract: Identifying public misinformation is a complicated and challenging task. An important part of checking the veracity of a specific claim is to evaluate the stance different news sources take towards the assertion. Automatic stance evaluation, i.e. stance detection, would arguably facilitate the process of fact checking. In this paper, we present our stance detection system which claimed third place… ▽ More

    Submitted 21 May, 2018; v1 submitted 11 July, 2017; originally announced July 2017.

    Comments: 6 pages, 1 figure, 3 tables; additional reference and details added, typos and wording corrected

  8. arXiv:1701.08251  [pdf, other

    cs.CL cs.AI cs.CV

    Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation

    Authors: Nasrin Mostafazadeh, Chris Brockett, Bill Dolan, Michel Galley, Jianfeng Gao, Georgios P. Spithourakis, Lucy Vanderwende

    Abstract: The popularity of image sharing on social media and the engagement it creates between users reflects the important role that visual context plays in everyday conversations. We present a novel task, Image-Grounded Conversations (IGC), in which natural-sounding conversations are generated about a shared image. To benchmark progress, we introduce a new multiple-reference dataset of crowd-sourced, eve… ▽ More

    Submitted 19 April, 2017; v1 submitted 28 January, 2017; originally announced January 2017.

  9. arXiv:1610.06370  [pdf, other

    cs.CL cs.HC cs.NE

    Clinical Text Prediction with Numerically Grounded Conditional Language Models

    Authors: Georgios P. Spithourakis, Steffen E. Petersen, Sebastian Riedel

    Abstract: Assisted text input techniques can save time and effort and improve text quality. In this paper, we investigate how grounded and conditional extensions to standard neural language models can bring improvements in the tasks of word prediction and completion. These extensions incorporate a structured knowledge base and numerical values from the text into the context used to predict the next word. Ou… ▽ More

    Submitted 20 October, 2016; originally announced October 2016.

    Comments: Accepted at the 7th International Workshop on Health Text Mining and Information Analysis (LOUHI) EMNLP 2016

  10. arXiv:1608.04147  [pdf, other

    cs.CL cs.NE

    Numerically Grounded Language Models for Semantic Error Correction

    Authors: Georgios P. Spithourakis, Isabelle Augenstein, Sebastian Riedel

    Abstract: Semantic error detection and correction is an important task for applications such as fact checking, speech-to-text or grammatical error correction. Current approaches generally focus on relatively shallow semantics and do not account for numeric quantities. Our approach uses language models grounded in numbers within the text. Such groundings are easily achieved for recurrent neural language mode… ▽ More

    Submitted 14 August, 2016; originally announced August 2016.

    Comments: accepted to EMNLP 2016

  11. arXiv:1603.06155  [pdf, other

    cs.CL

    A Persona-Based Neural Conversation Model

    Authors: Jiwei Li, Michel Galley, Chris Brockett, Georgios P. Spithourakis, Jianfeng Gao, Bill Dolan

    Abstract: We present persona-based models for handling the issue of speaker consistency in neural response generation. A speaker model encodes personas in distributed embeddings that capture individual characteristics such as background information and speaking style. A dyadic speaker-addressee model captures properties of interactions between two interlocutors. Our models yield qualitative performance impr… ▽ More

    Submitted 8 June, 2016; v1 submitted 19 March, 2016; originally announced March 2016.

    Comments: Accepted for publication at ACL 2016