Skip to main content

Showing 1–30 of 30 results for author: Mrkšić, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2109.10126  [pdf, other

    cs.CL

    ConvFiT: Conversational Fine-Tuning of Pretrained Language Models

    Authors: Ivan Vulić, Pei-Hao Su, Sam Coope, Daniela Gerz, Paweł Budzianowski, Iñigo Casanueva, Nikola Mrkšić, Tsung-Hsien Wen

    Abstract: Transformer-based language models (LMs) pretrained on large text collections are proven to store a wealth of semantic knowledge. However, 1) they are not effective as sentence encoders when used off-the-shelf, and 2) thus typically lag behind conversationally pretrained (e.g., via response selection) encoders on conversational tasks such as intent detection (ID). In this work, we propose ConvFiT,… ▽ More

    Submitted 21 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 (long paper)

  2. arXiv:2104.08524  [pdf, other

    cs.CL

    Multilingual and Cross-Lingual Intent Detection from Spoken Data

    Authors: Daniela Gerz, Pei-Hao Su, Razvan Kusztos, Avishek Mondal, Michał Lis, Eshan Singhal, Nikola Mrkšić, Tsung-Hsien Wen, Ivan Vulić

    Abstract: We present a systematic study on multilingual and cross-lingual intent detection from spoken data. The study leverages a new resource put forth in this work, termed MInDS-14, a first training and evaluation resource for the intent detection task with spoken data. It covers 14 intents extracted from a commercial system in the e-banking domain, associated with spoken examples in 14 diverse language… ▽ More

    Submitted 17 April, 2021; originally announced April 2021.

  3. arXiv:1911.03688  [pdf, other

    cs.CL

    ConveRT: Efficient and Accurate Conversational Representations from Transformers

    Authors: Matthew Henderson, Iñigo Casanueva, Nikola Mrkšić, Pei-Hao Su, Tsung-Hsien Wen, Ivan Vulić

    Abstract: General-purpose pretrained sentence encoders such as BERT are not ideal for real-world conversational AI applications; they are computationally heavy, slow, and expensive to train. We propose ConveRT (Conversational Representations from Transformers), a pretraining framework for conversational tasks satisfying all the following requirements: it is effective, affordable, and quick to train. We pret… ▽ More

    Submitted 29 April, 2020; v1 submitted 9 November, 2019; originally announced November 2019.

  4. arXiv:1909.01296  [pdf, other

    cs.CL

    PolyResponse: A Rank-based Approach to Task-Oriented Dialogue with Application in Restaurant Search and Booking

    Authors: Matthew Henderson, Ivan Vulić, Iñigo Casanueva, Paweł Budzianowski, Daniela Gerz, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su

    Abstract: We present PolyResponse, a conversational search engine that supports task-oriented dialogue. It is a retrieval-based approach that bypasses the complex multi-component design of traditional task-oriented dialogue systems and the use of explicit semantics in the form of task-specific ontologies. The PolyResponse engine is trained on hundreds of millions of examples extracted from real conversation… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019 (Demo paper)

  5. arXiv:1906.01543  [pdf, other

    cs.CL

    Training Neural Response Selection for Task-Oriented Dialogue Systems

    Authors: Matthew Henderson, Ivan Vulić, Daniela Gerz, Iñigo Casanueva, Paweł Budzianowski, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su

    Abstract: Despite their popularity in the chatbot literature, retrieval-based models have had modest impact on task-oriented dialogue systems, with the main obstacle to their application being the low-data regime of most task-oriented dialogue tasks. Inspired by the recent success of pretraining in language modelling, we propose an effective method for deploying response selection in task-oriented dialogue.… ▽ More

    Submitted 7 June, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: ACL 2019 long paper

  6. arXiv:1904.06472  [pdf, other

    cs.CL

    A Repository of Conversational Datasets

    Authors: Matthew Henderson, Paweł Budzianowski, Iñigo Casanueva, Sam Coope, Daniela Gerz, Girish Kumar, Nikola Mrkšić, Georgios Spithourakis, Pei-Hao Su, Ivan Vulić, Tsung-Hsien Wen

    Abstract: Progress in Machine Learning is often driven by the availability of large datasets, and consistent evaluation metrics for comparing modeling approaches. To this end, we present a repository of conversational datasets consisting of hundreds of millions of examples, and a standardised evaluation procedure for conversational response selection models using '1-of-100 accuracy'. The repository contains… ▽ More

    Submitted 28 May, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

    Journal ref: Proceedings of the Workshop on NLP for Conversational AI (2019)

  7. Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization

    Authors: Edoardo Maria Ponti, Ivan Vulić, Goran Glavaš, Nikola Mrkšić, Anna Korhonen

    Abstract: Semantic specialization is the process of fine-tuning pre-trained distributional word vectors using external lexical knowledge (e.g., WordNet) to accentuate a particular semantic relation in the specialized vector space. While post-processing specialization methods are applicable to arbitrary distributional vectors, they are limited to updating only the vectors of words occurring in external lexic… ▽ More

    Submitted 11 September, 2018; originally announced September 2018.

    Comments: Accepted at EMNLP 2018

  8. arXiv:1805.11350  [pdf, other

    cs.CL cs.AI cs.LG

    Fully Statistical Neural Belief Tracking

    Authors: Nikola Mrkšić, Ivan Vulić

    Abstract: This paper proposes an improvement to the existing data-driven Neural Belief Tracking (NBT) framework for Dialogue State Tracking (DST). The existing NBT model uses a hand-crafted belief state update mechanism which involves an expensive manual retuning step whenever the model is deployed to a new dialogue domain. We show that this update mechanism can be learned jointly with the semantic decoding… ▽ More

    Submitted 29 May, 2018; originally announced May 2018.

    Comments: Accepted as a short paper for the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018)

  9. arXiv:1805.03228  [pdf, other

    cs.CL

    Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources

    Authors: Ivan Vulić, Goran Glavaš, Nikola Mrkšić, Anna Korhonen

    Abstract: Word vector specialisation (also known as retrofitting) is a portable, light-weight approach to fine-tuning arbitrary distributional word vector spaces by injecting external knowledge from rich lexical resources such as WordNet. By design, these post-processing methods only update the vectors of words occurring in external lexicons, leaving the representations of all unseen words intact. In this p… ▽ More

    Submitted 8 May, 2018; originally announced May 2018.

    Comments: NAACL 2018 (long paper)

  10. arXiv:1711.11023  [pdf, other

    stat.ML cs.CL cs.NE

    A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management

    Authors: Iñigo Casanueva, Paweł Budzianowski, Pei-Hao Su, Nikola Mrkšić, Tsung-Hsien Wen, Stefan Ultes, Lina Rojas-Barahona, Steve Young, Milica Gašić

    Abstract: Dialogue assistants are rapidly becoming an indispensable daily aid. To avoid the significant effort needed to hand-craft the required dialogue flow, the Dialogue Management (DM) module can be cast as a continuous Markov Decision Process (MDP) and trained through Reinforcement Learning (RL). Several RL models have been investigated over recent years. However, the lack of a common benchmarking fram… ▽ More

    Submitted 6 April, 2018; v1 submitted 29 November, 2017; originally announced November 2017.

    Comments: Accepted at the Deep Reinforcement Learning Symposium, 31st Conference on Neural Information Processing Systems (NIPS 2017) Paper updated with minor changes

  11. arXiv:1710.06371  [pdf, other

    cs.CL

    Specialising Word Vectors for Lexical Entailment

    Authors: Ivan Vulić, Nikola Mrkšić

    Abstract: We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation. By injecting external linguistic constraints (e.g., WordNet links) into the initial vector space, the LE specialisation procedure brings true hyponymy-hy… ▽ More

    Submitted 19 April, 2018; v1 submitted 17 October, 2017; originally announced October 2017.

    Comments: NAACL-HLT 2018 (long paper)

  12. arXiv:1707.06945  [pdf, other

    cs.CL

    Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation

    Authors: Ivan Vulić, Nikola Mrkšić, Anna Korhonen

    Abstract: Existing approaches to automatic VerbNet-style verb classification are heavily dependent on feature engineering and therefore limited to languages with mature NLP pipelines. In this work, we propose a novel cross-lingual transfer method for inducing VerbNets for multiple languages. To the best of our knowledge, this is the first study which demonstrates how the architectures for learning word embe… ▽ More

    Submitted 21 July, 2017; originally announced July 2017.

    Comments: EMNLP 2017 (long paper)

  13. arXiv:1707.06299  [pdf, other

    cs.CL stat.ML

    Reward-Balancing for Statistical Spoken Dialogue Systems using Multi-objective Reinforcement Learning

    Authors: Stefan Ultes, Paweł Budzianowski, Iñigo Casanueva, Nikola Mrkšić, Lina Rojas-Barahona, Pei-Hao Su, Tsung-Hsien Wen, Milica Gašić, Steve Young

    Abstract: Reinforcement learning is widely used for dialogue policy optimization where the reward function often consists of more than one component, e.g., the dialogue success and the dialogue length. In this work, we propose a structured method for finding a good balance between these components by searching for the optimal reward component weighting. To render this search feasible, we use multi-objective… ▽ More

    Submitted 19 July, 2017; originally announced July 2017.

    Comments: Accepted at SIGDial 2017

  14. arXiv:1706.06210  [pdf, other

    cs.CL cs.AI

    Sub-domain Modelling for Dialogue Management with Hierarchical Reinforcement Learning

    Authors: Paweł Budzianowski, Stefan Ultes, Pei-Hao Su, Nikola Mrkšić, Tsung-Hsien Wen, Iñigo Casanueva, Lina Rojas-Barahona, Milica Gašić

    Abstract: Human conversation is inherently complex, often spanning many different topics/domains. This makes policy learning for dialogue systems very challenging. Standard flat reinforcement learning methods do not provide an efficient framework for modelling such dialogues. In this paper, we focus on the under-explored problem of multi-domain dialogue management. First, we propose a new method for hierarc… ▽ More

    Submitted 17 July, 2017; v1 submitted 19 June, 2017; originally announced June 2017.

    Comments: Update of the section 4 and the bibliography

  15. arXiv:1706.00377  [pdf, other

    cs.CL

    Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules

    Authors: Ivan Vulić, Nikola Mrkšić, Roi Reichart, Diarmuid Ó Séaghdha, Steve Young, Anna Korhonen

    Abstract: Morphologically rich languages accentuate two properties of distributional vector space models: 1) the difficulty of inducing accurate representations for low-frequency word forms; and 2) insensitivity to distinct lexical relations that have similar distributional signatures. These effects are detrimental for language understanding systems, which may infer that 'inexpensive' is a rephrasing for 'e… ▽ More

    Submitted 1 June, 2017; originally announced June 2017.

    Comments: ACL 2017 (Long paper)

  16. arXiv:1706.00374  [pdf, other

    cs.CL cs.AI cs.LG

    Semantic Specialisation of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints

    Authors: Nikola Mrkšić, Ivan Vulić, Diarmuid Ó Séaghdha, Ira Leviant, Roi Reichart, Milica Gašić, Anna Korhonen, Steve Young

    Abstract: We present Attract-Repel, an algorithm for improving the semantic quality of word vectors by injecting constraints extracted from lexical resources. Attract-Repel facilitates the use of constraints from mono- and cross-lingual resources, yielding semantically specialised cross-lingual vector spaces. Our evaluation shows that the method can make use of existing cross-lingual lexicons to construct h… ▽ More

    Submitted 1 June, 2017; originally announced June 2017.

    Comments: Accepted for publication at TACL (to be presented at EMNLP 2017)

  17. arXiv:1610.04120  [pdf, other

    cs.AI cs.CL cs.NE

    Exploiting Sentence and Context Representations in Deep Neural Models for Spoken Language Understanding

    Authors: Lina M. Rojas Barahona, Milica Gasic, Nikola Mrkšić, Pei-Hao Su, Stefan Ultes, Tsung-Hsien Wen, Steve Young

    Abstract: This paper presents a deep learning architecture for the semantic decoder component of a Statistical Spoken Dialogue System. In a slot-filling dialogue, the semantic decoder predicts the dialogue act and a set of slot-value pairs from a set of n-best hypotheses returned by the Automatic Speech Recognition. Most current models for spoken language understanding assume (i) word-aligned semantic annot… ▽ More

    Submitted 13 October, 2016; originally announced October 2016.

  18. arXiv:1609.02846  [pdf, other

    cs.CL

    Dialogue manager domain adaptation using Gaussian process reinforcement learning

    Authors: Milica Gasic, Nikola Mrksic, Lina M. Rojas-Barahona, Pei-Hao Su, Stefan Ultes, David Vandyke, Tsung-Hsien Wen, Steve Young

    Abstract: Spoken dialogue systems allow humans to interact with machines using natural speech. As such, they have many benefits. By using speech as the primary communication medium, a computer interface can facilitate swift, human-like acquisition of information. In recent years, speech interfaces have become ever more popular, as is evident from the rise of personal assistants such as Siri, Google Now, Cor… ▽ More

    Submitted 9 September, 2016; originally announced September 2016.

    Comments: accepted for publication in Computer Speech and Language

  19. arXiv:1606.03777  [pdf, other

    cs.CL cs.AI cs.LG

    Neural Belief Tracker: Data-Driven Dialogue State Tracking

    Authors: Nikola Mrkšić, Diarmuid Ó Séaghdha, Tsung-Hsien Wen, Blaise Thomson, Steve Young

    Abstract: One of the core components of modern spoken dialogue systems is the belief tracker, which estimates the user's goal at every step of the dialogue. However, most current approaches have difficulty scaling to larger, more complex dialogue domains. This is due to their dependency on either: a) Spoken Language Understanding models that require large amounts of annotated training data; or b) hand-craft… ▽ More

    Submitted 21 April, 2017; v1 submitted 12 June, 2016; originally announced June 2016.

    Comments: Accepted as a long paper for the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017)

  20. arXiv:1606.03352  [pdf, other

    cs.CL cs.NE stat.ML

    Conditional Generation and Snapshot Learning in Neural Dialogue Systems

    Authors: Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Lina M. Rojas-Barahona, Pei-Hao Su, Stefan Ultes, David Vandyke, Steve Young

    Abstract: Recently a variety of LSTM-based conditional language models (LM) have been applied across a range of language generation tasks. In this work we study various model architectures and different ways to represent and aggregate the source information in an end-to-end neural dialogue system framework. A method called snapshot learning is also proposed to facilitate learning from supervised sequential… ▽ More

    Submitted 10 June, 2016; originally announced June 2016.

  21. arXiv:1606.02689  [pdf, other

    cs.CL cs.LG

    Continuously Learning Neural Dialogue Management

    Authors: Pei-Hao Su, Milica Gasic, Nikola Mrksic, Lina Rojas-Barahona, Stefan Ultes, David Vandyke, Tsung-Hsien Wen, Steve Young

    Abstract: We describe a two-step approach for dialogue management in task-oriented spoken dialogue systems. A unified neural network framework is proposed to enable the system to first learn by supervision from a set of dialogue data and then continuously improve its behaviour via reinforcement learning, all using gradient-based algorithms on one single model. The experiments demonstrate the supervised mode… ▽ More

    Submitted 8 June, 2016; originally announced June 2016.

  22. arXiv:1605.07669  [pdf, other

    cs.CL cs.LG

    On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems

    Authors: Pei-Hao Su, Milica Gasic, Nikola Mrksic, Lina Rojas-Barahona, Stefan Ultes, David Vandyke, Tsung-Hsien Wen, Steve Young

    Abstract: The ability to compute an accurate reward function is essential for optimising a dialogue policy via reinforcement learning. In real-world applications, using explicit user feedback as the reward signal is often unreliable and costly to collect. This problem can be mitigated if the user's intent is known in advance or data is available to pre-train a task success predictor off-line. In practice ne… ▽ More

    Submitted 2 June, 2016; v1 submitted 24 May, 2016; originally announced May 2016.

    Comments: Accepted as a long paper in ACL 2016

  23. arXiv:1604.04562  [pdf, other

    cs.CL cs.AI cs.NE stat.ML

    A Network-based End-to-End Trainable Task-oriented Dialogue System

    Authors: Tsung-Hsien Wen, David Vandyke, Nikola Mrksic, Milica Gasic, Lina M. Rojas-Barahona, Pei-Hao Su, Stefan Ultes, Steve Young

    Abstract: Teaching machines to accomplish tasks by conversing naturally with humans is challenging. Currently, develo** task-oriented dialogue systems requires creating multiple components and typically this involves either a large amount of handcrafting, or acquiring costly labelled datasets to solve a statistical learning problem for each component. In this work we introduce a neural network-based text-… ▽ More

    Submitted 24 April, 2017; v1 submitted 15 April, 2016; originally announced April 2016.

    Comments: published at EACL 2017

  24. arXiv:1603.01232  [pdf, other

    cs.CL

    Multi-domain Neural Network Language Generation for Spoken Dialogue Systems

    Authors: Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Lina M. Rojas-Barahona, Pei-Hao Su, David Vandyke, Steve Young

    Abstract: Moving from limited-domain natural language generation (NLG) to open domain is difficult because the number of semantic input combinations grows exponentially with the number of domains. Therefore, it is important to leverage existing resources and exploit similarities between domains to facilitate domain adaptation. In this paper, we propose a procedure to train multi-domain, Recurrent Neural Net… ▽ More

    Submitted 3 March, 2016; originally announced March 2016.

    Comments: Accepted as a long paper in NAACL-HLT 2016

  25. arXiv:1603.00892  [pdf, other

    cs.CL cs.LG

    Counter-fitting Word Vectors to Linguistic Constraints

    Authors: Nikola Mrkšić, Diarmuid Ó Séaghdha, Blaise Thomson, Milica Gašić, Lina Rojas-Barahona, Pei-Hao Su, David Vandyke, Tsung-Hsien Wen, Steve Young

    Abstract: In this work, we present a novel counter-fitting method which injects antonymy and synonymy constraints into vector space representations in order to improve the vectors' capability for judging semantic similarity. Applying this method to publicly available pre-trained word vectors leads to a new state of the art performance on the SimLex-999 dataset. We also show how the method can be used to tai… ▽ More

    Submitted 2 March, 2016; originally announced March 2016.

    Comments: Paper accepted for the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2016)

  26. arXiv:1508.03391  [pdf, other

    cs.LG cs.CL

    Reward Sha** with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems

    Authors: Pei-Hao Su, David Vandyke, Milica Gasic, Nikola Mrksic, Tsung-Hsien Wen, Steve Young

    Abstract: Statistical spoken dialogue systems have the attractive property of being able to be optimised from data via interactions with real users. However in the reinforcement learning paradigm the dialogue manager (agent) often requires significant time to explore the state-action space to learn to behave in a desirable manner. This is a critical issue when the system is trained on-line with real users w… ▽ More

    Submitted 18 August, 2015; v1 submitted 13 August, 2015; originally announced August 2015.

    Comments: Accepted for publication in SigDial 2015

  27. arXiv:1508.03386  [pdf, other

    cs.LG cs.CL

    Learning from Real Users: Rating Dialogue Success with Neural Networks for Reinforcement Learning in Spoken Dialogue Systems

    Authors: Pei-Hao Su, David Vandyke, Milica Gasic, Dongho Kim, Nikola Mrksic, Tsung-Hsien Wen, Steve Young

    Abstract: To train a statistical spoken dialogue system (SDS) it is essential that an accurate method for measuring task success is available. To date training has relied on presenting a task to either simulated or paid users and inferring the dialogue's success by observing whether this presented task was achieved or not. Our aim however is to be able to learn from real users acting under their own volitio… ▽ More

    Submitted 13 August, 2015; originally announced August 2015.

    Comments: Accepted for publication in INTERSPEECH 2015

  28. arXiv:1508.01755  [pdf, other

    cs.CL

    Stochastic Language Generation in Dialogue using Recurrent Neural Networks with Convolutional Sentence Reranking

    Authors: Tsung-Hsien Wen, Milica Gasic, Dongho Kim, Nikola Mrksic, Pei-Hao Su, David Vandyke, Steve Young

    Abstract: The natural language generation (NLG) component of a spoken dialogue system (SDS) usually needs a substantial amount of handcrafting or a well-labeled dataset to be trained on. These limitations add significantly to development costs and make cross-domain, multi-lingual dialogue systems intractable. Moreover, human languages are context-aware. The most natural response should be directly learned f… ▽ More

    Submitted 7 August, 2015; originally announced August 2015.

    Comments: To be appear in SigDial 2015

  29. arXiv:1508.01745  [pdf, other

    cs.CL

    Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems

    Authors: Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Pei-Hao Su, David Vandyke, Steve Young

    Abstract: Natural language generation (NLG) is a critical component of spoken dialogue and it has a significant impact both on usability and perceived quality. Most NLG systems in common use employ rules and heuristics and tend to generate rigid and stylised responses without the natural variation of human language. They are also not easily scaled to systems covering multiple domains and languages. This pap… ▽ More

    Submitted 26 August, 2015; v1 submitted 7 August, 2015; originally announced August 2015.

    Comments: To be appear in EMNLP 2015

  30. arXiv:1506.07190  [pdf, other

    cs.CL cs.LG

    Multi-domain Dialog State Tracking using Recurrent Neural Networks

    Authors: Nikola Mrkšić, Diarmuid Ó Séaghdha, Blaise Thomson, Milica Gašić, Pei-Hao Su, David Vandyke, Tsung-Hsien Wen, Steve Young

    Abstract: Dialog state tracking is a key component of many modern dialog systems, most of which are designed with a single, well-defined domain in mind. This paper shows that dialog data drawn from different dialog domains can be used to train a general belief tracking model which can operate across all of these domains, exhibiting superior performance to each of the domain-specific models. We propose a tra… ▽ More

    Submitted 23 June, 2015; originally announced June 2015.

    Comments: Accepted as a short paper in the 53rd Annual Meeting of the Association for Computational Linguistics (ACL 2015)