Search | arXiv e-print repository

Processing Self Corrections in a speech to speech system

Authors: Joerg Spilker, Martin Klarner, Guenther Goerz

Abstract: Speech repairs occur often in spontaneous spoken dialogues. The ability to detect and correct those repairs is necessary for any spoken language system. We present a framework to detect and correct speech repairs where all relevant levels of information, i.e., acoustics, lexis, syntax and semantics can be integrated. The basic idea is to reduce the search space for repairs as soon as possible by… ▽ More Speech repairs occur often in spontaneous spoken dialogues. The ability to detect and correct those repairs is necessary for any spoken language system. We present a framework to detect and correct speech repairs where all relevant levels of information, i.e., acoustics, lexis, syntax and semantics can be integrated. The basic idea is to reduce the search space for repairs as soon as possible by cascading filters that involve more and more features. At first an acoustic module generates hypotheses about the existence of a repair. Second a stochastic model suggests a correction for every hypothesis. Well scored corrections are inserted as new paths in the word lattice. Finally a lattice parser decides on accepting the rep air. △ Less

Submitted 21 August, 2000; originally announced August 2000.

Comments: 5 pages, 2 figures

ACM Class: I 2.7

Journal ref: Proceedings of COLING 2000, Saarbruecken, Germany; 31.7-4.8; pp 1116-1120

arXiv:cs/9907021 [pdf, ps, other]

Architectural Considerations for Conversational Systems -- The Verbmobil/INTARC Experience

Authors: Guenther Goerz, Joerg Spilker, Volker Strom, Hans Weber

Abstract: The paper describes the speech to speech translation system INTARC, developed during the first phase of the Verbmobil project. The general design goals of the INTARC system architecture were time synchronous processing as well as incrementality and interactivity as a means to achieve a higher degree of robustness and scalability. Interactivity means that in addition to the bottom-up (in terms of… ▽ More The paper describes the speech to speech translation system INTARC, developed during the first phase of the Verbmobil project. The general design goals of the INTARC system architecture were time synchronous processing as well as incrementality and interactivity as a means to achieve a higher degree of robustness and scalability. Interactivity means that in addition to the bottom-up (in terms of processing levels) data flow the ability to process top-down restrictions considering the same signal segment for all processing levels. The construction of INTARC 2.0, which has been operational since fall 1996, followed an engineering approach focussing on the integration of symbolic (linguistic) and stochastic (recognition) techniques which led to a generalization of the concept of a ``one pass'' beam search. △ Less

Submitted 14 July, 1999; originally announced July 1999.

Comments: 10 pages, to appear in proceedings of First International Workshop on Human Computer Conversation, Bellagio, Italy

ACM Class: I.2.7

arXiv:cs/9809022 [pdf, ps, other]

Modelling Users, Intentions, and Structure in Spoken Dialog

Authors: Bernd Ludwig, Guenther Goerz, Heinrich Niemann

Abstract: We outline how utterances in dialogs can be interpreted using a partial first order logic. We exploit the capability of this logic to talk about the truth status of formulae to define a notion of coherence between utterances and explain how this coherence relation can serve for the construction of AND/OR trees that represent the segmentation of the dialog. In a BDI model we formalize basic assum… ▽ More We outline how utterances in dialogs can be interpreted using a partial first order logic. We exploit the capability of this logic to talk about the truth status of formulae to define a notion of coherence between utterances and explain how this coherence relation can serve for the construction of AND/OR trees that represent the segmentation of the dialog. In a BDI model we formalize basic assumptions about dialog and cooperative behaviour of participants. These assumptions provide a basis for inferring speech acts from coherence relations between utterances and attitudes of dialog participants. Speech acts prove to be useful for determining dialog segments defined on the notion of completing expectations of dialog participants. Finally, we sketch how explicit segmentation signalled by cue phrases and performatives is covered by our dialog model. △ Less

Submitted 17 September, 1998; originally announced September 1998.

Comments: 17 pages

ACM Class: H.5.2

arXiv:cmp-lg/9808005 [pdf, ps, other]

Combining Expression and Content in Domains for Dialog Managers

Authors: Bernd Ludwig, Guenther Goerz, Heinrich Niemann

Abstract: We present work in progress on abstracting dialog managers from their domain in order to implement a dialog manager development tool which takes (among other data) a domain description as input and delivers a new dialog manager for the described domain as output. Thereby we will focus on two topics; firstly, the construction of domain descriptions with description logics and secondly, the interp… ▽ More We present work in progress on abstracting dialog managers from their domain in order to implement a dialog manager development tool which takes (among other data) a domain description as input and delivers a new dialog manager for the described domain as output. Thereby we will focus on two topics; firstly, the construction of domain descriptions with description logics and secondly, the interpretation of utterances in a given domain. △ Less

Submitted 13 August, 1998; originally announced August 1998.

Comments: 5 pages, uses conference.sty

Journal ref: Proceedings of DL '98, pp. 126-130, Trento, Italy

arXiv:cmp-lg/9606031 [pdf, ps, other]

Research on Architectures for Integrated Speech/Language Systems in Verbmobil

Authors: Günther Görz, Marcus Kesseler, Jörg Spilker, Hans Weber

Abstract: The German joint research project Verbmobil (VM) aims at the development of a speech to speech translation system. This paper reports on research done in our group which belongs to Verbmobil's subproject on system architectures (TP15). Our specific research areas are the construction of parsers for spontaneous speech, investigations in the parallelization of parsing and to contribute to the deve… ▽ More The German joint research project Verbmobil (VM) aims at the development of a speech to speech translation system. This paper reports on research done in our group which belongs to Verbmobil's subproject on system architectures (TP15). Our specific research areas are the construction of parsers for spontaneous speech, investigations in the parallelization of parsing and to contribute to the development of a flexible communication architecture with distributed control. △ Less

Submitted 25 June, 1996; originally announced June 1996.

Comments: 6 pages, 2 Postscript figures

Journal ref: accepted for COLING 96

arXiv:cmp-lg/9605028 [pdf, ps, other]

Towards Understanding Spontaneous Speech: Word Accuracy vs. Concept Accuracy

Authors: M. Boros, W. Eckert, F. Gallwitz, G. Goerz, G. Hanrieder, H. Niemann

Abstract: In this paper we describe an approach to automatic evaluation of both the speech recognition and understanding capabilities of a spoken dialogue system for train time table information. We use word accuracy for recognition and concept accuracy for understanding performance judgement. Both measures are calculated by comparing these modules' output with a correct reference answer. We report evalua… ▽ More In this paper we describe an approach to automatic evaluation of both the speech recognition and understanding capabilities of a spoken dialogue system for train time table information. We use word accuracy for recognition and concept accuracy for understanding performance judgement. Both measures are calculated by comparing these modules' output with a correct reference answer. We report evaluation results for a spontaneous speech corpus with about 10000 utterances. We observed a nearly linear relationship between word accuracy and concept accuracy. △ Less

Submitted 15 May, 1996; originally announced May 1996.

Comments: 4 pages PS, Latex2e source importing 2 eps figures, uses icslp.cls, caption.sty, psfig.sty; to appear in the Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP 96)

arXiv:cmp-lg/9505017 [pdf, ps, other]

Robust Parsing of Spoken Dialogue Using Contextual Knowledge and Recognition Probabilities

Authors: Gerhard Hanrieder, Guenther Goerz

Abstract: In this paper we describe the linguistic processor of a spoken dialogue system. The parser receives a word graph from the recognition module as its input. Its task is to find the best path through the graph. If no complete solution can be found, a robust mechanism for selecting multiple partial results is applied. We show how the information content rate of the results can be improved if the sel… ▽ More In this paper we describe the linguistic processor of a spoken dialogue system. The parser receives a word graph from the recognition module as its input. Its task is to find the best path through the graph. If no complete solution can be found, a robust mechanism for selecting multiple partial results is applied. We show how the information content rate of the results can be improved if the selection is based on an integrated quality score combining word recognition scores and context-dependent semantic predictions. Results of parsing word graphs with and without predictions are reported. △ Less

Submitted 8 May, 1995; originally announced May 1995.

Comments: 4 pages, LaTex source, 3 PostScript figures, uses epsf.sty and ETRW.sty, to appear in Proceedings of ESCA Workshop on Spoken Dialogue Systems, Denmark, May 30-June 2

arXiv:cmp-lg/9406032 [pdf, ps]

Anytime Algorithms for Speech Parsing?

Authors: Guenther Goerz, Marcus Kesseler

Abstract: This paper discusses to which extent the concept of ``anytime algorithms'' can be applied to parsing algorithms with feature unification. We first try to give a more precise definition of what an anytime algorithm is. We arque that parsing algorithms have to be classified as contract algorithms as opposed to (truly) interruptible algorithms. With the restriction that the transaction being active… ▽ More This paper discusses to which extent the concept of ``anytime algorithms'' can be applied to parsing algorithms with feature unification. We first try to give a more precise definition of what an anytime algorithm is. We arque that parsing algorithms have to be classified as contract algorithms as opposed to (truly) interruptible algorithms. With the restriction that the transaction being active at the time an interrupt is issued has to be completed before the interrupt can be executed, it is possible to provide a parser with limited anytime behavior, which is in fact being realized in our research prototype. △ Less

Submitted 21 June, 1994; originally announced June 1994.

Comments: 5 pages, 2 figures

Journal ref: COLING-94

Showing 1–8 of 8 results for author: Görz, G