-
Shared Model of Sense-making for Human-Machine Collaboration
Authors:
Gheorghe Tecuci,
Dorin Marcu,
Louis Kaiser,
Mihai Boicu
Abstract:
We present a model of sense-making that greatly facilitates the collaboration between an intelligent analyst and a knowledge-based agent. It is a general model grounded in the science of evidence and the scientific method of hypothesis generation and testing, where sense-making hypotheses that explain an observation are generated, relevant evidence is then discovered, and the hypotheses are tested…
▽ More
We present a model of sense-making that greatly facilitates the collaboration between an intelligent analyst and a knowledge-based agent. It is a general model grounded in the science of evidence and the scientific method of hypothesis generation and testing, where sense-making hypotheses that explain an observation are generated, relevant evidence is then discovered, and the hypotheses are tested based on the discovered evidence. We illustrate how the model enables an analyst to directly instruct the agent to understand situations involving the possible production of weapons (e.g., chemical warfare agents) and how the agent becomes increasingly more competent in understanding other situations from that domain (e.g., possible production of centrifuge-enriched uranium or of stealth fighter aircraft).
△ Less
Submitted 5 November, 2021;
originally announced November 2021.
-
Toward a Computational Theory of Evidence-Based Reasoning for Instructable Cognitive Agents
Authors:
Gheorghe Tecuci,
Dorin Marcu,
Mihai Boicu,
Steven Meckl,
Chirag Uttamsingh
Abstract:
Evidence-based reasoning is at the core of many problem-solving and decision-making tasks in a wide variety of domains. Generalizing from the research and development of cognitive agents in several such domains, this paper presents progress toward a computational theory for the development of instructable cognitive agents for evidence-based reasoning tasks. The paper also illustrates the applicati…
▽ More
Evidence-based reasoning is at the core of many problem-solving and decision-making tasks in a wide variety of domains. Generalizing from the research and development of cognitive agents in several such domains, this paper presents progress toward a computational theory for the development of instructable cognitive agents for evidence-based reasoning tasks. The paper also illustrates the application of this theory to the development of four prototype cognitive agents in domains that are critical to the government and the public sector. Two agents function as cognitive assistants, one in intelligence analysis, and the other in science education. The other two agents operate autonomously, one in cybersecurity and the other in intelligence, surveillance, and reconnaissance. The paper concludes with the directions of future research on the proposed computational theory.
△ Less
Submitted 9 October, 2019;
originally announced October 2019.
-
Co-Arg: Cogent Argumentation with Crowd Elicitation
Authors:
Mihai Boicu,
Dorin Marcu,
Gheorghe Tecuci,
Lou Kaiser,
Chirag Uttamsingh,
Navya Kalale
Abstract:
This paper presents Co-Arg, a new type of cognitive assistant to an intelligence analyst that enables the synergistic integration of analyst imagination and expertise, computer knowledge and critical reasoning, and crowd wisdom, to draw defensible and persuasive conclusions from masses of evidence of all types, in a world that is changing all the time. Co-Arg's goal is to improve the quality of th…
▽ More
This paper presents Co-Arg, a new type of cognitive assistant to an intelligence analyst that enables the synergistic integration of analyst imagination and expertise, computer knowledge and critical reasoning, and crowd wisdom, to draw defensible and persuasive conclusions from masses of evidence of all types, in a world that is changing all the time. Co-Arg's goal is to improve the quality of the analytic results and enhance their understandability for both experts and novices. The performed analysis is based on a sound and transparent argumentation that links evidence to conclusions in a way that shows very clearly how the conclusions have been reached, what evidence was used and how, what is not known, and what assumptions have been made. The analytic results are presented in a report describes the analytic conclusion and its probability, the main favoring and disfavoring arguments, the justification of the key judgments and assumptions, and the missing information that might increase the accuracy of the solution.
△ Less
Submitted 2 October, 2018;
originally announced October 2018.
-
Learning Interpretable Spatial Operations in a Rich 3D Blocks World
Authors:
Yonatan Bisk,
Kevin J. Shih,
Ye** Choi,
Daniel Marcu
Abstract:
In this paper, we study the problem of map** natural language instructions to complex spatial actions in a 3D blocks world. We first introduce a new dataset that pairs complex 3D spatial operations to rich natural language descriptions that require complex spatial and pragmatic interpretations such as "mirroring", "twisting", and "balancing". This dataset, built on the simulation environment of…
▽ More
In this paper, we study the problem of map** natural language instructions to complex spatial actions in a 3D blocks world. We first introduce a new dataset that pairs complex 3D spatial operations to rich natural language descriptions that require complex spatial and pragmatic interpretations such as "mirroring", "twisting", and "balancing". This dataset, built on the simulation environment of Bisk, Yuret, and Marcu (2016), attains language that is significantly richer and more complex, while also doubling the size of the original dataset in the 2D environment with 100 new world configurations and 250,000 tokens. In addition, we propose a new neural architecture that achieves competitive results while automatically discovering an inventory of interpretable spatial operations (Figure 5)
△ Less
Submitted 24 December, 2017; v1 submitted 9 December, 2017;
originally announced December 2017.
-
Unsupervised Neural Hidden Markov Models
Authors:
Ke Tran,
Yonatan Bisk,
Ashish Vaswani,
Daniel Marcu,
Kevin Knight
Abstract:
In this work, we present the first results for neuralizing an Unsupervised Hidden Markov Model. We evaluate our approach on tag in- duction. Our approach outperforms existing generative models and is competitive with the state-of-the-art though with a simpler model easily extended to include additional context.
In this work, we present the first results for neuralizing an Unsupervised Hidden Markov Model. We evaluate our approach on tag in- duction. Our approach outperforms existing generative models and is competitive with the state-of-the-art though with a simpler model easily extended to include additional context.
△ Less
Submitted 28 September, 2016;
originally announced September 2016.
-
Extracting Biomolecular Interactions Using Semantic Parsing of Biomedical Text
Authors:
Sahil Garg,
Aram Galstyan,
Ulf Hermjakob,
Daniel Marcu
Abstract:
We advance the state of the art in biomolecular interaction extraction with three contributions: (i) We show that deep, Abstract Meaning Representations (AMR) significantly improve the accuracy of a biomolecular interaction extraction system when compared to a baseline that relies solely on surface- and syntax-based features; (ii) In contrast with previous approaches that infer relations on a sent…
▽ More
We advance the state of the art in biomolecular interaction extraction with three contributions: (i) We show that deep, Abstract Meaning Representations (AMR) significantly improve the accuracy of a biomolecular interaction extraction system when compared to a baseline that relies solely on surface- and syntax-based features; (ii) In contrast with previous approaches that infer relations on a sentence-by-sentence basis, we expand our framework to enable consistent predictions over sets of sentences (documents); (iii) We further modify and expand a graph kernel learning framework to enable concurrent exploitation of automatically induced AMR (semantic) and dependency structure (syntactic) representations. Our experiments show that our approach yields interaction extraction systems that are more robust in environments where there is a significant mismatch between training and test conditions.
△ Less
Submitted 4 December, 2015;
originally announced December 2015.
-
Using Syntax-Based Machine Translation to Parse English into Abstract Meaning Representation
Authors:
Michael Pust,
Ulf Hermjakob,
Kevin Knight,
Daniel Marcu,
Jonathan May
Abstract:
We present a parser for Abstract Meaning Representation (AMR). We treat English-to-AMR conversion within the framework of string-to-tree, syntax-based machine translation (SBMT). To make this work, we transform the AMR structure into a form suitable for the mechanics of SBMT and useful for modeling. We introduce an AMR-specific language model and add data and features drawn from semantic resources…
▽ More
We present a parser for Abstract Meaning Representation (AMR). We treat English-to-AMR conversion within the framework of string-to-tree, syntax-based machine translation (SBMT). To make this work, we transform the AMR structure into a form suitable for the mechanics of SBMT and useful for modeling. We introduce an AMR-specific language model and add data and features drawn from semantic resources. Our resulting AMR parser improves upon state-of-the-art results by 7 Smatch points.
△ Less
Submitted 28 April, 2015; v1 submitted 24 April, 2015;
originally announced April 2015.
-
Domain Adaptation for Statistical Classifiers
Authors:
H. Daume III,
D. Marcu
Abstract:
The most basic assumption used in statistical learning theory is that training data and test data are drawn from the same underlying distribution. Unfortunately, in many applications, the "in-domain" test data is drawn from a distribution that is related, but not identical, to the "out-of-domain" distribution of the training data. We consider the common case in which labeled out-of-domain data is…
▽ More
The most basic assumption used in statistical learning theory is that training data and test data are drawn from the same underlying distribution. Unfortunately, in many applications, the "in-domain" test data is drawn from a distribution that is related, but not identical, to the "out-of-domain" distribution of the training data. We consider the common case in which labeled out-of-domain data is plentiful, but labeled in-domain data is scarce. We introduce a statistical formulation of this problem in terms of a simple mixture model and present an instantiation of this framework to maximum entropy classifiers and their linear chain counterparts. We present efficient inference algorithms for this special case based on the technique of conditional expectation maximization. Our experimental results show that our approach leads to improved performance on three real world tasks on four different data sets from the natural language processing domain.
△ Less
Submitted 28 September, 2011;
originally announced September 2011.
-
Learning as Search Optimization: Approximate Large Margin Methods for Structured Prediction
Authors:
Hal Daumé III,
Daniel Marcu
Abstract:
Map**s to structured output spaces (strings, trees, partitions, etc.) are typically learned using extensions of classification algorithms to simple graphical structures (eg., linear chains) in which search and parameter estimation can be performed exactly. Unfortunately, in many complex problems, it is rare that exact search or parameter estimation is tractable. Instead of learning exact model…
▽ More
Map**s to structured output spaces (strings, trees, partitions, etc.) are typically learned using extensions of classification algorithms to simple graphical structures (eg., linear chains) in which search and parameter estimation can be performed exactly. Unfortunately, in many complex problems, it is rare that exact search or parameter estimation is tractable. Instead of learning exact models and searching via heuristic means, we embrace this difficulty and treat the structured output problem in terms of approximate search. We present a framework for learning as search optimization, and two parameter updates with convergence theorems and bounds. Empirical evidence shows that our integrated approach to learning and decoding can outperform exact models at smaller computational cost.
△ Less
Submitted 4 July, 2009;
originally announced July 2009.
-
A Bayesian Model for Supervised Clustering with the Dirichlet Process Prior
Authors:
Hal Daumé III,
Daniel Marcu
Abstract:
We develop a Bayesian framework for tackling the supervised clustering problem, the generic problem encountered in tasks such as reference matching, coreference resolution, identity uncertainty and record linkage. Our clustering model is based on the Dirichlet process prior, which enables us to define distributions over the countably infinite sets that naturally arise in this problem. We add sup…
▽ More
We develop a Bayesian framework for tackling the supervised clustering problem, the generic problem encountered in tasks such as reference matching, coreference resolution, identity uncertainty and record linkage. Our clustering model is based on the Dirichlet process prior, which enables us to define distributions over the countably infinite sets that naturally arise in this problem. We add supervision to our model by positing the existence of a set of unobserved random variables (we call these "reference types") that are generic across all clusters. Inference in our framework, which requires integrating over infinitely many parameters, is solved using Markov chain Monte Carlo techniques. We present algorithms for both conjugate and non-conjugate priors. We present a simple--but general--parameterization of our model based on a Gaussian assumption. We evaluate this model on one artificial task and three real-world tasks, comparing it against both unsupervised and state-of-the-art supervised algorithms. Our results show that our model is able to outperform other models across a variety of tasks and performance metrics.
△ Less
Submitted 4 July, 2009;
originally announced July 2009.
-
A Large-Scale Exploration of Effective Global Features for a Joint Entity Detection and Tracking Model
Authors:
Hal Daumé III,
Daniel Marcu
Abstract:
Entity detection and tracking (EDT) is the task of identifying textual mentions of real-world entities in documents, extending the named entity detection and coreference resolution task by considering mentions other than names (pronouns, definite descriptions, etc.). Like NE tagging and coreference resolution, most solutions to the EDT task separate out the mention detection aspect from the core…
▽ More
Entity detection and tracking (EDT) is the task of identifying textual mentions of real-world entities in documents, extending the named entity detection and coreference resolution task by considering mentions other than names (pronouns, definite descriptions, etc.). Like NE tagging and coreference resolution, most solutions to the EDT task separate out the mention detection aspect from the coreference aspect. By doing so, these solutions are limited to using only local features for learning. In contrast, by modeling both aspects of the EDT task simultaneously, we are able to learn using highly complex, non-local features. We develop a new joint EDT model and explore the utility of many features, demonstrating their effectiveness on this task.
△ Less
Submitted 4 July, 2009;
originally announced July 2009.
-
A Noisy-Channel Model for Document Compression
Authors:
Hal Daumé III,
Daniel Marcu
Abstract:
We present a document compression system that uses a hierarchical noisy-channel model of text production. Our compression system first automatically derives the syntactic structure of each sentence and the overall discourse structure of the text given as input. The system then uses a statistical hierarchical model of text production in order to drop non-important syntactic and discourse constitu…
▽ More
We present a document compression system that uses a hierarchical noisy-channel model of text production. Our compression system first automatically derives the syntactic structure of each sentence and the overall discourse structure of the text given as input. The system then uses a statistical hierarchical model of text production in order to drop non-important syntactic and discourse constituents so as to generate coherent, grammatical document compressions of arbitrary length. The system outperforms both a baseline and a sentence-based compression system that operates by simplifying sequentially all sentences in a text. Our results support the claim that discourse knowledge plays an important role in document summarization.
△ Less
Submitted 4 July, 2009;
originally announced July 2009.
-
Induction of Word and Phrase Alignments for Automatic Document Summarization
Authors:
Hal Daumé III,
Daniel Marcu
Abstract:
Current research in automatic single document summarization is dominated by two effective, yet naive approaches: summarization by sentence extraction, and headline generation via bag-of-words models. While successful in some tasks, neither of these models is able to adequately capture the large set of linguistic devices utilized by humans when they produce summaries. One possible explanation for…
▽ More
Current research in automatic single document summarization is dominated by two effective, yet naive approaches: summarization by sentence extraction, and headline generation via bag-of-words models. While successful in some tasks, neither of these models is able to adequately capture the large set of linguistic devices utilized by humans when they produce summaries. One possible explanation for the widespread use of these models is that good techniques have been developed to extract appropriate training data for them from existing document/abstract and document/headline corpora. We believe that future progress in automatic summarization will be driven both by the development of more sophisticated, linguistically informed models, as well as a more effective leveraging of document/abstract corpora. In order to open the doors to simultaneously achieving both of these goals, we have developed techniques for automatically producing word-to-word and phrase-to-phrase alignments between documents and their human-written abstracts. These alignments make explicit the correspondences that exist in such document/abstract pairs, and create a potentially rich data source from which complex summarization algorithms may learn. This paper describes experiments we have carried out to analyze the ability of humans to perform such alignments, and based on these analyses, we describe experiments for creating them automatically. Our model for the alignment task is based on an extension of the standard hidden Markov model, and learns to create alignments in a completely unsupervised fashion. We describe our model in detail and present experimental results that show that our model is able to learn to reliably identify word- and phrase-level alignments in a corpus of <document,abstract> pairs.
△ Less
Submitted 4 July, 2009;
originally announced July 2009.
-
Search-based Structured Prediction
Authors:
Hal Daumé III,
John Langford,
Daniel Marcu
Abstract:
We present Searn, an algorithm for integrating search and learning to solve complex structured prediction problems such as those that occur in natural language, speech, computational biology, and vision. Searn is a meta-algorithm that transforms these complex problems into simple classification problems to which any binary classifier may be applied. Unlike current algorithms for structured learn…
▽ More
We present Searn, an algorithm for integrating search and learning to solve complex structured prediction problems such as those that occur in natural language, speech, computational biology, and vision. Searn is a meta-algorithm that transforms these complex problems into simple classification problems to which any binary classifier may be applied. Unlike current algorithms for structured learning that require decomposition of both the loss function and the feature functions over the predicted structure, Searn is able to learn prediction functions for any loss function and any class of features. Moreover, Searn comes with a strong, natural theoretical guarantee: good performance on the derived classification problems implies good performance on the structured prediction problem.
△ Less
Submitted 4 July, 2009;
originally announced July 2009.
-
A Formalism and an Algorithm for Computing Pragmatic Inferences and Detecting Infelicities
Authors:
Daniel Marcu
Abstract:
Since Austin introduced the term ``infelicity'', the linguistic literature has been flooded with its use, but no formal or computational explanation has been given for it. This thesis provides one for those infelicities that occur when a pragmatic inference is cancelled.
Our contribution assumes the existence of a finer grained taxonomy with respect to pragmatic inferences. It is shown that if…
▽ More
Since Austin introduced the term ``infelicity'', the linguistic literature has been flooded with its use, but no formal or computational explanation has been given for it. This thesis provides one for those infelicities that occur when a pragmatic inference is cancelled.
Our contribution assumes the existence of a finer grained taxonomy with respect to pragmatic inferences. It is shown that if one wants to account for the natural language expressiveness, one should distinguish between pragmatic inferences that are felicitous to defeat and pragmatic inferences that are infelicitously defeasible. Thus, it is shown that one should consider at least three types of information: indefeasible, felicitously defeasible, and infelicitously defeasible. The cancellation of the last of these determines the pragmatic infelicities.
A new formalism has been devised to accommodate the three levels of information, called ``stratified logic''. Within it, we are able to express formally notions such as ``utterance U presupposes P'' or ``utterance U is infelicitous''. Special attention is paid to the implications that our work has in solving some well-known existential philosophical puzzles. The formalism yields an algorithm for computing interpretations for utterances, for determining their associated presuppositions, and for signalling infelicitous utterances that has been implemented in Common Lisp. The algorithm applies equally to simple and complex utterances and sequences of utterances.
△ Less
Submitted 26 April, 1995; v1 submitted 25 April, 1995;
originally announced April 1995.
-
An Implemented Formalism for Computing Linguistic Presuppositions and Existential Commitments
Authors:
Daniel Marcu,
Graeme Hirst
Abstract:
We rely on the strength of linguistic and philosophical perspectives in constructing a framework that offers a unified explanation for presuppositions and existential commitment. We use a rich ontology and a set of methodological principles that embed the essence of Meinong's philosophy and Grice's conversational principles into a stratified logic, under an unrestricted interpretation of the qua…
▽ More
We rely on the strength of linguistic and philosophical perspectives in constructing a framework that offers a unified explanation for presuppositions and existential commitment. We use a rich ontology and a set of methodological principles that embed the essence of Meinong's philosophy and Grice's conversational principles into a stratified logic, under an unrestricted interpretation of the quantifiers. The result is a logical formalism that yields a tractable computational method that uniformly calculates all the presuppositions of a given utterance, including the existential ones.
△ Less
Submitted 25 April, 1995;
originally announced April 1995.
-
A Uniform Treatment of Pragmatic Inferences in Simple and Complex Utterances and Sequences of Utterances
Authors:
Daniel Marcu,
Graeme Hirst
Abstract:
Drawing appropriate defeasible inferences has been proven to be one of the most pervasive puzzles of natural language processing and a recurrent problem in pragmatics. This paper provides a theoretical framework, called ``stratified logic'', that can accommodate defeasible pragmatic inferences. The framework yields an algorithm that computes the conversational, conventional, scalar, clausal, and…
▽ More
Drawing appropriate defeasible inferences has been proven to be one of the most pervasive puzzles of natural language processing and a recurrent problem in pragmatics. This paper provides a theoretical framework, called ``stratified logic'', that can accommodate defeasible pragmatic inferences. The framework yields an algorithm that computes the conversational, conventional, scalar, clausal, and normal state implicatures; and the presuppositions that are associated with utterances. The algorithm applies equally to simple and complex utterances and sequences of utterances.
△ Less
Submitted 25 April, 1995;
originally announced April 1995.