-
PIZZA: A new benchmark for complex end-to-end task-oriented parsing
Authors:
Konstantine Arkoudas,
Nicolas Guenon des Mesnards,
Melanie Rubino,
Sandesh Swamy,
Saarthak Khanna,
Weiqi Sun,
Khan Haidar
Abstract:
Much recent work in task-oriented parsing has focused on finding a middle ground between flat slots and intents, which are inexpressive but easy to annotate, and powerful representations such as the lambda calculus, which are expressive but costly to annotate. This paper continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semant…
▽ More
Much recent work in task-oriented parsing has focused on finding a middle ground between flat slots and intents, which are inexpressive but easy to annotate, and powerful representations such as the lambda calculus, which are expressive but costly to annotate. This paper continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semantics cannot be captured by flat slots and intents. We perform an extensive evaluation of deep-learning techniques for task-oriented parsing on this dataset, including different flavors of seq2seq systems and RNNGs. The dataset comes in two main versions, one in a recently introduced utterance-level hierarchical notation that we call TOP, and one whose targets are executable representations (EXR). We demonstrate empirically that training the parser to directly generate EXR notation not only solves the problem of entity resolution in one fell swoop and overcomes a number of expressive limitations of TOP notation, but also results in significantly greater parsing accuracy.
△ Less
Submitted 30 November, 2022;
originally announced December 2022.
-
Cross-TOP: Zero-Shot Cross-Schema Task-Oriented Parsing
Authors:
Melanie Rubino,
Nicolas Guenon des Mesnards,
Uday Shah,
Nanjiang Jiang,
Weiqi Sun,
Konstantine Arkoudas
Abstract:
Deep learning methods have enabled task-oriented semantic parsing of increasingly complex utterances. However, a single model is still typically trained and deployed for each task separately, requiring labeled training data for each, which makes it challenging to support new tasks, even within a single business vertical (e.g., food-ordering or travel booking). In this paper we describe Cross-TOP (…
▽ More
Deep learning methods have enabled task-oriented semantic parsing of increasingly complex utterances. However, a single model is still typically trained and deployed for each task separately, requiring labeled training data for each, which makes it challenging to support new tasks, even within a single business vertical (e.g., food-ordering or travel booking). In this paper we describe Cross-TOP (Cross-Schema Task-Oriented Parsing), a zero-shot method for complex semantic parsing in a given vertical. By leveraging the fact that user requests from the same vertical share lexical and semantic similarities, a single cross-schema parser is trained to service an arbitrary number of tasks, seen or unseen, within a vertical. We show that Cross-TOP can achieve high accuracy on a previously unseen task without requiring any additional training data, thereby providing a scalable way to bootstrap semantic parsers for new tasks. As part of this work we release the FoodOrdering dataset, a task-oriented parsing dataset in the food-ordering vertical, with utterances and annotations derived from five schemas, each from a different restaurant menu.
△ Less
Submitted 10 June, 2022;
originally announced June 2022.
-
Unfreeze with Care: Space-Efficient Fine-Tuning of Semantic Parsing Models
Authors:
Weiqi Sun,
Haidar Khan,
Nicolas Guenon des Mesnards,
Melanie Rubino,
Konstantine Arkoudas
Abstract:
Semantic parsing is a key NLP task that maps natural language to structured meaning representations. As in many other NLP tasks, SOTA performance in semantic parsing is now attained by fine-tuning a large pretrained language model (PLM). While effective, this approach is inefficient in the presence of multiple downstream tasks, as a new set of values for all parameters of the PLM needs to be store…
▽ More
Semantic parsing is a key NLP task that maps natural language to structured meaning representations. As in many other NLP tasks, SOTA performance in semantic parsing is now attained by fine-tuning a large pretrained language model (PLM). While effective, this approach is inefficient in the presence of multiple downstream tasks, as a new set of values for all parameters of the PLM needs to be stored for each task separately. Recent work has explored methods for adapting PLMs to downstream tasks while kee** most (or all) of their parameters frozen. We examine two such promising techniques, prefix tuning and bias-term tuning, specifically on semantic parsing. We compare them against each other on two different semantic parsing datasets, and we also compare them against full and partial fine-tuning, both in few-shot and conventional data settings. While prefix tuning is shown to do poorly for semantic parsing tasks off the shelf, we modify it by adding special token embeddings, which results in very strong performance without compromising parameter savings.
△ Less
Submitted 4 March, 2022;
originally announced March 2022.
-
Detecting Bots and Assessing Their Impact in Social Networks
Authors:
Nicolas Guenon des Mesnards,
David Scott Hunter,
Zakaria el Hjouji,
Tauhid Zaman
Abstract:
Online social networks are often subject to influence campaigns by malicious actors through the use of automated accounts known as bots. We consider the problem of detecting bots in online social networks and assessing their impact on the opinions of individuals. We begin by analyzing the behavior of bots in social networks and identify that they exhibit heterophily, meaning they interact with hum…
▽ More
Online social networks are often subject to influence campaigns by malicious actors through the use of automated accounts known as bots. We consider the problem of detecting bots in online social networks and assessing their impact on the opinions of individuals. We begin by analyzing the behavior of bots in social networks and identify that they exhibit heterophily, meaning they interact with humans more than other bots. We use this property to develop a detection algorithm based on the Ising model from statistical physics. The bots are identified by solving a minimum cut problem. We show that this Ising model algorithm can identify bots with higher accuracy while utilizing much less data than other state of the art methods.
We then develop a a function we call generalized harmonic influence centrality to estimate the impact bots have on the opinions of users in social networks. This function is based on a generalized opinion dynamics model and captures how the activity level and network connectivity of the bots shift equilibrium opinions. To apply generalized harmonic influence centrality to real social networks, we develop a deep neural network to measure the opinions of users based on their social network posts. Using this neural network, we then calculate the generalized harmonic influence centrality of bots in multiple real social networks. For some networks we find that a limited number of bots can cause non-trivial shifts in the population opinions. In other networks, we find that the bots have little impact. Overall we find that generalized harmonic influence centrality is a useful operational tool to measure the impact of bots in social networks.
△ Less
Submitted 15 December, 2020; v1 submitted 29 October, 2018;
originally announced October 2018.
-
Detecting Influence Campaigns in Social Networks Using the Ising Model
Authors:
Nicolas Guenon des Mesnards,
Tauhid Zaman
Abstract:
We consider the problem of identifying coordinated influence campaigns conducted by automated agents or bots in a social network. We study several different Twitter datasets which contain such campaigns and find that the bots exhibit heterophily - they interact more with humans than with each other. We use this observation to develop a probability model for the network structure and bot labels bas…
▽ More
We consider the problem of identifying coordinated influence campaigns conducted by automated agents or bots in a social network. We study several different Twitter datasets which contain such campaigns and find that the bots exhibit heterophily - they interact more with humans than with each other. We use this observation to develop a probability model for the network structure and bot labels based on the Ising model from statistical physics. We present a method to find the maximum likelihood assignment of bot labels by solving a minimum cut problem. Our algorithm allows for the simultaneous detection of multiple bots that are potentially engaging in a coordinated influence campaign, in contrast to other methods that identify bots one at a time. We find that our algorithm is able to more accurately find bots than existing methods when compared to a human labeled ground truth. We also look at the content posted by the bots we identify and find that they seem to have a coordinated agenda.
△ Less
Submitted 25 May, 2018;
originally announced May 2018.