-
Open Assistant Toolkit -- version 2
Authors:
Sophie Fischer,
Federico Rossetto,
Carlos Gemmell,
Andrew Ramsay,
Iain Mackie,
Philip Zubel,
Niklas Tecklenburg,
Jeffrey Dalton
Abstract:
We present the second version of the Open Assistant Toolkit (OAT-v2), an open-source task-oriented conversational system for composing generative neural models. OAT-v2 is a scalable and flexible assistant platform supporting multiple domains and modalities of user interaction. It splits processing a user utterance into modular system components, including submodules such as action code generation,…
▽ More
We present the second version of the Open Assistant Toolkit (OAT-v2), an open-source task-oriented conversational system for composing generative neural models. OAT-v2 is a scalable and flexible assistant platform supporting multiple domains and modalities of user interaction. It splits processing a user utterance into modular system components, including submodules such as action code generation, multimodal content retrieval, and knowledge-augmented response generation. Developed over multiple years of the Alexa TaskBot challenge, OAT-v2 is a proven system that enables scalable and robust experimentation in experimental and real-world deployment. OAT-v2 provides open models and software for research and commercial applications to enable the future of multimodal virtual assistants across diverse applications and types of rich interaction.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
GRILLBot In Practice: Lessons and Tradeoffs Deploying Large Language Models for Adaptable Conversational Task Assistants
Authors:
Sophie Fischer,
Carlos Gemmell,
Niklas Tecklenburg,
Iain Mackie,
Federico Rossetto,
Jeffrey Dalton
Abstract:
We tackle the challenge of building real-world multimodal assistants for complex real-world tasks. We describe the practicalities and challenges of develo** and deploying GRILLBot, a leading (first and second prize winning in 2022 and 2023) system deployed in the Alexa Prize TaskBot Challenge. Building on our Open Assistant Toolkit (OAT) framework, we propose a hybrid architecture that leverages…
▽ More
We tackle the challenge of building real-world multimodal assistants for complex real-world tasks. We describe the practicalities and challenges of develo** and deploying GRILLBot, a leading (first and second prize winning in 2022 and 2023) system deployed in the Alexa Prize TaskBot Challenge. Building on our Open Assistant Toolkit (OAT) framework, we propose a hybrid architecture that leverages Large Language Models (LLMs) and specialised models tuned for specific subtasks requiring very low latency. OAT allows us to define when, how and which LLMs should be used in a structured and deployable manner. For knowledge-grounded question answering and live task adaptations, we show that LLM reasoning abilities over task context and world knowledge outweigh latency concerns. For dialogue state management, we implement a code generation approach and show that specialised smaller models have 84% effectiveness with 100x lower latency. Overall, we provide insights and discuss tradeoffs for deploying both traditional models and LLMs to users in complex real-world multimodal environments in the Alexa TaskBot challenge. These experiences will continue to evolve as LLMs become more capable and efficient -- fundamentally resha** OAT and future assistant architectures.
△ Less
Submitted 28 June, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
Dissipative Dark Substructure: The Consequences of Atomic Dark Matter on Milky Way Analog Subhalos
Authors:
Caleb Gemmell,
Sandip Roy,
Xuejian Shen,
David Curtin,
Mariangela Lisanti,
Norman Murray,
Philip F. Hopkins
Abstract:
Using cosmological hydrodynamical zoom-in simulations, we explore the properties of subhalos in Milky Way analogs that contain a sub-component of Atomic Dark Matter (ADM). ADM differs from Cold Dark Matter (CDM) due to the presence of self interactions that lead to energy dissipation and bound-state formation, analogous to Standard Model baryons. This model can arise in complex dark sectors that a…
▽ More
Using cosmological hydrodynamical zoom-in simulations, we explore the properties of subhalos in Milky Way analogs that contain a sub-component of Atomic Dark Matter (ADM). ADM differs from Cold Dark Matter (CDM) due to the presence of self interactions that lead to energy dissipation and bound-state formation, analogous to Standard Model baryons. This model can arise in complex dark sectors that are natural and theoretically-motivated extensions to the Standard Model. The simulations used in this work were carried out using GIZMO and utilize the FIRE-2 galaxy formation physics in the Standard Model baryonic sector. For the parameter points we consider, the ADM gas cools efficiently, allowing it to collapse to the center of subhalos. This increases a subhalo's central density and affects its orbit, with more subhalos surviving small pericentric passages. The subset of subhalos that host visible satellite galaxies have cuspier density profiles and smaller stellar-half-mass radii relative to CDM. The entire population of dwarf galaxies produced in the ADM simulations is much more compact than those seen in CDM simulations, unable to reproduce the entire diversity of observed dwarf galaxy structures. Additionally, we also identify a population of highly compact subhalos that consist nearly entirely of ADM and form in the central region of the host, where they can leave distinctive imprints in the baryonic disk. This work presents the first detailed exploration of subhalo properties in a strongly dissipative dark matter scenario, providing intuition for how other regions of ADM parameter space, as well as other dark sector models, would impact galactic-scale observables.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Dark Sector Glueballs at the LHC
Authors:
Austin Batz,
Timothy Cohen,
David Curtin,
Caleb Gemmell,
Graham D. Kribs
Abstract:
We study confining dark sectors where the lightest hadrons are glueballs. Such models can provide viable dark matter candidates and appear in some neutral naturalness scenarios. In this work, we introduce a new phenomenological model of dark glueball hadronization inspired by the Lund string model. This enables us to make realistic predictions for dark glueball phenomenology at the LHC for the fir…
▽ More
We study confining dark sectors where the lightest hadrons are glueballs. Such models can provide viable dark matter candidates and appear in some neutral naturalness scenarios. In this work, we introduce a new phenomenological model of dark glueball hadronization inspired by the Lund string model. This enables us to make realistic predictions for dark glueball phenomenology at the LHC for the first time. Our model reproduces the expected thermal distribution of hadron species as an emergent consequence of hadronization dynamics. The ability to predict the production of glueball states heavier than the lightest species significantly expands the reach of long-lived glueball searches in MATHUSLA compared to previous simplified estimates. We also characterize regions of parameter space where emerging and/or semivisible jets could arise from pure-glue dark sectors, thereby providing new benchmark models that motivate searches for these signatures.
△ Less
Submitted 15 April, 2024; v1 submitted 20 October, 2023;
originally announced October 2023.
-
Generate, Transform, Answer: Question Specific Tool Synthesis for Tabular Data
Authors:
Carlos Gemmell,
Jeffrey Dalton
Abstract:
Tabular question answering (TQA) presents a challenging setting for neural systems by requiring joint reasoning of natural language with large amounts of semi-structured data. Unlike humans who use programmatic tools like filters to transform data before processing, language models in TQA process tables directly, resulting in information loss as table size increases. In this paper we propose ToolW…
▽ More
Tabular question answering (TQA) presents a challenging setting for neural systems by requiring joint reasoning of natural language with large amounts of semi-structured data. Unlike humans who use programmatic tools like filters to transform data before processing, language models in TQA process tables directly, resulting in information loss as table size increases. In this paper we propose ToolWriter to generate query specific programs and detect when to apply them to transform tables and align them with the TQA model's capabilities. Focusing ToolWriter to generate row-filtering tools improves the state-of-the-art for WikiTableQuestions and WikiSQL with the most performance gained on long tables. By investigating headroom, our work highlights the broader potential for programmatic tools combined with neural components to manipulate large amounts of structured data.
△ Less
Submitted 17 March, 2023;
originally announced March 2023.
-
Indirect Detection of Dark Matter Annihilating into Dark Glueballs
Authors:
David Curtin,
Caleb Gemmell
Abstract:
We examine indirect detection of dark matter that annihilates into dark glueballs, which in turn decay into the Standard Model via a range of portals. This arises if the dark matter candidate couples to a confining gauge force without light flavours, representative of many possible complex dark sectors. Such Hidden Valley scenarios are being increasingly considered due to non-detection of minimal…
▽ More
We examine indirect detection of dark matter that annihilates into dark glueballs, which in turn decay into the Standard Model via a range of portals. This arises if the dark matter candidate couples to a confining gauge force without light flavours, representative of many possible complex dark sectors. Such Hidden Valley scenarios are being increasingly considered due to non-detection of minimal models as well as theoretical motivations such as the Twin Higgs solution to the little hierarchy problem. Study of dark glueballs in indirect detection has previously been hampered by the difficulty of modeling their production in dark showers. We use the recent GlueShower code to produce the first constraints on dark matter annihilating via dark glueballs into the Standard Model across photon, antiproton, and positron channels. We also fit the Galactic Centre Excess and use this observation, combined with other astrophysical constraints, to show how multi-channel observations can constrain UV and IR details of the theory, namely the exact decay portal and hadronization behaviour respectively. This provides unique complementary discovery and diagnostic potential to Hidden Valley searches at colliders. It is interesting to note that thermal WIMPs annihilating to $\mathcal{O}(10~\mathrm{GeV})$ dark glueballs and then the Standard Model via the Twin-Higgs-like decay portal can account for the Galactic Centre Excess while respecting other constraints.
△ Less
Submitted 19 October, 2023; v1 submitted 10 November, 2022;
originally announced November 2022.
-
GRILLBot: An Assistant for Real-World Tasks with Neural Semantic Parsing and Graph-Based Representations
Authors:
Carlos Gemmell,
Iain Mackie,
Paul Owoicho,
Federico Rossetto,
Sophie Fischer,
Jeffrey Dalton
Abstract:
GRILLBot is the winning system in the 2022 Alexa Prize TaskBot Challenge, moving towards the next generation of multimodal task assistants. It is a voice assistant to guide users through complex real-world tasks in the domains of cooking and home improvement. These are long-running and complex tasks that require flexible adjustment and adaptation. The demo highlights the core aspects, including a…
▽ More
GRILLBot is the winning system in the 2022 Alexa Prize TaskBot Challenge, moving towards the next generation of multimodal task assistants. It is a voice assistant to guide users through complex real-world tasks in the domains of cooking and home improvement. These are long-running and complex tasks that require flexible adjustment and adaptation. The demo highlights the core aspects, including a novel Neural Decision Parser for contextualized semantic parsing, a new "TaskGraph" state representation that supports conditional execution, knowledge-grounded chit-chat, and automatic enrichment of tasks with images and videos.
△ Less
Submitted 31 August, 2022;
originally announced August 2022.
-
Induced Natural Language Rationales and Interleaved Markup Tokens Enable Extrapolation in Large Language Models
Authors:
Mirelle Bueno,
Carlos Gemmell,
Jeffrey Dalton,
Roberto Lotufo,
Rodrigo Nogueira
Abstract:
The ability to extrapolate, i.e., to make predictions on sequences that are longer than those presented as training examples, is a challenging problem for current deep learning models. Recent work shows that this limitation persists in state-of-the-art Transformer-based models. Most solutions to this problem use specific architectures or training methods that do not generalize to other tasks. We d…
▽ More
The ability to extrapolate, i.e., to make predictions on sequences that are longer than those presented as training examples, is a challenging problem for current deep learning models. Recent work shows that this limitation persists in state-of-the-art Transformer-based models. Most solutions to this problem use specific architectures or training methods that do not generalize to other tasks. We demonstrate that large language models can succeed in extrapolation without modifying their architecture or training procedure. Our experimental results show that generating step-by-step rationales and introducing marker tokens are both required for effective extrapolation. First, we induce a language model to produce step-by-step rationales before outputting the answer to effectively communicate the task to the model. However, as sequences become longer, we find that current models struggle to keep track of token positions. To address this issue, we interleave output tokens with markup tokens that act as explicit positional and counting symbols. Our findings show how these two complementary approaches enable remarkable sequence extrapolation and highlight a limitation of current architectures to effectively generalize without explicit surface form guidance. Code available at https://github.com/MirelleB/induced-rationales-markup-tokens
△ Less
Submitted 28 November, 2022; v1 submitted 24 August, 2022;
originally announced August 2022.
-
VILT: Video Instructions Linking for Complex Tasks
Authors:
Sophie Fischer,
Carlos Gemmell,
Iain Mackie,
Jeffrey Dalton
Abstract:
This work addresses challenges in develo** conversational assistants that support rich multimodal video interactions to accomplish real-world tasks interactively. We introduce the task of automatically linking instructional videos to task steps as "Video Instructions Linking for Complex Tasks" (VILT). Specifically, we focus on the domain of cooking and empowering users to cook meals interactivel…
▽ More
This work addresses challenges in develo** conversational assistants that support rich multimodal video interactions to accomplish real-world tasks interactively. We introduce the task of automatically linking instructional videos to task steps as "Video Instructions Linking for Complex Tasks" (VILT). Specifically, we focus on the domain of cooking and empowering users to cook meals interactively with a video-enabled Alexa skill. We create a reusable benchmark with 61 queries from recipe tasks and curate a collection of 2,133 instructional "How-To" cooking videos. Studying VILT with state-of-the-art retrieval methods, we find that dense retrieval with ANCE is the most effective, achieving an NDCG@3 of 0.566 and P@1 of 0.644. We also conduct a user study that measures the effect of incorporating videos in a real-world task setting, where 10 participants perform several cooking tasks with varying multimodal experimental conditions using a state-of-the-art Alexa TaskBot system. The users interacting with manually linked videos said they learned something new 64% of the time, which is a 9% increase compared to the automatically linked videos (55%), indicating that linked video relevance is important for task learning.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
CODEC: Complex Document and Entity Collection
Authors:
Iain Mackie,
Paul Owoicho,
Carlos Gemmell,
Sophie Fischer,
Sean MacAvaney,
Jeffrey Dalton
Abstract:
CODEC is a document and entity ranking benchmark that focuses on complex research topics. We target essay-style information needs of social science researchers, i.e. "How has the UK's Open Banking Regulation benefited Challenger Banks?". CODEC includes 42 topics developed by researchers and a new focused web corpus with semantic annotations including entity links. This resource includes expert jud…
▽ More
CODEC is a document and entity ranking benchmark that focuses on complex research topics. We target essay-style information needs of social science researchers, i.e. "How has the UK's Open Banking Regulation benefited Challenger Banks?". CODEC includes 42 topics developed by researchers and a new focused web corpus with semantic annotations including entity links. This resource includes expert judgments on 17,509 documents and entities (416.9 per topic) from diverse automatic and interactive manual runs. The manual runs include 387 query reformulations, providing data for query performance prediction and automatic rewriting evaluation.
CODEC includes analysis of state-of-the-art systems, including dense retrieval and neural re-ranking. The results show the topics are challenging with headroom for document and entity ranking improvement. Query expansion with entity information shows significant gains in document ranking, demonstrating the resource's value for evaluating and improving entity-oriented search. We also show that the manual query reformulations significantly improve document ranking and entity ranking performance. Overall, CODEC provides challenging research topics to support the development and evaluation of entity-centric search methods.
△ Less
Submitted 17 May, 2022; v1 submitted 9 May, 2022;
originally announced May 2022.
-
Theory, phenomenology, and experimental avenues for dark showers: a Snowmass 2021 report
Authors:
Guillaume Albouy,
Jared Barron,
Hugues Beauchesne,
Elias Bernreuther,
Marcella Bona,
Cesare Cazzaniga,
Cari Cesarotti,
Timothy Cohen,
Annapaola de Cosa,
David Curtin,
Zeynep Demiragli,
Caterina Doglioni,
Alison Elliot,
Karri Folan DiPetrillo,
Florian Eble,
Carlos Erice,
Chad Freer,
Aran Garcia-Bellido,
Caleb Gemmell,
Marie-Hélène Genest,
Giovanni Grilli di Cortona,
Giuliano Gustavino,
Nicoline Hemme,
Tova Holmes,
Deepak Kar
, et al. (29 additional authors not shown)
Abstract:
In this work, we consider the case of a strongly coupled dark/hidden sector, which extends the Standard Model (SM) by adding an additional non-Abelian gauge group. These extensions generally contain matter fields, much like the SM quarks, and gauge fields similar to the SM gluons. We focus on the exploration of such sectors where the dark particles are produced at the LHC through a portal and unde…
▽ More
In this work, we consider the case of a strongly coupled dark/hidden sector, which extends the Standard Model (SM) by adding an additional non-Abelian gauge group. These extensions generally contain matter fields, much like the SM quarks, and gauge fields similar to the SM gluons. We focus on the exploration of such sectors where the dark particles are produced at the LHC through a portal and undergo rapid hadronization within the dark sector before decaying back, at least in part and potentially with sizeable lifetimes, to SM particles, giving a range of possibly spectacular signatures such as emerging or semi-visible jets. Other, non-QCD-like scenarios leading to soft unclustered energy patterns or glueballs are also discussed. After a review of the theory, existing benchmarks and constraints, this work addresses how to build consistent benchmarks from the underlying physical parameters and present new developments for the PYTHIA Hidden Valley module, along with jet substructure studies. Finally, a series of improved search strategies is presented in order to pave the way for a better exploration of the dark showers at the LHC.
△ Less
Submitted 27 June, 2022; v1 submitted 17 March, 2022;
originally announced March 2022.
-
Simulating Glueball Production in $N_f = 0$ QCD
Authors:
David Curtin,
Caleb Gemmell,
Christopher B. Verhaaren
Abstract:
In an $SU(N_c)$ gauge theory with zero light quark flavours $N_f = 0$, the only hadronic states that form below the confinement scale are composite gluon states called glueballs. These minimal confining sectors arise in many Hidden Valley extensions of the Standard Model, including scenarios that could hold the solution to the dark matter question and the hierarchy problem. Quantitative study of d…
▽ More
In an $SU(N_c)$ gauge theory with zero light quark flavours $N_f = 0$, the only hadronic states that form below the confinement scale are composite gluon states called glueballs. These minimal confining sectors arise in many Hidden Valley extensions of the Standard Model, including scenarios that could hold the solution to the dark matter question and the hierarchy problem. Quantitative study of dark glueball phenomenology requires an understanding of pure glue hadronization, which to date is severely lacking. In this work we show that significant progress can be made by combining a perturbative pure glue parton shower with a self-consistent and physically motivated parameterization of the unknown non-perturbative physics, thanks to the modest hierarchy between the glueball mass and the confinement scale. We make our simulation code available as the public GlueShower package, the first glueball generator for Hidden Valley theories, and perform preliminary studies of several glueball production observables, with theoretical uncertainties that take the full range of possible hadronization scenarios into account. We hope this will enable new studies of dark sector phenomenology that were previously inaccessible.
△ Less
Submitted 20 January, 2023; v1 submitted 25 February, 2022;
originally announced February 2022.
-
Relevance Transformer: Generating Concise Code Snippets with Relevance Feedback
Authors:
Carlos Gemmell,
Federico Rossetto,
Jeffrey Dalton
Abstract:
Tools capable of automatic code generation have the potential to augment programmer's capabilities. While straightforward code retrieval is incorporated into many IDEs, an emerging area is explicit code generation. Code generation is currently approached as a Machine Translation task, with Recurrent Neural Network (RNN) based encoder-decoder architectures trained on code-description pairs. In this…
▽ More
Tools capable of automatic code generation have the potential to augment programmer's capabilities. While straightforward code retrieval is incorporated into many IDEs, an emerging area is explicit code generation. Code generation is currently approached as a Machine Translation task, with Recurrent Neural Network (RNN) based encoder-decoder architectures trained on code-description pairs. In this work we introduce and study modern Transformer architectures for this task. We further propose a new model called the Relevance Transformer that incorporates external knowledge using pseudo-relevance feedback. The Relevance Transformer biases the decoding process to be similar to existing retrieved code while enforcing diversity. We perform experiments on multiple standard benchmark datasets for code generation including Django, Hearthstone, and CoNaLa. The results show improvements over state-of-the-art methods based on BLEU evaluation. The Relevance Transformer model shows the potential of Transformer-based architectures for code generation and introduces a method of incorporating pseudo-relevance feedback during inference.
△ Less
Submitted 8 December, 2020; v1 submitted 6 July, 2020;
originally announced July 2020.