Skip to main content

Showing 1–6 of 6 results for author: Hartill, T

.
  1. arXiv:2311.12337  [pdf, other

    cs.CL cs.AI

    Do Smaller Language Models Answer Contextualised Questions Through Memorisation Or Generalisation?

    Authors: Tim Hartill, Joshua Bensemann, Michael Witbrock, Patricia J. Riddle

    Abstract: A distinction is often drawn between a model's ability to predict a label for an evaluation sample that is directly memorised from highly similar training samples versus an ability to predict the label via some method of generalisation. In the context of using Language Models for question-answering, discussion continues to occur as to the extent to which questions are answered through memorisation… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  2. arXiv:2308.04711  [pdf, other

    cs.CL

    Answering Unseen Questions With Smaller Language Models Using Rationale Generation and Dense Retrieval

    Authors: Tim Hartill, Diana Benavides-Prado, Michael Witbrock, Patricia J. Riddle

    Abstract: When provided with sufficient explanatory context, smaller Language Models have been shown to exhibit strong reasoning ability on challenging short-answer question-answering tasks where the questions are unseen in training. We evaluate two methods for further improvement in this setting. Both methods focus on combining rationales generated by a larger Language Model with longer contexts created fr… ▽ More

    Submitted 12 October, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

  3. arXiv:2308.00946  [pdf, other

    cs.CL cs.AI

    Teaching Smaller Language Models To Generalise To Unseen Compositional Questions

    Authors: Tim Hartill, Neset Tan, Michael Witbrock, Patricia J. Riddle

    Abstract: We equip a smaller Language Model to generalise to answering challenging compositional questions that have not been seen in training. To do so we propose a combination of multitask supervised pretraining on up to 93 tasks designed to instill diverse reasoning abilities, and a dense retrieval system that aims to retrieve a set of evidential paragraph fragments. Recent progress in question-answering… ▽ More

    Submitted 20 August, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

  4. arXiv:2303.07585  [pdf, other

    cs.CL

    Input-length-shortening and text generation via attention values

    Authors: Neşet Özkan Tan, Alex Yuxuan Peng, Joshua Bensemann, Qiming Bao, Tim Hartill, Mark Gahegan, Michael Witbrock

    Abstract: Identifying words that impact a task's performance more than others is a challenge in natural language processing. Transformers models have recently addressed this issue by incorporating an attention mechanism that assigns greater attention (i.e., relevance) scores to some words than others. Because of the attention mechanism's high computational cost, transformer models usually have an input-leng… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: 7 pages, 4 figures. AAAI23-EMC2

  5. arXiv:2207.14000  [pdf, other

    cs.CL cs.AI cs.LG cs.LO

    Multi-Step Deductive Reasoning Over Natural Language: An Empirical Study on Out-of-Distribution Generalisation

    Authors: Qiming Bao, Alex Yuxuan Peng, Tim Hartill, Neset Tan, Zhenyun Deng, Michael Witbrock, Jiamou Liu

    Abstract: Combining deep learning with symbolic logic reasoning aims to capitalize on the success of both fields and is drawing increasing attention. Inspired by DeepLogic, an end-to-end model trained to perform inference on logic programs, we introduce IMA-GloVe-GA, an iterative neural inference network for multi-step reasoning expressed in natural language. In our model, reasoning is performed using an it… ▽ More

    Submitted 30 March, 2024; v1 submitted 28 July, 2022; originally announced July 2022.

    Comments: 10 pages, 3 figures, The 2nd International Joint Conference on Learning & Reasoning and 16th International Workshop on Neural-Symbolic Learning and Reasoning (IJCLR-NeSy 2022)

  6. Relating Blindsight and AI: A Review

    Authors: Joshua Bensemann, Qiming Bao, Gaël Gendron, Tim Hartill, Michael Witbrock

    Abstract: Processes occurring in brains, a.k.a. biological neural networks, can and have been modeled within artificial neural network architectures. Due to this, we have conducted a review of research on the phenomenon of blindsight in an attempt to generate ideas for artificial intelligence models. Blindsight can be considered as a diminished form of visual experience. If we assume that artificial network… ▽ More

    Submitted 8 December, 2021; originally announced January 2022.

    Comments: Preprint of an article published in Journal of Artificial Intelligence and Consciousness, 2021 doi.org/10.1142/S2705078521500156 \c{opyright} copyright World Scientific Publishing Company www.worldscientific.com/worldscinet/jaic

    Journal ref: Journal of Artificial Intelligence and Consciousness, 1-15 (2021)