Skip to main content

Showing 1–31 of 31 results for author: Cohn, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16528  [pdf, other

    cs.CL

    Evaluating the Ability of Large Language Models to Reason about Cardinal Directions

    Authors: Anthony G Cohn, Robert E Blackwell

    Abstract: We investigate the abilities of a representative set of Large language Models (LLMs) to reason about cardinal directions (CDs). To do so, we create two datasets: the first, co-created with ChatGPT, focuses largely on recall of world knowledge about CDs; the second is generated from a set of templates, comprehensively testing an LLM's ability to determine the correct CD given a particular scenario.… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 9 pages, 3 figures, 1 table. Short paper accepted by COSIT 24, The 16th Conference on Spatial Information Theory

  2. arXiv:2406.14336  [pdf, other

    cs.CL cs.AI

    Exploring Spatial Representations in the Historical Lake District Texts with LLM-based Relation Extraction

    Authors: Erum Haris, Anthony G. Cohn, John G. Stell

    Abstract: Navigating historical narratives poses a challenge in unveiling the spatial intricacies of past landscapes. The proposed work addresses this challenge within the context of the English Lake District, employing the Corpus of the Lake District Writing. The method utilizes a generative pre-trained transformer model to extract spatial relations from the textual descriptions in the corpus. The study ap… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2406.11911  [pdf, other

    cs.AI cs.CL cs.LG

    A Notion of Complexity for Theory of Mind via Discrete World Models

    Authors: X. Angelo Huang, Emanuele La Malfa, Samuele Marro, Andrea Asperti, Anthony Cohn, Michael Wooldridge

    Abstract: Theory of Mind (ToM) can be used to assess the capabilities of Large Language Models (LLMs) in complex scenarios where social reasoning is required. While the research community has proposed many ToM benchmarks, their hardness varies greatly, and their complexity is not well defined. This work proposes a framework to measure the complexity of ToM tasks. We quantify a problem's complexity as the nu… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: https://flecart.github.com/complexity-tom-dwm

  4. arXiv:2406.01931  [pdf, other

    cs.CL

    Dishonesty in Helpful and Harmless Alignment

    Authors: Youcheng Huang, **gkun Tang, Duanyu Feng, Zheng Zhang, Wenqiang Lei, Jiancheng Lv, Anthony G. Cohn

    Abstract: People tell lies when seeking rewards. Large language models (LLMs) are aligned to human values with reinforcement learning where they get rewards if they satisfy human preference. We find that this also induces dishonesty in helpful and harmless alignment where LLMs tell lies in generating harmless responses. Using the latest interpreting tools, we detect dishonesty, show how LLMs can be harmful… ▽ More

    Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  5. arXiv:2405.15064  [pdf, other

    cs.CL cs.AI cs.DB

    Reframing Spatial Reasoning Evaluation in Language Models: A Real-World Simulation Benchmark for Qualitative Reasoning

    Authors: Fangjun Li, David C. Hogg, Anthony G. Cohn

    Abstract: Spatial reasoning plays a vital role in both human cognition and machine intelligence, prompting new research into language models' (LMs) capabilities in this regard. However, existing benchmarks reveal shortcomings in evaluating qualitative spatial reasoning (QSR). These benchmarks typically present oversimplified scenarios or unclear natural language descriptions, hindering effective evaluation.… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Camera-Ready version for IJCAI 2024

  6. arXiv:2402.02805  [pdf, other

    cs.AI cs.CL cs.LG

    Graph-enhanced Large Language Models in Asynchronous Plan Reasoning

    Authors: Fangru Lin, Emanuele La Malfa, Valentin Hofmann, Elle Michelle Yang, Anthony Cohn, Janet B. Pierrehumbert

    Abstract: Planning is a fundamental property of human intelligence. Reasoning about asynchronous plans is challenging since it requires sequential and parallel planning to optimize time costs. Can large language models (LLMs) succeed at this task? Here, we present the first large-scale study investigating this question. We find that a representative set of closed and open-source LLMs, including GPT-4 and LL… ▽ More

    Submitted 3 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted at ICML-2024

  7. arXiv:2401.09074  [pdf, other

    cs.LG cs.AI cs.CL cs.PL

    Code Simulation Challenges for Large Language Models

    Authors: Emanuele La Malfa, Christoph Weinhuber, Orazio Torre, Fangru Lin, Samuele Marro, Anthony Cohn, Nigel Shadbolt, Michael Wooldridge

    Abstract: Many reasoning, planning, and problem-solving tasks share an intrinsic algorithmic nature: correctly simulating each step is a sufficient condition to solve them correctly. This work studies to what extent Large Language Models (LLMs) can simulate coding and algorithmic tasks to provide insights into general capabilities in such algorithmic reasoning tasks. We introduce benchmarks for straight-lin… ▽ More

    Submitted 12 June, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

    Comments: Code: https://github.com/EmanueleLM/CodeSimulation

  8. arXiv:2401.03991  [pdf, other

    cs.AI cs.CL cs.DB cs.LO

    Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark

    Authors: Fangjun Li, David C. Hogg, Anthony G. Cohn

    Abstract: Artificial intelligence (AI) has made remarkable progress across various domains, with large language models like ChatGPT gaining substantial attention for their human-like text-generation capabilities. Despite these achievements, spatial reasoning remains a significant challenge for these models. Benchmarks like StepGame evaluate AI spatial reasoning, where ChatGPT has shown unsatisfactory perfor… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: Camera-Ready version for AAAI 2024

  9. arXiv:2309.16573  [pdf, other

    cs.AI cs.CL cs.CY

    Language Models as a Service: Overview of a New Paradigm and its Challenges

    Authors: Emanuele La Malfa, Aleksandar Petrov, Simon Frieder, Christoph Weinhuber, Ryan Burnell, Raza Nazar, Anthony G. Cohn, Nigel Shadbolt, Michael Wooldridge

    Abstract: Some of the most powerful language models currently are proprietary systems, accessible only via (typically restrictive) web or software programming interfaces. This is the Language-Models-as-a-Service (LMaaS) paradigm. In contrast with scenarios where full model access is available, as in the case of open-source models, such closed-off language models present specific challenges for evaluating, b… ▽ More

    Submitted 30 November, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

  10. arXiv:2309.15577  [pdf, other

    cs.AI

    An Evaluation of ChatGPT-4's Qualitative Spatial Reasoning Capabilities in RCC-8

    Authors: Anthony G Cohn

    Abstract: Qualitative Spatial Reasoning (QSR) is well explored area of Commonsense Reasoning and has multiple applications ranging from Geographical Information Systems to Robotics and Computer Vision. Recently many claims have been made for the capabilities of Large Language Models (LLMs). In this paper we investigate the extent to which one particular LLM can perform classical qualitative spatial reasonin… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: 10 figures. 8 pages. Accepted for presentation at 36th International Workshop on Qualitative Reasoning (QR-23), in conjunction with ECAI2023 in Krakow, Poland

  11. arXiv:2304.11164  [pdf, other

    cs.CL cs.AI

    Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs

    Authors: Anthony G Cohn, Jose Hernandez-Orallo

    Abstract: Language models have become very popular recently and many claims have been made about their abilities, including for commonsense reasoning. Given the increasingly better results of current language models on previous static benchmarks for commonsense reasoning, we explore an alternative dialectical evaluation. The goal of this kind of evaluation is not to obtain an aggregate performance value but… ▽ More

    Submitted 22 April, 2023; originally announced April 2023.

    Comments: 11 pages in main paper + 71 pages in appendix

  12. arXiv:2304.05989  [pdf, other

    cs.AI cs.CV cs.LG

    Object-agnostic Affordance Categorization via Unsupervised Learning of Graph Embeddings

    Authors: Alexia Toumpa, Anthony G. Cohn

    Abstract: Acquiring knowledge about object interactions and affordances can facilitate scene understanding and human-robot collaboration tasks. As humans tend to use objects in many different ways depending on the scene and the objects' availability, learning object affordances in everyday-life scenarios is a challenging task, particularly in the presence of an open set of interactions and objects. We addre… ▽ More

    Submitted 30 March, 2023; originally announced April 2023.

    Comments: Accepted at Journal of Artificial Intelligence Research (JAIR)

  13. arXiv:2212.08659  [pdf

    cs.AI cs.HC cs.MA cs.RO

    A Hierarchical Framework for Collaborative Artificial Intelligence

    Authors: James L. Crowley, Joëlle L Coutaz, Jasmin Grosinger, Javier Vázquez-Salceda, Cecilio Angulo, Alberto Sanfeliu, Luca Iocchi, Anthony G. Cohn

    Abstract: We propose a hierarchical framework for collaborative intelligent systems. This framework organizes research challenges based on the nature of the collaborative activity and the information that must be shared, with each level building on capabilities provided by lower levels. We review research paradigms at each level, with a description of classical engineering-based approaches and modern altern… ▽ More

    Submitted 14 December, 2022; originally announced December 2022.

    Journal ref: IEEE Pervasive Computing, 2022

  14. arXiv:2208.01136  [pdf, other

    cs.CV cs.AI

    Exploring the GLIDE model for Human Action-effect Prediction

    Authors: Fangjun Li, David C. Hogg, Anthony G. Cohn

    Abstract: We address the following action-effect prediction task. Given an image depicting an initial state of the world and an action expressed in text, predict an image depicting the state of the world following the action. The prediction should have the same scene context as the input image. We explore the use of the recently proposed GLIDE model for performing this task. GLIDE is a generative neural net… ▽ More

    Submitted 1 August, 2022; originally announced August 2022.

  15. arXiv:2208.00783  [pdf, other

    cs.CV cs.AI

    Location retrieval using visible landmarks based qualitative place signatures

    Authors: Lijun Wei, Valerie Gouet-Brunet, Anthony Cohn

    Abstract: Location retrieval based on visual information is to retrieve the location of an agent (e.g. human, robot) or the area they see by comparing the observations with a certain form of representation of the environment. Existing methods generally require precise measurement and storage of the observed environment features, which may not always be robust due to the change of season, viewpoint, occlusio… ▽ More

    Submitted 26 July, 2022; originally announced August 2022.

  16. arXiv:2109.11969  [pdf, other

    cs.CL cs.AI cs.HC

    Rethinking Crowd Sourcing for Semantic Similarity

    Authors: Shaul Solomon, Adam Cohn, Hernan Rosenblum, Chezi Hershkovitz, Ivan P. Yamshchikov

    Abstract: Estimation of semantic similarity is crucial for a variety of natural language processing (NLP) tasks. In the absence of a general theory of semantic information, many papers rely on human annotators as the source of ground truth for semantic similarity estimation. This paper investigates the ambiguities inherent in crowd-sourced semantic labeling. It shows that annotators that treat semantic simi… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

    ACM Class: I.2.7; H.5.2; K.6.1

  17. arXiv:2108.04621  [pdf, other

    cs.PL

    Refactoring the Whitby Intelligent Tutoring System for Clean Architecture

    Authors: Paul S. Brown, Vania Dimitrova, Glen Hart, Anthony G. Cohn, Paulo Moura

    Abstract: Whitby is the server-side of an Intelligent Tutoring System application for learning System-Theoretic Process Analysis (STPA), a methodology used to ensure the safety of anything that can be represented with a systems model. The underlying logic driving the reasoning behind Whitby is Situation Calculus, which is a many-sorted logic with situation, action, and object sorts. The Situation Calculus i… ▽ More

    Submitted 10 August, 2021; originally announced August 2021.

    Comments: Under consideration for acceptance in TPLP. Paper presented at the 37th International Conference on Logic Programming (ICLP 2021), 16 pages

  18. arXiv:2102.09896  [pdf, other

    cs.CV

    Scribble-Supervised Semantic Segmentation by Uncertainty Reduction on Neural Representation and Self-Supervision on Neural Eigenspace

    Authors: Zhiyi Pan, Peng Jiang, Yunhai Wang, Changhe Tu, Anthony G. Cohn

    Abstract: Scribble-supervised semantic segmentation has gained much attention recently for its promising performance without high-quality annotations. Due to the lack of supervision, confident and consistent predictions are usually hard to obtain. Typically, people handle these problems to either adopt an auxiliary task with the well-labeled dataset or incorporate the graphical model with additional require… ▽ More

    Submitted 19 February, 2021; originally announced February 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2011.05621

  19. arXiv:2003.13120  [pdf, other

    cs.CV cs.LG eess.IV physics.geo-ph

    Defect segmentation: Map** tunnel lining internal defects with ground penetrating radar data using a convolutional neural network

    Authors: Senlin Yang, Zhengfang Wang, **g Wang, Anthony G. Cohn, Jiaqi Zhang, Peng Jiang, Peng Jiang, Qingmei Sui

    Abstract: This research proposes a Ground Penetrating Radar (GPR) data processing method for non-destructive detection of tunnel lining internal defects, called defect segmentation. To perform this critical step of automatic tunnel lining detection, the method uses a CNN called Segnet combined with the Lovász softmax loss function to map the internal defect structure with GPR synthetic data, which improves… ▽ More

    Submitted 29 March, 2020; originally announced March 2020.

    Comments: 24 pages,11 figures

  20. arXiv:2002.12738  [pdf, other

    cs.RO cs.LG

    Human-like Planning for Reaching in Cluttered Environments

    Authors: Mohamed Hasan, Matthew Warburton, Wisdom C. Agboh, Mehmet R. Dogar, Matteo Leonetti, He Wang, Faisal Mushtaq, Mark Mon-Williams, Anthony G. Cohn

    Abstract: Humans, in comparison to robots, are remarkably adept at reaching for objects in cluttered environments. The best existing robot planners are based on random sampling of configuration space -- which becomes excessively high-dimensional with large number of objects. Consequently, most planners often fail to efficiently find object manipulation plans in such environments. We addressed this problem b… ▽ More

    Submitted 3 March, 2020; v1 submitted 28 February, 2020; originally announced February 2020.

    Comments: To be published in ICRA 2020

  21. arXiv:1912.05759  [pdf

    cs.CV cs.LG eess.IV physics.geo-ph

    GPRInvNet: Deep Learning-Based Ground Penetrating Radar Data Inversion for Tunnel Lining

    Authors: Bin Liu, Yuxiao Ren, Hanchi Liu, Hui Xu, Zhengfang Wang, Anthony G. Cohn, Peng Jiang

    Abstract: A DNN architecture referred to as GPRInvNet was proposed to tackle the challenges of map** the ground-penetrating radar (GPR) B-Scan data to complex permittivity maps of subsurface structures. The GPRInvNet consisted of a trace-to-trace encoder and a decoder. It was specially designed to take into account the characteristics of GPR inversion when faced with complex GPR B-Scan data, as well as ad… ▽ More

    Submitted 26 September, 2021; v1 submitted 11 December, 2019; originally announced December 2019.

    Journal ref: IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 10, pp. 8305-8325, Oct. 2021

  22. arXiv:1810.08615  [pdf

    cs.RO cs.LG cs.NE

    Autonomous Functional Locomotion in a Tendon-Driven Limb via Limited Experience

    Authors: Ali Marjaninejad, Darío Urbina-Meléndez, Brian A. Cohn, Francisco J. Valero-Cuevas

    Abstract: Robots will become ubiquitously useful only when they can use few attempts to teach themselves to perform different tasks, even with complex bodies and in dynamical environments. Vertebrates, in fact, successfully use trial-and-error to learn multiple tasks in spite of their intricate tendon-driven anatomies. Roboticists find such tendon-driven systems particularly hard to control because they are… ▽ More

    Submitted 19 October, 2018; originally announced October 2018.

    Comments: 39 pages, 6 figures

  23. arXiv:1809.05970  [pdf

    q-bio.QM cs.HC

    Quantifying and attenuating pathologic tremor in virtual reality

    Authors: Brian A. Cohn, Dilan D. Shah, Ali Marjaninejad, Martin Shapiro, Serhan Ulkumen, Christopher M. Laine, Francisco J. Valero-Cuevas, Kenneth H. Hayashida, Sarah Ingersoll

    Abstract: We present a virtual reality (VR) experience that creates a research-grade benchmark in assessing patients with active upper-limb tremor, while simultaneously offering the opportunity for patients to engage with VR experiences without their pathologic tremor. Accurate and precise use of handheld motion controllers in VR gaming applications may be limited for patients with upper limb tremor. In par… ▽ More

    Submitted 16 September, 2018; originally announced September 2018.

    Comments: 3 pages; 3 figures

  24. arXiv:1802.07490  [pdf, other

    cs.RO cs.CV cs.GR

    ViTac: Feature Sharing between Vision and Tactile Sensing for Cloth Texture Recognition

    Authors: Shan Luo, Wenzhen Yuan, Edward Adelson, Anthony G. Cohn, Raul Fuentes

    Abstract: Vision and touch are two of the important sensing modalities for humans and they offer complementary information for sensing the environment. Robots could also benefit from such multi-modal sensing ability. In this paper, addressing for the first time (to the best of our knowledge) texture recognition from tactile images and vision, we propose a new fusion method named Deep Maximum Covariance Anal… ▽ More

    Submitted 13 March, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

    Comments: 6 pages, 5 figures, Accepted for 2018 IEEE International Conference on Robotics and Automation

  25. arXiv:1709.03456  [pdf, other

    cs.CV cs.AI

    CLAD: A Complex and Long Activities Dataset with Rich Crowdsourced Annotations

    Authors: Jawad Tayyub, Majd Hawasly, David C. Hogg, Anthony G. Cohn

    Abstract: This paper introduces a novel activity dataset which exhibits real-life and diverse scenarios of complex, temporally-extended human activities and actions. The dataset presents a set of videos of actors performing everyday activities in a natural and unscripted manner. The dataset was recorded using a static Kinect 2 sensor which is commonly used on many robotic platforms. The dataset comprises of… ▽ More

    Submitted 21 September, 2017; v1 submitted 11 September, 2017; originally announced September 2017.

  26. The STRANDS Project: Long-Term Autonomy in Everyday Environments

    Authors: Nick Hawes, Chris Burbridge, Ferdian Jovan, Lars Kunze, Bruno Lacerda, Lenka Mudrová, Jay Young, Jeremy Wyatt, Denise Hebesberger, Tobias Körtner, Rares Ambrus, Nils Bore, John Folkesson, Patric Jensfelt, Lucas Beyer, Alexander Hermans, Bastian Leibe, Aitor Aldoma, Thomas Fäulhammer, Michael Zillich, Markus Vincze, Eris Chinellato, Muhannad Al-Omari, Paul Duckworth, Yiannis Gatsoulis , et al. (8 additional authors not shown)

    Abstract: Thanks to the efforts of the robotics and autonomous systems community, robots are becoming ever more capable. There is also an increasing demand from end-users for autonomous service robots that can operate in real environments for extended periods. In the STRANDS project we are tackling this demand head-on by integrating state-of-the-art artificial intelligence and robotics research into mobile… ▽ More

    Submitted 14 October, 2016; v1 submitted 15 April, 2016; originally announced April 2016.

  27. Sentence Compression as Tree Transduction

    Authors: Trevor Anthony Cohn, Mirella Lapata

    Abstract: This paper presents a tree-to-tree transduction method for sentence compression. Our model is based on synchronous tree substitution grammar, a formalism that allows local distortion of the tree topology and can thus naturally capture structural mismatches. We describe an algorithm for decoding in this framework and show how the model can be trained discriminatively within a large margin framework… ▽ More

    Submitted 15 January, 2014; originally announced January 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 34, pages 637-674, 2009

  28. arXiv:1201.1530  [pdf, ps, other

    math.OC cs.DS

    N-k-e Survivable Power System Design

    Authors: Richard Li-Yang Chen, Amy Cohn, Neng Fan, Ali Pinar

    Abstract: We consider the problem of designing (or augmenting) an electric power system such that it satisfies the N-k-e survivability criterion while minimizing total cost. The survivability criterion requires that at least (1-e) fraction of the total demand can still be met even if any k (or fewer) of the system components fail. We formulate this problem, taking into account both transmission and generati… ▽ More

    Submitted 6 January, 2012; originally announced January 2012.

  29. arXiv:1109.1801  [pdf, other

    math.OC cs.DS

    An Implicit Optimization Approach for Survivable Network Design

    Authors: Richard Chen, Amy Cohn, Ali Pinar

    Abstract: We consider the problem of designing a network of minimum cost while satisfying a prescribed survivability criterion. The survivability criterion requires that a feasible flow must still exists (i.e. all demands can be satisfied without violating arc capacities) even after the disruption of a subset of the network's arcs. Specifically, we consider the case in which a disruption (random or maliciou… ▽ More

    Submitted 8 September, 2011; originally announced September 2011.

  30. arXiv:0909.0122  [pdf, ps, other

    cs.AI

    Reasoning with Topological and Directional Spatial Information

    Authors: Sanjiang Li, Anthony G. Cohn

    Abstract: Current research on qualitative spatial representation and reasoning mainly focuses on one single aspect of space. In real world applications, however, multiple spatial aspects are often involved simultaneously. This paper investigates problems arising in reasoning with combined topological and directional information. We use the RCC8 algebra and the Rectangle Algebra (RA) for expressing topol… ▽ More

    Submitted 1 September, 2009; originally announced September 2009.

    Journal ref: Computational Intelligence, 2012, 28(4):579-616

  31. arXiv:cs/9603104  [pdf, ps

    cs.AI

    Active Learning with Statistical Models

    Authors: D. A. Cohn, Z. Ghahramani, M. I. Jordan

    Abstract: For many types of machine learning algorithms, one can compute the statistically `optimal' way to select training data. In this paper, we review how optimal data selection techniques have been used with feedforward neural networks. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally wei… ▽ More

    Submitted 29 February, 1996; originally announced March 1996.

    Comments: See http://www.jair.org/ for any accompanying files

    Journal ref: Journal of Artificial Intelligence Research, Vol 4, (1996), 129-145