Search | arXiv e-print repository

doi 10.4204/EPTCS.385.29

Assessing Drivers' Situation Awareness in Semi-Autonomous Vehicles: ASP based Characterisations of Driving Dynamics for Modelling Scene Interpretation and Projection

Authors: Jakob Suchan, Jan-Patrick Osterloh

Abstract: Semi-autonomous driving, as it is already available today and will eventually become even more accessible, implies the need for driver and automation system to reliably work together in order to ensure safe driving. A particular challenge in this endeavour are situations in which the vehicle's automation is no longer able to drive and is thus requesting the human to take over. In these situations… ▽ More Semi-autonomous driving, as it is already available today and will eventually become even more accessible, implies the need for driver and automation system to reliably work together in order to ensure safe driving. A particular challenge in this endeavour are situations in which the vehicle's automation is no longer able to drive and is thus requesting the human to take over. In these situations the driver has to quickly build awareness for the traffic situation to be able to take over control and safely drive the car. Within this context we present a software and hardware framework to asses how aware the driver is about the situation and to provide human-centred assistance to help in building situation awareness. The framework is developed as a modular system within the Robot Operating System (ROS) with modules for sensing the environment and the driver state, modelling the driver's situation awareness, and for guiding the driver's attention using specialized Human Machine Interfaces (HMIs). A particular focus of this paper is on an Answer Set Programming (ASP) based approach for modelling and reasoning about the driver's interpretation and projection of the scene. This is based on scene data, as well as eye-tracking data reflecting the scene elements observed by the driver. We present the overall application and discuss the role of semantic reasoning and modelling cognitive functions based on logic programming in such applications. Furthermore we present the ASP approach for interpretation and projection of the driver's situation awareness and its integration within the overall system in the context of a real-world use-case in simulated as well as in real driving. △ Less

Submitted 30 August, 2023; originally announced August 2023.

Comments: In Proceedings ICLP 2023, arXiv:2308.14898

Journal ref: EPTCS 385, 2023, pp. 300-313

arXiv:2012.14359 [pdf, other]

Commonsense Visual Sensemaking for Autonomous Driving: On Generalised Neurosymbolic Online Abduction Integrating Vision and Semantics

Authors: Jakob Suchan, Mehul Bhatt, Srikrishna Varadarajan

Abstract: We demonstrate the need and potential of systematically integrated vision and semantics solutions for visual sensemaking in the backdrop of autonomous driving. A general neurosymbolic method for online visual sensemaking using answer set programming (ASP) is systematically formalised and fully implemented. The method integrates state of the art in visual computing, and is developed as a modular fr… ▽ More We demonstrate the need and potential of systematically integrated vision and semantics solutions for visual sensemaking in the backdrop of autonomous driving. A general neurosymbolic method for online visual sensemaking using answer set programming (ASP) is systematically formalised and fully implemented. The method integrates state of the art in visual computing, and is developed as a modular framework that is generally usable within hybrid architectures for realtime perception and control. We evaluate and demonstrate with community established benchmarks KITTIMOD, MOT-2017, and MOT-2020. As use-case, we focus on the significance of human-centred visual sensemaking -- e.g., involving semantic representation and explainability, question-answering, commonsense interpolation -- in safety-critical autonomous driving situations. The developed neurosymbolic framework is domain-independent, with the case of autonomous driving designed to serve as an exemplar for online visual sensemaking in diverse cognitive interaction settings in the backdrop of select human-centred AI technology design considerations. Keywords: Cognitive Vision, Deep Semantics, Declarative Spatial Reasoning, Knowledge Representation and Reasoning, Commonsense Reasoning, Visual Abduction, Answer Set Programming, Autonomous Driving, Human-Centred Computing and Design, Standardisation in Driving Technology, Spatial Cognition and AI. △ Less

Submitted 28 December, 2020; originally announced December 2020.

Comments: This is a preprint / review version of an accepted contribution to be published as part of the Artificial Intelligence Journal (AIJ).? The article is an extended version of an IJCAI 2019 publication [74, arXiv:1906.00107]

arXiv:2006.00059 [pdf, other]

Towards a Human-Centred Cognitive Model of Visuospatial Complexity in Everyday Driving

Authors: Vasiliki Kondyli, Mehul Bhatt, Jakob Suchan

Abstract: We develop a human-centred, cognitive model of visuospatial complexity in everyday, naturalistic driving conditions. With a focus on visual perception, the model incorporates quantitative, structural, and dynamic attributes identifiable in the chosen context; the human-centred basis of the model lies in its behavioural evaluation with human subjects with respect to psychophysical measures pertaini… ▽ More We develop a human-centred, cognitive model of visuospatial complexity in everyday, naturalistic driving conditions. With a focus on visual perception, the model incorporates quantitative, structural, and dynamic attributes identifiable in the chosen context; the human-centred basis of the model lies in its behavioural evaluation with human subjects with respect to psychophysical measures pertaining to embodied visuoauditory attention. We report preliminary steps to apply the developed cognitive model of visuospatial complexity for human-factors guided dataset creation and benchmarking, and for its use as a semantic template for the (explainable) computational analysis of visuospatial complexity. △ Less

Submitted 2 June, 2020; v1 submitted 29 May, 2020; originally announced June 2020.

Comments: 9th European Starting AI Researchers Symposium (STAIRS), at ECAI 2020, the 24th European Conference on Artificial Intelligence (ECAI)., Santiago de Compostela, Spain

arXiv:1906.00107 [pdf, other]

Out of Sight But Not Out of Mind: An Answer Set Programming Based Online Abduction Framework for Visual Sensemaking in Autonomous Driving

Authors: Jakob Suchan, Mehul Bhatt, Srikrishna Varadarajan

Abstract: We demonstrate the need and potential of systematically integrated vision and semantics} solutions for visual sensemaking (in the backdrop of autonomous driving). A general method for online visual sensemaking using answer set programming is systematically formalised and fully implemented. The method integrates state of the art in (deep learning based) visual computing, and is developed as a modul… ▽ More We demonstrate the need and potential of systematically integrated vision and semantics} solutions for visual sensemaking (in the backdrop of autonomous driving). A general method for online visual sensemaking using answer set programming is systematically formalised and fully implemented. The method integrates state of the art in (deep learning based) visual computing, and is developed as a modular framework usable within hybrid architectures for perception & control. We evaluate and demo with community established benchmarks KITTIMOD and MOT. As use-case, we focus on the significance of human-centred visual sensemaking ---e.g., semantic representation and explainability, question-answering, commonsense interpolation--- in safety-critical autonomous driving situations. △ Less

Submitted 31 May, 2019; originally announced June 2019.

Comments: IJCAI 2019: the 28th International Joint Conference on Artificial Intelligence (IJCAI) 2019, August 10 - 16, Macao. (Preprint / to appear)

arXiv:1806.07376 [pdf, other]

Semantic Analysis of (Reflectional) Visual Symmetry: A Human-Centred Computational Model for Declarative Explainability

Authors: Jakob Suchan, Mehul Bhatt, Srikrishna Vardarajan, Seyed Ali Amirshahi, Stella Yu

Abstract: We present a computational model for the semantic interpretation of symmetry in naturalistic scenes. Key features include a human-centred representation, and a declarative, explainable interpretation model supporting deep semantic question-answering founded on an integration of methods in knowledge representation and deep learning based computer vision. In the backdrop of the visual arts, we showc… ▽ More We present a computational model for the semantic interpretation of symmetry in naturalistic scenes. Key features include a human-centred representation, and a declarative, explainable interpretation model supporting deep semantic question-answering founded on an integration of methods in knowledge representation and deep learning based computer vision. In the backdrop of the visual arts, we showcase the framework's capability to generate human-centred, queryable, relational structures, also evaluating the framework with an empirical study on the human perception of visual symmetry. Our framework represents and is driven by the application of foundational, integrated Vision and Knowledge Representation and Reasoning methods for applications in the arts, and the psychological and social sciences. △ Less

Submitted 14 September, 2018; v1 submitted 31 May, 2018; originally announced June 2018.

Comments: Preprint of accepted article / Journal: Advances in Cognitive Systems. ( http://www.cogsys.org/journal )

Journal ref: Advances in Cognitive Systems. (http://www.cogsys.org/journal), 2018

arXiv:1805.06861 [pdf, other]

Answer Set Programming Modulo `Space-Time'

Authors: Carl Schultz, Mehul Bhatt, Jakob Suchan, Przemysław Wałęga

Abstract: We present ASP Modulo `Space-Time', a declarative representational and computational framework to perform commonsense reasoning about regions with both spatial and temporal components. Supported are capabilities for mixed qualitative-quantitative reasoning, consistency checking, and inferring compositions of space-time relations; these capabilities combine and synergise for applications in a range… ▽ More We present ASP Modulo `Space-Time', a declarative representational and computational framework to perform commonsense reasoning about regions with both spatial and temporal components. Supported are capabilities for mixed qualitative-quantitative reasoning, consistency checking, and inferring compositions of space-time relations; these capabilities combine and synergise for applications in a range of AI application areas where the processing and interpretation of spatio-temporal data is crucial. The framework and resulting system is the only general KR-based method for declaratively reasoning about the dynamics of `space-time' regions as first-class objects. We present an empirical evaluation (with scalability and robustness results), and include diverse application examples involving interpretation and control tasks. △ Less

Submitted 17 May, 2018; originally announced May 2018.

arXiv:1712.00840 [pdf, other]

Visual Explanation by High-Level Abduction: On Answer-Set Programming Driven Reasoning about Moving Objects

Authors: Jakob Suchan, Mehul Bhatt, Przemysław Wałęga, Carl Schultz

Abstract: We propose a hybrid architecture for systematically computing robust visual explanation(s) encompassing hypothesis formation, belief revision, and default reasoning with video data. The architecture consists of two tightly integrated synergistic components: (1) (functional) answer set programming based abductive reasoning with space-time tracklets as native entities; and (2) a visual processing pi… ▽ More We propose a hybrid architecture for systematically computing robust visual explanation(s) encompassing hypothesis formation, belief revision, and default reasoning with video data. The architecture consists of two tightly integrated synergistic components: (1) (functional) answer set programming based abductive reasoning with space-time tracklets as native entities; and (2) a visual processing pipeline for detection based object tracking and motion analysis. We present the formal framework, its general implementation as a (declarative) method in answer set programming, and an example application and evaluation based on two diverse video datasets: the MOTChallenge benchmark developed by the vision community, and a recently developed Movie Dataset. △ Less

Submitted 3 December, 2017; originally announced December 2017.

Comments: Preprint of final publication published as part of AAAI 2018: J. Suchan., M. Bhatt, Wałęga, P., Schultz, C. (2018). Visual Explanation by High-Level Abduction: On Answer-Set Programming Driven Reasoning about Moving Objects. In AAAI 2018: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, February 2-7, 2018, New Orleans, USA

arXiv:1710.04076 [pdf, other]

Deep Semantic Abstractions of Everyday Human Activities: On Commonsense Representations of Human Interactions

Authors: Jakob Suchan, Mehul Bhatt

Abstract: We propose a deep semantic characterization of space and motion categorically from the viewpoint of grounding embodied human-object interactions. Our key focus is on an ontological model that would be adept to formalisation from the viewpoint of commonsense knowledge representation, relational learning, and qualitative reasoning about space and motion in cognitive robotics settings. We demonstrate… ▽ More We propose a deep semantic characterization of space and motion categorically from the viewpoint of grounding embodied human-object interactions. Our key focus is on an ontological model that would be adept to formalisation from the viewpoint of commonsense knowledge representation, relational learning, and qualitative reasoning about space and motion in cognitive robotics settings. We demonstrate key aspects of the space & motion ontology and its formalization as a representational framework in the backdrop of select examples from a dataset of everyday activities. Furthermore, focussing on human-object interaction data obtained from RGBD sensors, we also illustrate how declarative (spatio-temporal) reasoning in the (constraint) logic programming family may be performed with the developed deep semantic abstractions. △ Less

Submitted 10 October, 2017; originally announced October 2017.

Comments: In ROBOT 2017: Third Iberian Robotics Conference. Escuela Técnica Superior de Ingeniería, Sevilla (Spain) (November 22-24, 2017). https://grvc.us.es/robot2017/ (to appear). arXiv admin note: substantial text overlap with arXiv:1709.05293

arXiv:1709.05293 [pdf, other]

Commonsense Scene Semantics for Cognitive Robotics: Towards Grounding Embodied Visuo-Locomotive Interactions

Authors: Jakob Suchan, Mehul Bhatt

Abstract: We present a commonsense, qualitative model for the semantic grounding of embodied visuo-spatial and locomotive interactions. The key contribution is an integrative methodology combining low-level visual processing with high-level, human-centred representations of space and motion rooted in artificial intelligence. We demonstrate practical applicability with examples involving object interactions,… ▽ More We present a commonsense, qualitative model for the semantic grounding of embodied visuo-spatial and locomotive interactions. The key contribution is an integrative methodology combining low-level visual processing with high-level, human-centred representations of space and motion rooted in artificial intelligence. We demonstrate practical applicability with examples involving object interactions, and indoor movement. △ Less

Submitted 15 September, 2017; originally announced September 2017.

Comments: to appear in: ICCV 2017 Workshop - Vision in Practice on Autonomous Robots (ViPAR), International Conference on Computer Vision (ICCV), Venice, Italy

arXiv:1608.02693 [pdf, other]

Deeply Semantic Inductive Spatio-Temporal Learning

Authors: Jakob Suchan, Mehul Bhatt, Carl Schultz

Abstract: We present an inductive spatio-temporal learning framework rooted in inductive logic programming. With an emphasis on visuo-spatial language, logic, and cognition, the framework supports learning with relational spatio-temporal features identifiable in a range of domains involving the processing and interpretation of dynamic visuo-spatial imagery. We present a prototypical system, and an example a… ▽ More We present an inductive spatio-temporal learning framework rooted in inductive logic programming. With an emphasis on visuo-spatial language, logic, and cognition, the framework supports learning with relational spatio-temporal features identifiable in a range of domains involving the processing and interpretation of dynamic visuo-spatial imagery. We present a prototypical system, and an example application in the domain of computing for visual arts and computational cognitive science. △ Less

Submitted 9 August, 2016; originally announced August 2016.

Comments: Accepted for publication at ILP 2016: 26th International Conference on Inductive Logic Programming 4th - 6th September 2016, London. Keywords: Spatio-Temporal Learning; Dynamic Visuo-Spatial Imagery; Declarative Spatial Reasoning; Inductive Logic Programming; AI and Art

arXiv:1607.07565 [pdf, other]

Grounding Dynamic Spatial Relations for Embodied (Robot) Interaction

Authors: Michael Spranger, Jakob Suchan, Mehul Bhatt, Manfred Eppe

Abstract: This paper presents a computational model of the processing of dynamic spatial relations occurring in an embodied robotic interaction setup. A complete system is introduced that allows autonomous robots to produce and interpret dynamic spatial phrases (in English) given an environment of moving objects. The model unites two separate research strands: computational cognitive semantics and on common… ▽ More This paper presents a computational model of the processing of dynamic spatial relations occurring in an embodied robotic interaction setup. A complete system is introduced that allows autonomous robots to produce and interpret dynamic spatial phrases (in English) given an environment of moving objects. The model unites two separate research strands: computational cognitive semantics and on commonsense spatial representation and reasoning. The model for the first time demonstrates an integration of these different strands. △ Less

Submitted 26 July, 2016; originally announced July 2016.

Comments: in: Pham, D.-N. and Park, S.-B., editors, PRICAI 2014: Trends in Artificial Intelligence, volume 8862 of Lecture Notes in Computer Science, pages 958-971. Springer

arXiv:1607.05968 [pdf, other]

Robust Natural Language Processing - Combining Reasoning, Cognitive Semantics and Construction Grammar for Spatial Language

Authors: Michael Spranger, Jakob Suchan, Mehul Bhatt

Abstract: We present a system for generating and understanding of dynamic and static spatial relations in robotic interaction setups. Robots describe an environment of moving blocks using English phrases that include spatial relations such as "across" and "in front of". We evaluate the system in robot-robot interactions and show that the system can robustly deal with visual perception errors, language omiss… ▽ More We present a system for generating and understanding of dynamic and static spatial relations in robotic interaction setups. Robots describe an environment of moving blocks using English phrases that include spatial relations such as "across" and "in front of". We evaluate the system in robot-robot interactions and show that the system can robustly deal with visual perception errors, language omissions and ungrammatical utterances. △ Less

Submitted 20 July, 2016; originally announced July 2016.

Comments: in IJCAI'16: Proceedings of the 25th international joint conference on Artificial intelligence, Palo Alto, 2016. AAAI Press

arXiv:1508.03276 [pdf, other]

Talking about the Moving Image: A Declarative Model for Image Schema Based Embodied Perception Grounding and Language Generation

Authors: Jakob Suchan, Mehul Bhatt, Harshita Jhavar

Abstract: We present a general theory and corresponding declarative model for the embodied grounding and natural language based analytical summarisation of dynamic visuo-spatial imagery. The declarative model ---ecompassing spatio-linguistic abstractions, image schemas, and a spatio-temporal feature based language generator--- is modularly implemented within Constraint Logic Programming (CLP). The implement… ▽ More We present a general theory and corresponding declarative model for the embodied grounding and natural language based analytical summarisation of dynamic visuo-spatial imagery. The declarative model ---ecompassing spatio-linguistic abstractions, image schemas, and a spatio-temporal feature based language generator--- is modularly implemented within Constraint Logic Programming (CLP). The implemented model is such that primitives of the theory, e.g., pertaining to space and motion, image schemata, are available as first-class objects with `deep semantics' suited for inference and query. We demonstrate the model with select examples broadly motivated by areas such as film, design, geography, smart environments where analytical natural language based externalisations of the moving image are central from the viewpoint of human interaction, evidence-based qualitative analysis, and sensemaking. Keywords: moving image, visual semantics and embodiment, visuo-spatial cognition and computation, cognitive vision, computational models of narrative, declarative spatial reasoning △ Less

Submitted 13 August, 2015; originally announced August 2015.

Comments: 19 pages. Unpublished report

arXiv:1306.5308 [pdf, other]

Cognitive Interpretation of Everyday Activities: Toward Perceptual Narrative Based Visuo-Spatial Scene Interpretation

Authors: Mehul Bhatt, Jakob Suchan, Carl Schultz

Abstract: We position a narrative-centred computational model for high-level knowledge representation and reasoning in the context of a range of assistive technologies concerned with "visuo-spatial perception and cognition" tasks. Our proposed narrative model encompasses aspects such as \emph{space, events, actions, change, and interaction} from the viewpoint of commonsense reasoning and learning in large-s… ▽ More We position a narrative-centred computational model for high-level knowledge representation and reasoning in the context of a range of assistive technologies concerned with "visuo-spatial perception and cognition" tasks. Our proposed narrative model encompasses aspects such as \emph{space, events, actions, change, and interaction} from the viewpoint of commonsense reasoning and learning in large-scale cognitive systems. The broad focus of this paper is on the domain of "human-activity interpretation" in smart environments, ambient intelligence etc. In the backdrop of a "smart meeting cinematography" domain, we position the proposed narrative model, preliminary work on perceptual narrativisation, and the immediate outlook on constructing general-purpose open-source tools for perceptual narrativisation. ACM Classification: I.2 Artificial Intelligence: I.2.0 General -- Cognitive Simulation, I.2.4 Knowledge Representation Formalisms and Methods, I.2.10 Vision and Scene Understanding: Architecture and control structures, Motion, Perceptual reasoning, Shape, Video analysis General keywords: cognitive systems; human-computer interaction; spatial cognition and computation; commonsense reasoning; spatial and temporal reasoning; assistive technologies △ Less

Submitted 22 June, 2013; originally announced June 2013.

Comments: To appear at: Computational Models of Narrative (CMN) 2013., a satellite event of CogSci 2013: The 35th meeting of the Cognitive Science Society

ACM Class: I.2; I.2.0; I.2.4; I.2.10

arXiv:1306.1034 [pdf, other]

ROTUNDE - A Smart Meeting Cinematography Initiative: Tools, Datasets, and Benchmarks for Cognitive Interpretation and Control

Authors: Mehul Bhatt, Jakob Suchan, Christian Freksa

Abstract: We construe smart meeting cinematography with a focus on professional situations such as meetings and seminars, possibly conducted in a distributed manner across socio-spatially separated groups. The basic objective in smart meeting cinematography is to interpret professional interactions involving people, and automatically produce dynamic recordings of discussions, debates, presentations etc in t… ▽ More We construe smart meeting cinematography with a focus on professional situations such as meetings and seminars, possibly conducted in a distributed manner across socio-spatially separated groups. The basic objective in smart meeting cinematography is to interpret professional interactions involving people, and automatically produce dynamic recordings of discussions, debates, presentations etc in the presence of multiple communication modalities. Typical modalities include gestures (e.g., raising one's hand for a question, applause), voice and interruption, electronic apparatus (e.g., pressing a button), movement (e.g., standing-up, moving around) etc. ROTUNDE, an instance of smart meeting cinematography concept, aims to: (a) develop functionality-driven benchmarks with respect to the interpretation and control capabilities of human-cinematographers, real-time video editors, surveillance personnel, and typical human performance in everyday situations; (b) Develop general tools for the commonsense cognitive interpretation of dynamic scenes from the viewpoint of visuo-spatial cognition centred perceptual narrativisation. Particular emphasis is placed on declarative representations and interfacing mechanisms that seamlessly integrate within large-scale cognitive (interaction) systems and companion technologies consisting of diverse AI sub-components. For instance, the envisaged tools would provide general capabilities for high-level commonsense reasoning about space, events, actions, change, and interaction. △ Less

Submitted 5 June, 2013; originally announced June 2013.

Comments: Appears in AAAI-2013 Workshop on: Space, Time, and Ambient Intelligence (STAMI 2013)

Showing 1–15 of 15 results for author: Suchan, J