Search | arXiv e-print repository

Neurosymbolic Grounding for Compositional World Models

Authors: Atharva Sehgal, Arya Grayeli, Jennifer J. Sun, Swarat Chaudhuri

Abstract: We introduce Cosmos, a framework for object-centric world modeling that is designed for compositional generalization (CompGen), i.e., high performance on unseen input scenes obtained through the composition of known visual "atoms." The central insight behind Cosmos is the use of a novel form of neurosymbolic grounding. Specifically, the framework introduces two new tools: (i) neurosymbolic scene e… ▽ More We introduce Cosmos, a framework for object-centric world modeling that is designed for compositional generalization (CompGen), i.e., high performance on unseen input scenes obtained through the composition of known visual "atoms." The central insight behind Cosmos is the use of a novel form of neurosymbolic grounding. Specifically, the framework introduces two new tools: (i) neurosymbolic scene encodings, which represent each entity in a scene using a real vector computed using a neural encoder, as well as a vector of composable symbols describing attributes of the entity, and (ii) a neurosymbolic attention mechanism that binds these entities to learned rules of interaction. Cosmos is end-to-end differentiable; also, unlike traditional neurosymbolic methods that require representations to be manually mapped to symbols, it computes an entity's symbolic attributes using vision-language foundation models. Through an evaluation that considers two different forms of CompGen on an established blocks-pushing domain, we show that the framework establishes a new state-of-the-art for CompGen in world modeling. Artifacts are available at: https://trishullab.github.io/cosmos-web/ △ Less

Submitted 10 May, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

Comments: Uploading ICLR,2024 Camera Ready Version

arXiv:2010.16198 [pdf, other]

Automatic Myocardial Infarction Evaluation from Delayed-Enhancement Cardiac MRI using Deep Convolutional Networks

Authors: Kibrom Berihu Girum, Youssef Skandarani, Raabid Hussain, Alexis Bozorg Grayeli, Gilles Créhange, Alain Lalande

Abstract: In this paper, we propose a new deep learning framework for an automatic myocardial infarction evaluation from clinical information and delayed enhancement-MRI (DE-MRI). The proposed framework addresses two tasks. The first task is automatic detection of myocardial contours, the infarcted area, the no-reflow area, and the left ventricular cavity from a short-axis DE-MRI series. It employs two segm… ▽ More In this paper, we propose a new deep learning framework for an automatic myocardial infarction evaluation from clinical information and delayed enhancement-MRI (DE-MRI). The proposed framework addresses two tasks. The first task is automatic detection of myocardial contours, the infarcted area, the no-reflow area, and the left ventricular cavity from a short-axis DE-MRI series. It employs two segmentation neural networks. The first network is used to segment the anatomical structures such as the myocardium and left ventricular cavity. The second network is used to segment the pathological areas such as myocardial infarction, myocardial no-reflow, and normal myocardial region. The segmented myocardium region from the first network is further used to refine the second network's pathological segmentation results. The second task is to automatically classify a given case into normal or pathological from clinical information with or without DE-MRI. A cascaded support vector machine (SVM) is employed to classify a given case from its associated clinical information. The segmented pathological areas from DE-MRI are also used for the classification task. We evaluated our method on the 2020 EMIDEC MICCAI challenge dataset. It yielded an average Dice index of 0.93 and 0.84, respectively, for the left ventricular cavity and the myocardium. The classification from using only clinical information yielded 80% accuracy over five-fold cross-validation. Using the DE-MRI, our method can classify the cases with 93.3% accuracy. These experimental results reveal that the proposed method can automatically evaluate the myocardial infarction. △ Less

Submitted 30 October, 2020; originally announced October 2020.

arXiv:1909.01647 [pdf]

3D landmark detection for augmented reality based otologic procedures

Authors: Raabid Hussain, Alain Lalande, Kibrom Berihu Girum, Caroline Guigou, Alexis Bozorg Grayeli

Abstract: Ear consists of the smallest bones in the human body and does not contain significant amount of distinct landmark points that may be used to register a preoperative CT-scan with the surgical video in an augmented reality framework. Learning based algorithms may be used to help the surgeons to identify landmark points. This paper presents a convolutional neural network approach to landmark detectio… ▽ More Ear consists of the smallest bones in the human body and does not contain significant amount of distinct landmark points that may be used to register a preoperative CT-scan with the surgical video in an augmented reality framework. Learning based algorithms may be used to help the surgeons to identify landmark points. This paper presents a convolutional neural network approach to landmark detection in preoperative ear CT images and then discusses an augmented reality system that can be used to visualize the cochlear axis on an otologic surgical video. △ Less

Submitted 4 September, 2019; originally announced September 2019.

Journal ref: Surgetica, Jun 2019, Rennes, France

Showing 1–3 of 3 results for author: Grayeli, A