Skip to main content

Showing 1–9 of 9 results for author: Ehinger, K A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05791  [pdf, other

    cs.CV

    Sequential Amodal Segmentation via Cumulative Occlusion Learning

    Authors: Jiayang Ao, Qiuhong Ke, Krista A. Ehinger

    Abstract: To fully understand the 3D context of a single image, a visual system must be able to segment both the visible and occluded regions of objects, while discerning their occlusion order. Ideally, the system should be able to handle any object and not be restricted to segmenting a limited set of object classes, especially in robotic applications. Addressing this need, we introduce a diffusion model wi… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  2. arXiv:2402.12138  [pdf, other

    cs.CV

    Perceiving Longer Sequences With Bi-Directional Cross-Attention Transformers

    Authors: Markus Hiller, Krista A. Ehinger, Tom Drummond

    Abstract: We present a novel bi-directional Transformer architecture (BiXT) which scales linearly with input size in terms of computational cost and memory consumption, but does not suffer the drop in performance or limitation to only one input modality seen with other efficient Transformer-based approaches. BiXT is inspired by the Perceiver architectures but replaces iterative attention with an efficient b… ▽ More

    Submitted 26 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Preprint; Added LRA section & updated scaling trends

  3. arXiv:2401.07426  [pdf, other

    cs.AI

    Generalized Planning for the Abstraction and Reasoning Corpus

    Authors: Chao Lei, Nir Lipovetzky, Krista A. Ehinger

    Abstract: The Abstraction and Reasoning Corpus (ARC) is a general artificial intelligence benchmark that poses difficulties for pure machine learning methods due to its requirement for fluid intelligence with a focus on reasoning and abstraction. In this work, we introduce an ARC solver, Generalized Planning for Abstract Reasoning (GPAR). It casts an ARC problem as a generalized planning (GP) problem, where… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

    Comments: Accepted at AAAI 2024 (extended version)

  4. arXiv:2310.17167  [pdf, other

    cs.LG cs.AI cs.CV

    Improving Denoising Diffusion Models via Simultaneous Estimation of Image and Noise

    Authors: Zhenkai Zhang, Krista A. Ehinger, Tom Drummond

    Abstract: This paper introduces two key contributions aimed at improving the speed and quality of images generated through inverse diffusion processes. The first contribution involves reparameterizing the diffusion process in terms of the angle on a quarter-circular arc between the image and noise, specifically setting the conventional $\displaystyle \sqrt{\barα}=\cos(η)$. This reparameterization eliminates… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  5. arXiv:2307.00735  [pdf, other

    cs.AI

    Novelty and Lifted Helpful Actions in Generalized Planning

    Authors: Chao Lei, Nir Lipovetzky, Krista A. Ehinger

    Abstract: It has been shown recently that successful techniques in classical planning, such as goal-oriented heuristics and landmarks, can improve the ability to compute planning programs for generalized planning (GP) problems. In this work, we introduce the notion of action novelty rank, which computes novelty with respect to a planning program, and propose novelty-based generalized planning solvers, which… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

    Comments: Accepted at SoCS 2023 (extended version)

  6. arXiv:2303.06596  [pdf, other

    cs.CV cs.LG

    Amodal Intra-class Instance Segmentation: Synthetic Datasets and Benchmark

    Authors: Jiayang Ao, Qiuhong Ke, Krista A. Ehinger

    Abstract: Images of realistic scenes often contain intra-class objects that are heavily occluded from each other, making the amodal perception task that requires parsing the occluded parts of the objects challenging. Although important for downstream tasks such as robotic gras** systems, the lack of large-scale amodal datasets with detailed annotations makes it difficult to model intra-class occlusions ex… ▽ More

    Submitted 7 November, 2023; v1 submitted 12 March, 2023; originally announced March 2023.

    Comments: Accepted at WACV 2024. Datasets are available at https://github.com/saraao/amodal-dataset

  7. Image Amodal Completion: A Survey

    Authors: Jiayang Ao, Qiuhong Ke, Krista A. Ehinger

    Abstract: Existing computer vision systems can compete with humans in understanding the visible parts of objects, but still fall far short of humans when it comes to depicting the invisible parts of partially occluded objects. Image amodal completion aims to equip computers with human-like amodal completion functions to understand an intact object despite it being partially occluded. The main purpose of thi… ▽ More

    Submitted 7 November, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: Accepted at Computer Vision and Image Understanding. See https://doi.org/10.1016/j.cviu.2023.103661 for the final version

  8. arXiv:2006.15417  [pdf, other

    cs.CV cs.AI cs.LG

    Invertible Concept-based Explanations for CNN Models with Non-negative Concept Activation Vectors

    Authors: Ruihan Zhang, Prashan Madumal, Tim Miller, Krista A. Ehinger, Benjamin I. P. Rubinstein

    Abstract: Convolutional neural network (CNN) models for computer vision are powerful but lack explainability in their most basic form. This deficiency remains a key challenge when applying CNNs in important domains. Recent work on explanations through feature importance of approximate linear models has moved from input-level features (pixels or segments) to features from mid-layer feature maps in the form o… ▽ More

    Submitted 17 June, 2021; v1 submitted 27 June, 2020; originally announced June 2020.

  9. arXiv:1504.06755  [pdf, other

    cs.CV

    TurkerGaze: Crowdsourcing Saliency with Webcam based Eye Tracking

    Authors: **mei Xu, Krista A Ehinger, Yinda Zhang, Adam Finkelstein, Sanjeev R. Kulkarni, Jianxiong Xiao

    Abstract: Traditional eye tracking requires specialized hardware, which means collecting gaze data from many observers is expensive, tedious and slow. Therefore, existing saliency prediction datasets are order-of-magnitudes smaller than typical datasets for other vision recognition tasks. The small size of these datasets limits the potential for training data intensive algorithms, and causes overfitting in… ▽ More

    Submitted 20 May, 2015; v1 submitted 25 April, 2015; originally announced April 2015.

    Comments: 9 pages, 14 figures