Skip to main content

Showing 1–12 of 12 results for author: Fan, J E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.12463  [pdf, other

    cs.CV

    Open Vocabulary Semantic Scene Sketch Understanding

    Authors: Ahmed Bourouis, Judith Ellen Fan, Yulia Gryaditskaya

    Abstract: We study the underexplored but fundamental vision problem of machine understanding of abstract freehand scene sketches. We introduce a sketch encoder that results in semantically-aware feature space, which we evaluate by testing its performance on a semantic sketch segmentation task. To train our model we rely only on the availability of bitmap sketches with their brief captions and do not require… ▽ More

    Submitted 30 March, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  2. arXiv:2312.06721  [pdf, other

    cs.CV

    Counterfactual World Modeling for Physical Dynamics Understanding

    Authors: Rahul Venkatesh, Honglin Chen, Kevin Feigelis, Daniel M. Bear, Khaled Jedoui, Klemen Kotar, Felix Binder, Wanhee Lee, Sherry Liu, Kevin A. Smith, Judith E. Fan, Daniel L. K. Yamins

    Abstract: The ability to understand physical dynamics is essential to learning agents acting in the world. This paper presents Counterfactual World Modeling (CWM), a candidate pure vision foundational model for physical dynamics understanding. CWM consists of three basic concepts. First, we propose a simple and powerful temporally-factored masking policy for masked prediction of video data, which encourages… ▽ More

    Submitted 25 December, 2023; v1 submitted 10 December, 2023; originally announced December 2023.

  3. arXiv:2312.03035  [pdf, other

    cs.CV

    SEVA: Leveraging sketches to evaluate alignment between human and machine visual abstraction

    Authors: Kushin Mukherjee, Holly Huey, Xuanchen Lu, Yael Vinker, Rio Aguina-Kang, Ariel Shamir, Judith E. Fan

    Abstract: Sketching is a powerful tool for creating abstract images that are sparse but meaningful. Sketch understanding poses fundamental challenges for general-purpose vision algorithms because it requires robustness to the sparsity of sketches relative to natural visual inputs and because it demands tolerance for semantic ambiguity, as sketches can reliably evoke multiple meanings. While current vision a… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: Accepted to the Advances in Neural Information Processing Systems (Datasets and Benchmarks Track) 2023

  4. arXiv:2307.12967  [pdf, other

    cs.CV cs.LG

    Learning Dense Correspondences between Photos and Sketches

    Authors: Xuanchen Lu, Xiaolong Wang, Judith E Fan

    Abstract: Humans effortlessly grasp the connection between sketches and real-world objects, even when these sketches are far from realistic. Moreover, human sketch understanding goes beyond categorization -- critically, it also entails understanding how individual elements within a sketch correspond to parts of the physical world it represents. What are the computational ingredients needed to support this a… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: Accepted to ICML 2023. Project page: https://photo-sketch-correspondence.github.io

  5. arXiv:2306.15668  [pdf, other

    cs.CV cs.AI cs.GR cs.RO

    Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties

    Authors: Hsiao-Yu Tung, Mingyu Ding, Zhenfang Chen, Daniel Bear, Chuang Gan, Joshua B. Tenenbaum, Daniel LK Yamins, Judith E Fan, Kevin A. Smith

    Abstract: General physical scene understanding requires more than simply localizing and recognizing objects -- it requires knowledge that objects can have different latent properties (e.g., mass or elasticity), and that those properties affect the outcome of physical events. While there has been great progress in physical and video prediction models in recent years, benchmarks to test their performance typi… ▽ More

    Submitted 1 November, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

    Comments: Accepted by NeurIPS 2023 Datasets and Benchmarks Track

  6. arXiv:2205.11613  [pdf, other

    cs.HC

    How do people incorporate advice from artificial agents when making physical judgments?

    Authors: Erik Brockbank, Haoliang Wang, Justin Yang, Suvir Mirchandani, Erdem Bıyık, Dorsa Sadigh, Judith E. Fan

    Abstract: How do people build up trust with artificial agents? Here, we study a key component of interpersonal trust: people's ability to evaluate the competence of another agent across repeated interactions. Prior work has largely focused on appraisal of simple, static skills; in contrast, we probe competence evaluations in a rich setting with agents that learn over time. Participants played a video game i… ▽ More

    Submitted 16 May, 2022; originally announced May 2022.

  7. arXiv:2205.05666  [pdf, other

    cs.CL cs.AI

    Identifying concept libraries from language about object structure

    Authors: Catherine Wong, William P. McCarthy, Gabriel Grand, Yoni Friedman, Joshua B. Tenenbaum, Jacob Andreas, Robert D. Hawkins, Judith E. Fan

    Abstract: Our understanding of the visual world goes beyond naming objects, encompassing our ability to parse objects into meaningful parts, attributes, and relations. In this work, we leverage natural language descriptions for a diverse set of 2K procedurally generated objects to identify the parts people use and the principles leading these parts to be favored over others. We formalize our problem as sear… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: Appears in the conference proceedings of CogSci 2022

  8. arXiv:2109.13861  [pdf, other

    cs.CV

    Visual resemblance and communicative context constrain the emergence of graphical conventions

    Authors: Robert D. Hawkins, Megumi Sano, Noah D. Goodman, Judith E. Fan

    Abstract: From photorealistic sketches to schematic diagrams, drawing provides a versatile medium for communicating about the visual world. How do images spanning such a broad range of appearances reliably convey meaning? Do viewers understand drawings based solely on their ability to resemble the entities they refer to (i.e., as images), or do they understand drawings based on shared but arbitrary associat… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

    Comments: 26 pages; 8 figures; submitted version of manuscript

  9. arXiv:2107.00077  [pdf, other

    cs.CL

    Learning to communicate about shared procedural abstractions

    Authors: William P. McCarthy, Robert D. Hawkins, Haoliang Wang, Cameron Holdaway, Judith E. Fan

    Abstract: Many real-world tasks require agents to coordinate their behavior to achieve shared goals. Successful collaboration requires not only adopting the same communicative conventions, but also grounding these conventions in the same task-appropriate conceptual abstractions. We investigate how humans use natural language to collaboratively solve physical assembly problems more effectively over time. Hum… ▽ More

    Submitted 30 June, 2021; originally announced July 2021.

  10. arXiv:2106.08261  [pdf, other

    cs.AI cs.CV

    Physion: Evaluating Physical Prediction from Vision in Humans and Machines

    Authors: Daniel M. Bear, Elias Wang, Damian Mrowca, Felix J. Binder, Hsiao-Yu Fish Tung, R. T. Pramod, Cameron Holdaway, Sirui Tao, Kevin Smith, Fan-Yun Sun, Li Fei-Fei, Nancy Kanwisher, Joshua B. Tenenbaum, Daniel L. K. Yamins, Judith E. Fan

    Abstract: While current vision algorithms excel at many challenging tasks, it is unclear how well they understand the physical dynamics of real-world environments. Here we introduce Physion, a dataset and benchmark for rigorously evaluating the ability to predict how physical scenarios will evolve over time. Our dataset features realistic simulations of a wide range of physical phenomena, including rigid an… ▽ More

    Submitted 20 June, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

    Comments: 28 pages

    ACM Class: I.2.10; I.4.8; I.5

  11. arXiv:2106.05654  [pdf, other

    cs.AI cs.RO

    Visual sco** operations for physical assembly

    Authors: Felix J Binder, Marcelo M Mattar, David Kirsh, Judith E Fan

    Abstract: Planning is hard. The use of subgoals can make planning more tractable, but selecting these subgoals is computationally costly. What algorithms might enable us to reap the benefits of planning using subgoals while minimizing the computational overhead of selecting them? We propose visual sco**, a strategy that interleaves planning and acting by alternately defining a spatial region as the next s… ▽ More

    Submitted 4 August, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

    MSC Class: I.2

    Journal ref: Proceedings for the 43nd Annual Meeting of the Cognitive Science Society 2021

  12. arXiv:2106.02775  [pdf, other

    cs.CV

    Visual communication of object concepts at different levels of abstraction

    Authors: Justin Yang, Judith E. Fan

    Abstract: People can produce drawings of specific entities (e.g., Garfield), as well as general categories (e.g., "cat"). What explains this ability to produce such varied drawings of even highly familiar object concepts? We hypothesized that drawing objects at different levels of abstraction depends on both sensory information and representational goals, such that drawings intended to portray a recently se… ▽ More

    Submitted 4 June, 2021; originally announced June 2021.

    Comments: To appear in Proceedings of the 43rd Annual Meeting of the Cognitive Science Society. 7 pages, 5 figures