Skip to main content

Showing 1–26 of 26 results for author: Mahdavi-Amiri, A

.
  1. arXiv:2407.06305  [pdf, other

    cs.CV cs.GR

    SweepNet: Unsupervised Learning Shape Abstraction via Neural Sweepers

    Authors: Mingrui Zhao, Yizhi Wang, Fenggen Yu, Changqing Zou, Ali Mahdavi-Amiri

    Abstract: Shape abstraction is an important task for simplifying complex geometric structures while retaining essential features. Sweep surfaces, commonly found in human-made objects, aid in this process by effectively capturing and representing object geometry, thereby facilitating abstraction. In this paper, we introduce \papername, a novel approach to shape abstraction through sweep surfaces. We propose… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 14 pages,20 figures, ECCV 2024

  2. arXiv:2406.01300  [pdf, other

    cs.CV

    pOps: Photo-Inspired Diffusion Operators

    Authors: Elad Richardson, Yuval Alaluf, Ali Mahdavi-Amiri, Daniel Cohen-Or

    Abstract: Text-guided image generation enables the creation of visual content from textual descriptions. However, certain visual concepts cannot be effectively conveyed through language alone. This has sparked a renewed interest in utilizing the CLIP image embedding space for more visually-oriented tasks through methods such as IP-Adapter. Interestingly, the CLIP image embedding space has been shown to be s… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Project Page: https://popspaper.github.io/pOps/

  3. arXiv:2405.17393  [pdf, other

    cs.CV

    EASI-Tex: Edge-Aware Mesh Texturing from Single Image

    Authors: Sai Raj Kishore Perla, Yizhi Wang, Ali Mahdavi-Amiri, Hao Zhang

    Abstract: We present a novel approach for single-image mesh texturing, which employs a diffusion model with judicious conditioning to seamlessly transfer an object's texture from a single RGB image to a given 3D mesh object. We do not assume that the two objects belong to the same category, and even if they do, there can be significant discrepancies in their geometry and part proportions. Our method aims to… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: ACM Transactions on Graphics (Proceedings of SIGGRAPH), 2024. Project Page: https://sairajk.github.io/easi-tex/

  4. arXiv:2403.14937  [pdf, other

    cs.CV

    Survey on Modeling of Articulated Objects

    Authors: Jiayi Liu, Manolis Savva, Ali Mahdavi-Amiri

    Abstract: 3D modeling of articulated objects is a research problem within computer vision, graphics, and robotics. Its objective is to understand the shape and motion of the articulated components, represent the geometry and mobility of object parts, and create realistic models that reflect articulated objects in the real world. This survey provides a comprehensive overview of the current state-of-the-art i… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  5. arXiv:2403.04943  [pdf, other

    cs.CV

    AFreeCA: Annotation-Free Counting for All

    Authors: Adriano D'Alessandro, Ali Mahdavi-Amiri, Ghassan Hamarneh

    Abstract: Object counting methods typically rely on manually annotated datasets. The cost of creating such datasets has restricted the versatility of these networks to count objects from specific classes (such as humans or penguins), and counting objects from diverse categories remains a challenge. The availability of robust text-to-image latent diffusion models (LDMs) raises the question of whether these m… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  6. arXiv:2402.03549  [pdf

    cs.CV

    AnaMoDiff: 2D Analogical Motion Diffusion via Disentangled Denoising

    Authors: Maham Tanveer, Yizhi Wang, Ruiqi Wang, Nanxuan Zhao, Ali Mahdavi-Amiri, Hao Zhang

    Abstract: We present AnaMoDiff, a novel diffusion-based method for 2D motion analogies that is applied to raw, unannotated videos of articulated characters. Our goal is to accurately transfer motions from a 2D driving video onto a source character, with its identity, in terms of appearance and natural movement, well preserved, even when there may be significant discrepancies between the source and driving c… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  7. arXiv:2312.09570  [pdf, other

    cs.CV

    CAGE: Controllable Articulation GEneration

    Authors: Jiayi Liu, Hou In Ivan Tam, Ali Mahdavi-Amiri, Manolis Savva

    Abstract: We address the challenge of generating 3D articulated objects in a controllable fashion. Currently, modeling articulated 3D objects is either achieved through laborious manual authoring, or using methods from prior work that are hard to scale and control directly. We leverage the interplay between part shape, connectivity, and motion using a denoising diffusion-based method with attention modules… ▽ More

    Submitted 20 March, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: CVPR 2024. Project page: https://3dlg-hcvc.github.io/cage/

  8. arXiv:2312.02221  [pdf, other

    cs.CV cs.GR

    Slice3D: Multi-Slice, Occlusion-Revealing, Single View 3D Reconstruction

    Authors: Yizhi Wang, Wallace Lira, Wenqi Wang, Ali Mahdavi-Amiri, Hao Zhang

    Abstract: We introduce multi-slice reasoning, a new notion for single-view 3D reconstruction which challenges the current and prevailing belief that multi-view synthesis is the most natural conduit between single-view and 3D. Our key observation is that object slicing is more advantageous than altering views to reveal occluded structures. Specifically, slicing is more occlusion-revealing since it can peel t… ▽ More

    Submitted 10 December, 2023; v1 submitted 3 December, 2023; originally announced December 2023.

    Comments: Project website: https://yizhiwang96.github.io/Slice3D/

  9. arXiv:2311.17083  [pdf, other

    cs.CV

    CLiC: Concept Learning in Context

    Authors: Mehdi Safaee, Aryan Mikaeili, Or Patashnik, Daniel Cohen-Or, Ali Mahdavi-Amiri

    Abstract: This paper addresses the challenge of learning a local visual pattern of an object from one image, and generating images depicting objects with that pattern. Learning a localized concept and placing it on an object in a target image is a nontrivial task, as the objects may have different orientations and shapes. Our approach builds upon recent advancements in visual concept learning. It involves a… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  10. arXiv:2310.01662  [pdf, other

    cs.CV

    SYRAC: Synthesize, Rank, and Count

    Authors: Adriano D'Alessandro, Ali Mahdavi-Amiri, Ghassan Hamarneh

    Abstract: Crowd counting is a critical task in computer vision, with several important applications. However, existing counting methods rely on labor-intensive density map annotations, necessitating the manual localization of each individual pedestrian. While recent efforts have attempted to alleviate the annotation burden through weakly or semi-supervised learning, these approaches fall short of significan… ▽ More

    Submitted 11 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

  11. arXiv:2309.00733  [pdf, other

    cs.CV cs.LG

    TExplain: Explaining Learned Visual Features via Pre-trained (Frozen) Language Models

    Authors: Saeid Asgari Taghanaki, Aliasghar Khani, Ali Saheb Pasand, Amir Khasahmadi, Aditya Sanghi, Karl D. D. Willis, Ali Mahdavi-Amiri

    Abstract: Interpreting the learned features of vision models has posed a longstanding challenge in the field of machine learning. To address this issue, we propose a novel method that leverages the capabilities of language models to interpret the learned features of pre-trained image classifiers. Our method, called TExplain, tackles this task by training a neural network to establish a connection between th… ▽ More

    Submitted 1 May, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: Accepted to ICLR 2024, Reliable and Responsible Foundation Models workshop

  12. arXiv:2308.07391  [pdf, other

    cs.CV cs.AI cs.GR

    PARIS: Part-level Reconstruction and Motion Analysis for Articulated Objects

    Authors: Jiayi Liu, Ali Mahdavi-Amiri, Manolis Savva

    Abstract: We address the task of simultaneous part-level reconstruction and motion parameter estimation for articulated objects. Given two sets of multi-view images of an object in two static articulation states, we decouple the movable part from the static part and reconstruct shape and appearance while predicting the motion parameters. To tackle this problem, we present PARIS: a self-supervised, end-to-en… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

    Comments: Presented at ICCV 2023. Project website: https://3dlg-hcvc.github.io/paris/

  13. arXiv:2305.18601  [pdf, other

    cs.CV

    BRICS: Bi-level feature Representation of Image CollectionS

    Authors: Dingdong Yang, Yizhi Wang, Ali Mahdavi-Amiri, Hao Zhang

    Abstract: We present BRICS, a bi-level feature representation for image collections, which consists of a key code space on top of a feature grid space. Specifically, our representation is learned by an autoencoder to encode images into continuous key codes, which are used to retrieve features from groups of multi-resolution feature grids. Our key codes and feature grids are jointly trained continuously with… ▽ More

    Submitted 30 December, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

  14. arXiv:2303.10735  [pdf, other

    cs.CV cs.GR

    SKED: Sketch-guided Text-based 3D Editing

    Authors: Aryan Mikaeili, Or Perel, Mehdi Safaee, Daniel Cohen-Or, Ali Mahdavi-Amiri

    Abstract: Text-to-image diffusion models are gradually introduced into computer graphics, recently enabling the development of Text-to-3D pipelines in an open domain. However, for interactive editing purposes, local manipulations of content through a simplistic textual interface can be arduous. Incorporating user guided sketches with Text-to-image pipelines offers users more intuitive control. Still, as sta… ▽ More

    Submitted 18 August, 2023; v1 submitted 19 March, 2023; originally announced March 2023.

  15. arXiv:2303.09604  [pdf, other

    cs.CV cs.GR

    DS-Fusion: Artistic Typography via Discriminated and Stylized Diffusion

    Authors: Maham Tanveer, Yizhi Wang, Ali Mahdavi-Amiri, Hao Zhang

    Abstract: We introduce a novel method to automatically generate an artistic typography by stylizing one or more letter fonts to visually convey the semantics of an input word, while ensuring that the output remains readable. To address an assortment of challenges with our task at hand including conflicting goals (artistic stylization vs. legibility), lack of ground truth, and immense search space, our appro… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: Project website: https://ds-fusion.github.io/

  16. arXiv:2302.10167  [pdf, other

    cs.CV cs.GR cs.LG

    Cross-domain Compositing with Pretrained Diffusion Models

    Authors: Roy Hachnochi, Mingrui Zhao, Nadav Orzech, Rinon Gal, Ali Mahdavi-Amiri, Daniel Cohen-Or, Amit Haim Bermano

    Abstract: Diffusion models have enabled high-quality, conditional image editing capabilities. We propose to expand their arsenal, and demonstrate that off-the-shelf diffusion models can be used for a wide range of cross-domain compositing tasks. Among numerous others, these include image blending, object immersion, texture-replacement and even CG2Real translation or stylization. We employ a localized, itera… ▽ More

    Submitted 25 May, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: Code: https://github.com/cross-domain-compositing/cross-domain-compositing

  17. arXiv:2210.00055  [pdf, other

    cs.LG cs.CV

    MaskTune: Mitigating Spurious Correlations by Forcing to Explore

    Authors: Saeid Asgari Taghanaki, Aliasghar Khani, Fereshte Khani, Ali Gholami, Linh Tran, Ali Mahdavi-Amiri, Ghassan Hamarneh

    Abstract: A fundamental challenge of over-parameterized deep learning models is learning meaningful data representations that yield good performance on a downstream task without over-fitting spurious input features. This work proposes MaskTune, a masking strategy that prevents over-reliance on spurious (or a limited number of) features. MaskTune forces the trained model to explore new features during a sing… ▽ More

    Submitted 8 October, 2022; v1 submitted 30 September, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS 2022

  18. arXiv:2208.02205  [pdf, other

    cs.CV

    Large-scale Building Damage Assessment using a Novel Hierarchical Transformer Architecture on Satellite Images

    Authors: Navjot Kaur, Cheng-Chun Lee, Ali Mostafavi, Ali Mahdavi-Amiri

    Abstract: This paper presents \dahitra, a novel deep-learning model with hierarchical transformers to classify building damages based on satellite images in the aftermath of natural disasters. Satellite imagery provides real-time and high-coverage information and offers opportunities to inform large-scale post-disaster building damage assessment, which is critical for rapid emergency response. In this work,… ▽ More

    Submitted 4 February, 2023; v1 submitted 3 August, 2022; originally announced August 2022.

  19. arXiv:2112.06596  [pdf, other

    cs.CV

    SAC-GAN: Structure-Aware Image Composition

    Authors: Hang Zhou, Rui Ma, Ling-Xiao Zhang, Lin Gao, Ali Mahdavi-Amiri, Hao Zhang

    Abstract: We introduce an end-to-end learning framework for image-to-image composition, aiming to plausibly compose an object represented as a cropped patch from an object image into a background scene image. As our approach emphasizes more on semantic and structural coherence of the composed images, rather than their pixel-level RGB accuracies, we tailor the input and output of our network with structure-a… ▽ More

    Submitted 2 December, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

    Comments: Accepted to TVCG. Code: https://github.com/RyanHangZhou/SAC-GAN

  20. arXiv:2112.05381  [pdf, other

    cs.CV cs.GR

    UNIST: Unpaired Neural Implicit Shape Translation Network

    Authors: Qimin Chen, Johannes Merz, Aditya Sanghi, Hooman Shayani, Ali Mahdavi-Amiri, Hao Zhang

    Abstract: We introduce UNIST, the first deep neural implicit model for general-purpose, unpaired shape-to-shape translation, in both 2D and 3D domains. Our model is built on autoencoding implicit fields, rather than point clouds which represents the state of the art. Furthermore, our translation network is trained to perform the task over a latent grid representation which combines the merits of both latent… ▽ More

    Submitted 30 March, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

    Comments: CVPR 2022. project page: https://qiminchen.github.io/unist/

  21. arXiv:2106.16237  [pdf, other

    cs.CV

    Multimodal Shape Completion via IMLE

    Authors: Himanshu Arora, Saurabh Mishra, Shichong Peng, Ke Li, Ali Mahdavi-Amiri

    Abstract: Shape completion is the problem of completing partial input shapes such as partial scans. This problem finds important applications in computer vision and robotics due to issues such as occlusion or sparsity in real-world data. However, most of the existing research related to shape completion has been focused on completing shapes by learning a one-to-one map** which limits the diversity and cre… ▽ More

    Submitted 7 July, 2021; v1 submitted 30 June, 2021; originally announced June 2021.

    Comments: Project Website: https://sites.google.com/site/alimahdaviamiri/projects/shape-completion

  22. arXiv:2104.05652  [pdf, other

    cs.CV cs.GR cs.LG

    CAPRI-Net: Learning Compact CAD Shapes with Adaptive Primitive Assembly

    Authors: Fenggen Yu, Zhiqin Chen, Manyi Li, Aditya Sanghi, Hooman Shayani, Ali Mahdavi-Amiri, Hao Zhang

    Abstract: We introduce CAPRI-Net, a neural network for learning compact and interpretable implicit representations of 3D computer-aided design (CAD) models, in the form of adaptive primitive assemblies. Our network takes an input 3D shape that can be provided as a point cloud or voxel grids, and reconstructs it by a compact assembly of quadric surface primitives via constructive solid geometry (CSG) operati… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

  23. arXiv:2104.04606  [pdf, other

    cs.CV

    RaidaR: A Rich Annotated Image Dataset of Rainy Street Scenes

    Authors: Jiongchao **, Arezou Fatemi, Wallace Lira, Fenggen Yu, Biao Leng, Rui Ma, Ali Mahdavi-Amiri, Hao Zhang

    Abstract: We introduce RaidaR, a rich annotated image dataset of rainy street scenes, to support autonomous driving research. The new dataset contains the largest number of rainy images (58,542) to date, 5,000 of which provide semantic segmentations and 3,658 provide object instance segmentations. The RaidaR images cover a wide range of realistic rain-induced artifacts, including fog, droplets, and road ref… ▽ More

    Submitted 26 October, 2021; v1 submitted 9 April, 2021; originally announced April 2021.

    Comments: Presented in Second ICCV Workshop on Autonomous Vehicle Vision (AVVision), 2021. Website: https://raidar-dataset.com/

  24. Data to Physicalization: A Survey of the Physical Rendering Process

    Authors: Hessam Djavaherpour, Faramarz Samavati, Ali Mahdavi-Amiri, Fatemeh Yazdanbakhsh, Samuel Huron, Richard Levy, Yvonne Jansen, Lora Oehlberg

    Abstract: Physical representations of data offer physical and spatial ways of looking at, navigating, and interacting with data. While digital fabrication has facilitated the creation of objects with data-driven geometry, rendering data as a physically fabricated object is still a daunting leap for many physicalization designers. Rendering in the scope of this research refers to the back-and-forth process f… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

    ACM Class: I.3.5; I.3.8

    Journal ref: Computer Graphics Forum, Wiley, 2021, 40 (3), pp.569-598

  25. arXiv:2007.04883  [pdf, other

    cs.CV

    PIE-NET: Parametric Inference of Point Cloud Edges

    Authors: Xiaogang Wang, Yuelang Xu, Kai Xu, Andrea Tagliasacchi, Bin Zhou, Ali Mahdavi-Amiri, Hao Zhang

    Abstract: We introduce an end-to-end learnable technique to robustly identify feature edges in 3D point cloud data. We represent these edges as a collection of parametric curves (i.e.,lines, circles, and B-splines). Accordingly, our deep neural network, coined PIE-NET, is trained for parametric inference of edges. The network relies on a "region proposal" architecture, where a first module proposes an over-… ▽ More

    Submitted 25 October, 2020; v1 submitted 9 July, 2020; originally announced July 2020.

  26. arXiv:1804.06579  [pdf, other

    cs.GR cs.CV

    Semi-Supervised Co-Analysis of 3D Shape Styles from Projected Lines

    Authors: Fenggen Yu, Yan Zhang, Kai Xu, Ali Mahdavi-Amiri, Hao Zhang

    Abstract: We present a semi-supervised co-analysis method for learning 3D shape styles from projected feature lines, achieving style patch localization with only weak supervision. Given a collection of 3D shapes spanning multiple object categories and styles, we perform style co-analysis over projected feature lines of each 3D shape and then backproject the learned style features onto the 3D shapes. Our cor… ▽ More

    Submitted 15 January, 2022; v1 submitted 18 April, 2018; originally announced April 2018.

    Comments: 17 pages, 25 figures

    MSC Class: 68U05 ACM Class: H.4.2

    Journal ref: ACM Transactions on Graphics 37(2). February 2018