Skip to main content

Showing 1–50 of 203 results for author: Ferrari, V

.
  1. arXiv:2406.15448  [pdf, ps, other

    hep-th math-ph math.AG math.DG

    On Spectral Data for $(2,2)$ Berry Connections, Difference Equations & Equivariant Quantum Cohomology

    Authors: Andrea E. V. Ferrari, Daniel Zhang

    Abstract: We study supersymmetric Berry connections of 2d $\mathcal{N}=(2,2)$ gauged linear sigma models (GLSMs) quantized on a circle, which are periodic monopoles, with the aim to provide a fruitful physical arena for recent mathematical constructions related to the latter. These are difference modules encoding monopole solutions via a Hitchin-Kobayashi correspondence established by Mochizuki. We demonstr… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Contribution to the proceedings of GLSM@30

  2. arXiv:2404.05465  [pdf, other

    cs.CV cs.LG

    HAMMR: HierArchical MultiModal React agents for generic VQA

    Authors: Lluis Castrejon, Thomas Mensink, Howard Zhou, Vittorio Ferrari, Andre Araujo, Jasper Uijlings

    Abstract: Combining Large Language Models (LLMs) with external specialized tools (LLMs+tools) is a recent paradigm to solve multimodal tasks such as Visual Question Answering (VQA). While this approach was demonstrated to work well when optimized and evaluated for each individual benchmark, in practice it is crucial for the next generation of real-world AI systems to handle a broad range of multimodal probl… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  3. arXiv:2312.00878  [pdf, other

    cs.CV cs.AI

    Grounding Everything: Emerging Localization Properties in Vision-Language Transformers

    Authors: Walid Bousselham, Felix Petersen, Vittorio Ferrari, Hilde Kuehne

    Abstract: Vision-language foundation models have shown remarkable performance in various zero-shot settings such as image retrieval, classification, or captioning. But so far, those models seem to fall behind when it comes to zero-shot localization of referential expressions and objects in images. As a result, they need to be fine-tuned for this task. In this paper, we show that pretrained vision-language (… ▽ More

    Submitted 14 December, 2023; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: Code available at https://github.com/WalBouss/GEM

  4. arXiv:2312.00357  [pdf

    eess.IV cs.CV cs.LG

    A Generalizable Deep Learning System for Cardiac MRI

    Authors: Rohan Shad, Cyril Zakka, Dhamanpreet Kaur, Robyn Fong, Ross Warren Filice, John Mongan, Kimberly Kalianos, Nishith Khandwala, David Eng, Matthew Leipzig, Walter Witschey, Alejandro de Feria, Victor Ferrari, Euan Ashley, Michael A. Acker, Curtis Langlotz, William Hiesinger

    Abstract: Cardiac MRI allows for a comprehensive assessment of myocardial structure, function, and tissue characteristics. Here we describe a foundational vision system for cardiac MRI, capable of representing the breadth of human cardiovascular disease and health. Our deep learning model is trained via self-supervised contrastive learning, by which visual concepts in cine-sequence cardiac MRI scans are lea… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: 21 page main manuscript, 4 figures. Supplementary Appendix and code will be made available on publication

    ACM Class: I.2.10

  5. arXiv:2311.08454  [pdf, other

    hep-th math-ph math.AG math.DG

    Berry Connections for 2d $(2,2)$ Theories, Monopole Spectral Data & (Generalised) Cohomology Theories

    Authors: Andrea E. V. Ferrari, Daniel Zhang

    Abstract: We study Berry connections for supersymmetric ground states of 2d $\mathcal{N}=(2,2)$ GLSMs quantised on a circle, which are generalised periodic monopoles. Periodic monopole solutions may be encoded into difference modules, as shown by Mochizuki, or into an alternative algebraic construction given in terms of vector bundles endowed with filtrations. By studying the ground states in terms of a one… ▽ More

    Submitted 3 June, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: 54 pages + appendix. Some clarifications added, typos corrected, abstract streamlined

  6. arXiv:2311.05087  [pdf, other

    hep-th math.QA

    Boundary vertex algebras for 3d $\mathcal{N}=4$ rank-0 SCFTs

    Authors: Andrea E. V. Ferrari, Niklas Garner, Heeyeon Kim

    Abstract: We initiate the study of boundary Vertex Operator Algebras (VOAs) of topologically twisted 3d $\mathcal{N}=4$ rank-0 SCFTs. This is a recently introduced class of $\mathcal{N}=4$ SCFTs that by definition have zero-dimensional Higgs and Coulomb branches. We briefly explain why it is reasonable to obtain rational VOAs at the boundary of their topological twists. When a rank-0 SCFT is realized as the… ▽ More

    Submitted 27 June, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: minor revision

  7. arXiv:2311.04587  [pdf, other

    cs.SE

    Log Statements Generation via Deep Learning: Widening the Support Provided to Developers

    Authors: Antonio Mastropaolo, Valentina Ferrari, Luca Pascarella, Gabriele Bavota

    Abstract: Logging assists in monitoring events that transpire during the execution of software. Previous research has highlighted the challenges confronted by developers when it comes to logging, including dilemmas such as where to log, what data to record, and which log level to employ (e.g., info, fatal). In this context, we introduced LANCE, an approach rooted in deep learning (DL) that has demonstrated… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  8. arXiv:2308.16139  [pdf, other

    cs.CV cs.DB cs.LG

    MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer Vision

    Authors: Jianning Li, Zongwei Zhou, Jiancheng Yang, Antonio Pepe, Christina Gsaxner, Gijs Luijten, Chongyu Qu, Tiezheng Zhang, Xiaoxi Chen, Wenxuan Li, Marek Wodzinski, Paul Friedrich, Kangxian Xie, Yuan **, Narmada Ambigapathy, Enrico Nasca, Naida Solak, Gian Marco Melito, Viet Duc Vu, Afaque R. Memon, Christopher Schlachta, Sandrine De Ribaupierre, Rajnikant Patel, Roy Eagleson, Xiaojun Chen , et al. (132 additional authors not shown)

    Abstract: Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of Shape… ▽ More

    Submitted 12 December, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

    Comments: 16 pages

    MSC Class: 68T01

  9. arXiv:2308.11606  [pdf, other

    cs.CV cs.CL

    StoryBench: A Multifaceted Benchmark for Continuous Story Visualization

    Authors: Emanuele Bugliarello, Hernan Moraldo, Ruben Villegas, Mohammad Babaeizadeh, Mohammad Taghi Saffar, Han Zhang, Dumitru Erhan, Vittorio Ferrari, Pieter-Jan Kindermans, Paul Voigtlaender

    Abstract: Generating video stories from text prompts is a complex task. In addition to having high visual quality, videos need to realistically adhere to a sequence of text prompts whilst being consistent throughout the frames. Creating a benchmark for video generation requires data annotated over time, which contrasts with the single caption used often in video datasets. To fill this gap, we collect compre… ▽ More

    Submitted 12 October, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: NeurIPS D&B 2023

  10. arXiv:2306.09224  [pdf, other

    cs.CV

    Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories

    Authors: Thomas Mensink, Jasper Uijlings, Lluis Castrejon, Arushi Goel, Felipe Cadar, Howard Zhou, Fei Sha, André Araujo, Vittorio Ferrari

    Abstract: We propose Encyclopedic-VQA, a large scale visual question answering (VQA) dataset featuring visual questions about detailed properties of fine-grained categories and instances. It contains 221k unique question+answer pairs each matched with (up to) 5 images, resulting in a total of 1M VQA samples. Moreover, our dataset comes with a controlled knowledge base derived from Wikipedia, marking the evi… ▽ More

    Submitted 24 July, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: ICCV'23

  11. arXiv:2306.09109  [pdf, other

    cs.CV

    NAVI: Category-Agnostic Image Collections with High-Quality 3D Shape and Pose Annotations

    Authors: Varun Jampani, Kevis-Kokitsi Maninis, Andreas Engelhardt, Arjun Karpur, Karen Truong, Kyle Sargent, Stefan Popov, André Araujo, Ricardo Martin-Brualla, Kaushal Patel, Daniel Vlasic, Vittorio Ferrari, Ameesh Makadia, Ce Liu, Yuanzhen Li, Howard Zhou

    Abstract: Recent advances in neural reconstruction enable high-quality 3D object reconstruction from casually captured image collections. Current techniques mostly analyze their progress on relatively simple image collections where Structure-from-Motion (SfM) techniques can provide ground-truth (GT) camera poses. We note that SfM techniques tend to fail on in-the-wild image collections such as image search… ▽ More

    Submitted 13 October, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 camera ready. Project page: https://navidataset.github.io

  12. arXiv:2306.09077  [pdf, other

    cs.CV cs.GR

    Estimating Generic 3D Room Structures from 2D Annotations

    Authors: Denys Rozumnyi, Stefan Popov, Kevis-Kokitsi Maninis, Matthias Nießner, Vittorio Ferrari

    Abstract: Indoor rooms are among the most common use cases in 3D scene understanding. Current state-of-the-art methods for this task are driven by large annotated datasets. Room layouts are especially important, consisting of structural elements in 3D, such as wall, floor, and ceiling. However, they are difficult to annotate, especially on pure RGB video. We propose a novel method to produce generic 3D room… ▽ More

    Submitted 21 December, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: https://github.com/google-research/cad-estate Accepted at 37th Conference on Neural Information Processing Systems (NeurIPS 2023) Track on Datasets and Benchmarks

  13. arXiv:2306.09011  [pdf, other

    cs.CV

    CAD-Estate: Large-scale CAD Model Annotation in RGB Videos

    Authors: Kevis-Kokitsi Maninis, Stefan Popov, Matthias Nießner, Vittorio Ferrari

    Abstract: We propose a method for annotating videos of complex multi-object scenes with a globally-consistent 3D representation of the objects. We annotate each object with a CAD model from a database, and place it in the 3D coordinate frame of the scene with a 9-DoF pose transformation. Our method is semi-automatic and works on commonly-available RGB videos, without requiring a depth sensor. Many steps are… ▽ More

    Submitted 14 August, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: Project page: https://github.com/google-research/cad-estate

  14. arXiv:2304.11055  [pdf, other

    hep-th math.QA

    Free field realisation of boundary vertex algebras for Abelian gauge theories in three dimensions

    Authors: Christopher Beem, Andrea E. V. Ferrari

    Abstract: We study the boundary vertex algebras of $A$-twisted $\mathcal{N}=4$ Abelian gauge theories in three dimensions. These are identified with the BRST quotient (semi-infinite cohomology) of collections of symplectic bosons and free fermions that reflect the matter content of the corresponding gauge theory. We develop various free field realisations for these vertex algebras which we propose to interp… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.

    Comments: 54 pages + appendices

  15. arXiv:2304.06419  [pdf, other

    cs.CV cs.GR

    Tracking by 3D Model Estimation of Unknown Objects in Videos

    Authors: Denys Rozumnyi, Jiri Matas, Marc Pollefeys, Vittorio Ferrari, Martin R. Oswald

    Abstract: Most model-free visual object tracking methods formulate the tracking task as object location estimation given by a 2D segmentation or a bounding box in each video frame. We argue that this representation is limited and instead propose to guide and improve 2D tracking with an explicit object representation, namely the textured 3D shape and 6DoF pose in each video frame. Our representation tackles… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

  16. arXiv:2303.04739  [pdf, other

    cs.CV cs.AR cs.LG cs.PF

    Advancing Direct Convolution using Convolution Slicing Optimization and ISA Extensions

    Authors: Victor Ferrari, Rafael Sousa, Marcio Pereira, João P. L. de Carvalho, José Nelson Amaral, José Moreira, Guido Araujo

    Abstract: Convolution is one of the most computationally intensive operations that must be performed for machine-learning model inference. A traditional approach to compute convolutions is known as the Im2Col + BLAS method. This paper proposes SConv: a direct-convolution algorithm based on a MLIR/LLVM code-generation toolchain that can be integrated into machine-learning compilers . This algorithm introduce… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

    Comments: 15 pages, 11 figures

  17. arXiv:2302.12948  [pdf, other

    cs.LG cs.AI cs.CV

    Agile Modeling: From Concept to Classifier in Minutes

    Authors: Otilia Stretcu, Edward Vendrow, Kenji Hata, Krishnamurthy Viswanathan, Vittorio Ferrari, Sasan Tavakkol, Wenlei Zhou, Aditya Avinash, Enming Luo, Neil Gordon Alldrin, MohammadHossein Bateni, Gabriel Berger, Andrew Bunner, Chun-Ta Lu, Javier A Rey, Giulia DeSalvo, Ranjay Krishna, Ariel Fuxman

    Abstract: The application of computer vision to nuanced subjective use cases is growing. While crowdsourcing has served the vision community well for most objective tasks (such as labeling a "zebra"), it now falters on tasks where there is substantial subjectivity in the concept (such as identifying "gourmet tuna"). However, empowering any user to develop a classifier for their concept is technically diffic… ▽ More

    Submitted 12 May, 2023; v1 submitted 24 February, 2023; originally announced February 2023.

  18. arXiv:2302.11217  [pdf, other

    cs.CV

    Connecting Vision and Language with Video Localized Narratives

    Authors: Paul Voigtlaender, Soravit Changpinyo, Jordi Pont-Tuset, Radu Soricut, Vittorio Ferrari

    Abstract: We propose Video Localized Narratives, a new form of multimodal video annotations connecting vision and language. In the original Localized Narratives, annotators speak and move their mouse simultaneously on an image, thus grounding each word with a mouse trace segment. However, this is challenging on a video. Our new protocol empowers annotators to tell the story of a video with Localized Narrati… ▽ More

    Submitted 15 March, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

    Comments: Accepted at CVPR 2023

  19. arXiv:2302.03207  [pdf

    physics.optics physics.app-ph

    Tunable Structural Transmissive Color in Fano-Resonant Optical Coatings Employing Phase-Change Materials

    Authors: Yi-Siou Huang, Chih-Yu Lee, Medha Rath, Victoria Ferrari, Heshan Yu, Taylor J. Woehl, Jimmy Ni, Ichiro Takeuchi, Carlos Ríos

    Abstract: Reversible, nonvolatile, and pronounced refractive index modulation is an unprecedented combination of properties enabled by chalcogenide phase-change materials (PCMs). This combination of properties makes PCMs a fast-growing platform for active, low-energy nanophotonics, including tunability to otherwise passive thin-film optical coatings. Here, we integrate the PCM Sb2Se3 into a novel four-layer… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

    Comments: 16 pages, 12 figures

  20. Evaluation of the potential of Near Infrared Hyperspectral Imaging for monitoring the invasive brown marmorated stink bug

    Authors: Veronica Ferrari, Rosalba Calvini, Bas Boom, Camilla Menozzi, Aravind Krishnaswamy Rangarajan, Lara Maistrello, Peter Offermans, Alessandro Ulrici

    Abstract: The brown marmorated stink bug (BMSB), Halyomorpha halys, is an invasive insect pest of global importance that damages several crops, compromising agri-food production. Field monitoring procedures are fundamental to perform risk assessment operations, in order to promptly face crop infestations and avoid economical losses. To improve pest management, spectral cameras mounted on Unmanned Aerial Veh… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

    Comments: Accepted manuscript

    Journal ref: Chemometrics and Intelligent Laboratory Systems, 2023, 234, 104751

  21. Generalized Symmetries and Anomalies of 3d N=4 SCFTs

    Authors: Lakshya Bhardwaj, Mathew Bullimore, Andrea E. V. Ferrari, Sakura Schafer-Nameki

    Abstract: We study generalized global symmetries and their 't Hooft anomalies in 3d N=4 superconformal field theories (SCFTs). Following some general considerations, we focus on good quiver gauge theories, comprised of balanced unitary nodes and unbalanced unitary and special unitary nodes. While the global form of the Higgs branch symmetry group may be determined from the UV Lagrangian, the global form of… ▽ More

    Submitted 24 January, 2024; v1 submitted 5 January, 2023; originally announced January 2023.

    Comments: 79 pages, v2: Corrected an important typo reported by M. Sperling

    Journal ref: SciPost Phys. 16, 080 (2024)

  22. arXiv:2212.11920  [pdf, other

    cs.CV

    Beyond SOT: Tracking Multiple Generic Objects at Once

    Authors: Christoph Mayer, Martin Danelljan, Ming-Hsuan Yang, Vittorio Ferrari, Luc Van Gool, Alina Kuznetsova

    Abstract: Generic Object Tracking (GOT) is the problem of tracking target objects, specified by bounding boxes in the first frame of a video. While the task has received much attention in the last decades, researchers have almost exclusively focused on the single object setting. Multi-object GOT benefits from a wider applicability, rendering it more attractive in real-world applications. We attribute the la… ▽ More

    Submitted 25 February, 2024; v1 submitted 22 December, 2022; originally announced December 2022.

    Comments: accepted by WACV'24

  23. arXiv:2212.07393  [pdf, other

    hep-th cond-mat.str-el math-ph

    Non-invertible Symmetries and Higher Representation Theory II

    Authors: Thomas Bartsch, Mathew Bullimore, Andrea E. V. Ferrari, Jamie Pearson

    Abstract: In this paper we continue our investigation of the global categorical symmetries that arise when gauging finite higher groups and their higher subgroups with discrete torsion. The motivation is to provide a common perspective on the construction of non-invertible global symmetries in higher dimensions and a precise description of the associated symmetry categories. We propose that the symmetry cat… ▽ More

    Submitted 14 July, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

    Comments: 56 pages + appendix, v2: clarifications and citations added

  24. arXiv:2210.14142  [pdf, other

    cs.CV

    From colouring-in to pointillism: revisiting semantic segmentation supervision

    Authors: Rodrigo Benenson, Vittorio Ferrari

    Abstract: The prevailing paradigm for producing semantic segmentation training data relies on densely labelling each pixel of each image in the training set, akin to colouring-in books. This approach becomes a bottleneck when scaling up in the number of images, classes, and annotators. Here we propose instead a pointillist approach for semantic segmentation annotation, where only point-wise yes/no questions… ▽ More

    Submitted 17 November, 2022; v1 submitted 25 October, 2022; originally announced October 2022.

    Comments: Open Images V7 available at https://g.co/dataset/open-images

  25. arXiv:2210.07670  [pdf, other

    cs.CV

    Multi-View Photometric Stereo Revisited

    Authors: Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc Van Gool

    Abstract: Multi-view photometric stereo (MVPS) is a preferred method for detailed and precise 3D acquisition of an object from images. Although popular methods for MVPS can provide outstanding results, they are often complex to execute and limited to isotropic material objects. To address such limitations, we present a simple, practical approach to MVPS, which works well for isotropic as well as other objec… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: Accepted for publication at IEEE/CVF WACV 2023. Draft info: 10 pages, 5 figure, and 3 tables

  26. arXiv:2208.05993  [pdf, other

    hep-th math-ph

    Non-invertible Symmetries and Higher Representation Theory I

    Authors: Thomas Bartsch, Mathew Bullimore, Andrea E. V. Ferrari, Jamie Pearson

    Abstract: The purpose of this paper is to investigate the global categorical symmetries that arise when gauging finite higher groups in three or more dimensions. The motivation is to provide a common perspective on constructions of non-invertible global symmetries in higher dimensions and a precise description of the associated symmetry categories. This paper focusses on gauging finite groups and split 2-gr… ▽ More

    Submitted 5 May, 2023; v1 submitted 11 August, 2022; originally announced August 2022.

    Comments: 55 pages + Appendices. v2: references updated

  27. arXiv:2206.04453  [pdf, other

    cs.CV

    The Missing Link: Finding label relations across datasets

    Authors: Jasper Uijlings, Thomas Mensink, Vittorio Ferrari

    Abstract: Computer vision is driven by the many datasets available for training or evaluating novel methods. However, each dataset has a different set of class labels, visual definition of classes, images following a specific distribution, annotation protocols, etc. In this paper we explore the automatic discovery of visual-semantic relations between labels across datasets. We aim to understand how instance… ▽ More

    Submitted 9 August, 2022; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: ECCV 2022

  28. Anomalies of Generalized Symmetries from Solitonic Defects

    Authors: Lakshya Bhardwaj, Mathew Bullimore, Andrea E. V. Ferrari, Sakura Schafer-Nameki

    Abstract: We propose the general idea that 't Hooft anomalies of generalized global symmetries can be understood in terms of the properties of solitonic defects, which generically are non-topological defects. The defining property of such defects is that they act as sources for background fields of generalized symmetries. 't Hooft anomalies arise when solitonic defects are charged under these generalized sy… ▽ More

    Submitted 26 January, 2024; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: 85 pages

    Journal ref: SciPost Phys. 16, 087 (2024)

  29. Supersymmetric ground states of 3d $\mathcal{N}=4$ SUSY gauge theories and Heisenberg Algebras

    Authors: Andrea E. V. Ferrari

    Abstract: We consider 3d $\mathcal{N} = 4$ theories on the geometry $Σ\times\mathbb{R}$, where $Σ$ is a closed and connected Riemann surface, from the point of view of a quantum mechanics on $\mathbb{R}$. Focussing on the elementary mirror pair in the presence of real deformation parameters, namely SQED with one hypermultiplet (SQED[1]) and the free hypermulitplet, we study the algebras of local operators i… ▽ More

    Submitted 21 April, 2023; v1 submitted 12 May, 2022; originally announced May 2022.

    Comments: Scipost version. Minor notational change in the appendix, typos corrected

    Journal ref: SciPost Phys. 14, 063 (2023)

  30. arXiv:2204.01403  [pdf, other

    cs.CV

    How stable are Transferability Metrics evaluations?

    Authors: Andrea Agostinelli, Michal Pándy, Jasper Uijlings, Thomas Mensink, Vittorio Ferrari

    Abstract: Transferability metrics is a maturing field with increasing interest, which aims at providing heuristics for selecting the most suitable source models to transfer to a given target dataset, without fine-tuning them all. However, existing works rely on custom experimental setups which differ across papers, leading to inconsistent conclusions about which transferability metrics work best. In this pa… ▽ More

    Submitted 20 October, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

    Comments: ECCV 2022

  31. arXiv:2203.13296  [pdf, other

    cs.CV

    RayTran: 3D pose estimation and shape reconstruction of multiple objects from videos with ray-traced transformers

    Authors: Michał J. Tyszkiewicz, Kevis-Kokitsi Maninis, Stefan Popov, Vittorio Ferrari

    Abstract: We propose a transformer-based neural network architecture for multi-object 3D reconstruction from RGB videos. It relies on two alternative ways to represent its knowledge: as a global 3D grid of features and an array of view-specific 2D grids. We progressively exchange information between the two with a dedicated bidirectional attention mechanism. We exploit knowledge about the image formation pr… ▽ More

    Submitted 26 August, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

    Comments: ECCV 2022 camera ready

  32. arXiv:2202.13071  [pdf, other

    cs.CV

    Uncertainty-Aware Deep Multi-View Photometric Stereo

    Authors: Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc Van Gool

    Abstract: This paper presents a simple and effective solution to the longstanding classical multi-view photometric stereo (MVPS) problem. It is well-known that photometric stereo (PS) is excellent at recovering high-frequency surface details, whereas multi-view stereo (MVS) can help remove the low-frequency distortion due to PS and retain the global geometry of the shape. This paper proposes an approach tha… ▽ More

    Submitted 28 March, 2022; v1 submitted 26 February, 2022; originally announced February 2022.

    Comments: Accepted for publication in IEEE/CVF CVPR 2022. (11 Pages, 6 Figures, 3 Tables)

  33. arXiv:2111.14643  [pdf, other

    cs.CV cs.GR

    Urban Radiance Fields

    Authors: Konstantinos Rematas, Andrew Liu, Pratul P. Srinivasan, Jonathan T. Barron, Andrea Tagliasacchi, Thomas Funkhouser, Vittorio Ferrari

    Abstract: The goal of this work is to perform 3D reconstruction and novel view synthesis from data captured by scanning platforms commonly deployed for world map** in urban outdoor environments (e.g., Street View). Given a sequence of posed RGB images and lidar sweeps acquired by cameras and scanners moving through an outdoor scene, we produce a model from which 3D surfaces can be extracted and novel RGB… ▽ More

    Submitted 29 November, 2021; originally announced November 2021.

    Comments: Project: https://urban-radiance-fields.github.io/

  34. arXiv:2111.14465  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Motion-from-Blur: 3D Shape and Motion Estimation of Motion-blurred Objects in Videos

    Authors: Denys Rozumnyi, Martin R. Oswald, Vittorio Ferrari, Marc Pollefeys

    Abstract: We propose a method for jointly estimating the 3D motion, 3D shape, and appearance of highly motion-blurred objects from a video. To this end, we model the blurred appearance of a fast moving object in a generative fashion by parametrizing its 3D position, rotation, velocity, acceleration, bounces, shape, and texture over the duration of a predefined time window spanning multiple frames. Using dif… ▽ More

    Submitted 7 April, 2022; v1 submitted 29 November, 2021; originally announced November 2021.

    Comments: CVPR 2022 camera-ready

    Journal ref: 2022 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  35. arXiv:2111.13011  [pdf, other

    cs.CV

    Transferability Metrics for Selecting Source Model Ensembles

    Authors: Andrea Agostinelli, Jasper Uijlings, Thomas Mensink, Vittorio Ferrari

    Abstract: We address the problem of ensemble selection in transfer learning: Given a large pool of source models we want to select an ensemble of models which, after fine-tuning on the target training set, yields the best performance on the target test set. Since fine-tuning all possible ensembles is computationally prohibitive, we aim at predicting performance on the target dataset using a computationally… ▽ More

    Submitted 31 March, 2022; v1 submitted 25 November, 2021; originally announced November 2021.

  36. arXiv:2111.12780  [pdf, other

    cs.CV

    Transferability Estimation using Bhattacharyya Class Separability

    Authors: Michal Pándy, Andrea Agostinelli, Jasper Uijlings, Vittorio Ferrari, Thomas Mensink

    Abstract: Transfer learning has become a popular method for leveraging pre-trained models in computer vision. However, without performing computationally expensive fine-tuning, it is difficult to quantify which pre-trained source models are suitable for a specific target task, or, conversely, to which tasks a pre-trained source model can be easily adapted to. In this work, we propose Gaussian Bhattacharyya… ▽ More

    Submitted 11 April, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: Accepted for CVPR 2022

  37. arXiv:2110.05621  [pdf, other

    cs.CV

    Neural Architecture Search for Efficient Uncalibrated Deep Photometric Stereo

    Authors: Francesco Sarno, Suryansh Kumar, Berk Kaya, Zhiwu Huang, Vittorio Ferrari, Luc Van Gool

    Abstract: We present an automated machine learning approach for uncalibrated photometric stereo (PS). Our work aims at discovering lightweight and computationally efficient PS neural networks with excellent surface normal accuracy. Unlike previous uncalibrated deep PS networks, which are handcrafted and carefully tuned, we leverage differentiable neural architecture search (NAS) strategy to find uncalibrate… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: Accepted for publication at IEEE/CVF, WACV 2022. (11 pages)

  38. arXiv:2110.05594  [pdf, other

    cs.CV

    Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo

    Authors: Berk Kaya, Suryansh Kumar, Francesco Sarno, Vittorio Ferrari, Luc Van Gool

    Abstract: We present a modern solution to the multi-view photometric stereo problem (MVPS). Our work suitably exploits the image formation model in a MVPS experimental setup to recover the dense 3D reconstruction of an object from images. We procure the surface orientation using a photometric stereo (PS) image formation model and blend it with a multi-view neural radiance field representation to recover the… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: Accepted for publication at IEEE/CVF WACV 2022. 18 pages

  39. arXiv:2106.08762  [pdf, other

    cs.CV

    Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects

    Authors: Denys Rozumnyi, Martin R. Oswald, Vittorio Ferrari, Marc Pollefeys

    Abstract: We address the novel task of jointly reconstructing the 3D shape, texture, and motion of an object from a single motion-blurred image. While previous approaches address the deblurring problem only in the 2D image domain, our proposed rigorous modeling of all object properties in the 3D domain enables the correct description of arbitrary object motion. This leads to significantly better image decom… ▽ More

    Submitted 26 October, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

    Comments: Accepted to 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

    Journal ref: 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

  40. Supersymmetric Ground States of 3d $\mathcal{N}=4$ Gauge Theories on a Riemann Surface

    Authors: Mathew Bullimore, Andrea E. V. Ferrari, Heeyeon Kim

    Abstract: This paper studies supersymmetric ground states of 3d $\mathcal{N}=4$ supersymmetric gauge theories on a Riemann surface of genus $g$. There are two distinct spaces of supersymmetric ground states arising from the $A$ and $B$ type twists on the Riemann surface, which lead to effective supersymmetric quantum mechanics with four supercharges and supermultiplets of type $\mathcal{N}=(2,2)$ and… ▽ More

    Submitted 23 November, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

    Comments: 50 pages, SciPost version

    Journal ref: SciPost Phys. 12, 072 (2022)

  41. A Step Toward More Inclusive People Annotations for Fairness

    Authors: Candice Schumann, Susanna Ricco, Utsav Prabhu, Vittorio Ferrari, Caroline Pantofaru

    Abstract: The Open Images Dataset contains approximately 9 million images and is a widely accepted dataset for computer vision research. As is common practice for large datasets, the annotations are not exhaustive, with bounding boxes and attribute labels for only a subset of the classes in each image. In this paper, we present a new set of annotations on a subset of the Open Images dataset called the MIAP… ▽ More

    Submitted 5 May, 2021; originally announced May 2021.

    Journal ref: AIES (2021)

  42. arXiv:2103.13318  [pdf, other

    cs.CV

    Factors of Influence for Transfer Learning across Diverse Appearance Domains and Task Types

    Authors: Thomas Mensink, Jasper Uijlings, Alina Kuznetsova, Michael Gygli, Vittorio Ferrari

    Abstract: Transfer learning enables to re-use knowledge learned on a source task to help learning a target task. A simple form of transfer learning is common in current state-of-the-art computer vision models, i.e. pre-training a model for image classification on the ILSVRC dataset, and then fine-tune on any target task. However, previous systematic studies of transfer learning have been limited and the cir… ▽ More

    Submitted 20 November, 2021; v1 submitted 24 March, 2021; originally announced March 2021.

    Comments: Accepted for future publication in TPAMI

  43. arXiv:2102.08860  [pdf, other

    cs.CV cs.GR

    ShaRF: Shape-conditioned Radiance Fields from a Single View

    Authors: Konstantinos Rematas, Ricardo Martin-Brualla, Vittorio Ferrari

    Abstract: We present a method for estimating neural scenes representations of objects given only a single image. The core of our method is the estimation of a geometric scaffold for the object and its use as a guide for the reconstruction of the underlying radiance field. Our formulation is based on a generative process that first maps a latent code to a voxelized shape, and then renders it to an image, wit… ▽ More

    Submitted 23 June, 2021; v1 submitted 17 February, 2021; originally announced February 2021.

    Comments: Project page: http://www.krematas.com/sharf/index.html

  44. arXiv:2102.04980  [pdf, other

    cs.CV cs.CL

    Telling the What while Pointing to the Where: Multimodal Queries for Image Retrieval

    Authors: Soravit Changpinyo, Jordi Pont-Tuset, Vittorio Ferrari, Radu Soricut

    Abstract: Most existing image retrieval systems use text queries as a way for the user to express what they are looking for. However, fine-grained image retrieval often requires the ability to also express where in the image the content they are looking for is. The text modality can only cumbersomely express such localization preferences, whereas pointing is a more natural fit. In this paper, we propose an… ▽ More

    Submitted 24 August, 2021; v1 submitted 9 February, 2021; originally announced February 2021.

    Comments: IEEE/CVF International Conference on Computer Vision (ICCV 2021)

  45. arXiv:2012.12554  [pdf, other

    cs.CV cs.HC

    Efficient video annotation with visual interpolation and frame selection guidance

    Authors: A. Kuznetsova, A. Talati, Y. Luo, K. Simmons, V. Ferrari

    Abstract: We introduce a unified framework for generic video annotation with bounding boxes. Video annotation is a longstanding problem, as it is a tedious and time-consuming process. We tackle two important challenges of video annotation: (1) automatic temporal interpolation and extrapolation of bounding boxes provided by a human annotator on a subset of all frames, and (2) automatic selection of frames to… ▽ More

    Submitted 23 December, 2020; originally announced December 2020.

    Comments: accepted to WACV 2021

  46. arXiv:2012.11575  [pdf, other

    cs.CV

    From Points to Multi-Object 3D Reconstruction

    Authors: Francis Engelmann, Konstantinos Rematas, Bastian Leibe, Vittorio Ferrari

    Abstract: We propose a method to detect and reconstruct multiple 3D objects from a single RGB image. The key idea is to optimize for detection, alignment and shape jointly over all objects in the RGB image, while focusing on realistic and physically plausible reconstructions. To this end, we propose a keypoint detector that localizes objects as center points and directly predicts all object properties, incl… ▽ More

    Submitted 21 June, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

    Comments: CVPR2021 - Project Page: https://francisengelmann.github.io/points2objects/

  47. arXiv:2012.06777  [pdf, other

    cs.CV

    Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces

    Authors: Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc Van Gool

    Abstract: This paper presents an uncalibrated deep neural network framework for the photometric stereo problem. For training models to solve the problem, existing neural network-based methods either require exact light directions or ground-truth surface normals of the object or both. However, in practice, it is challenging to procure both of this information precisely, which restricts the broader adoption o… ▽ More

    Submitted 17 April, 2021; v1 submitted 12 December, 2020; originally announced December 2020.

    Comments: Accepted for publication at CVPR 2021. Document info: 18 pages, 21 Figures, 5 tables. (Minor typo corrected)

  48. arXiv:2012.04641  [pdf, other

    cs.CV

    Vid2CAD: CAD Model Alignment using Multi-View Constraints from Videos

    Authors: Kevis-Kokitsi Maninis, Stefan Popov, Matthias Nießner, Vittorio Ferrari

    Abstract: We address the task of aligning CAD models to a video sequence of a complex scene containing multiple objects. Our method can process arbitrary videos and fully automatically recover the 9 DoF pose for each object appearing in it, thus aligning them in a common 3D coordinate frame. The core idea of our method is to integrate neural network predictions from individual frames with a temporally globa… ▽ More

    Submitted 25 January, 2022; v1 submitted 8 December, 2020; originally announced December 2020.

    Comments: T-PAMI 2022 | Video: https://www.youtube.com/watch?v=R1cXg0vpwe4 | Project page: https://www.kmaninis.com/vid2cad/

  49. DeFMO: Deblurring and Shape Recovery of Fast Moving Objects

    Authors: Denys Rozumnyi, Martin R. Oswald, Vittorio Ferrari, Jiri Matas, Marc Pollefeys

    Abstract: Objects moving at high speed appear significantly blurred when captured with cameras. The blurry appearance is especially ambiguous when the object has complex shape or texture. In such cases, classical methods, or even humans, are unable to recover the object's appearance and motion. We propose a method that, given a single image with its estimated background, outputs the object's appearance and… ▽ More

    Submitted 30 March, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

    Comments: CVPR 2021 camera-ready

    Journal ref: 2021 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  50. arXiv:2007.11603  [pdf, ps, other

    hep-th math-ph

    The Twisted Index and Topological Saddles

    Authors: Mathew Bullimore, Andrea E. V. Ferrari, Heeyeon Kim, Guangyu Xu

    Abstract: The twisted index of 3d $\mathcal{N}=2$ gauge theories on $S^1 \times Σ$ has an algebro-geometric interpretation as the Witten index of an effective supersymmetric quantum mechanics. In this paper, we consider the contributions to the supersymmetric quantum mechanics from topological saddle points in supersymmetric localisation of abelian gauge theories. Topological saddles are configurations wher… ▽ More

    Submitted 22 June, 2023; v1 submitted 22 July, 2020; originally announced July 2020.

    Comments: 28 pages, typos corrected to align with published 2022 version