Search | arXiv e-print repository

OpenMask3D: Open-Vocabulary 3D Instance Segmentation

Authors: Ayça Takmaz, Elisabetta Fedele, Robert W. Sumner, Marc Pollefeys, Federico Tombari, Francis Engelmann

Abstract: We introduce the task of open-vocabulary 3D instance segmentation. Current approaches for 3D instance segmentation can typically only recognize object categories from a pre-defined closed set of classes that are annotated in the training datasets. This results in important limitations for real-world applications where one might need to perform tasks guided by novel, open-vocabulary queries related… ▽ More We introduce the task of open-vocabulary 3D instance segmentation. Current approaches for 3D instance segmentation can typically only recognize object categories from a pre-defined closed set of classes that are annotated in the training datasets. This results in important limitations for real-world applications where one might need to perform tasks guided by novel, open-vocabulary queries related to a wide variety of objects. Recently, open-vocabulary 3D scene understanding methods have emerged to address this problem by learning queryable features for each point in the scene. While such a representation can be directly employed to perform semantic segmentation, existing methods cannot separate multiple object instances. In this work, we address this limitation, and propose OpenMask3D, which is a zero-shot approach for open-vocabulary 3D instance segmentation. Guided by predicted class-agnostic 3D instance masks, our model aggregates per-mask features via multi-view fusion of CLIP-based image embeddings. Experiments and ablation studies on ScanNet200 and Replica show that OpenMask3D outperforms other open-vocabulary methods, especially on the long-tail distribution. Qualitative experiments further showcase OpenMask3D's ability to segment object properties based on free-form queries describing geometry, affordances, and materials. △ Less

Submitted 29 October, 2023; v1 submitted 23 June, 2023; originally announced June 2023.

Comments: NeurIPS 2023. Project page: https://openmask3d.github.io/

Journal ref: NeurIPS 2023

arXiv:2303.16618 [pdf, other]

Reference-less Analysis of Context Specificity in Translation with Personalised Language Models

Authors: Sebastian Vincent, Alice Dowek, Rowanne Sumner, Charlotte Blundell, Emily Preston, Chris Bayliss, Chris Oakley, Carolina Scarton

Abstract: Sensitising language models (LMs) to external context helps them to more effectively capture the speaking patterns of individuals with specific characteristics or in particular environments. This work investigates to what extent rich character and film annotations can be leveraged to personalise LMs in a scalable manner. We then explore the use of such models in evaluating context specificity in m… ▽ More Sensitising language models (LMs) to external context helps them to more effectively capture the speaking patterns of individuals with specific characteristics or in particular environments. This work investigates to what extent rich character and film annotations can be leveraged to personalise LMs in a scalable manner. We then explore the use of such models in evaluating context specificity in machine translation. We build LMs which leverage rich contextual information to reduce perplexity by up to 6.5% compared to a non-contextual model, and generalise well to a scenario with no speaker-specific data, relying on combinations of demographic characteristics expressed via metadata. Our findings are consistent across two corpora, one of which (Cornell-rich) is also a contribution of this paper. We then use our personalised LMs to measure the co-occurrence of extra-textual context and translation hypotheses in a machine translation setting. Our results suggest that the degree to which professional translations in our domain are context-specific can be preserved to a better extent by a contextual machine translation model than a non-contextual model, which is also reflected in the contextual model's superior reference-based scores. △ Less

Submitted 5 March, 2024; v1 submitted 29 March, 2023; originally announced March 2023.

Comments: Accepted to LREC-COLING 2024

arXiv:2302.08683 [pdf, ps, other]

doi 10.1111/1467-8659.00299

Animating Sand, Mud, and Snow

Authors: Robert W. Sumner, James F. O'Brien, Jessica K. Hodgins

Abstract: Computer animations often lack the subtle environmental changes that should occur due to the actions of the characters. Squealing car tires usually leave no skid marks, airplanes rarely leave jet trails in the sky, and most runners leave no footprints. In this paper, we describe a simulation model of ground surfaces that can be deformed by the impact of rigid body models of animated characters. To… ▽ More Computer animations often lack the subtle environmental changes that should occur due to the actions of the characters. Squealing car tires usually leave no skid marks, airplanes rarely leave jet trails in the sky, and most runners leave no footprints. In this paper, we describe a simulation model of ground surfaces that can be deformed by the impact of rigid body models of animated characters. To demonstrate the algorithms, we show footprints made by a runner in sand, mud, and snow as well as bicycle tire tracks, a bicycle crash, and a falling runner. The shapes of the footprints in the three surfaces are quite different, but the effects were controlled through only five essentially independent parameters. To assess the realism of the resulting motion, we compare the simulated footprints to human footprints in sand. △ Less

Submitted 21 February, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

Comments: 11 pages, 11 figures, 12 ancillary videos, previous version published in Graphics Interface 1998. Michael A. J. Sweeney award for best student paper. Alternative location: http://graphics.berkeley.edu/papers/Sumner-ASM-1999-03

ACM Class: I.3.5

Journal ref: Computer Graphics Forum, 18(1):17-26, 1999

arXiv:2212.00786 [pdf, other]

3D Segmentation of Humans in Point Clouds with Synthetic Data

Authors: Ayça Takmaz, Jonas Schult, Irem Kaftan, Mertcan Akçay, Bastian Leibe, Robert Sumner, Francis Engelmann, Siyu Tang

Abstract: Segmenting humans in 3D indoor scenes has become increasingly important with the rise of human-centered robotics and AR/VR applications. To this end, we propose the task of joint 3D human semantic segmentation, instance segmentation and multi-human body-part segmentation. Few works have attempted to directly segment humans in cluttered 3D scenes, which is largely due to the lack of annotated train… ▽ More Segmenting humans in 3D indoor scenes has become increasingly important with the rise of human-centered robotics and AR/VR applications. To this end, we propose the task of joint 3D human semantic segmentation, instance segmentation and multi-human body-part segmentation. Few works have attempted to directly segment humans in cluttered 3D scenes, which is largely due to the lack of annotated training data of humans interacting with 3D scenes. We address this challenge and propose a framework for generating training data of synthetic humans interacting with real 3D scenes. Furthermore, we propose a novel transformer-based model, Human3D, which is the first end-to-end model for segmenting multiple human instances and their body-parts in a unified manner. The key advantage of our synthetic data generation framework is its ability to generate diverse and realistic human-scene interactions, with highly accurate ground truth. Our experiments show that pre-training on synthetic data improves performance on a wide variety of 3D human segmentation tasks. Finally, we demonstrate that Human3D outperforms even task-specific state-of-the-art 3D segmentation methods. △ Less

Submitted 18 August, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

Comments: project page: https://human-3d.github.io/

arXiv:2202.12050 [pdf, other]

Experiments as Code: A Concept for Reproducible, Auditable, Debuggable, Reusable, & Scalable Experiments

Authors: Leonel Aguilar, Michal Gath-Morad, Jascha Grübel, Jasper Ermatinger, Hantao Zhao, Stefan Wehrli, Robert W. Sumner, Ce Zhang, Dirk Helbing, Christoph Hölscher

Abstract: A common concern in experimental research is the auditability and reproducibility of experiments. Experiments are usually designed, provisioned, managed, and analyzed by diverse teams of specialists (e.g., researchers, technicians and engineers) and may require many resources (e.g. cloud infrastructure, specialized equipment). Even though researchers strive to document experiments accurately, this… ▽ More A common concern in experimental research is the auditability and reproducibility of experiments. Experiments are usually designed, provisioned, managed, and analyzed by diverse teams of specialists (e.g., researchers, technicians and engineers) and may require many resources (e.g. cloud infrastructure, specialized equipment). Even though researchers strive to document experiments accurately, this process is often lacking, making it hard to reproduce them. Moreover, when it is necessary to create a similar experiment, very often we end up "reinventing the wheel" as it is easier to start from scratch than trying to reuse existing work, thus losing valuable embedded best practices and previous experiences. In behavioral studies this has contributed to the reproducibility crisis. To tackle this challenge, we propose the "Experiments as Code" paradigm, where the whole experiment is not only documented but additionally the automation code to provision, deploy, manage, and analyze it is provided. To this end we define the Experiments as Code concept, provide a taxonomy for the components of a practical implementation, and provide a proof of concept with a simple desktop VR experiment that showcases the benefits of its "as code" representation, i.e., reproducibility, auditability, debuggability, reusability, and scalability. △ Less

Submitted 24 February, 2022; originally announced February 2022.

ACM Class: K.4.3; J.4

arXiv:2202.07104 [pdf, other]

The Hitchhiker's Guide to Fused Twins: A Review of Access to Digital Twins in situ in Smart Cities

Authors: Jascha Grübel, Tyler Thrash, Leonel Aguilar, Michal Gath-Morad, Julia Chatain, Robert W. Sumner, Christoph Hölscher, Victor R. Schinazi

Abstract: Smart Cities already surround us, and yet they are still incomprehensibly far from directly impacting everyday life. While current Smart Cities are often inaccessible, the experience of everyday citizens may be enhanced with a combination of the emerging technologies Digital Twins (DTs) and Situated Analytics. DTs represent their Physical Twin (PT) in the real world via models, simulations, (remot… ▽ More Smart Cities already surround us, and yet they are still incomprehensibly far from directly impacting everyday life. While current Smart Cities are often inaccessible, the experience of everyday citizens may be enhanced with a combination of the emerging technologies Digital Twins (DTs) and Situated Analytics. DTs represent their Physical Twin (PT) in the real world via models, simulations, (remotely) sensed data, context awareness, and interactions. However, interaction requires appropriate interfaces to address the complexity of the city. Ultimately, leveraging the potential of Smart Cities requires going beyond assembling the DT to be comprehensive and accessible. Situated Analytics allows for the anchoring of city information in its spatial context. We advance the concept of embedding the DT into the PT through Situated Analytics to form Fused Twins (FTs). This fusion allows access to data in the location that it is generated in an embodied context that can make the data more understandable. Prototypes of FTs are rapidly emerging from different domains, but Smart Cities represent the context with the most potential for FTs in the future. This paper reviews DTs, Situated Analytics, and Smart Cities as the foundations of FTs. Regarding DTs, we define five components (Physical, Data, Analytical, Virtual, and Connection environments) that we relate to several cognates (i.e., similar but different terms) from existing literature. Regarding Situated Analytics, we review the effects of user embodiment on cognition and cognitive load. Finally, we classify existing partial examples of FTs from the literature and address their construction from Augmented Reality, Geographic Information Systems, Building/City Information Models, and DTs and provide an overview of future direction △ Less

Submitted 8 June, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

Comments: Additional authors for new content required by reviewers. Added content (on situated analytics and smart cities). Reorganized sections. Expanded literature review. Expanded discussion

arXiv:1804.10312 [pdf]

doi 10.1371/journal.pone.0216335

Emergence of integrated institutions in a large population of self-governing communities

Authors: Seth Frey, Robert W Sumner

Abstract: Most aspects of our lives are governed by large, highly developed institutions that integrate several governance tasks under one authority structure. But theorists differ as to the mechanisms that drive the development of such concentrated governance systems from rudimentary beginnings. Is the emergence of integrated governance schemes a symptom of consolidation of authority by small status groups… ▽ More Most aspects of our lives are governed by large, highly developed institutions that integrate several governance tasks under one authority structure. But theorists differ as to the mechanisms that drive the development of such concentrated governance systems from rudimentary beginnings. Is the emergence of integrated governance schemes a symptom of consolidation of authority by small status groups? Or does integration occur because a complex institution has more potential responses to a complex environment? Here we examine the emergence of complex governance regimes in 5,000 sovereign, resource-constrained, self-governing online communities, ranging in scale from one to thousands of users. Each community begins with no community members and no governance infrastructure. As communities grow, they are subject to selection pressures that keep better managed servers better populated. We identify predictors of community success and test the hypothesis that governance complexity can enhance community fitness. We find that what predicts success depends on size: changes in complexity predict increased success with larger population servers. Specifically, governance rules in a large successful community are more numerous and broader in scope. They also tend to rely more on rules that concentrate power in administrators, and on rules that manage bad behavior and limited server resources. Overall, this work is consistent with theories that formal integrated governance systems emerge to organize collective responses to interdependent resource management problems, especially as factors such as population size exacerbate those problems. △ Less

Submitted 11 July, 2019; v1 submitted 26 April, 2018; originally announced April 2018.

Comments: contains supplement

ACM Class: H.5.3; J.4; K.6.4

arXiv:1304.1903 [pdf, other]

doi 10.1140/epjst/e2012-01689-8

Towards a living earth simulator

Authors: M. Paolucci, D. Kossman, R. Conte, P. Lukowicz, P. Argyrakis, A. Blandford, G. Bonelli, S. Anderson, S. de Freitas, B. Edmonds, N. Gilbert, M. Gross, J. Kohlhammer, P. Koumoutsakos, A. Krause, B. -O. Linnér, P. Slusallek, O. Sorkine, R. W. Sumner, D. Helbing

Abstract: The Living Earth Simulator (LES) is one of the core components of the FuturICT architecture. It will work as a federation of methods, tools, techniques and facilities supporting all of the FuturICT simulation-related activities to allow and encourage interactive exploration and understanding of societal issues. Society-relevant problems will be targeted by leaning on approaches based on complex sy… ▽ More The Living Earth Simulator (LES) is one of the core components of the FuturICT architecture. It will work as a federation of methods, tools, techniques and facilities supporting all of the FuturICT simulation-related activities to allow and encourage interactive exploration and understanding of societal issues. Society-relevant problems will be targeted by leaning on approaches based on complex systems theories and data science in tight interaction with the other components of FuturICT. The LES will evaluate and provide answers to real-world questions by taking into account multiple scenarios. It will build on present approaches such as agent-based simulation and modeling, multiscale modelling, statistical inference, and data mining, moving beyond disciplinary borders to achieve a new perspective on complex social systems. △ Less

Submitted 6 April, 2013; originally announced April 2013.

Journal ref: Eur. Phys. J. Special Topics vol. 214, pp. 77-108 (2012)

arXiv:cs/0603090 [pdf, ps, other]

doi 10.1016/j.aml.2006.04.022

Topological Grammars for Data Approximation

Authors: A. N. Gorban, N. R. Sumner, A. Y. Zinovyev

Abstract: A method of {\it topological grammars} is proposed for multidimensional data approximation. For data with complex topology we define a {\it principal cubic complex} of low dimension and given complexity that gives the best approximation for the dataset. This complex is a generalization of linear and non-linear principal manifolds and includes them as particular cases. The problem of optimal prin… ▽ More A method of {\it topological grammars} is proposed for multidimensional data approximation. For data with complex topology we define a {\it principal cubic complex} of low dimension and given complexity that gives the best approximation for the dataset. This complex is a generalization of linear and non-linear principal manifolds and includes them as particular cases. The problem of optimal principal complex construction is transformed into a series of minimization problems for quadratic functionals. These quadratic functionals have a physically transparent interpretation in terms of elastic energy. For the energy computation, the whole complex is represented as a system of nodes and springs. Topologically, the principal complex is a product of one-dimensional continuums (represented by graphs), and the grammars describe how these continuums transform during the process of optimal complex construction. This factorization of the whole process onto one-dimensional transformations using minimization of quadratic energy functionals allow us to construct efficient algorithms. △ Less

Submitted 28 July, 2006; v1 submitted 22 March, 2006; originally announced March 2006.

Comments: Corrected Journal version, Appl. Math. Lett., in press. 7 pgs., 2 figs

Journal ref: Applied Mathematics Letters 20 (2007) 382--386

Showing 1–9 of 9 results for author: Sumner, R