Search | arXiv e-print repository

$C^2M^3$: Cycle-Consistent Multi-Model Merging

Authors: Donato Crisostomi, Marco Fumero, Daniele Baieri, Florian Bernard, Emanuele Rodolà

Abstract: In this paper, we present a novel data-free method for merging neural networks in weight space. Differently from most existing works, our method optimizes for the permutations of network neurons globally across all layers. This allows us to enforce cycle consistency of the permutations when merging $N \geq 3$ models, allowing circular compositions of permutations to be computed without accumulatin… ▽ More In this paper, we present a novel data-free method for merging neural networks in weight space. Differently from most existing works, our method optimizes for the permutations of network neurons globally across all layers. This allows us to enforce cycle consistency of the permutations when merging $N \geq 3$ models, allowing circular compositions of permutations to be computed without accumulating error along the path. We qualitatively and quantitatively motivate the need for such a constraint, showing its benefits when merging sets of models in scenarios spanning varying architectures and datasets. We finally show that, when coupled with activation renormalization, our approach yields the best results in the task. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 22 pages, 16 figures

arXiv:2311.06547 [pdf, other]

From Charts to Atlas: Merging Latent Spaces into One

Authors: Donato Crisostomi, Irene Cannistraci, Luca Moschella, Pietro Barbiero, Marco Ciccone, Pietro Liò, Emanuele Rodolà

Abstract: Models trained on semantically related datasets and tasks exhibit comparable inter-sample relations within their latent spaces. We investigate in this study the aggregation of such latent spaces to create a unified space encompassing the combined information. To this end, we introduce Relative Latent Space Aggregation, a two-step approach that first renders the spaces comparable using relative rep… ▽ More Models trained on semantically related datasets and tasks exhibit comparable inter-sample relations within their latent spaces. We investigate in this study the aggregation of such latent spaces to create a unified space encompassing the combined information. To this end, we introduce Relative Latent Space Aggregation, a two-step approach that first renders the spaces comparable using relative representations, and then aggregates them via a simple mean. We carefully divide a classification problem into a series of learning tasks under three different settings: sharing samples, classes, or neither. We then train a model on each task and aggregate the resulting latent spaces. We compare the aggregated space with that derived from an end-to-end model trained over all tasks and show that the two spaces are similar. We then observe that the aggregated space is better suited for classification, and empirically demonstrate that it is due to the unique imprints left by task-specific embedders within the representations. We finally test our framework in scenarios where no shared region exists and show that it can still be used to merge the spaces, albeit with diminished benefits over naive merging. △ Less

Submitted 11 November, 2023; originally announced November 2023.

Comments: To appear in the NeurReps workshop @ NeurIPS 2023

arXiv:2311.06284 [pdf, other]

Efficient Generation of Multimodal Fluid Simulation Data

Authors: Daniele Baieri, Donato Crisostomi, Stefano Esposito, Filippo Maggioli, Emanuele Rodolà

Abstract: In this work, we introduce an efficient generation procedure to produce synthetic multi-modal datasets of fluid simulations. The procedure can reproduce the dynamics of fluid flows and allows for exploring and learning various properties of their complex behavior, from distinct perspectives and modalities. We employ our framework to generate a set of thoughtfully designed training datasets, which… ▽ More In this work, we introduce an efficient generation procedure to produce synthetic multi-modal datasets of fluid simulations. The procedure can reproduce the dynamics of fluid flows and allows for exploring and learning various properties of their complex behavior, from distinct perspectives and modalities. We employ our framework to generate a set of thoughtfully designed training datasets, which attempt to span specific fluid simulation scenarios in a meaningful way. The properties of our contributions are demonstrated by evaluating recently published algorithms for the neural fluid simulation and fluid inverse rendering tasks using our benchmark datasets. Our contribution aims to fulfill the community's need for standardized training data, fostering more reproducibile and robust research. △ Less

Submitted 8 March, 2024; v1 submitted 30 October, 2023; originally announced November 2023.

Comments: 10 pages, 7 figures

MSC Class: 68U20 ACM Class: I.2.6; I.3; I.6.3

arXiv:2206.03695 [pdf, other]

Metric Based Few-Shot Graph Classification

Authors: Donato Crisostomi, Simone Antonelli, Valentino Maiorca, Luca Moschella, Riccardo Marin, Emanuele Rodolà

Abstract: Many modern deep-learning techniques do not work without enormous datasets. At the same time, several fields demand methods working in scarcity of data. This problem is even more complex when the samples have varying structures, as in the case of graphs. Graph representation learning techniques have recently proven successful in a variety of domains. Nevertheless, the employed architectures perfor… ▽ More Many modern deep-learning techniques do not work without enormous datasets. At the same time, several fields demand methods working in scarcity of data. This problem is even more complex when the samples have varying structures, as in the case of graphs. Graph representation learning techniques have recently proven successful in a variety of domains. Nevertheless, the employed architectures perform miserably when faced with data scarcity. On the other hand, few-shot learning allows employing modern deep learning models in scarce data regimes without waiving their effectiveness. In this work, we tackle the problem of few-shot graph classification, showing that equip** a simple distance metric learning baseline with a state-of-the-art graph embedder allows to obtain competitive results on the task.While the simplicity of the architecture is enough to outperform more complex ones, it also allows straightforward additions. To this end, we show that additional improvements may be obtained by encouraging a task-conditioned embedding space. Finally, we propose a MixUp-based online data augmentation technique acting in the latent space and show its effectiveness on the task. △ Less

Submitted 4 January, 2023; v1 submitted 8 June, 2022; originally announced June 2022.

Comments: To appear in Learning on Graphs (LoG) 2022

Showing 1–4 of 4 results for author: Crisostomi, D