-
Breeding Programs Optimization with Reinforcement Learning
Authors:
Omar G. Younis,
Luca Corinzia,
Ioannis N. Athanasiadis,
Andreas Krause,
Joachim M. Buhmann,
Matteo Turchetta
Abstract:
Crop breeding is crucial in improving agricultural productivity while potentially decreasing land usage, greenhouse gas emissions, and water consumption. However, breeding programs are challenging due to long turnover times, high-dimensional decision spaces, long-term objectives, and the need to adapt to rapid climate change. This paper introduces the use of Reinforcement Learning (RL) to optimize…
▽ More
Crop breeding is crucial in improving agricultural productivity while potentially decreasing land usage, greenhouse gas emissions, and water consumption. However, breeding programs are challenging due to long turnover times, high-dimensional decision spaces, long-term objectives, and the need to adapt to rapid climate change. This paper introduces the use of Reinforcement Learning (RL) to optimize simulated crop breeding programs. RL agents are trained to make optimal crop selection and cross-breeding decisions based on genetic information. To benchmark RL-based breeding algorithms, we introduce a suite of Gym environments. The study demonstrates the superiority of RL techniques over standard practices in terms of genetic gain when simulated in silico using real-world genomic maize data.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Gaussian Representation for Deformable Image Registration
Authors:
Jihe Li,
Fabian Zhang,
Xia Li,
Tianhao Zhang,
Ye Zhang,
Joachim Buhmann
Abstract:
Deformable image registration (DIR) is a fundamental task in radiotherapy, with existing methods often struggling to balance computational efficiency, registration accuracy, and speed effectively. We introduce a novel DIR approach employing parametric 3D Gaussian control points achieving a better tradeoff. It provides an explicit and flexible representation for spatial deformation fields between 3…
▽ More
Deformable image registration (DIR) is a fundamental task in radiotherapy, with existing methods often struggling to balance computational efficiency, registration accuracy, and speed effectively. We introduce a novel DIR approach employing parametric 3D Gaussian control points achieving a better tradeoff. It provides an explicit and flexible representation for spatial deformation fields between 3D volumetric medical images, producing a displacement vector field (DVF) across all volumetric positions. The movement of individual voxels is derived using linear blend skinning (LBS) through localized interpolation of transformations associated with neighboring Gaussians. This interpolation strategy not only simplifies the determination of voxel motions but also acts as an effective regularization technique. Our approach incorporates a unified optimization process through backpropagation, enabling iterative learning of both the parameters of the 3D Gaussians and their transformations. Additionally, the density of Gaussians is adjusted adaptively during the learning phase to accommodate varying degrees of motion complexity. We validated our approach on the 4D-CT lung DIR-Lab and cardiac ACDC datasets, achieving an average target registration error (TRE) of 1.06 mm within a much-improved processing time of 2.43 seconds for the DIR-Lab dataset over existing methods, demonstrating significant advancements in both accuracy and efficiency.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
TAMBRIDGE: Bridging Frame-Centered Tracking and 3D Gaussian Splatting for Enhanced SLAM
Authors:
Peifeng Jiang,
Hong Liu,
Xia Li,
Ti Wang,
Fabian Zhang,
Joachim M. Buhmann
Abstract:
The limited robustness of 3D Gaussian Splatting (3DGS) to motion blur and camera noise, along with its poor real-time performance, restricts its application in robotic SLAM tasks. Upon analysis, the primary causes of these issues are the density of views with motion blur and the cumulative errors in dense pose estimation from calculating losses based on noisy original images and rendering results,…
▽ More
The limited robustness of 3D Gaussian Splatting (3DGS) to motion blur and camera noise, along with its poor real-time performance, restricts its application in robotic SLAM tasks. Upon analysis, the primary causes of these issues are the density of views with motion blur and the cumulative errors in dense pose estimation from calculating losses based on noisy original images and rendering results, which increase the difficulty of 3DGS rendering convergence. Thus, a cutting-edge 3DGS-based SLAM system is introduced, leveraging the efficiency and flexibility of 3DGS to achieve real-time performance while remaining robust against sensor noise, motion blur, and the challenges posed by long-session SLAM. Central to this approach is the Fusion Bridge module, which seamlessly integrates tracking-centered ORB Visual Odometry with map**-centered online 3DGS. Precise pose initialization is enabled by this module through joint optimization of re-projection and rendering loss, as well as strategic view selection, enhancing rendering convergence in large-scale scenes. Extensive experiments demonstrate state-of-the-art rendering quality and localization accuracy, positioning this system as a promising solution for real-world robotics applications that require stable, near-real-time performance. Our project is available at https://ZeldaFromHeaven.github.io/TAMBRIDGE/
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
CPT-Interp: Continuous sPatial and Temporal Motion Modeling for 4D Medical Image Interpolation
Authors:
Xia Li,
Runzhao Yang,
Xiangtai Li,
Antony Lomax,
Ye Zhang,
Joachim Buhmann
Abstract:
Motion information from 4D medical imaging offers critical insights into dynamic changes in patient anatomy for clinical assessments and radiotherapy planning and, thereby, enhances the capabilities of 3D image analysis. However, inherent physical and technical constraints of imaging hardware often necessitate a compromise between temporal resolution and image quality. Frame interpolation emerges…
▽ More
Motion information from 4D medical imaging offers critical insights into dynamic changes in patient anatomy for clinical assessments and radiotherapy planning and, thereby, enhances the capabilities of 3D image analysis. However, inherent physical and technical constraints of imaging hardware often necessitate a compromise between temporal resolution and image quality. Frame interpolation emerges as a pivotal solution to this challenge. Previous methods often suffer from discretion when they estimate the intermediate motion and execute the forward war**. In this study, we draw inspiration from fluid mechanics to propose a novel approach for continuously modeling patient anatomic motion using implicit neural representation. It ensures both spatial and temporal continuity, effectively bridging Eulerian and Lagrangian specifications together to naturally facilitate continuous frame interpolation. Our experiments across multiple datasets underscore the method's superior accuracy and speed. Furthermore, as a case-specific optimization (training-free) approach, it circumvents the need for extensive datasets and addresses model generalization issues.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models
Authors:
Seyedmorteza Sadat,
Jakob Buhmann,
Derek Bradley,
Otmar Hilliges,
Romann M. Weber
Abstract:
Advances in latent diffusion models (LDMs) have revolutionized high-resolution image generation, but the design space of the autoencoder that is central to these systems remains underexplored. In this paper, we introduce LiteVAE, a family of autoencoders for LDMs that leverage the 2D discrete wavelet transform to enhance scalability and computational efficiency over standard variational autoencode…
▽ More
Advances in latent diffusion models (LDMs) have revolutionized high-resolution image generation, but the design space of the autoencoder that is central to these systems remains underexplored. In this paper, we introduce LiteVAE, a family of autoencoders for LDMs that leverage the 2D discrete wavelet transform to enhance scalability and computational efficiency over standard variational autoencoders (VAEs) with no sacrifice in output quality. We also investigate the training methodologies and the decoder architecture of LiteVAE and propose several enhancements that improve the training dynamics and reconstruction quality. Our base LiteVAE model matches the quality of the established VAEs in current LDMs with a six-fold reduction in encoder parameters, leading to faster training and lower GPU memory requirements, while our larger model outperforms VAEs of comparable complexity across all evaluated metrics (rFID, LPIPS, PSNR, and SSIM).
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Continuous sPatial-Temporal Deformable Image Registration (CPT-DIR) for motion modelling in radiotherapy: beyond classic voxel-based methods
Authors:
Xia Li,
Muheng Li,
Antony Lomax,
Joachim Buhmann,
Ye Zhang
Abstract:
Background and purpose: Deformable image registration (DIR) is a crucial tool in radiotherapy for extracting and modelling organ motion. However, when significant changes and sliding boundaries are present, it faces compromised accuracy and uncertainty, determining the subsequential contour propagation and dose accumulation procedures. Materials and methods: We propose an implicit neural represent…
▽ More
Background and purpose: Deformable image registration (DIR) is a crucial tool in radiotherapy for extracting and modelling organ motion. However, when significant changes and sliding boundaries are present, it faces compromised accuracy and uncertainty, determining the subsequential contour propagation and dose accumulation procedures. Materials and methods: We propose an implicit neural representation (INR)-based approach modelling motion continuously in both space and time, named Continues-sPatial-Temporal DIR (CPT-DIR). This method uses a multilayer perception (MLP) network to map 3D coordinate (x,y,z) to its corresponding velocity vector (vx,vy,vz). The displacement vectors (dx,dy,dz) are then calculated by integrating velocity vectors over time. The MLP's parameters can rapidly adapt to new cases without pre-training, enhancing optimisation. The DIR's performance was tested on the DIR-Lab dataset of 10 lung 4DCT cases, using metrics of landmark accuracy (TRE), contour conformity (Dice) and image similarity (MAE). Results: The proposed CPT-DIR can reduce landmark TRE from 2.79mm to 0.99mm, outperforming B-splines' results for all cases. The MAE of the whole-body region improves from 35.46HU to 28.99HU. Furthermore, CPT-DIR surpasses B-splines for accuracy in the sliding boundary region, lowering MAE and increasing Dice coefficients for the ribcage from 65.65HU and 90.41% to 42.04HU and 90.56%, versus 75.40HU and 89.30% without registration. Meanwhile, CPT-DIR offers significant speed advantages, completing in under 15 seconds compared to a few minutes with the conventional B-splines method. Conclusion: Leveraging the continuous representations, the CPT-DIR method significantly enhances registration accuracy, automation and speed, outperforming traditional B-splines in landmark and contour precision, particularly in the challenging areas.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Point-In-Context: Understanding Point Cloud via In-Context Learning
Authors:
Mengyuan Liu,
Zhongbin Fang,
Xia Li,
Joachim M. Buhmann,
Xiangtai Li,
Chen Change Loy
Abstract:
With the emergence of large-scale models trained on diverse datasets, in-context learning has emerged as a promising paradigm for multitasking, notably in natural language processing and image processing. However, its application in 3D point cloud tasks remains largely unexplored. In this work, we introduce Point-In-Context (PIC), a novel framework for 3D point cloud understanding via in-context l…
▽ More
With the emergence of large-scale models trained on diverse datasets, in-context learning has emerged as a promising paradigm for multitasking, notably in natural language processing and image processing. However, its application in 3D point cloud tasks remains largely unexplored. In this work, we introduce Point-In-Context (PIC), a novel framework for 3D point cloud understanding via in-context learning. We address the technical challenge of effectively extending masked point modeling to 3D point clouds by introducing a Joint Sampling module and proposing a vanilla version of PIC called Point-In-Context-Generalist (PIC-G). PIC-G is designed as a generalist model for various 3D point cloud tasks, with inputs and outputs modeled as coordinates. In this paradigm, the challenging segmentation task is achieved by assigning label points with XYZ coordinates for each category; the final prediction is then chosen based on the label point closest to the predictions. To break the limitation by the fixed label-coordinate assignment, which has poor generalization upon novel classes, we propose two novel training strategies, In-Context Labeling and In-Context Enhancing, forming an extended version of PIC named Point-In-Context-Segmenter (PIC-S), targeting improving dynamic context labeling and model training. By utilizing dynamic in-context labels and extra in-context pairs, PIC-S achieves enhanced performance and generalization capability in and across part segmentation datasets. PIC is a general framework so that other tasks or datasets can be seamlessly introduced into our PIC through a unified data format. We conduct extensive experiments to validate the versatility and adaptability of our proposed methods in handling a wide range of tasks and segmenting multi-datasets. Our PIC-S is capable of generalizing unseen datasets and performing novel part segmentation by customizing prompts.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Hypernetwork-Driven Model Fusion for Federated Domain Generalization
Authors:
Marc Bartholet,
Taehyeon Kim,
Ami Beuret,
Se-Young Yun,
Joachim M. Buhmann
Abstract:
Federated Learning (FL) faces significant challenges with domain shifts in heterogeneous data, degrading performance. Traditional domain generalization aims to learn domain-invariant features, but the federated nature of model averaging often limits this due to its linear aggregation of local learning. To address this, we propose a robust framework, coined as hypernetwork-based Federated Fusion (h…
▽ More
Federated Learning (FL) faces significant challenges with domain shifts in heterogeneous data, degrading performance. Traditional domain generalization aims to learn domain-invariant features, but the federated nature of model averaging often limits this due to its linear aggregation of local learning. To address this, we propose a robust framework, coined as hypernetwork-based Federated Fusion (hFedF), using hypernetworks for non-linear aggregation, facilitating generalization to unseen domains. Our method employs client-specific embeddings and gradient alignment techniques to manage domain generalization effectively. Evaluated in both zero-shot and few-shot settings, hFedF demonstrates superior performance in handling domain shifts. Comprehensive comparisons on PACS, Office-Home, and VLCS datasets show that hFedF consistently achieves the highest in-domain and out-of-domain accuracy with reliable predictions. Our study contributes significantly to the under-explored field of Federated Domain Generalization (FDG), setting a new benchmark for performance in this area.
△ Less
Submitted 28 May, 2024; v1 submitted 10 February, 2024;
originally announced February 2024.
-
Neural Graphics Primitives-based Deformable Image Registration for On-the-fly Motion Extraction
Authors:
Xia Li,
Fabian Zhang,
Muheng Li,
Damien Weber,
Antony Lomax,
Joachim Buhmann,
Ye Zhang
Abstract:
Intra-fraction motion in radiotherapy is commonly modeled using deformable image registration (DIR). However, existing methods often struggle to balance speed and accuracy, limiting their applicability in clinical scenarios. This study introduces a novel approach that harnesses Neural Graphics Primitives (NGP) to optimize the displacement vector field (DVF). Our method leverages learned primitives…
▽ More
Intra-fraction motion in radiotherapy is commonly modeled using deformable image registration (DIR). However, existing methods often struggle to balance speed and accuracy, limiting their applicability in clinical scenarios. This study introduces a novel approach that harnesses Neural Graphics Primitives (NGP) to optimize the displacement vector field (DVF). Our method leverages learned primitives, processed as splats, and interpolates within space using a shallow neural network. Uniquely, it enables self-supervised optimization at an ultra-fast speed, negating the need for pre-training on extensive datasets and allowing seamless adaptation to new cases. We validated this approach on the 4D-CT lung dataset DIR-lab, achieving a target registration error (TRE) of 1.15\pm1.15 mm within a remarkable time of 1.77 seconds. Notably, our method also addresses the sliding boundary problem, a common challenge in conventional DIR methods.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
PersonalityChat: Conversation Distillation for Personalized Dialog Modeling with Facts and Traits
Authors:
Ehsan Lotfi,
Maxime De Bruyn,
Jeska Buhmann,
Walter Daelemans
Abstract:
The new wave of Large Language Models (LLM) has offered an efficient tool to curate sizeable conversational datasets. So far studies have mainly focused on task-oriented or generic open-domain dialogs, and have not fully explored the ability of LLMs in following complicated prompts. In this work, we focus on personalization, and employ LLMs to curate a dataset which is difficult and costly to crow…
▽ More
The new wave of Large Language Models (LLM) has offered an efficient tool to curate sizeable conversational datasets. So far studies have mainly focused on task-oriented or generic open-domain dialogs, and have not fully explored the ability of LLMs in following complicated prompts. In this work, we focus on personalization, and employ LLMs to curate a dataset which is difficult and costly to crowd-source: PersonalityChat is a synthetic conversational dataset based upon the popular PersonaChat dataset, but conditioned on both personas and (Big-5) personality traits. Evaluating models fine-tuned on this dataset, we show that the personality trait labels can be used for trait-based personalization of generative dialogue models. We also perform a head-to-head comparison between PersonalityChat and PersonaChat, and show that training on the distilled dataset results in more fluent and coherent dialog agents in the small-model regime.
△ Less
Submitted 14 January, 2024;
originally announced January 2024.
-
Invariant Anomaly Detection under Distribution Shifts: A Causal Perspective
Authors:
João B. S. Carvalho,
Mengtao Zhang,
Robin Geyer,
Carlos Cotrini,
Joachim M. Buhmann
Abstract:
Anomaly detection (AD) is the machine learning task of identifying highly discrepant abnormal samples by solely relying on the consistency of the normal training samples. Under the constraints of a distribution shift, the assumption that training samples and test samples are drawn from the same distribution breaks down. In this work, by leveraging tools from causal inference we attempt to increase…
▽ More
Anomaly detection (AD) is the machine learning task of identifying highly discrepant abnormal samples by solely relying on the consistency of the normal training samples. Under the constraints of a distribution shift, the assumption that training samples and test samples are drawn from the same distribution breaks down. In this work, by leveraging tools from causal inference we attempt to increase the resilience of anomaly detection models to different kinds of distribution shifts. We begin by elucidating a simple yet necessary statistical property that ensures invariant representations, which is critical for robust AD under both domain and covariate shifts. From this property, we derive a regularization term which, when minimized, leads to partial distribution invariance across environments. Through extensive experimental evaluation on both synthetic and real-world tasks, covering a range of six different AD methods, we demonstrated significant improvements in out-of-distribution performance. Under both covariate and domain shift, models regularized with our proposed term showed marked increased robustness. Code is available at: https://github.com/JoaoCarv/invariant-anomaly-detection.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
CADS: Unleashing the Diversity of Diffusion Models through Condition-Annealed Sampling
Authors:
Seyedmorteza Sadat,
Jakob Buhmann,
Derek Bradley,
Otmar Hilliges,
Romann M. Weber
Abstract:
While conditional diffusion models are known to have good coverage of the data distribution, they still face limitations in output diversity, particularly when sampled with a high classifier-free guidance scale for optimal image quality or when trained on small datasets. We attribute this problem to the role of the conditioning signal in inference and offer an improved sampling strategy for diffus…
▽ More
While conditional diffusion models are known to have good coverage of the data distribution, they still face limitations in output diversity, particularly when sampled with a high classifier-free guidance scale for optimal image quality or when trained on small datasets. We attribute this problem to the role of the conditioning signal in inference and offer an improved sampling strategy for diffusion models that can increase generation diversity, especially at high guidance scales, with minimal loss of sample quality. Our sampling strategy anneals the conditioning signal by adding scheduled, monotonically decreasing Gaussian noise to the conditioning vector during inference to balance diversity and condition alignment. Our Condition-Annealed Diffusion Sampler (CADS) can be used with any pretrained model and sampling algorithm, and we show that it boosts the diversity of diffusion models in various conditional generation tasks. Further, using an existing pretrained diffusion model, CADS achieves a new state-of-the-art FID of 1.70 and 2.31 for class-conditional ImageNet generation at 256$\times$256 and 512$\times$512 respectively.
△ Less
Submitted 13 May, 2024; v1 submitted 26 October, 2023;
originally announced October 2023.
-
Regularizing Adversarial Imitation Learning Using Causal Invariance
Authors:
Ivan Ovinnikov,
Joachim M. Buhmann
Abstract:
Imitation learning methods are used to infer a policy in a Markov decision process from a dataset of expert demonstrations by minimizing a divergence measure between the empirical state occupancy measures of the expert and the policy. The guiding signal to the policy is provided by the discriminator used as part of an versarial optimization procedure. We observe that this model is prone to absorbi…
▽ More
Imitation learning methods are used to infer a policy in a Markov decision process from a dataset of expert demonstrations by minimizing a divergence measure between the empirical state occupancy measures of the expert and the policy. The guiding signal to the policy is provided by the discriminator used as part of an versarial optimization procedure. We observe that this model is prone to absorbing spurious correlations present in the expert data. To alleviate this issue, we propose to use causal invariance as a regularization principle for adversarial training of these models. The regularization objective is applicable in a straightforward manner to existing adversarial imitation frameworks. We demonstrate the efficacy of the regularized formulation in an illustrative two-dimensional setting as well as a number of high-dimensional robot locomotion benchmark tasks.
△ Less
Submitted 17 August, 2023;
originally announced August 2023.
-
Improving Explainability of Disentangled Representations using Multipath-Attribution Map**s
Authors:
Lukas Klein,
João B. S. Carvalho,
Mennatallah El-Assady,
Paolo Penna,
Joachim M. Buhmann,
Paul F. Jaeger
Abstract:
Explainable AI aims to render model behavior understandable by humans, which can be seen as an intermediate step in extracting causal relations from correlative patterns. Due to the high risk of possible fatal decisions in image-based clinical diagnostics, it is necessary to integrate explainable AI into these safety-critical systems. Current explanatory methods typically assign attribution scores…
▽ More
Explainable AI aims to render model behavior understandable by humans, which can be seen as an intermediate step in extracting causal relations from correlative patterns. Due to the high risk of possible fatal decisions in image-based clinical diagnostics, it is necessary to integrate explainable AI into these safety-critical systems. Current explanatory methods typically assign attribution scores to pixel regions in the input image, indicating their importance for a model's decision. However, they fall short when explaining why a visual feature is used. We propose a framework that utilizes interpretable disentangled representations for downstream-task prediction. Through visualizing the disentangled representations, we enable experts to investigate possible causation effects by leveraging their domain knowledge. Additionally, we deploy a multi-path attribution map** for enriching and validating explanations. We demonstrate the effectiveness of our approach on a synthetic benchmark suite and two medical datasets. We show that the framework not only acts as a catalyst for causal relation extraction but also enhances model robustness by enabling shortcut detection without the need for testing under distribution shifts.
△ Less
Submitted 15 June, 2023;
originally announced June 2023.
-
Explore In-Context Learning for 3D Point Cloud Understanding
Authors:
Zhongbin Fang,
Xiangtai Li,
Xia Li,
Joachim M. Buhmann,
Chen Change Loy,
Mengyuan Liu
Abstract:
With the rise of large-scale models trained on broad data, in-context learning has become a new learning paradigm that has demonstrated significant potential in natural language processing and computer vision tasks. Meanwhile, in-context learning is still largely unexplored in the 3D point cloud domain. Although masked modeling has been successfully applied for in-context learning in 2D vision, di…
▽ More
With the rise of large-scale models trained on broad data, in-context learning has become a new learning paradigm that has demonstrated significant potential in natural language processing and computer vision tasks. Meanwhile, in-context learning is still largely unexplored in the 3D point cloud domain. Although masked modeling has been successfully applied for in-context learning in 2D vision, directly extending it to 3D point clouds remains a formidable challenge. In the case of point clouds, the tokens themselves are the point cloud positions (coordinates) that are masked during inference. Moreover, position embedding in previous works may inadvertently introduce information leakage. To address these challenges, we introduce a novel framework, named Point-In-Context, designed especially for in-context learning in 3D point clouds, where both inputs and outputs are modeled as coordinates for each task. Additionally, we propose the Joint Sampling module, carefully designed to work in tandem with the general point sampling operator, effectively resolving the aforementioned technical issues. We conduct extensive experiments to validate the versatility and adaptability of our proposed methods in handling a wide range of tasks.
△ Less
Submitted 27 December, 2023; v1 submitted 14 June, 2023;
originally announced June 2023.
-
How to Enable Uncertainty Estimation in Proximal Policy Optimization
Authors:
Eugene Bykovets,
Yannick Metz,
Mennatallah El-Assady,
Daniel A. Keim,
Joachim M. Buhmann
Abstract:
While deep reinforcement learning (RL) agents have showcased strong results across many domains, a major concern is their inherent opaqueness and the safety of such systems in real-world use cases. To overcome these issues, we need agents that can quantify their uncertainty and detect out-of-distribution (OOD) states. Existing uncertainty estimation techniques, like Monte-Carlo Dropout or Deep Ens…
▽ More
While deep reinforcement learning (RL) agents have showcased strong results across many domains, a major concern is their inherent opaqueness and the safety of such systems in real-world use cases. To overcome these issues, we need agents that can quantify their uncertainty and detect out-of-distribution (OOD) states. Existing uncertainty estimation techniques, like Monte-Carlo Dropout or Deep Ensembles, have not seen widespread adoption in on-policy deep RL. We posit that this is due to two reasons: concepts like uncertainty and OOD states are not well defined compared to supervised learning, especially for on-policy RL methods. Secondly, available implementations and comparative studies for uncertainty estimation methods in RL have been limited. To overcome the first gap, we propose definitions of uncertainty and OOD for Actor-Critic RL algorithms, namely, proximal policy optimization (PPO), and present possible applicable measures. In particular, we discuss the concepts of value and policy uncertainty. The second point is addressed by implementing different uncertainty estimation methods and comparing them across a number of environments. The OOD detection performance is evaluated via a custom evaluation benchmark of in-distribution (ID) and OOD states for various RL environments. We identify a trade-off between reward and OOD detection performance. To overcome this, we formulate a Pareto optimization problem in which we simultaneously optimize for reward and OOD detection performance. We show experimentally that the recently proposed method of Masksembles strikes a favourable balance among the survey methods, enabling high-quality uncertainty estimation and OOD detection while matching the performance of original RL agents.
△ Less
Submitted 7 October, 2022;
originally announced October 2022.
-
Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs
Authors:
Đorđe Miladinović,
Kumar Shridhar,
Kushal Jain,
Max B. Paulus,
Joachim M. Buhmann,
Mrinmaya Sachan,
Carl Allen
Abstract:
In principle, applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning. However, training sequence VAEs is challenging: autoregressive decoders can often explain the data without utilizing the latent space, known as posterior collapse. To mitigate this, state-of-the-art models weaken the pow…
▽ More
In principle, applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning. However, training sequence VAEs is challenging: autoregressive decoders can often explain the data without utilizing the latent space, known as posterior collapse. To mitigate this, state-of-the-art models weaken the powerful decoder by applying uniformly random dropout to the decoder input. We show theoretically that this removes pointwise mutual information provided by the decoder input, which is compensated for by utilizing the latent space. We then propose an adversarial training strategy to achieve information-based stochastic dropout. Compared to uniform dropout on standard text benchmark datasets, our targeted approach increases both sequence modeling performance and the information captured in the latent space.
△ Less
Submitted 16 December, 2022; v1 submitted 26 September, 2022;
originally announced September 2022.
-
Open-Domain Dialog Evaluation using Follow-Ups Likelihood
Authors:
Maxime De Bruyn,
Ehsan Lotfi,
Jeska Buhmann,
Walter Daelemans
Abstract:
Automatic evaluation of open-domain dialogs remains an unsolved problem. Moreover, existing methods do not correlate strongly with human annotations. This paper presents a new automated evaluation method using follow-ups: we measure the probability that a language model will continue the conversation with a fixed set of follow-ups (e.g., not really relevant here, what are you trying to say). When…
▽ More
Automatic evaluation of open-domain dialogs remains an unsolved problem. Moreover, existing methods do not correlate strongly with human annotations. This paper presents a new automated evaluation method using follow-ups: we measure the probability that a language model will continue the conversation with a fixed set of follow-ups (e.g., not really relevant here, what are you trying to say). When compared against twelve existing methods, our new evaluation achieves the highest correlation with human evaluations.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
BARReL: Bottleneck Attention for Adversarial Robustness in Vision-Based Reinforcement Learning
Authors:
Eugene Bykovets,
Yannick Metz,
Mennatallah El-Assady,
Daniel A. Keim,
Joachim M. Buhmann
Abstract:
Robustness to adversarial perturbations has been explored in many areas of computer vision. This robustness is particularly relevant in vision-based reinforcement learning, as the actions of autonomous agents might be safety-critic or impactful in the real world. We investigate the susceptibility of vision-based reinforcement learning agents to gradient-based adversarial attacks and evaluate a pot…
▽ More
Robustness to adversarial perturbations has been explored in many areas of computer vision. This robustness is particularly relevant in vision-based reinforcement learning, as the actions of autonomous agents might be safety-critic or impactful in the real world. We investigate the susceptibility of vision-based reinforcement learning agents to gradient-based adversarial attacks and evaluate a potential defense. We observe that Bottleneck Attention Modules (BAM) included in CNN architectures can act as potential tools to increase robustness against adversarial attacks. We show how learned attention maps can be used to recover activations of a convolutional layer by restricting the spatial activations to salient regions. Across a number of RL environments, BAM-enhanced architectures show increased robustness during inference. Finally, we discuss potential future research directions.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Gated Domain Units for Multi-source Domain Generalization
Authors:
Simon Föll,
Alina Dubatovka,
Eugen Ernst,
Siu Lun Chau,
Martin Maritsch,
Patrik Okanovic,
Gudrun Thäter,
Joachim M. Buhmann,
Felix Wortmann,
Krikamol Muandet
Abstract:
The phenomenon of distribution shift (DS) occurs when a dataset at test time differs from the dataset at training time, which can significantly impair the performance of a machine learning model in practical settings due to a lack of knowledge about the data's distribution at test time. To address this problem, we postulate that real-world distributions are composed of latent Invariant Elementary…
▽ More
The phenomenon of distribution shift (DS) occurs when a dataset at test time differs from the dataset at training time, which can significantly impair the performance of a machine learning model in practical settings due to a lack of knowledge about the data's distribution at test time. To address this problem, we postulate that real-world distributions are composed of latent Invariant Elementary Distributions (I.E.D) across different domains. This assumption implies an invariant structure in the solution space that enables knowledge transfer to unseen domains. To exploit this property for domain generalization, we introduce a modular neural network layer consisting of Gated Domain Units (GDUs) that learn a representation for each latent elementary distribution. During inference, a weighted ensemble of learning machines can be created by comparing new observations with the representations of each elementary distribution. Our flexible framework also accommodates scenarios where explicit domain information is not present. Extensive experiments on image, text, and graph data show consistent performance improvement on out-of-training target domains. These findings support the practicality of the I.E.D assumption and the effectiveness of GDUs for domain generalisation.
△ Less
Submitted 16 May, 2023; v1 submitted 24 June, 2022;
originally announced June 2022.
-
Teach Me What to Say and I Will Learn What to Pick: Unsupervised Knowledge Selection Through Response Generation with Pretrained Generative Models
Authors:
Ehsan Lotfi,
Maxime De Bruyn,
Jeska Buhmann,
Walter Daelemans
Abstract:
Knowledge Grounded Conversation Models (KGCM) are usually based on a selection/retrieval module and a generation module, trained separately or simultaneously, with or without having access to a gold knowledge option. With the introduction of large pre-trained generative models, the selection and generation part have become more and more entangled, shifting the focus towards enhancing knowledge inc…
▽ More
Knowledge Grounded Conversation Models (KGCM) are usually based on a selection/retrieval module and a generation module, trained separately or simultaneously, with or without having access to a gold knowledge option. With the introduction of large pre-trained generative models, the selection and generation part have become more and more entangled, shifting the focus towards enhancing knowledge incorporation (from multiple sources) instead of trying to pick the best knowledge option. These approaches however depend on knowledge labels and/or a separate dense retriever for their best performance. In this work we study the unsupervised selection abilities of pre-trained generative models (e.g. BART) and show that by adding a score-and-aggregate module between encoder and decoder, they are capable of learning to pick the proper knowledge through minimising the language modelling loss (i.e. without having access to knowledge labels). Trained as such, our model - K-Mine - shows competitive selection and generation performance against models that benefit from knowledge labels and/or separate dense retriever.
△ Less
Submitted 5 October, 2021;
originally announced October 2021.
-
MFAQ: a Multilingual FAQ Dataset
Authors:
Maxime De Bruyn,
Ehsan Lotfi,
Jeska Buhmann,
Walter Daelemans
Abstract:
In this paper, we present the first multilingual FAQ dataset publicly available. We collected around 6M FAQ pairs from the web, in 21 different languages. Although this is significantly larger than existing FAQ retrieval datasets, it comes with its own challenges: duplication of content and uneven distribution of topics. We adopt a similar setup as Dense Passage Retrieval (DPR) and test various bi…
▽ More
In this paper, we present the first multilingual FAQ dataset publicly available. We collected around 6M FAQ pairs from the web, in 21 different languages. Although this is significantly larger than existing FAQ retrieval datasets, it comes with its own challenges: duplication of content and uneven distribution of topics. We adopt a similar setup as Dense Passage Retrieval (DPR) and test various bi-encoders on this dataset. Our experiments reveal that a multilingual model based on XLM-RoBERTa achieves the best results, except for English. Lower resources languages seem to learn from one another as a multilingual model achieves a higher MRR than language-specific ones. Our qualitative analysis reveals the brittleness of the model on simple word changes. We publicly release our dataset, model and training script.
△ Less
Submitted 5 October, 2021; v1 submitted 27 September, 2021;
originally announced September 2021.
-
ConveRT for FAQ Answering
Authors:
Maxime De Bruyn,
Ehsan Lotfi,
Jeska Buhmann,
Walter Daelemans
Abstract:
Knowledgeable FAQ chatbots are a valuable resource to any organization. While powerful and efficient retrieval-based models exist for English, it is rarely the case for other languages for which the same amount of training data is not available. In this paper, we propose a novel pre-training procedure to adapt ConveRT, an English conversational retriever model, to other languages with less trainin…
▽ More
Knowledgeable FAQ chatbots are a valuable resource to any organization. While powerful and efficient retrieval-based models exist for English, it is rarely the case for other languages for which the same amount of training data is not available. In this paper, we propose a novel pre-training procedure to adapt ConveRT, an English conversational retriever model, to other languages with less training data available. We apply it for the first time to the task of Dutch FAQ answering related to the COVID-19 vaccine. We show it performs better than an open-source alternative in both a low-data regime and a high-data regime.
△ Less
Submitted 14 October, 2021; v1 submitted 2 August, 2021;
originally announced August 2021.
-
Spatially Dependent U-Nets: Highly Accurate Architectures for Medical Imaging Segmentation
Authors:
João B. S. Carvalho,
João A. Santinha,
Đorđe Miladinović,
Joachim M. Buhmann
Abstract:
In clinical practice, regions of interest in medical imaging often need to be identified through a process of precise image segmentation. The quality of this image segmentation step critically affects the subsequent clinical assessment of the patient status. To enable high accuracy, automatic image segmentation, we introduce a novel deep neural network architecture that exploits the inherent spati…
▽ More
In clinical practice, regions of interest in medical imaging often need to be identified through a process of precise image segmentation. The quality of this image segmentation step critically affects the subsequent clinical assessment of the patient status. To enable high accuracy, automatic image segmentation, we introduce a novel deep neural network architecture that exploits the inherent spatial coherence of anatomical structures and is well equipped to capture long-range spatial dependencies in the segmented pixel/voxel space. In contrast to the state-of-the-art solutions based on convolutional layers, our approach leverages on recently introduced spatial dependency layers that have an unbounded receptive field and explicitly model the inductive bias of spatial coherence. Our method performs favourably to commonly used U-Net and U-Net++ architectures as demonstrated by improved Dice and Jaccardscore in three different medical segmentation tasks: nuclei segmentation in microscopy images, polyp segmentation in colonoscopy videos, and liver segmentation in abdominal CT scans.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.
-
Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling
Authors:
Đorđe Miladinović,
Aleksandar Stanić,
Stefan Bauer,
Jürgen Schmidhuber,
Joachim M. Buhmann
Abstract:
How to improve generative modeling by better exploiting spatial regularities and coherence in images? We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs). In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way, using a sequential gating-based mechani…
▽ More
How to improve generative modeling by better exploiting spatial regularities and coherence in images? We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs). In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way, using a sequential gating-based mechanism that distributes contextual information across 2-D space. We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation over baseline convolutional architectures and the state-of-the-art among the models within the same class. Furthermore, we demonstrate that SDN can be applied to large images by synthesizing samples of high quality and coherence. In a vanilla VAE setting, we find that a powerful SDN decoder also improves learning disentangled representations, indicating that neural architectures play an important role in this task. Our results suggest favoring spatial dependency over convolutional layers in various VAE settings. The accompanying source code is given at https://github.com/djordjemila/sdn.
△ Less
Submitted 16 March, 2021;
originally announced March 2021.
-
On maximum-likelihood estimation in the all-or-nothing regime
Authors:
Luca Corinzia,
Paolo Penna,
Wojciech Szpankowski,
Joachim M. Buhmann
Abstract:
We study the problem of estimating a rank-1 additive deformation of a Gaussian tensor according to the \emph{maximum-likelihood estimator} (MLE). The analysis is carried out in the sparse setting, where the underlying signal has a support that scales sublinearly with the total number of dimensions. We show that for Bernoulli distributed signals, the MLE undergoes an \emph{all-or-nothing} (AoN) pha…
▽ More
We study the problem of estimating a rank-1 additive deformation of a Gaussian tensor according to the \emph{maximum-likelihood estimator} (MLE). The analysis is carried out in the sparse setting, where the underlying signal has a support that scales sublinearly with the total number of dimensions. We show that for Bernoulli distributed signals, the MLE undergoes an \emph{all-or-nothing} (AoN) phase transition, already established for the minimum mean-square-error estimator (MMSE) in the same problem. The result follows from two main technical points: (i) the connection established between the MLE and the MMSE, using the first and second-moment methods in the constrained signal space, (ii) a recovery regime for the MMSE stricter than the simple error vanishing characterization given in the standard AoN, that is here proved as a general result.
△ Less
Submitted 25 January, 2021;
originally announced January 2021.
-
Statistical and computational thresholds for the planted $k$-densest sub-hypergraph problem
Authors:
Luca Corinzia,
Paolo Penna,
Wojciech Szpankowski,
Joachim M. Buhmann
Abstract:
In this work, we consider the problem of recovery a planted $k$-densest sub-hypergraph on $d$-uniform hypergraphs. This fundamental problem appears in different contexts, e.g., community detection, average-case complexity, and neuroscience applications as a structural variant of tensor-PCA problem. We provide tight \emph{information-theoretic} upper and lower bounds for the exact recovery threshol…
▽ More
In this work, we consider the problem of recovery a planted $k$-densest sub-hypergraph on $d$-uniform hypergraphs. This fundamental problem appears in different contexts, e.g., community detection, average-case complexity, and neuroscience applications as a structural variant of tensor-PCA problem. We provide tight \emph{information-theoretic} upper and lower bounds for the exact recovery threshold by the maximum-likelihood estimator, as well as \emph{algorithmic} bounds based on approximate message passing algorithms. The problem exhibits a typical statistical-to-computational gap observed in analogous sparse settings that widen with increasing sparsity of the problem. The bounds show that the signal structure impacts the location of the statistical and computational phase transition that the known existing bounds for the tensor-PCA model do not capture. This effect is due to the generic planted signal prior that this latter model addresses.
△ Less
Submitted 28 January, 2021; v1 submitted 23 November, 2020;
originally announced November 2020.
-
Microtubule Tracking in Electron Microscopy Volumes
Authors:
Nils Eckstein,
Julia Buhmann,
Matthew Cook,
Jan Funke
Abstract:
We present a method for microtubule tracking in electron microscopy volumes. Our method first identifies a sparse set of voxels that likely belong to microtubules. Similar to prior work, we then enumerate potential edges between these voxels, which we represent in a candidate graph. Tracks of microtubules are found by selecting nodes and edges in the candidate graph by solving a constrained optimi…
▽ More
We present a method for microtubule tracking in electron microscopy volumes. Our method first identifies a sparse set of voxels that likely belong to microtubules. Similar to prior work, we then enumerate potential edges between these voxels, which we represent in a candidate graph. Tracks of microtubules are found by selecting nodes and edges in the candidate graph by solving a constrained optimization problem incorporating biological priors on microtubule structure. For this, we present a novel integer linear programming formulation, which results in speed-ups of three orders of magnitude and an increase of 53% in accuracy compared to prior art (evaluated on three 1.2 x 4 x 4$μ$m volumes of Drosophila neural tissue). We also propose a scheme to solve the optimization problem in a block-wise fashion, which allows distributed tracking and is necessary to process very large electron microscopy volumes. Finally, we release a benchmark dataset for microtubule tracking, here used for training, testing and validation, consisting of eight 30 x 1000 x 1000 voxel blocks (1.2 x 4 x 4$μ$m) of densely annotated microtubules in the CREMI data set (https://github.com/nilsec/micron).
△ Less
Submitted 17 September, 2020;
originally announced September 2020.
-
Neural collaborative filtering for unsupervised mitral valve segmentation in echocardiography
Authors:
Luca Corinzia,
Fabian Laumer,
Alessandro Candreva,
Maurizio Taramasso,
Francesco Maisano,
Joachim M. Buhmann
Abstract:
The segmentation of the mitral valve annulus and leaflets specifies a crucial first step to establish a machine learning pipeline that can support physicians in performing multiple tasks, e.g.\ diagnosis of mitral valve diseases, surgical planning, and intraoperative procedures. Current methods for mitral valve segmentation on 2D echocardiography videos require extensive interaction with annotator…
▽ More
The segmentation of the mitral valve annulus and leaflets specifies a crucial first step to establish a machine learning pipeline that can support physicians in performing multiple tasks, e.g.\ diagnosis of mitral valve diseases, surgical planning, and intraoperative procedures. Current methods for mitral valve segmentation on 2D echocardiography videos require extensive interaction with annotators and perform poorly on low-quality and noisy videos. We propose an automated and unsupervised method for the mitral valve segmentation based on a low dimensional embedding of the echocardiography videos using neural network collaborative filtering. The method is evaluated in a collection of echocardiography videos of patients with a variety of mitral valve diseases, and additionally on an independent test cohort. It outperforms state-of-the-art \emph{unsupervised} and \emph{supervised} methods on low-quality videos or in the case of sparse annotation.
△ Less
Submitted 13 August, 2020;
originally announced August 2020.
-
Continuous Submodular Function Maximization
Authors:
Yatao Bian,
Joachim M. Buhmann,
Andreas Krause
Abstract:
Continuous submodular functions are a category of generally non-convex/non-concave functions with a wide spectrum of applications. The celebrated property of this class of functions - continuous submodularity - enables both exact minimization and approximate maximization in poly. time. Continuous submodularity is obtained by generalizing the notion of submodularity from discrete domains to continu…
▽ More
Continuous submodular functions are a category of generally non-convex/non-concave functions with a wide spectrum of applications. The celebrated property of this class of functions - continuous submodularity - enables both exact minimization and approximate maximization in poly. time. Continuous submodularity is obtained by generalizing the notion of submodularity from discrete domains to continuous domains. It intuitively captures a repulsive effect amongst different dimensions of the defined multivariate function.
In this paper, we systematically study continuous submodularity and a class of non-convex optimization problems: continuous submodular function maximization. We start by a thorough characterization of the class of continuous submodular functions, and show that continuous submodularity is equivalent to a weak version of the diminishing returns (DR) property. Thus we also derive a subclass of continuous submodular functions, termed continuous DR-submodular functions, which enjoys the full DR property. Then we present operations that preserve continuous (DR-)submodularity, thus yielding general rules for composing new submodular functions. We establish intriguing properties for the problem of constrained DR-submodular maximization, such as the local-global relation. We identify several applications of continuous submodular optimization, ranging from influence maximization, MAP inference for DPPs to provable mean field inference. For these applications, continuous submodularity formalizes valuable domain knowledge relevant for optimizing this class of objectives. We present inapproximability results and provable algorithms for two problem settings: constrained monotone DR-submodular maximization and constrained non-monotone DR-submodular maximization. Finally, we extensively evaluate the effectiveness of the proposed algorithms.
△ Less
Submitted 24 June, 2020;
originally announced June 2020.
-
From Sets to Multisets: Provable Variational Inference for Probabilistic Integer Submodular Models
Authors:
Aytunc Sahin,
Yatao Bian,
Joachim M. Buhmann,
Andreas Krause
Abstract:
Submodular functions have been studied extensively in machine learning and data mining. In particular, the optimization of submodular functions over the integer lattice (integer submodular functions) has recently attracted much interest, because this domain relates naturally to many practical problem settings, such as multilabel graph cut, budget allocation and revenue maximization with discrete a…
▽ More
Submodular functions have been studied extensively in machine learning and data mining. In particular, the optimization of submodular functions over the integer lattice (integer submodular functions) has recently attracted much interest, because this domain relates naturally to many practical problem settings, such as multilabel graph cut, budget allocation and revenue maximization with discrete assignments. In contrast, the use of these functions for probabilistic modeling has received surprisingly little attention so far. In this work, we firstly propose the Generalized Multilinear Extension, a continuous DR-submodular extension for integer submodular functions. We study central properties of this extension and formulate a new probabilistic model which is defined through integer submodular functions. Then, we introduce a block-coordinate ascent algorithm to perform approximate inference for those class of models. Finally, we demonstrate its effectiveness and viability on several real-world social connection graph datasets with integer submodular objectives.
△ Less
Submitted 1 June, 2020;
originally announced June 2020.
-
Rig-space Neural Rendering
Authors:
Dominik Borer,
Lu Yuhang,
Laura Wuelfroth,
Jakob Buhmann,
Martin Guay
Abstract:
Movie productions use high resolution 3d characters with complex proprietary rigs to create the highest quality images possible for large displays. Unfortunately, these 3d assets are typically not compatible with real-time graphics engines used for games, mixed reality and real-time pre-visualization. Consequently, the 3d characters need to be re-modeled and re-rigged for these new applications, r…
▽ More
Movie productions use high resolution 3d characters with complex proprietary rigs to create the highest quality images possible for large displays. Unfortunately, these 3d assets are typically not compatible with real-time graphics engines used for games, mixed reality and real-time pre-visualization. Consequently, the 3d characters need to be re-modeled and re-rigged for these new applications, requiring weeks of work and artistic approval. Our solution to this problem is to learn a compact image-based rendering of the original 3d character, conditioned directly on the rig parameters. Our idea is to render the character in many different poses and views, and to train a deep neural network to render high resolution images, from the rig parameters directly. Many neural rendering techniques have been proposed to render from 2d skeletons, or geometry and UV maps. However these require manual work, and to do not remain compatible with the animator workflow of manipulating rig widgets, as well as the real-time game engine pipeline of interpolating rig parameters. We extend our architecture to support dynamic re-lighting and composition with other 3d objects in the scene. We designed a network that efficiently generates multiple scene feature maps such as normals, depth, albedo and mask, which are composed with other scene objects to form the final image.
△ Less
Submitted 22 March, 2020;
originally announced March 2020.
-
Variational Federated Multi-Task Learning
Authors:
Luca Corinzia,
Ami Beuret,
Joachim M. Buhmann
Abstract:
In federated learning, a central server coordinates the training of a single model on a massively distributed network of devices. This setting can be naturally extended to a multi-task learning framework, to handle real-world federated datasets that typically show strong statistical heterogeneity among devices. Despite federated multi-task learning being shown to be an effective paradigm for real-…
▽ More
In federated learning, a central server coordinates the training of a single model on a massively distributed network of devices. This setting can be naturally extended to a multi-task learning framework, to handle real-world federated datasets that typically show strong statistical heterogeneity among devices. Despite federated multi-task learning being shown to be an effective paradigm for real-world datasets, it has been applied only on convex models. In this work, we introduce VIRTUAL, an algorithm for federated multi-task learning for general non-convex models. In VIRTUAL the federated network of the server and the clients is treated as a star-shaped Bayesian network, and learning is performed on the network using approximated variational inference. We show that this method is effective on real-world federated datasets, outperforming the current state-of-the-art for federated learning, and concurrently allowing sparser gradient updates.
△ Less
Submitted 4 February, 2021; v1 submitted 14 June, 2019;
originally announced June 2019.
-
Disentangled State Space Representations
Authors:
Đorđe Miladinović,
Muhammad Waleed Gondal,
Bernhard Schölkopf,
Joachim M. Buhmann,
Stefan Bauer
Abstract:
Sequential data often originates from diverse domains across which statistical regularities and domain specifics exist. To specifically learn cross-domain sequence representations, we introduce disentangled state space models (DSSM) -- a class of SSM in which domain-invariant state dynamics is explicitly disentangled from domain-specific information governing that dynamics. We analyze how such sep…
▽ More
Sequential data often originates from diverse domains across which statistical regularities and domain specifics exist. To specifically learn cross-domain sequence representations, we introduce disentangled state space models (DSSM) -- a class of SSM in which domain-invariant state dynamics is explicitly disentangled from domain-specific information governing that dynamics. We analyze how such separation can improve knowledge transfer to new domains, and enable robust prediction, sequence manipulation and domain characterization. We furthermore propose an unsupervised VAE-based training procedure to implement DSSM in form of Bayesian filters. In our experiments, we applied VAE-DSSM framework to achieve competitive performance in online ODE system identification and regression across experimental settings, and controlled generation and prediction of bouncing ball video sequences across varying gravitational influences.
△ Less
Submitted 7 June, 2019;
originally announced June 2019.
-
Domain Authoring Assistant for Intelligent Virtual Agents
Authors:
Sepehr Janghorbani,
Ashutosh Modi,
Jakob Buhmann,
Mubbasir Kapadia
Abstract:
Develo** intelligent virtual characters has attracted a lot of attention in the recent years. The process of creating such characters often involves a team of creative authors who describe different aspects of the characters in natural language, and planning experts that translate this description into a planning domain. This can be quite challenging as the team of creative authors should dilige…
▽ More
Develo** intelligent virtual characters has attracted a lot of attention in the recent years. The process of creating such characters often involves a team of creative authors who describe different aspects of the characters in natural language, and planning experts that translate this description into a planning domain. This can be quite challenging as the team of creative authors should diligently define every aspect of the character especially if it contains complex human-like behavior. Also a team of engineers has to manually translate the natural language description of a character's personality into the planning domain knowledge. This can be extremely time and resource demanding and can be an obstacle to author's creativity. The goal of this paper is to introduce an authoring assistant tool to automate the process of domain generation from natural language description of virtual characters, thus bridging between the creative authoring team and the planning domain experts. Moreover, the proposed tool also identifies possible missing information in the domain description and iteratively makes suggestions to the author.
△ Less
Submitted 5 April, 2019;
originally announced April 2019.
-
Learning Counterfactual Representations for Estimating Individual Dose-Response Curves
Authors:
Patrick Schwab,
Lorenz Linhardt,
Stefan Bauer,
Joachim M. Buhmann,
Walter Karlen
Abstract:
Estimating what would be an individual's potential response to varying levels of exposure to a treatment is of high practical relevance for several important fields, such as healthcare, economics and public policy. However, existing methods for learning to estimate counterfactual outcomes from observational data are either focused on estimating average dose-response curves, or limited to settings…
▽ More
Estimating what would be an individual's potential response to varying levels of exposure to a treatment is of high practical relevance for several important fields, such as healthcare, economics and public policy. However, existing methods for learning to estimate counterfactual outcomes from observational data are either focused on estimating average dose-response curves, or limited to settings with only two treatments that do not have an associated dosage parameter. Here, we present a novel machine-learning approach towards learning counterfactual representations for estimating individual dose-response curves for any number of treatments with continuous dosage parameters with neural networks. Building on the established potential outcomes framework, we introduce performance metrics, model selection criteria, model architectures, and open benchmarks for estimating individual dose-response curves. Our experiments show that the methods developed in this work set a new state-of-the-art in estimating individual dose-response.
△ Less
Submitted 10 December, 2020; v1 submitted 3 February, 2019;
originally announced February 2019.
-
Exact Recovery for a Family of Community-Detection Generative Models
Authors:
Luca Corinzia,
Paolo Penna,
Luca Mondada,
Joachim M. Buhmann
Abstract:
Generative models for networks with communities have been studied extensively for being a fertile ground to establish information-theoretic and computational thresholds. In this paper we propose a new toy model for planted generative models called planted Random Energy Model (REM), inspired by Derrida's REM. For this model we provide the asymptotic behaviour of the probability of error for the max…
▽ More
Generative models for networks with communities have been studied extensively for being a fertile ground to establish information-theoretic and computational thresholds. In this paper we propose a new toy model for planted generative models called planted Random Energy Model (REM), inspired by Derrida's REM. For this model we provide the asymptotic behaviour of the probability of error for the maximum likelihood estimator and hence the exact recovery threshold. As an application, we further consider the 2 non-equally sized community Weighted Stochastic Block Model (2-WSBM) on $h$-uniform hypergraphs, that is equivalent to the P-REM on both sides of the spectrum, for high and low edge cardinality $h$. We provide upper and lower bounds for the exact recoverability for any $h$, map** these problems to the aforementioned P-REM. To the best of our knowledge these are the first consistency results for the 2-WSBM on graphs and on hypergraphs with non-equally sized community.
△ Less
Submitted 30 April, 2019; v1 submitted 21 January, 2019;
originally announced January 2019.
-
Synaptic partner prediction from point annotations in insect brains
Authors:
Julia Buhmann,
Renate Krause,
Rodrigo Ceballos Lentini,
Nils Eckstein,
Matthew Cook,
Srinivas Turaga,
Jan Funke
Abstract:
High-throughput electron microscopy allows recording of lar- ge stacks of neural tissue with sufficient resolution to extract the wiring diagram of the underlying neural network. Current efforts to automate this process focus mainly on the segmentation of neurons. However, in order to recover a wiring diagram, synaptic partners need to be identi- fied as well. This is especially challenging in ins…
▽ More
High-throughput electron microscopy allows recording of lar- ge stacks of neural tissue with sufficient resolution to extract the wiring diagram of the underlying neural network. Current efforts to automate this process focus mainly on the segmentation of neurons. However, in order to recover a wiring diagram, synaptic partners need to be identi- fied as well. This is especially challenging in insect brains like Drosophila melanogaster, where one presynaptic site is associated with multiple post- synaptic elements. Here we propose a 3D U-Net architecture to directly identify pairs of voxels that are pre- and postsynaptic to each other. To that end, we formulate the problem of synaptic partner identification as a classification problem on long-range edges between voxels to encode both the presence of a synaptic pair and its direction. This formulation allows us to directly learn from synaptic point annotations instead of more ex- pensive voxel-based synaptic cleft or vesicle annotations. We evaluate our method on the MICCAI 2016 CREMI challenge and improve over the current state of the art, producing 3% fewer errors than the next best method.
△ Less
Submitted 16 July, 2018; v1 submitted 21 June, 2018;
originally announced June 2018.
-
Optimal DR-Submodular Maximization and Applications to Provable Mean Field Inference
Authors:
An Bian,
Joachim M. Buhmann,
Andreas Krause
Abstract:
Mean field inference in probabilistic models is generally a highly nonconvex problem. Existing optimization methods, e.g., coordinate ascent algorithms, can only generate local optima.
In this work we propose provable mean filed methods for probabilistic log-submodular models and its posterior agreement (PA) with strong approximation guarantees. The main algorithmic technique is a new Double Gre…
▽ More
Mean field inference in probabilistic models is generally a highly nonconvex problem. Existing optimization methods, e.g., coordinate ascent algorithms, can only generate local optima.
In this work we propose provable mean filed methods for probabilistic log-submodular models and its posterior agreement (PA) with strong approximation guarantees. The main algorithmic technique is a new Double Greedy scheme, termed DR-DoubleGreedy, for continuous DR-submodular maximization with box-constraints. It is a one-pass algorithm with linear time complexity, reaching the optimal 1/2 approximation ratio, which may be of independent interest. We validate the superior performance of our algorithms against baseline algorithms on both synthetic and real-world datasets.
△ Less
Submitted 29 November, 2018; v1 submitted 18 May, 2018;
originally announced May 2018.
-
Fast Gaussian Process Based Gradient Matching for Parameter Identification in Systems of Nonlinear ODEs
Authors:
Philippe Wenk,
Alkis Gotovos,
Stefan Bauer,
Nico Gorbach,
Andreas Krause,
Joachim M. Buhmann
Abstract:
Parameter identification and comparison of dynamical systems is a challenging task in many fields. Bayesian approaches based on Gaussian process regression over time-series data have been successfully applied to infer the parameters of a dynamical system without explicitly solving it. While the benefits in computational cost are well established, a rigorous mathematical framework has been missing.…
▽ More
Parameter identification and comparison of dynamical systems is a challenging task in many fields. Bayesian approaches based on Gaussian process regression over time-series data have been successfully applied to infer the parameters of a dynamical system without explicitly solving it. While the benefits in computational cost are well established, a rigorous mathematical framework has been missing. We offer a novel interpretation which leads to a better understanding and improvements in state-of-the-art performance in terms of accuracy for nonlinear dynamical systems.
△ Less
Submitted 1 March, 2019; v1 submitted 12 April, 2018;
originally announced April 2018.
-
Continuous DR-submodular Maximization: Structure and Algorithms
Authors:
An Bian,
Kfir Y. Levy,
Andreas Krause,
Joachim M. Buhmann
Abstract:
DR-submodular continuous functions are important objectives with wide real-world applications spanning MAP inference in determinantal point processes (DPPs), and mean-field inference for probabilistic submodular models, amongst others. DR-submodularity captures a subclass of non-convex functions that enables both exact minimization and approximate maximization in polynomial time.
In this work we…
▽ More
DR-submodular continuous functions are important objectives with wide real-world applications spanning MAP inference in determinantal point processes (DPPs), and mean-field inference for probabilistic submodular models, amongst others. DR-submodularity captures a subclass of non-convex functions that enables both exact minimization and approximate maximization in polynomial time.
In this work we study the problem of maximizing non-monotone DR-submodular continuous functions under general down-closed convex constraints. We start by investigating geometric properties that underlie such objectives, e.g., a strong relation between (approximately) stationary points and global optimum is proved. These properties are then used to devise two optimization algorithms with provable guarantees. Concretely, we first devise a "two-phase" algorithm with $1/4$ approximation guarantee. This algorithm allows the use of existing methods for finding (approximately) stationary points as a subroutine, thus, harnessing recent progress in non-convex optimization. Then we present a non-monotone Frank-Wolfe variant with $1/e$ approximation guarantee and sublinear convergence rate. Finally, we extend our approach to a broader class of generalized DR-submodular continuous functions, which captures a wider spectrum of applications. Our theoretical findings are validated on synthetic and real-world problem instances.
△ Less
Submitted 24 May, 2019; v1 submitted 3 November, 2017;
originally announced November 2017.
-
Guarantees for Greedy Maximization of Non-submodular Functions with Applications
Authors:
Andrew An Bian,
Joachim M. Buhmann,
Andreas Krause,
Sebastian Tschiatschek
Abstract:
We investigate the performance of the standard Greedy algorithm for cardinality constrained maximization of non-submodular nondecreasing set functions. While there are strong theoretical guarantees on the performance of Greedy for maximizing submodular functions, there are few guarantees for non-submodular ones. However, Greedy enjoys strong empirical performance for many important non-submodular…
▽ More
We investigate the performance of the standard Greedy algorithm for cardinality constrained maximization of non-submodular nondecreasing set functions. While there are strong theoretical guarantees on the performance of Greedy for maximizing submodular functions, there are few guarantees for non-submodular ones. However, Greedy enjoys strong empirical performance for many important non-submodular functions, e.g., the Bayesian A-optimality objective in experimental design. We prove theoretical guarantees supporting the empirical performance. Our guarantees are characterized by a combination of the (generalized) curvature $α$ and the submodularity ratio $γ$. In particular, we prove that Greedy enjoys a tight approximation guarantee of $\frac{1}α(1- e^{-γα})$ for cardinality constrained maximization. In addition, we bound the submodularity ratio and curvature for several important real-world objectives, including the Bayesian A-optimality objective, the determinantal function of a square submatrix and certain linear programs with combinatorial constraints. We experimentally validate our theoretical findings for both synthetic and real-world applications.
△ Less
Submitted 14 May, 2019; v1 submitted 6 March, 2017;
originally announced March 2017.
-
Scalable Adaptive Stochastic Optimization Using Random Projections
Authors:
Gabriel Krummenacher,
Brian McWilliams,
Yannic Kilcher,
Joachim M. Buhmann,
Nicolai Meinshausen
Abstract:
Adaptive stochastic gradient methods such as AdaGrad have gained popularity in particular for training deep neural networks. The most commonly used and studied variant maintains a diagonal matrix approximation to second order information by accumulating past gradients which are used to tune the step size adaptively. In certain situations the full-matrix variant of AdaGrad is expected to attain bet…
▽ More
Adaptive stochastic gradient methods such as AdaGrad have gained popularity in particular for training deep neural networks. The most commonly used and studied variant maintains a diagonal matrix approximation to second order information by accumulating past gradients which are used to tune the step size adaptively. In certain situations the full-matrix variant of AdaGrad is expected to attain better performance, however in high dimensions it is computationally impractical. We present Ada-LR and RadaGrad two computationally efficient approximations to full-matrix AdaGrad based on randomized dimensionality reduction. They are able to capture dependencies between features and achieve similar performance to full-matrix AdaGrad but at a much smaller computational cost. We show that the regret of Ada-LR is close to the regret of full-matrix AdaGrad which can have an up-to exponentially smaller dependence on the dimension than the diagonal variant. Empirically, we show that Ada-LR and RadaGrad perform similarly to full-matrix AdaGrad. On the task of training convolutional neural networks as well as recurrent neural networks, RadaGrad achieves faster convergence than diagonal AdaGrad.
△ Less
Submitted 21 November, 2016;
originally announced November 2016.
-
Greedy MAXCUT Algorithms and their Information Content
Authors:
Yatao Bian,
Alexey Gronskiy,
Joachim M. Buhmann
Abstract:
MAXCUT defines a classical NP-hard problem for graph partitioning and it serves as a typical case of the symmetric non-monotone Unconstrained Submodular Maximization (USM) problem. Applications of MAXCUT are abundant in machine learning, computer vision and statistical physics. Greedy algorithms to approximately solve MAXCUT rely on greedy vertex labelling or on an edge contraction strategy. These…
▽ More
MAXCUT defines a classical NP-hard problem for graph partitioning and it serves as a typical case of the symmetric non-monotone Unconstrained Submodular Maximization (USM) problem. Applications of MAXCUT are abundant in machine learning, computer vision and statistical physics. Greedy algorithms to approximately solve MAXCUT rely on greedy vertex labelling or on an edge contraction strategy. These algorithms have been studied by measuring their approximation ratios in the worst case setting but very little is known to characterize their robustness to noise contaminations of the input data in the average case. Adapting the framework of Approximation Set Coding, we present a method to exactly measure the cardinality of the algorithmic approximation sets of five greedy MAXCUT algorithms. Their information contents are explored for graph instances generated by two different noise models: the edge reversal model and Gaussian edge weights model. The results provide insights into the robustness of different greedy heuristics and techniques for MAXCUT, which can be used for algorithm design of general USM problems.
△ Less
Submitted 3 September, 2016;
originally announced September 2016.
-
Guaranteed Non-convex Optimization: Submodular Maximization over Continuous Domains
Authors:
Andrew An Bian,
Baharan Mirzasoleiman,
Joachim M. Buhmann,
Andreas Krause
Abstract:
Submodular continuous functions are a category of (generally) non-convex/non-concave functions with a wide spectrum of applications. We characterize these functions and demonstrate that they can be maximized efficiently with approximation guarantees. Specifically, i) We introduce the weak DR property that gives a unified characterization of submodularity for all set, integer-lattice and continuous…
▽ More
Submodular continuous functions are a category of (generally) non-convex/non-concave functions with a wide spectrum of applications. We characterize these functions and demonstrate that they can be maximized efficiently with approximation guarantees. Specifically, i) We introduce the weak DR property that gives a unified characterization of submodularity for all set, integer-lattice and continuous functions; ii) for maximizing monotone DR-submodular continuous functions under general down-closed convex constraints, we propose a Frank-Wolfe variant with $(1-1/e)$ approximation guarantee, and sub-linear convergence rate; iii) for maximizing general non-monotone submodular continuous functions subject to box constraints, we propose a DoubleGreedy algorithm with $1/3$ approximation guarantee. Submodular continuous functions naturally find applications in various real-world settings, including influence and revenue maximization with continuous assignments, sensor energy management, multi-resolution data summarization, facility location, etc. Experimental results show that the proposed algorithms efficiently generate superior solutions compared to baseline algorithms.
△ Less
Submitted 6 May, 2019; v1 submitted 17 June, 2016;
originally announced June 2016.
-
Multi-Organ Cancer Classification and Survival Analysis
Authors:
Stefan Bauer,
Nicolas Carion,
Peter Schüffler,
Thomas Fuchs,
Peter Wild,
Joachim M. Buhmann
Abstract:
Accurate and robust cell nuclei classification is the cornerstone for a wider range of tasks in digital and Computational Pathology. However, most machine learning systems require extensive labeling from expert pathologists for each individual problem at hand, with no or limited abilities for knowledge transfer between datasets and organ sites. In this paper we implement and evaluate a variety of…
▽ More
Accurate and robust cell nuclei classification is the cornerstone for a wider range of tasks in digital and Computational Pathology. However, most machine learning systems require extensive labeling from expert pathologists for each individual problem at hand, with no or limited abilities for knowledge transfer between datasets and organ sites. In this paper we implement and evaluate a variety of deep neural network models and model ensembles for nuclei classification in renal cell cancer (RCC) and prostate cancer (PCa). We propose a convolutional neural network system based on residual learning which significantly improves over the state-of-the-art in cell nuclei classification. Finally, we show that the combination of tissue types during training increases not only classification accuracy but also overall survival analysis.
△ Less
Submitted 2 December, 2016; v1 submitted 2 June, 2016;
originally announced June 2016.
-
TI-POOLING: transformation-invariant pooling for feature learning in Convolutional Neural Networks
Authors:
Dmitry Laptev,
Nikolay Savinov,
Joachim M. Buhmann,
Marc Pollefeys
Abstract:
In this paper we present a deep neural network topology that incorporates a simple to implement transformation invariant pooling operator (TI-POOLING). This operator is able to efficiently handle prior knowledge on nuisance variations in the data, such as rotation or scale changes. Most current methods usually make use of dataset augmentation to address this issue, but this requires larger number…
▽ More
In this paper we present a deep neural network topology that incorporates a simple to implement transformation invariant pooling operator (TI-POOLING). This operator is able to efficiently handle prior knowledge on nuisance variations in the data, such as rotation or scale changes. Most current methods usually make use of dataset augmentation to address this issue, but this requires larger number of model parameters and more training data, and results in significantly increased training time and larger chance of under- or overfitting. The main reason for these drawbacks is that the learned model needs to capture adequate features for all the possible transformations of the input. On the other hand, we formulate features in convolutional neural networks to be transformation-invariant. We achieve that using parallel siamese architectures for the considered transformation set and applying the TI-POOLING operator on their outputs before the fully-connected layers. We show that this topology internally finds the most optimal "canonical" instance of the input image for training and therefore limits the redundancy in learned features. This more efficient use of training data results in better performance on popular benchmark datasets with smaller number of parameters when comparing to standard convolutional neural networks with dataset augmentation and to other baselines.
△ Less
Submitted 22 September, 2016; v1 submitted 21 April, 2016;
originally announced April 2016.
-
Computational Pathology: Challenges and Promises for Tissue Analysis
Authors:
Thomas J. Fuchs,
Joachim M. Buhmann
Abstract:
The histological assessment of human tissue has emerged as the key challenge for detection and treatment of cancer. A plethora of different data sources ranging from tissue microarray data to gene expression, proteomics or metabolomics data provide a detailed overview of the health status of a patient. Medical doctors need to assess these information sources and they rely on data driven automatic…
▽ More
The histological assessment of human tissue has emerged as the key challenge for detection and treatment of cancer. A plethora of different data sources ranging from tissue microarray data to gene expression, proteomics or metabolomics data provide a detailed overview of the health status of a patient. Medical doctors need to assess these information sources and they rely on data driven automatic analysis tools. Methods for classification, grou** and segmentation of heterogeneous data sources as well as regression of noisy dependencies and estimation of survival probabilities enter the processing workflow of a pathology diagnosis system at various stages. This paper reports on state-of-the-art of the design and effectiveness of computational pathology workflows and it discusses future research directions in this emergent field of medical informatics and diagnostic machine learning.
△ Less
Submitted 31 December, 2015;
originally announced January 2016.
-
Kickback cuts Backprop's red-tape: Biologically plausible credit assignment in neural networks
Authors:
David Balduzzi,
Hastagiri Vanchinathan,
Joachim Buhmann
Abstract:
Error backpropagation is an extremely effective algorithm for assigning credit in artificial neural networks. However, weight updates under Backprop depend on lengthy recursive computations and require separate output and error messages -- features not shared by biological neurons, that are perhaps unnecessary. In this paper, we revisit Backprop and the credit assignment problem. We first decompos…
▽ More
Error backpropagation is an extremely effective algorithm for assigning credit in artificial neural networks. However, weight updates under Backprop depend on lengthy recursive computations and require separate output and error messages -- features not shared by biological neurons, that are perhaps unnecessary. In this paper, we revisit Backprop and the credit assignment problem. We first decompose Backprop into a collection of interacting learning algorithms; provide regret bounds on the performance of these sub-algorithms; and factorize Backprop's error signals. Using these results, we derive a new credit assignment algorithm for nonparametric regression, Kickback, that is significantly simpler than Backprop. Finally, we provide a sufficient condition for Kickback to follow error gradients, and show that Kickback matches Backprop's performance on real-world regression benchmarks.
△ Less
Submitted 22 November, 2014;
originally announced November 2014.
-
Correlated random features for fast semi-supervised learning
Authors:
Brian McWilliams,
David Balduzzi,
Joachim M. Buhmann
Abstract:
This paper presents Correlated Nystrom Views (XNV), a fast semi-supervised algorithm for regression and classification. The algorithm draws on two main ideas. First, it generates two views consisting of computationally inexpensive random features. Second, XNV applies multiview regression using Canonical Correlation Analysis (CCA) on unlabeled data to bias the regression towards useful features. It…
▽ More
This paper presents Correlated Nystrom Views (XNV), a fast semi-supervised algorithm for regression and classification. The algorithm draws on two main ideas. First, it generates two views consisting of computationally inexpensive random features. Second, XNV applies multiview regression using Canonical Correlation Analysis (CCA) on unlabeled data to bias the regression towards useful features. It has been shown that, if the views contains accurate estimators, CCA regression can substantially reduce variance with a minimal increase in bias. Random views are justified by recent theoretical and empirical work showing that regression with random features closely approximates kernel regression, implying that random views can be expected to contain accurate estimators. We show that XNV consistently outperforms a state-of-the-art algorithm for semi-supervised learning: substantially improving predictive performance and reducing the variability of performance on a wide variety of real-world datasets, whilst also reducing runtime by orders of magnitude.
△ Less
Submitted 5 November, 2013; v1 submitted 24 June, 2013;
originally announced June 2013.