Search | arXiv e-print repository

arXiv:2403.19492 [pdf, other]

Segmentation tool for images of cracks

Authors: Andrii Kompanets, Remco Duits, Davide Leonetti, Nicky van den Berg, H. H., Snijder

Abstract: Safety-critical infrastructures, such as bridges, are periodically inspected to check for existing damage, such as fatigue cracks and corrosion, and to guarantee the safe use of the infrastructure. Visual inspection is the most frequent type of general inspection, despite the fact that its detection capability is rather limited, especially for fatigue cracks. Machine learning algorithms can be use… ▽ More Safety-critical infrastructures, such as bridges, are periodically inspected to check for existing damage, such as fatigue cracks and corrosion, and to guarantee the safe use of the infrastructure. Visual inspection is the most frequent type of general inspection, despite the fact that its detection capability is rather limited, especially for fatigue cracks. Machine learning algorithms can be used for augmenting the capability of classical visual inspection of bridge structures, however, the implementation of such an algorithm requires a massive annotated training dataset, which is time-consuming to produce. This paper proposes a semi-automatic crack segmentation tool that eases the manual segmentation of cracks on images needed to create a training dataset for a machine learning algorithm. Also, it can be used to measure the geometry of the crack. This tool makes use of an image processing algorithm, which was initially developed for the analysis of vascular systems on retinal images. The algorithm relies on a multi-orientation wavelet transform, which is applied to the image to construct the so-called "orientation scores", i.e. a modified version of the image. Afterwards, the filtered orientation scores are used to formulate an optimal path problem that identifies the crack. The globally optimal path between manually selected crack endpoints is computed, using a state-of-the-art geometric tracking method. The pixel-wise segmentation is done afterwards using the obtained crack path. The proposed method outperforms fully automatic methods and shows potential to be an adequate alternative to the manual data annotation. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2308.15288 [pdf, other]

Conservativity of Type Theory over Higher-order Arithmetic

Authors: Benno van den Berg, Daniël Otten

Abstract: We investigate how much type theory is able to prove about the natural numbers. A classical result in this area shows that dependent type theory without any universes is conservative over Heyting Arithmetic (HA). We build on this result by showing that type theories with one level of universes are conservative over Higher-order Heyting Arithmetic (HAH). Although this clearly depends on the specifi… ▽ More We investigate how much type theory is able to prove about the natural numbers. A classical result in this area shows that dependent type theory without any universes is conservative over Heyting Arithmetic (HA). We build on this result by showing that type theories with one level of universes are conservative over Higher-order Heyting Arithmetic (HAH). Although this clearly depends on the specific type theory, we show that the interpretation of logic also plays a major role. For proof-irrelevant interpretations, we will see that strong versions of type theory prove exactly the same higher-order arithmetical formulas as HAH. Conversely, proof-relevant interpretations prove strictly more second-order arithmetical formulas than HAH, however they still prove exactly the same first-order arithmetical formulas. Along the way, we investigate the different interpretations of logic in type theory, and to what extent dependent type theories can be seen as extensions of higher-order logic. We apply our results by proving De Jongh's theorem for type theory. △ Less

Submitted 29 August, 2023; originally announced August 2023.

Comments: 23 pages

MSC Class: 03B38; 03B16; 03F55; 03F50 ACM Class: F.4.1

arXiv:2305.12570 [pdf, ps, other]

doi 10.1002/mp.16884

Generalizable synthetic MRI with physics-informed convolutional networks

Authors: Luuk Jacobs, Stefano Mandija, Hongyan Liu, Cornelis A. T. van den Berg, Alessandro Sbrizzi, Matteo Maspero

Abstract: In this study, we develop a physics-informed deep learning-based method to synthesize multiple brain magnetic resonance imaging (MRI) contrasts from a single five-minute acquisition and investigate its ability to generalize to arbitrary contrasts to accelerate neuroimaging protocols. A dataset of fifty-five subjects acquired with a standard MRI protocol and a five-minute transient-state sequence w… ▽ More In this study, we develop a physics-informed deep learning-based method to synthesize multiple brain magnetic resonance imaging (MRI) contrasts from a single five-minute acquisition and investigate its ability to generalize to arbitrary contrasts to accelerate neuroimaging protocols. A dataset of fifty-five subjects acquired with a standard MRI protocol and a five-minute transient-state sequence was used to develop a physics-informed deep learning-based method. The model, based on a generative adversarial network, maps data acquired from the five-minute scan to "effective" quantitative parameter maps, here named q*-maps, by using its generated PD, T1, and T2 values in a signal model to synthesize four standard contrasts (proton density-weighted, T1-weighted, T2-weighted, and T2-weighted fluid-attenuated inversion recovery), from which losses are computed. The q*-maps are compared to literature values and the synthetic contrasts are compared to an end-to-end deep learning-based method proposed by literature. The generalizability of the proposed method is investigated for five volunteers by synthesizing three non-standard contrasts unseen during training and comparing these to respective ground truth acquisitions via contrast-to-noise ratio and quantitative assessment. The physics-informed method was able to match the high-quality synthMRI of the end-to-end method for the four standard contrasts, with mean \pm standard deviation structural similarity metrics above 0.75 \pm 0.08 and peak signal-to-noise ratios above 22.4 \pm 1.9 and 22.6 \pm 2.1. Additionally, the physics-informed method provided retrospective contrast adjustment, with visually similar signal contrast and comparable contrast-to-noise ratios to the ground truth acquisitions for three sequences unused for model training, demonstrating its generalizability and potential application to accelerate neuroimaging protocols. △ Less

Submitted 21 May, 2023; originally announced May 2023.

Comments: 23 pages, 7 figures, 1 table. Presented at ISMRM 2022. Will be submitted to NMR in biomedicine

Journal ref: Med Phys. (2023)

arXiv:2305.07878 [pdf, ps, other]

Automatic Differentiation in Prolog

Authors: Tom Schrijvers, Birthe van den Berg, Fabrizio Riguzzi

Abstract: Automatic differentiation (AD) is a range of algorithms to compute the numeric value of a function's (partial) derivative, where the function is typically given as a computer program or abstract syntax tree. AD has become immensely popular as part of many learning algorithms, notably for neural networks. This paper uses Prolog to systematically derive gradient-based forward- and reverse-mode AD va… ▽ More Automatic differentiation (AD) is a range of algorithms to compute the numeric value of a function's (partial) derivative, where the function is typically given as a computer program or abstract syntax tree. AD has become immensely popular as part of many learning algorithms, notably for neural networks. This paper uses Prolog to systematically derive gradient-based forward- and reverse-mode AD variants from a simple executable specification: evaluation of the symbolic derivative. Along the way we demonstrate that several Prolog features (DCGs, co-routines) contribute to the succinct formulation of the algorithm. We also discuss two applications in probabilistic programming that are enabled by our Prolog algorithms. The first is parameter learning for the Sum-Product Loop Language and the second consists of both parameter learning and variational inference for probabilistic logic programming. △ Less

Submitted 13 May, 2023; originally announced May 2023.

Comments: accepted for publication in the issues of Theory and Practice of Logic Programming dedicated to ICLP 2023

arXiv:2304.09697 [pdf, ps, other]

A Calculus for Scoped Effects & Handlers

Authors: Roger Bosman, Birthe van den Berg, Wenhao Tang, Tom Schrijvers

Abstract: Algebraic effects & handlers have become a standard approach for side-effects in functional programming. Their modular composition with other effects and clean separation of syntax and semantics make them attractive to a wide audience. However, not all effects can be classified as algebraic; some need a more sophisticated handling. In particular, effects that have or create a delimited scope need… ▽ More Algebraic effects & handlers have become a standard approach for side-effects in functional programming. Their modular composition with other effects and clean separation of syntax and semantics make them attractive to a wide audience. However, not all effects can be classified as algebraic; some need a more sophisticated handling. In particular, effects that have or create a delimited scope need special care, as their continuation consists of two parts-in and out of the scope-and their modular composition introduces additional complexity. These effects are called scoped and have gained attention by their growing applicability and adoption in popular libraries. While calculi have been designed with algebraic effects & handlers built in to facilitate their use, a calculus that supports scoped effects & handlers in a similar manner does not yet exist. This work fills this gap: we present $λ_{\mathit{sc}}$, a calculus with native support for both algebraic and scoped effects & handlers. It addresses the need for polymorphic handlers and explicit clauses for forwarding unknown scoped operations to other handlers. Our calculus is based on Eff, an existing calculus for algebraic effects, extended with Koka-style row polymorphism, and consists of a formal grammar, operational semantics, a (type-safe) type-and-effect system and type inference. We demonstrate $λ_{\mathit{sc}}$ on a range of examples. △ Less

Submitted 5 March, 2024; v1 submitted 19 April, 2023; originally announced April 2023.

arXiv:2303.16320 [pdf]

doi 10.1002/mp.16529

SynthRAD2023 Grand Challenge dataset: generating synthetic CT for radiotherapy

Authors: Adrian Thummerer, Erik van der Bijl, Arthur Jr Galapon, Joost JC Verhoeff, Johannes A Langendijk, Stefan Both, Cornelis, AT van den Berg, Matteo Maspero

Abstract: Purpose: Medical imaging has become increasingly important in diagnosing and treating oncological patients, particularly in radiotherapy. Recent advances in synthetic computed tomography (sCT) generation have increased interest in public challenges to provide data and evaluation metrics for comparing different approaches openly. This paper describes a dataset of brain and pelvis computed tomograph… ▽ More Purpose: Medical imaging has become increasingly important in diagnosing and treating oncological patients, particularly in radiotherapy. Recent advances in synthetic computed tomography (sCT) generation have increased interest in public challenges to provide data and evaluation metrics for comparing different approaches openly. This paper describes a dataset of brain and pelvis computed tomography (CT) images with rigidly registered CBCT and MRI images to facilitate the development and evaluation of sCT generation for radiotherapy planning. Acquisition and validation methods: The dataset consists of CT, CBCT, and MRI of 540 brains and 540 pelvic radiotherapy patients from three Dutch university medical centers. Subjects' ages ranged from 3 to 93 years, with a mean age of 60. Various scanner models and acquisition settings were used across patients from the three data-providing centers. Details are available in CSV files provided with the datasets. Data format and usage notes: The data is available on Zenodo (https://doi.org/10.5281/zenodo.7260705) under the SynthRAD2023 collection. The images for each subject are available in nifti format. Potential applications: This dataset will enable the evaluation and development of image synthesis algorithms for radiotherapy purposes on a realistic multi-center dataset with varying acquisition protocols. Synthetic CT generation has numerous applications in radiation therapy, including diagnosis, treatment planning, treatment monitoring, and surgical planning. △ Less

Submitted 28 March, 2023; originally announced March 2023.

Comments: 15 pages, 4 figures, 9 tables, pre-print submitted to Medical Physics - dataset. The training dataset is available on Zenodo at https://doi.org/10.5281/zenodo.7260705 from April, 1st 2023

arXiv:2303.10202 [pdf]

doi 10.1016/j.ejmp.2023.102642

Exploring contrast generalisation in deep learning-based brain MRI-to-CT synthesis

Authors: Lotte Nijskens, Cornelis, AT van den Berg, Joost JC Verhoeff, Matteo Maspero

Abstract: Background: Synthetic computed tomography (sCT) has been proposed and increasingly clinically adopted to enable magnetic resonance imaging (MRI)-based radiotherapy. Deep learning (DL) has recently demonstrated the ability to generate accurate sCT from fixed MRI acquisitions. However, MRI protocols may change over time or differ between centres resulting in low-quality sCT due to poor model general… ▽ More Background: Synthetic computed tomography (sCT) has been proposed and increasingly clinically adopted to enable magnetic resonance imaging (MRI)-based radiotherapy. Deep learning (DL) has recently demonstrated the ability to generate accurate sCT from fixed MRI acquisitions. However, MRI protocols may change over time or differ between centres resulting in low-quality sCT due to poor model generalisation. Purpose: investigating domain randomisation (DR) to increase the generalisation of a DL model for brain sCT generation. Methods: CT and corresponding T1-weighted MRI with/without contrast, T2-weighted, and FLAIR MRI from 95 patients undergoing RT were collected, considering FLAIR the unseen sequence where to investigate generalisation. A ``Baseline'' generative adversarial network was trained with/without the FLAIR sequence to test how a model performs without DR. Image similarity and accuracy of sCT-based dose plans were assessed against CT to select the best-performing DR approach against the Baseline. Results: The Baseline model had the poorest performance on FLAIR, with mean absolute error (MAE)=106$\pm$20.7 HU (mean$\pmσ$). Performance on FLAIR significantly improved for the DR model with MAE=99.0$\pm$14.9 HU, but still inferior to the performance of the Baseline+FLAIR model (MAE=72.6$\pm$10.1 HU). Similarly, an improvement in $γ$-pass rate was obtained for DR vs Baseline. Conclusions: DR improved image similarity and dose accuracy on the unseen sequence compared to training only on acquired MRI. DR makes the model more robust, reducing the need for re-training when applying a model on sequences unseen and unavailable for retraining. △ Less

Submitted 17 March, 2023; originally announced March 2023.

Comments: Preprint submitted to Physica Medica on 2023-02-16 for review. Also published in Zenodo at https://doi.org/10.5281/zenodo.7742642

arXiv:2302.01415 [pdf, ps, other]

A Framework for Higher-Order Effects & Handlers

Authors: Birthe van den Berg, Tom Schrijvers

Abstract: Algebraic effects & handlers are a modular approach for modeling side-effects in functional programming. Their syntax is defined in terms of a signature of effectful operations, encoded as a functor, that are plugged into the free monad; their denotational semantics is defined by fold-style handlers that only interpret their part of the syntax and forward the rest. However, not all effects are alg… ▽ More Algebraic effects & handlers are a modular approach for modeling side-effects in functional programming. Their syntax is defined in terms of a signature of effectful operations, encoded as a functor, that are plugged into the free monad; their denotational semantics is defined by fold-style handlers that only interpret their part of the syntax and forward the rest. However, not all effects are algebraic: some need to access an internal computation. For example, scoped effects distinguish between a computation in scope and out of scope; parallel effects parallellize over a computation, latent effects defer a computation. Separate definitions have been proposed for these higher-order effects and their corresponding handlers, often leading to expedient and complex monad definitions. In this work we propose a generic framework for higher-order effects, generalizing algebraic effects & handlers: a generic free monad with higher-order effect signatures and a corresponding interpreter. Specializing this higher-order syntax leads to various definitions of previously defined (scoped, parallel, latent) and novel (writer, bracketing) effects. Furthermore, we formally show our framework theoretically correct, also putting different effect instances on formal footing; a significant contribution for parallel, latent, writer and bracketing effects. △ Less

Submitted 2 February, 2023; originally announced February 2023.

arXiv:2302.00600 [pdf, other]

Two for One: Diffusion Models and Force Fields for Coarse-Grained Molecular Dynamics

Authors: Marloes Arts, Victor Garcia Satorras, Chin-Wei Huang, Daniel Zuegner, Marco Federici, Cecilia Clementi, Frank Noé, Robert Pinsler, Rianne van den Berg

Abstract: Coarse-grained (CG) molecular dynamics enables the study of biological processes at temporal and spatial scales that would be intractable at an atomistic resolution. However, accurately learning a CG force field remains a challenge. In this work, we leverage connections between score-based generative models, force fields and molecular dynamics to learn a CG force field without requiring any force… ▽ More Coarse-grained (CG) molecular dynamics enables the study of biological processes at temporal and spatial scales that would be intractable at an atomistic resolution. However, accurately learning a CG force field remains a challenge. In this work, we leverage connections between score-based generative models, force fields and molecular dynamics to learn a CG force field without requiring any force inputs during training. Specifically, we train a diffusion generative model on protein structures from molecular dynamics simulations, and we show that its score function approximates a force field that can directly be used to simulate CG molecular dynamics. While having a vastly simplified training setup compared to previous work, we demonstrate that our approach leads to improved performance across several small- to medium-sized protein simulations, reproducing the CG equilibrium distribution, and preserving dynamics of all-atom simulations such as protein folding events. △ Less

Submitted 22 September, 2023; v1 submitted 1 February, 2023; originally announced February 2023.

arXiv:2212.11088 [pdf, ps, other]

Forward- or Reverse-Mode Automatic Differentiation: What's the Difference?

Authors: Birthe van den Berg, Tom Schrijvers, James McKinna, Alexander Vandenbroucke

Abstract: Automatic differentiation (AD) has been a topic of interest for researchers in many disciplines, with increased popularity since its application to machine learning and neural networks. Although many researchers appreciate and know how to apply AD, it remains a challenge to truly understand the underlying processes. From an algebraic point of view, however, AD appears surprisingly natural: it orig… ▽ More Automatic differentiation (AD) has been a topic of interest for researchers in many disciplines, with increased popularity since its application to machine learning and neural networks. Although many researchers appreciate and know how to apply AD, it remains a challenge to truly understand the underlying processes. From an algebraic point of view, however, AD appears surprisingly natural: it originates from the differentiation laws. In this work we use Algebra of Programming techniques to reason about different AD variants, leveraging Haskell to illustrate our observations. Our findings stem from three fundamental algebraic abstractions: (1) the notion of module over a semiring, (2) Nagata's construction of the 'idealization of a module', and (3) Kronecker's delta function, that together allow us to write a single-line abstract definition of AD. From this single-line definition, and by instantiating our algebraic structures in various ways, we derive different AD variants, that have the same extensional behaviour, but different intensional properties, mainly in terms of (asymptotic) computational complexity. We show the different variants equivalent by means of Kronecker isomorphisms, a further elaboration of our Haskell infrastructure which guarantees correctness by construction. With this framework in place, this paper seeks to make AD variants more comprehensible, taking an algebraic perspective on the matter. △ Less

Submitted 9 August, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

arXiv:2209.15611 [pdf, other]

Protein structure generation via folding diffusion

Authors: Kevin E. Wu, Kevin K. Yang, Rianne van den Berg, James Y. Zou, Alex X. Lu, Ava P. Amini

Abstract: The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases. Despite recent advances in protein structure prediction, directly generating diverse, novel protein structures from neural networks remains difficult. In this work, we present a new diffusion-based generative model th… ▽ More The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases. Despite recent advances in protein structure prediction, directly generating diverse, novel protein structures from neural networks remains difficult. In this work, we present a new diffusion-based generative model that designs protein backbone structures via a procedure that mirrors the native folding process. We describe protein backbone structure as a series of consecutive angles capturing the relative orientation of the constituent amino acid residues, and generate new structures by denoising from a random, unfolded state towards a stable folded structure. Not only does this mirror how proteins biologically twist into energetically favorable conformations, the inherent shift and rotational invariance of this representation crucially alleviates the need for complex equivariant networks. We train a denoising diffusion probabilistic model with a simple transformer backbone and demonstrate that our resulting model unconditionally generates highly realistic protein structures with complexity and structural patterns akin to those of naturally-occurring proteins. As a useful resource, we release the first open-source codebase and trained models for protein structure diffusion. △ Less

Submitted 23 November, 2022; v1 submitted 30 September, 2022; originally announced September 2022.

ACM Class: I.2.0; J.3

arXiv:2209.04934 [pdf, other]

Clifford Neural Layers for PDE Modeling

Authors: Johannes Brandstetter, Rianne van den Berg, Max Welling, Jayesh K. Gupta

Abstract: Partial differential equations (PDEs) see widespread use in sciences and engineering to describe simulation of physical processes as scalar and vector fields interacting and coevolving over time. Due to the computationally expensive nature of their standard solution methods, neural PDE surrogates have become an active research topic to accelerate these simulations. However, current methods do not… ▽ More Partial differential equations (PDEs) see widespread use in sciences and engineering to describe simulation of physical processes as scalar and vector fields interacting and coevolving over time. Due to the computationally expensive nature of their standard solution methods, neural PDE surrogates have become an active research topic to accelerate these simulations. However, current methods do not explicitly take into account the relationship between different fields and their internal components, which are often correlated. Viewing the time evolution of such correlated fields through the lens of multivector fields allows us to overcome these limitations. Multivector fields consist of scalar, vector, as well as higher-order components, such as bivectors and trivectors. Their algebraic properties, such as multiplication, addition and other arithmetic operations can be described by Clifford algebras. To our knowledge, this paper presents the first usage of such multivector representations together with Clifford convolutions and Clifford Fourier transforms in the context of deep learning. The resulting Clifford neural layers are universally applicable and will find direct use in the areas of fluid dynamics, weather forecasting, and the modeling of physical systems in general. We empirically evaluate the benefit of Clifford neural layers by replacing convolution and Fourier operations in common neural PDE surrogates by their Clifford counterparts on 2D Navier-Stokes and weather modeling tasks, as well as 3D Maxwell equations. For similar parameter count, Clifford neural layers consistently improve generalization capabilities of the tested neural PDE surrogates. Source code for our PyTorch implementation is available at https://microsoft.github.io/cliffordlayers/. △ Less

Submitted 2 March, 2023; v1 submitted 8 September, 2022; originally announced September 2022.

Comments: Accepted at ICLR-2023

arXiv:2206.02883 [pdf, other]

Lane-Level Route Planning for Autonomous Vehicles

Authors: Mitchell Jones, Maximilian Haas-Heger, Jur van den Berg

Abstract: We present an algorithm that, given a representation of a road network in lane-level detail, computes a route that minimizes the expected cost to reach a given destination. In doing so, our algorithm allows us to solve for the complex trade-offs encountered when trying to decide not just which roads to follow, but also when to change between the lanes making up these roads, in order to -- for exam… ▽ More We present an algorithm that, given a representation of a road network in lane-level detail, computes a route that minimizes the expected cost to reach a given destination. In doing so, our algorithm allows us to solve for the complex trade-offs encountered when trying to decide not just which roads to follow, but also when to change between the lanes making up these roads, in order to -- for example -- reduce the likelihood of missing a left exit while not unnecessarily driving in the leftmost lane. This routing problem can naturally be formulated as a Markov Decision Process (MDP), in which lane change actions have stochastic outcomes. However, MDPs are known to be time-consuming to solve in general. In this paper, we show that -- under reasonable assumptions -- we can use a Dijkstra-like approach to solve this stochastic problem, and benefit from its efficient $O(n \log n)$ running time. This enables an autonomous vehicle to exhibit lane-selection behavior as it efficiently plans an optimal route to its destination. △ Less

Submitted 13 July, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

Comments: Appeared at the 15th International Workshop on the Algorithmic Foundations of Robotics (WAFR) 2022

arXiv:2201.10287 [pdf, ps, other]

Structured Handling of Scoped Effects: Extended Version

Authors: Zhixuan Yang, Marco Paviotti, Nicolas Wu, Birthe van den Berg, Tom Schrijvers

Abstract: Algebraic effects offer a versatile framework that covers a wide variety of effects. However, the family of operations that delimit scopes are not algebraic and are usually modelled as handlers, thus preventing them from being used freely in conjunction with algebraic operations. Although proposals for scoped operations exist, they are either ad-hoc and unprincipled, or too inconvenient for practi… ▽ More Algebraic effects offer a versatile framework that covers a wide variety of effects. However, the family of operations that delimit scopes are not algebraic and are usually modelled as handlers, thus preventing them from being used freely in conjunction with algebraic operations. Although proposals for scoped operations exist, they are either ad-hoc and unprincipled, or too inconvenient for practical programming. This paper provides the best of both worlds: a theoretically-founded model of scoped effects that is convenient for implementation and reasoning. Our new model is based on an adjunction between a locally finitely presentable category and a category of functorial algebras. Using comparison functors between adjunctions, we show that our new model, an existing indexed model, and a third approach that simulates scoped operations in terms of algebraic ones have equal expressivity for handling scoped operations. We consider our new model to be the sweet spot between ease of implementation and structuredness. Additionally, our approach automatically induces fusion laws of handlers of scoped effects, which are useful for reasoning and optimisation. △ Less

Submitted 25 January, 2022; originally announced January 2022.

Comments: Extended version of the paper Structured Handling of Scoped Effects in ESOP 2022

arXiv:2111.02244 [pdf]

Exploring Explainable AI in the Financial Sector: Perspectives of Banks and Supervisory Authorities

Authors: Ouren Kuiper, Martin van den Berg, Joost van der Burgt, Stefan Leijnen

Abstract: Explainable artificial intelligence (xAI) is seen as a solution to making AI systems less of a black box. It is essential to ensure transparency, fairness, and accountability, which are especially paramount in the financial sector. The aim of this study was a preliminary investigation of the perspectives of supervisory authorities and regulated entities regarding the application of xAI in the fi-n… ▽ More Explainable artificial intelligence (xAI) is seen as a solution to making AI systems less of a black box. It is essential to ensure transparency, fairness, and accountability, which are especially paramount in the financial sector. The aim of this study was a preliminary investigation of the perspectives of supervisory authorities and regulated entities regarding the application of xAI in the fi-nancial sector. Three use cases (consumer credit, credit risk, and anti-money laundering) were examined using semi-structured interviews at three banks and two supervisory authorities in the Netherlands. We found that for the investigated use cases a disparity exists between supervisory authorities and banks regarding the desired scope of explainability of AI systems. We argue that the financial sector could benefit from clear differentiation between technical AI (model) ex-plainability requirements and explainability requirements of the broader AI system in relation to applicable laws and regulations. △ Less

Submitted 3 November, 2021; originally announced November 2021.

Comments: BNAIC/BeneLearn 2021 conference paper

arXiv:2110.02037 [pdf, other]

Autoregressive Diffusion Models

Authors: Emiel Hoogeboom, Alexey A. Gritsenko, Jasmijn Bastings, Ben Poole, Rianne van den Berg, Tim Salimans

Abstract: We introduce Autoregressive Diffusion Models (ARDMs), a model class encompassing and generalizing order-agnostic autoregressive models (Uria et al., 2014) and absorbing discrete diffusion (Austin et al., 2021), which we show are special cases of ARDMs under mild assumptions. ARDMs are simple to implement and easy to train. Unlike standard ARMs, they do not require causal masking of model represent… ▽ More We introduce Autoregressive Diffusion Models (ARDMs), a model class encompassing and generalizing order-agnostic autoregressive models (Uria et al., 2014) and absorbing discrete diffusion (Austin et al., 2021), which we show are special cases of ARDMs under mild assumptions. ARDMs are simple to implement and easy to train. Unlike standard ARMs, they do not require causal masking of model representations, and can be trained using an efficient objective similar to modern probabilistic diffusion models that scales favourably to highly-dimensional data. At test time, ARDMs support parallel generation which can be adapted to fit any given generation budget. We find that ARDMs require significantly fewer steps than discrete diffusion models to attain the same performance. Finally, we apply ARDMs to lossless compression, and show that they are uniquely suited to this task. Contrary to existing approaches based on bits-back coding, ARDMs obtain compelling results not only on complete datasets, but also on compressing single data points. Moreover, this can be done using a modest number of network calls for (de)compression due to the model's adaptable parallel generation. △ Less

Submitted 1 February, 2022; v1 submitted 5 October, 2021; originally announced October 2021.

Comments: Published as a conference paper at International Conference on Learning Representations (ICLR) 2022

arXiv:2108.12285 [pdf, other]

doi 10.1016/j.ohx.2022.e00320

OpenFish: Biomimetic Design of a Soft Robotic Fish for High Speed Locomotion

Authors: Sander C. van den Berg, Rob B. N. Scharff, Zoltán Rusák, Jun Wu

Abstract: We present OpenFish: an open source soft robotic fish which is optimized for speed and efficiency. The soft robotic fish uses a combination of an active and passive tail segment to accurately mimic the thunniform swimming mode. Through the implementation of a novel propulsion system that is capable of achieving higher oscillation frequencies with a more sinusoidal waveform, the open source soft ro… ▽ More We present OpenFish: an open source soft robotic fish which is optimized for speed and efficiency. The soft robotic fish uses a combination of an active and passive tail segment to accurately mimic the thunniform swimming mode. Through the implementation of a novel propulsion system that is capable of achieving higher oscillation frequencies with a more sinusoidal waveform, the open source soft robotic fish achieves a top speed of $0.85~\mathrm{m/s}$. Hereby, it outperforms the previously reported fastest soft robotic fish by $27\%$. Besides the propulsion system, the optimization of the fish morphology played a crucial role in achieving this speed. In this work, a detailed description of the design, construction and customization of the soft robotic fish is presented. Hereby, we hope this open source design will accelerate future research and developments in soft robotic fish. △ Less

Submitted 3 June, 2022; v1 submitted 20 July, 2021; originally announced August 2021.

Journal ref: HardwareX, 2022, e00320, ISSN 2468-0672, (https://www.sciencedirect.com/science/article/pii/S2468067222000657)

arXiv:2108.11155 [pdf, other]

Latent Effects for Reusable Language Components: Extended Version

Authors: Birthe van den Berg, Tom Schrijvers, Casper Bach-Poulsen, Nicolas Wu

Abstract: The development of programming languages can be quite complicated and costly. Hence, much effort has been devoted to the modular definition of language features that can be reused in various combinations to define new languages and experiment with their semantics. A notable outcome of these efforts is the algebra-based "datatypes a la carte" (DTC) approach. When combined with algebraic effects, DT… ▽ More The development of programming languages can be quite complicated and costly. Hence, much effort has been devoted to the modular definition of language features that can be reused in various combinations to define new languages and experiment with their semantics. A notable outcome of these efforts is the algebra-based "datatypes a la carte" (DTC) approach. When combined with algebraic effects, DTC can model a wide range of common language features. Unfortunately, the current state of the art does not cover modular definitions of advanced control-flow mechanisms that defer execution to an appropriate point, such as call-by-name and call-by-need evaluation, as well as (multi-)staging. This paper defines latent effects, a generic class of such control-flow mechanisms. We demonstrate how function abstractions, lazy computations and a MetaML-like staging can all be expressed in a modular fashion using latent effects, and how they can be combined in various ways to obtain complex semantics. We provide a full Haskell implementation of our effects and handlers with a range of examples. △ Less

Submitted 25 August, 2021; originally announced August 2021.

Comments: extended version of APLAS 2021 paper

arXiv:2107.07675 [pdf, other]

Beyond In-Place Corruption: Insertion and Deletion In Denoising Probabilistic Models

Authors: Daniel D. Johnson, Jacob Austin, Rianne van den Berg, Daniel Tarlow

Abstract: Denoising diffusion probabilistic models (DDPMs) have shown impressive results on sequence generation by iteratively corrupting each example and then learning to map corrupted versions back to the original. However, previous work has largely focused on in-place corruption, adding noise to each pixel or token individually while kee** their locations the same. In this work, we consider a broader c… ▽ More Denoising diffusion probabilistic models (DDPMs) have shown impressive results on sequence generation by iteratively corrupting each example and then learning to map corrupted versions back to the original. However, previous work has largely focused on in-place corruption, adding noise to each pixel or token individually while kee** their locations the same. In this work, we consider a broader class of corruption processes and denoising models over sequence data that can insert and delete elements, while still being efficient to train and sample from. We demonstrate that these models outperform standard in-place models on an arithmetic sequence task, and that when trained on the text8 dataset they can be used to fix spelling errors without any fine-tuning. △ Less

Submitted 15 July, 2021; originally announced July 2021.

Comments: Accepted at the ICML 2021 Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models (poster)

arXiv:2107.03006 [pdf, other]

Structured Denoising Diffusion Models in Discrete State-Spaces

Authors: Jacob Austin, Daniel D. Johnson, Jonathan Ho, Daniel Tarlow, Rianne van den Berg

Abstract: Denoising diffusion probabilistic models (DDPMs) (Ho et al. 2020) have shown impressive results on image and waveform generation in continuous state spaces. Here, we introduce Discrete Denoising Diffusion Probabilistic Models (D3PMs), diffusion-like generative models for discrete data that generalize the multinomial diffusion model of Hoogeboom et al. 2021, by going beyond corruption processes wit… ▽ More Denoising diffusion probabilistic models (DDPMs) (Ho et al. 2020) have shown impressive results on image and waveform generation in continuous state spaces. Here, we introduce Discrete Denoising Diffusion Probabilistic Models (D3PMs), diffusion-like generative models for discrete data that generalize the multinomial diffusion model of Hoogeboom et al. 2021, by going beyond corruption processes with uniform transition probabilities. This includes corruption with transition matrices that mimic Gaussian kernels in continuous space, matrices based on nearest neighbors in embedding space, and matrices that introduce absorbing states. The third allows us to draw a connection between diffusion models and autoregressive and mask-based generative models. We show that the choice of transition matrix is an important design decision that leads to improved results in image and text domains. We also introduce a new loss function that combines the variational lower bound with an auxiliary cross entropy loss. For text, this model class achieves strong results on character-level text generation while scaling to large vocabularies on LM1B. On the image dataset CIFAR-10, our models approach the sample quality and exceed the log-likelihood of the continuous-space DDPM model. △ Less

Submitted 22 February, 2023; v1 submitted 7 July, 2021; originally announced July 2021.

Comments: 10 pages plus references and appendices. First two authors contributed equally

arXiv:2107.00314 [pdf, other]

Backtracking (the) Algorithms on the Hamiltonian Cycle Problem

Authors: Joeri Sleegers, Daan van den Berg

Abstract: Even though the Hamiltonian cycle problem is NP-complete, many of its problem instances aren't. In fact, almost all the hard instances reside in one area: near the Komlós-Szemerédi bound, of $\frac{1}{2}\ v\cdot ln(v) + \frac{1}{2}\ v\cdot ln( ln(v))$ edges, where randomly generated graphs have an approximate 50\% chance of being Hamiltonian. If the number of edges is either much higher or much lo… ▽ More Even though the Hamiltonian cycle problem is NP-complete, many of its problem instances aren't. In fact, almost all the hard instances reside in one area: near the Komlós-Szemerédi bound, of $\frac{1}{2}\ v\cdot ln(v) + \frac{1}{2}\ v\cdot ln( ln(v))$ edges, where randomly generated graphs have an approximate 50\% chance of being Hamiltonian. If the number of edges is either much higher or much lower, the problem is not hard -- most backtracking algorithms decide such instances in (near) polynomial time. Recently however, targeted search efforts have identified very hard Hamiltonian cycle problem instances very far away from the Komlós-Szemerédi bound. In that study, the used backtracking algorithm was Vandegriend-Culberson's, which was supposedly the most efficient of all Hamiltonian backtracking algorithms. In this paper, we make a unified large scale quantitative comparison for the best known backtracking algorithms described between 1877 and 2016. We confirm the suspicion that the Komlós-Szemerédi bound is a hard area for all backtracking algorithms, but also that Vandegriend-Culberson is indeed the most efficient algorithm, when expressed in consumed computing time. When measured in recursive effectiveness however, the algorithm by Frank Rubin, almost half a century old, performs best. In a more general algorithmic assessment, we conjecture that edge pruning and non-Hamiltonicity checks might be largely responsible for these recursive savings. When expressed in system time however, denser problem instances require much more time per recursion. This is most likely due to the costliness of the extra search pruning procedures, which are relatively elaborate. We supply large amounts of experimental data, and a unified single-program implementation for all six algorithms. All data and algorithmic source code is made public for further use by our colleagues. △ Less

Submitted 29 June, 2023; v1 submitted 1 July, 2021; originally announced July 2021.

Comments: Peer-reviewed and published in IARIA journals

MSC Class: 68Q25

arXiv:2106.06080 [pdf, other]

Gradual Domain Adaptation in the Wild:When Intermediate Distributions are Absent

Authors: Samira Abnar, Rianne van den Berg, Golnaz Ghiasi, Mostafa Dehghani, Nal Kalchbrenner, Hanie Sedghi

Abstract: We focus on the problem of domain adaptation when the goal is shifting the model towards the target distribution, rather than learning domain invariant representations. It has been shown that under the following two assumptions: (a) access to samples from intermediate distributions, and (b) samples being annotated with the amount of change from the source distribution, self-training can be success… ▽ More We focus on the problem of domain adaptation when the goal is shifting the model towards the target distribution, rather than learning domain invariant representations. It has been shown that under the following two assumptions: (a) access to samples from intermediate distributions, and (b) samples being annotated with the amount of change from the source distribution, self-training can be successfully applied on gradually shifted samples to adapt the model toward the target distribution. We hypothesize having (a) is enough to enable iterative self-training to slowly adapt the model to the target distribution, by making use of an implicit curriculum. In the case where (a) does not hold, we observe that iterative self-training falls short. We propose GIFT, a method that creates virtual samples from intermediate distributions by interpolating representations of examples from source and target domains. We evaluate an iterative-self-training method on datasets with natural distribution shifts, and show that when applied on top of other domain adaptation methods, it improves the performance of the model on the target dataset. We run an analysis on a synthetic dataset to show that in the presence of (a) iterative-self-training naturally forms a curriculum of samples. Furthermore, we show that when (a) does not hold, GIFT performs better than iterative self-training. △ Less

Submitted 13 July, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

arXiv:2103.14482 [pdf, other]

doi 10.46298/lmcs-18(4:13)2022

Converse extensionality and apartness

Authors: Benno van den Berg, Robert Passmann

Abstract: In this paper we try to find a computational interpretation for a strong form of extensionality, which we call "converse extensionality". Converse extensionality principles, which arise as the Dialectica interpretation of the axiom of extensionality, were first studied by Howard. In order to give a computational interpretation to these principles, we reconsider Brouwer's apartness relation, a stro… ▽ More In this paper we try to find a computational interpretation for a strong form of extensionality, which we call "converse extensionality". Converse extensionality principles, which arise as the Dialectica interpretation of the axiom of extensionality, were first studied by Howard. In order to give a computational interpretation to these principles, we reconsider Brouwer's apartness relation, a strong constructive form of inequality. Formally, we provide a categorical construction to endow every typed combinatory algebra with an apartness relation. We then exploit that functions reflect apartness, in addition to preserving equality, to prove that the resulting categories of assemblies model a converse extensionality principle. △ Less

Submitted 19 December, 2022; v1 submitted 26 March, 2021; originally announced March 2021.

Journal ref: Logical Methods in Computer Science, Volume 18, Issue 4 (December 20, 2022) lmcs:7306

arXiv:2103.12841 [pdf, ps, other]

Ground Truths for the Humanities

Authors: Yvette Oortwijn, Hein van den Berg, Arianna Betti

Abstract: Ensuring a faithful interaction with data and its representation for humanities can and should depend on expert-constructed ground truths. Ensuring a faithful interaction with data and its representation for humanities can and should depend on expert-constructed ground truths. △ Less

Submitted 23 March, 2021; originally announced March 2021.

Comments: Provocation paper presented at the 5th Workshop on Visualization for the Digital Humanities (VIS4DH), part of the 31st IEEE Visualization Conference, IEEE VIS 2020, Virtual Event, Salt Lake City, USA, October 25-30, 2020. 1 page + 1 page of references

arXiv:2102.07846 [pdf, ps, other]

Corneal Pachymetry by AS-OCT after Descemet's Membrane Endothelial Keratoplasty

Authors: Friso G. Heslinga, Ruben T. Lucassen, Myrthe A. van den Berg, Luuk van der Hoek, Josien P. W. Pluim, Javier Cabrerizo, Mark Alberti, Mitko Veta

Abstract: Corneal thickness (pachymetry) maps can be used to monitor restoration of corneal endothelial function, for example after Descemet's membrane endothelial keratoplasty (DMEK). Automated delineation of the corneal interfaces in anterior segment optical coherence tomography (AS-OCT) can be challenging for corneas that are irregularly shaped due to pathology, or as a consequence of surgery, leading to… ▽ More Corneal thickness (pachymetry) maps can be used to monitor restoration of corneal endothelial function, for example after Descemet's membrane endothelial keratoplasty (DMEK). Automated delineation of the corneal interfaces in anterior segment optical coherence tomography (AS-OCT) can be challenging for corneas that are irregularly shaped due to pathology, or as a consequence of surgery, leading to incorrect thickness measurements. In this research, deep learning is used to automatically delineate the corneal interfaces and measure corneal thickness with high accuracy in post-DMEK AS-OCT B-scans. Three different deep learning strategies were developed based on 960 B-scans from 50 patients. On an independent test set of 320 B-scans, corneal thickness could be measured with an error of 13.98 to 15.50 micrometer for the central 9 mm range, which is less than 3% of the average corneal thickness. The accurate thickness measurements were used to construct detailed pachymetry maps. Moreover, follow-up scans could be registered based on anatomical landmarks to obtain differential pachymetry maps. These maps may enable a more comprehensive understanding of the restoration of the endothelial function after DMEK, where thickness often varies throughout different regions of the cornea, and subsequently contribute to a standardized postoperative regime. △ Less

Submitted 6 April, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

Comments: Fixed typo in abstract: The development set consists of 960 B-scans from 50 patients (instead of 68). The B-scans from the other 18 patients were used for testing only

arXiv:2102.00905 [pdf, other]

Quadratic type checking for objective type theory

Authors: Benno van den Berg, Martijn den Besten

Abstract: We introduce a modification of standard Martin-Lof type theory in which we eliminate definitional equality and replace all computation rules by propositional equalities. We show that type checking for such a system can be done in quadratic time and that it has a natural homotopy-theoretic semantics. We introduce a modification of standard Martin-Lof type theory in which we eliminate definitional equality and replace all computation rules by propositional equalities. We show that type checking for such a system can be done in quadratic time and that it has a natural homotopy-theoretic semantics. △ Less

Submitted 1 February, 2021; originally announced February 2021.

arXiv:2012.11936 [pdf, other]

Knowledge Graphs Evolution and Preservation -- A Technical Report from ISWS 2019

Authors: Nacira Abbas, Kholoud Alghamdi, Mortaza Alinam, Francesca Alloatti, Glenda Amaral, Claudia d'Amato, Luigi Asprino, Martin Beno, Felix Bensmann, Russa Biswas, Ling Cai, Riley Capshaw, Valentina Anita Carriero, Irene Celino, Amine Dadoun, Stefano De Giorgis, Harm Delva, John Domingue, Michel Dumontier, Vincent Emonet, Marieke van Erp, Paola Espinoza Arias, Omaima Fallatah, Sebastián Ferrada, Marc Gallofré Ocaña , et al. (49 additional authors not shown)

Abstract: One of the grand challenges discussed during the Dagstuhl Seminar "Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web" and described in its report is that of a: "Public FAIR Knowledge Graph of Everything: We increasingly see the creation of knowledge graphs that capture information about the entirety of a class of entities. [...] This grand challenge extends this fur… ▽ More One of the grand challenges discussed during the Dagstuhl Seminar "Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web" and described in its report is that of a: "Public FAIR Knowledge Graph of Everything: We increasingly see the creation of knowledge graphs that capture information about the entirety of a class of entities. [...] This grand challenge extends this further by asking if we can create a knowledge graph of "everything" ranging from common sense concepts to location based entities. This knowledge graph should be "open to the public" in a FAIR manner democratizing this mass amount of knowledge." Although linked open data (LOD) is one knowledge graph, it is the closest realisation (and probably the only one) to a public FAIR Knowledge Graph (KG) of everything. Surely, LOD provides a unique testbed for experimenting and evaluating research hypotheses on open and FAIR KG. One of the most neglected FAIR issues about KGs is their ongoing evolution and long term preservation. We want to investigate this problem, that is to understand what preserving and supporting the evolution of KGs means and how these problems can be addressed. Clearly, the problem can be approached from different perspectives and may require the development of different approaches, including new theories, ontologies, metrics, strategies, procedures, etc. This document reports a collaborative effort performed by 9 teams of students, each guided by a senior researcher as their mentor, attending the International Semantic Web Research School (ISWS 2019). Each team provides a different perspective to the problem of knowledge graph evolution substantiated by a set of research questions as the main subject of their investigation. In addition, they provide their working definition for KG preservation and evolution. △ Less

Submitted 22 December, 2020; originally announced December 2020.

arXiv:2012.02015 [pdf, other]

Context in Informational Bias Detection

Authors: Esther van den Berg, Katja Markert

Abstract: Informational bias is bias conveyed through sentences or clauses that provide tangential, speculative or background information that can sway readers' opinions towards entities. By nature, informational bias is context-dependent, but previous work on informational bias detection has not explored the role of context beyond the sentence. In this paper, we explore four kinds of context for informatio… ▽ More Informational bias is bias conveyed through sentences or clauses that provide tangential, speculative or background information that can sway readers' opinions towards entities. By nature, informational bias is context-dependent, but previous work on informational bias detection has not explored the role of context beyond the sentence. In this paper, we explore four kinds of context for informational bias in English news articles: neighboring sentences, the full article, articles on the same event from other news publishers, and articles from the same domain (but potentially different events). We find that integrating event context improves classification performance over a very strong baseline. In addition, we perform the first error analysis of models on this task. We find that the best-performing context-inclusive model outperforms the baseline on longer sentences, and sentences from politically centrist articles. △ Less

Submitted 3 December, 2020; originally announced December 2020.

Comments: Accepted to COLING'2020

arXiv:2009.12670 [pdf, other]

Effective Kan fibrations in simplicial sets

Authors: Benno van den Berg, Eric Faber

Abstract: We introduce the notion of an effective Kan fibration, a new mathematical structure that can be used to study simplicial homotopy theory. Our main motivation is to make simplicial homotopy theory suitable for homotopy type theory. Effective Kan fibrations are maps of simplicial sets equipped with a structured collection of chosen lifts that satisfy certain non-trivial properties. This contrasts wi… ▽ More We introduce the notion of an effective Kan fibration, a new mathematical structure that can be used to study simplicial homotopy theory. Our main motivation is to make simplicial homotopy theory suitable for homotopy type theory. Effective Kan fibrations are maps of simplicial sets equipped with a structured collection of chosen lifts that satisfy certain non-trivial properties. This contrasts with the ordinary, unstructured notion of a Kan fibration. We show that fundamental properties of Kan fibrations can be extended to explicit constructions on effective Kan fibrations. In particular, we give a constructive (explicit) proof showing that effective Kan fibrations are stable under push forward, or fibred exponentials. This is known to be impossible for ordinary Kan fibrations. We further show that effective Kan fibrations are local, or completely determined by their fibres above representables. We also give an (ineffective) proof saying that the maps which can be equipped with the structure of an effective Kan fibration are precisely the ordinary Kan fibrations. Hence implicitly, both notions still describe the same homotopy theory. By showing that the effective Kan fibrations combine all these properties, we solve an open problem in homotopy type theory. In this way our work provides a first step in giving a constructive account of Voevodsky's model of univalent type theory in simplicial sets. △ Less

Submitted 30 April, 2022; v1 submitted 26 September, 2020; originally announced September 2020.

Comments: 197 pages

MSC Class: 55U10; 55U35; 03B15; 18A32

arXiv:2008.01160 [pdf, other]

A Spectral Energy Distance for Parallel Speech Synthesis

Authors: Alexey A. Gritsenko, Tim Salimans, Rianne van den Berg, Jasper Snoek, Nal Kalchbrenner

Abstract: Speech synthesis is an important practical generative modeling problem that has seen great progress over the last few years, with likelihood-based autoregressive neural models now outperforming traditional concatenative systems. A downside of such autoregressive models is that they require executing tens of thousands of sequential operations per second of generated audio, making them ill-suited fo… ▽ More Speech synthesis is an important practical generative modeling problem that has seen great progress over the last few years, with likelihood-based autoregressive neural models now outperforming traditional concatenative systems. A downside of such autoregressive models is that they require executing tens of thousands of sequential operations per second of generated audio, making them ill-suited for deployment on specialized deep learning hardware. Here, we propose a new learning method that allows us to train highly parallel models of speech, without requiring access to an analytical likelihood function. Our approach is based on a generalized energy distance between the distributions of the generated and real audio. This spectral energy distance is a proper scoring rule with respect to the distribution over magnitude-spectrograms of the generated waveform audio and offers statistical consistency guarantees. The distance can be calculated from minibatches without bias, and does not involve adversarial learning, yielding a stable and consistent method for training implicit generative models. Empirically, we achieve state-of-the-art generation quality among implicit generative models, as judged by the recently-proposed cFDSD metric. When combining our method with adversarial techniques, we also improve upon the recently-proposed GAN-TTS model in terms of Mean Opinion Score as judged by trained human evaluators. △ Less

Submitted 23 October, 2020; v1 submitted 3 August, 2020; originally announced August 2020.

arXiv:2007.03488 [pdf, other]

Benchmarking in Optimization: Best Practice and Open Issues

Authors: Thomas Bartz-Beielstein, Carola Doerr, Daan van den Berg, Jakob Bossek, Sowmya Chandrasekaran, Tome Eftimov, Andreas Fischbach, Pascal Kerschke, William La Cava, Manuel Lopez-Ibanez, Katherine M. Malan, Jason H. Moore, Boris Naujoks, Patryk Orzechowski, Vanessa Volz, Markus Wagner, Thomas Weise

Abstract: This survey compiles ideas and recommendations from more than a dozen researchers with different backgrounds and from different institutes around the world. Promoting best practice in benchmarking is its main goal. The article discusses eight essential topics in benchmarking: clearly stated goals, well-specified problems, suitable algorithms, adequate performance measures, thoughtful analysis, eff… ▽ More This survey compiles ideas and recommendations from more than a dozen researchers with different backgrounds and from different institutes around the world. Promoting best practice in benchmarking is its main goal. The article discusses eight essential topics in benchmarking: clearly stated goals, well-specified problems, suitable algorithms, adequate performance measures, thoughtful analysis, effective and efficient designs, comprehensible presentations, and guaranteed reproducibility. The final goal is to provide well-accepted guidelines (rules) that might be useful for authors and reviewers. As benchmarking in optimization is an active and evolving field of research this manuscript is meant to co-evolve over time by means of periodic updates. △ Less

Submitted 16 December, 2020; v1 submitted 7 July, 2020; originally announced July 2020.

Comments: Version 2

MSC Class: 68W50 ACM Class: A.1; B.8.0; G.1.6; G.4; I.2.8

arXiv:2006.12459 [pdf, other]

IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression

Authors: Rianne van den Berg, Alexey A. Gritsenko, Mostafa Dehghani, Casper Kaae Sønderby, Tim Salimans

Abstract: In this paper we analyse and improve integer discrete flows for lossless compression. Integer discrete flows are a recently proposed class of models that learn invertible transformations for integer-valued random variables. Their discrete nature makes them particularly suitable for lossless compression with entropy coding schemes. We start by investigating a recent theoretical claim that states th… ▽ More In this paper we analyse and improve integer discrete flows for lossless compression. Integer discrete flows are a recently proposed class of models that learn invertible transformations for integer-valued random variables. Their discrete nature makes them particularly suitable for lossless compression with entropy coding schemes. We start by investigating a recent theoretical claim that states that invertible flows for discrete random variables are less flexible than their continuous counterparts. We demonstrate with a proof that this claim does not hold for integer discrete flows due to the embedding of data with finite support into the countably infinite integer lattice. Furthermore, we zoom in on the effect of gradient bias due to the straight-through estimator in integer discrete flows, and demonstrate that its influence is highly dependent on architecture choices and less prominent than previously thought. Finally, we show how different architecture modifications improve the performance of this model class for lossless compression, and that they also enable more efficient compression: a model with half the number of flow layers performs on par with or better than the original integer discrete flow model. △ Less

Submitted 23 March, 2021; v1 submitted 22 June, 2020; originally announced June 2020.

Comments: Accepted as a conference paper at the Ninth International Conference on Learning Representations (ICLR) 2021

arXiv:2003.08370 [pdf, other]

Distant Supervision and Noisy Label Learning for Low Resource Named Entity Recognition: A Study on Hausa and Yorùbá

Authors: David Ifeoluwa Adelani, Michael A. Hedderich, Dawei Zhu, Esther van den Berg, Dietrich Klakow

Abstract: The lack of labeled training data has limited the development of natural language processing tools, such as named entity recognition, for many languages spoken in develo** countries. Techniques such as distant and weak supervision can be used to create labeled data in a (semi-) automatic way. Additionally, to alleviate some of the negative effects of the errors in automatic annotation, noise-han… ▽ More The lack of labeled training data has limited the development of natural language processing tools, such as named entity recognition, for many languages spoken in develo** countries. Techniques such as distant and weak supervision can be used to create labeled data in a (semi-) automatic way. Additionally, to alleviate some of the negative effects of the errors in automatic annotation, noise-handling methods can be integrated. Pretrained word embeddings are another key component of most neural named entity classifiers. With the advent of more complex contextual word embeddings, an interesting trade-off between model size and performance arises. While these techniques have been shown to work well in high-resource settings, we want to study how they perform in low-resource scenarios. In this work, we perform named entity recognition for Hausa and Yorùbá, two languages that are widely spoken in several develo** countries. We evaluate different embedding approaches and show that distant supervision can be successfully leveraged in a realistic low-resource scenario where it can more than double a classifier's performance. △ Less

Submitted 31 March, 2020; v1 submitted 18 March, 2020; originally announced March 2020.

Comments: Accepted to ICLR 2020 Workshop

arXiv:1912.11136 [pdf, other]

doi 10.1016/j.phro.2020.04.002

CBCT-to-CT synthesis with a single neural network for head-and-neck, lung and breast cancer adaptive radiotherapy

Authors: Matteo Maspero, Mark HF Savenije, Tristan CF van Heijst, Joost JC Verhoeff, Alexis NTJ Kotte, Anette C Houweling, Cornelis AT van den Berg

Abstract: Purpose: CBCT-based adaptive radiotherapy requires daily images for accurate dose calculations. This study investigates the feasibility of applying a single convolutional network to facilitate CBCT-to-CT synthesis for head-and-neck, lung, and breast cancer patients. Methods: Ninety-nine patients diagnosed with head-and-neck, lung or breast cancer undergoing radiotherapy with CBCT-based position ve… ▽ More Purpose: CBCT-based adaptive radiotherapy requires daily images for accurate dose calculations. This study investigates the feasibility of applying a single convolutional network to facilitate CBCT-to-CT synthesis for head-and-neck, lung, and breast cancer patients. Methods: Ninety-nine patients diagnosed with head-and-neck, lung or breast cancer undergoing radiotherapy with CBCT-based position verification were included in this study. CBCTs were registered to planning CTs according to clinical procedures. Three cycle-consistent generative adversarial networks (cycle-GANs) were trained in an unpaired manner on 15 patients per anatomical site generating synthetic-CTs (sCTs). Another network was trained with all the anatomical sites together. Performances of all four networks were compared and evaluated for image similarity against rescan CT (rCT). Clinical plans were recalculated on CT and sCT and analysed through voxel-based dose differences and γ-analysis. Results: A sCT was generated in 10 seconds. Image similarity was comparable between models trained on different anatomical sites and a single model for all sites. Mean dose differences < 0.5% were obtained in high-dose regions. Mean gamma (2%,2mm) pass-rates > 95% were achieved for all sites. Conclusions: Cycle-GAN reduced CBCT artefacts and increased HU similarity to CT, enabling sCT-based dose calculations. The speed of the network can facilitate on-line adaptive radiotherapy using a single network for head-and-neck, lung and breast cancer patients. △ Less

Submitted 23 December, 2019; originally announced December 2019.

Comments: Submitted to Medical Physics; 2019-12-23

Journal ref: Physics and Imaging in Radiation Oncology Volume 14, April 2020, Pages 24-31

arXiv:1906.07582 [pdf, other]

Differentiable probabilistic models of scientific imaging with the Fourier slice theorem

Authors: Karen Ullrich, Rianne van den Berg, Marcus Brubaker, David Fleet, Max Welling

Abstract: Scientific imaging techniques such as optical and electron microscopy and computed tomography (CT) scanning are used to study the 3D structure of an object through 2D observations. These observations are related to the original 3D object through orthogonal integral projections. For common 3D reconstruction algorithms, computational efficiency requires the modeling of the 3D structures to take plac… ▽ More Scientific imaging techniques such as optical and electron microscopy and computed tomography (CT) scanning are used to study the 3D structure of an object through 2D observations. These observations are related to the original 3D object through orthogonal integral projections. For common 3D reconstruction algorithms, computational efficiency requires the modeling of the 3D structures to take place in Fourier space by applying the Fourier slice theorem. At present, it is unclear how to differentiate through the projection operator, and hence current learning algorithms can not rely on gradient based methods to optimize 3D structure models. In this paper we show how back-propagation through the projection operator in Fourier space can be achieved. We demonstrate the validity of the approach with experiments on 3D reconstruction of proteins. We further extend our approach to learning probabilistic models of 3D objects. This allows us to predict regions of low sampling rates or estimate noise. A higher sample efficiency can be reached by utilizing the learned uncertainties of the 3D structure as an unsupervised estimate of the model fit. Finally, we demonstrate how the reconstruction algorithm can be extended with an amortized inference scheme on unknown attributes such as object pose. Through empirical studies we show that joint inference of the 3D structure and the object pose becomes more difficult when the ground truth object contains more symmetries. Due to the presence of for instance (approximate) rotational symmetries, the pose estimation can easily get stuck in local optima, inhibiting a fine-grained high-quality estimate of the 3D structure. △ Less

Submitted 20 June, 2019; v1 submitted 18 June, 2019; originally announced June 2019.

Comments: accepted to UAI 2019

arXiv:1905.07376 [pdf, other]

Integer Discrete Flows and Lossless Compression

Authors: Emiel Hoogeboom, Jorn W. T. Peters, Rianne van den Berg, Max Welling

Abstract: Lossless compression methods shorten the expected representation size of data without loss of information, using a statistical model. Flow-based models are attractive in this setting because they admit exact likelihood optimization, which is equivalent to minimizing the expected number of bits per message. However, conventional flows assume continuous data, which may lead to reconstruction errors… ▽ More Lossless compression methods shorten the expected representation size of data without loss of information, using a statistical model. Flow-based models are attractive in this setting because they admit exact likelihood optimization, which is equivalent to minimizing the expected number of bits per message. However, conventional flows assume continuous data, which may lead to reconstruction errors when quantized for compression. For that reason, we introduce a flow-based generative model for ordinal discrete data called Integer Discrete Flow (IDF): a bijective integer map that can learn rich transformations on high-dimensional data. As building blocks for IDFs, we introduce a flexible transformation layer called integer discrete coupling. Our experiments show that IDFs are competitive with other flow-based generative models. Furthermore, we demonstrate that IDF based compression achieves state-of-the-art lossless compression rates on CIFAR10, ImageNet32, and ImageNet64. To the best of our knowledge, this is the first lossless compression method that uses invertible neural networks. △ Less

Submitted 6 December, 2019; v1 submitted 17 May, 2019; originally announced May 2019.

Comments: Accepted as a conference paper at Neural Information Processing Systems (NeurIPS) 2019

arXiv:1901.11137 [pdf, other]

Emerging Convolutions for Generative Normalizing Flows

Authors: Emiel Hoogeboom, Rianne van den Berg, Max Welling

Abstract: Generative flows are attractive because they admit exact likelihood optimization and efficient image synthesis. Recently, Kingma & Dhariwal (2018) demonstrated with Glow that generative flows are capable of generating high quality images. We generalize the 1 x 1 convolutions proposed in Glow to invertible d x d convolutions, which are more flexible since they operate on both channel and spatial ax… ▽ More Generative flows are attractive because they admit exact likelihood optimization and efficient image synthesis. Recently, Kingma & Dhariwal (2018) demonstrated with Glow that generative flows are capable of generating high quality images. We generalize the 1 x 1 convolutions proposed in Glow to invertible d x d convolutions, which are more flexible since they operate on both channel and spatial axes. We propose two methods to produce invertible convolutions that have receptive fields identical to standard convolutions: Emerging convolutions are obtained by chaining specific autoregressive convolutions, and periodic convolutions are decoupled in the frequency domain. Our experiments show that the flexibility of d x d convolutions significantly improves the performance of generative flow models on galaxy images, CIFAR10 and ImageNet. △ Less

Submitted 20 May, 2019; v1 submitted 30 January, 2019; originally announced January 2019.

Comments: Accepted at International Conference on Machine Learning (ICML) 2019

arXiv:1812.00991 [pdf]

Analyzing Partitioned FAIR Health Data Responsibly

Authors: Chang Sun, Lianne Ippel, Birgit Wouters, Johan van Soest, Alexander Malic, Onaopepo Adekunle, Bob van den Berg, Marco Puts, Ole Mussmann, Annemarie Koster, Carla van der Kallen, David Townend, Andre Dekker, Michel Dumontier

Abstract: It is widely anticipated that the use of health-related big data will enable further understanding and improvements in human health and wellbeing. Our current project, funded through the Dutch National Research Agenda, aims to explore the relationship between the development of diabetes and socio-economic factors such as lifestyle and health care utilization. The analysis involves combining data f… ▽ More It is widely anticipated that the use of health-related big data will enable further understanding and improvements in human health and wellbeing. Our current project, funded through the Dutch National Research Agenda, aims to explore the relationship between the development of diabetes and socio-economic factors such as lifestyle and health care utilization. The analysis involves combining data from the Maastricht Study (DMS), a prospective clinical study, and data collected by Statistics Netherlands (CBS) as part of its routine operations. However, a wide array of social, legal, technical, and scientific issues hinder the analysis. In this paper, we describe these challenges and our progress towards addressing them. △ Less

Submitted 2 December, 2018; originally announced December 2018.

Comments: 6 pages, 1 figure, preliminary result, project report

ACM Class: E.1; E.3; H.2.4; H.2.8

arXiv:1810.08723 [pdf, other]

The Ocean Tensor Package

Authors: Ewout van den Berg

Abstract: Matrix and tensor operations form the basis of a wide range of fields and applications, and in many cases constitute a substantial part of the overall computational complexity. The ability of general-purpose GPUs to speed up many of these operations and enable others has resulted in a widespread adaptation of these devices. In order for tensor operations to take full advantage of the computational… ▽ More Matrix and tensor operations form the basis of a wide range of fields and applications, and in many cases constitute a substantial part of the overall computational complexity. The ability of general-purpose GPUs to speed up many of these operations and enable others has resulted in a widespread adaptation of these devices. In order for tensor operations to take full advantage of the computational power, specialized software is required, and currently there exist several packages (predominantly in the area of deep learning) that incorporate tensor operations on both CPU and GPU. Nevertheless, a stand-alone framework that supports general tensor operations is still missing. In this paper we fill this gap and propose the Ocean Tensor Library: a modular tensor-support package that is designed to serve as a foundational layer for applications that require dense tensor operations on a variety of device types. The API is carefully designed to be powerful, extensible, and at the same time easy to use. The package is available as open source. △ Less

Submitted 19 October, 2018; originally announced October 2018.

ACM Class: G.4

arXiv:1810.05728 [pdf, other]

Estimating Information Flow in Deep Neural Networks

Authors: Ziv Goldfeld, Ewout van den Berg, Kristjan Greenewald, Igor Melnyk, Nam Nguyen, Brian Kingsbury, Yury Polyanskiy

Abstract: We study the flow of information and the evolution of internal representations during deep neural network (DNN) training, aiming to demystify the compression aspect of the information bottleneck theory. The theory suggests that DNN training comprises a rapid fitting phase followed by a slower compression phase, in which the mutual information $I(X;T)$ between the input $X$ and internal representat… ▽ More We study the flow of information and the evolution of internal representations during deep neural network (DNN) training, aiming to demystify the compression aspect of the information bottleneck theory. The theory suggests that DNN training comprises a rapid fitting phase followed by a slower compression phase, in which the mutual information $I(X;T)$ between the input $X$ and internal representations $T$ decreases. Several papers observe compression of estimated mutual information on different DNN models, but the true $I(X;T)$ over these networks is provably either constant (discrete $X$) or infinite (continuous $X$). This work explains the discrepancy between theory and experiments, and clarifies what was actually measured by these past works. To this end, we introduce an auxiliary (noisy) DNN framework for which $I(X;T)$ is a meaningful quantity that depends on the network's parameters. This noisy framework is shown to be a good proxy for the original (deterministic) DNN both in terms of performance and the learned representations. We then develop a rigorous estimator for $I(X;T)$ in noisy DNNs and observe compression in various models. By relating $I(X;T)$ in the noisy DNN to an information-theoretic communication problem, we show that compression is driven by the progressive clustering of hidden representations of inputs from the same class. Several methods to directly monitor clustering of hidden representations, both in noisy and deterministic DNNs, are used to show that meaningful clusters form in the $T$ space. Finally, we return to the estimator of $I(X;T)$ employed in past works, and demonstrate that while it fails to capture the true (vacuous) mutual information, it does serve as a measure for clustering. This clarifies the past observations of compression and isolates the geometric clustering of hidden representations as the true phenomenon of interest. △ Less

Submitted 30 May, 2019; v1 submitted 12 October, 2018; originally announced October 2018.

Comments: Main text accepted to ICML 2019. This preprint contains the full version of that paper (including omitted appendices)

arXiv:1810.05500 [pdf, other]

Predictive Uncertainty through Quantization

Authors: Bastiaan S. Veeling, Rianne van den Berg, Max Welling

Abstract: High-risk domains require reliable confidence estimates from predictive models. Deep latent variable models provide these, but suffer from the rigid variational distributions used for tractable inference, which err on the side of overconfidence. We propose Stochastic Quantized Activation Distributions (SQUAD), which imposes a flexible yet tractable distribution over discretized latent variables. T… ▽ More High-risk domains require reliable confidence estimates from predictive models. Deep latent variable models provide these, but suffer from the rigid variational distributions used for tractable inference, which err on the side of overconfidence. We propose Stochastic Quantized Activation Distributions (SQUAD), which imposes a flexible yet tractable distribution over discretized latent variables. The proposed method is scalable, self-normalizing and sample efficient. We demonstrate that the model fully utilizes the flexible distribution, learns interesting non-linearities, and provides predictive uncertainty of competitive quality. △ Less

Submitted 12 October, 2018; originally announced October 2018.

arXiv:1810.01118 [pdf, other]

Sinkhorn AutoEncoders

Authors: Giorgio Patrini, Rianne van den Berg, Patrick Forré, Marcello Carioni, Samarth Bhargav, Max Welling, Tim Genewein, Frank Nielsen

Abstract: Optimal transport offers an alternative to maximum likelihood for learning generative autoencoding models. We show that minimizing the p-Wasserstein distance between the generator and the true data distribution is equivalent to the unconstrained min-min optimization of the p-Wasserstein distance between the encoder aggregated posterior and the prior in latent space, plus a reconstruction error. We… ▽ More Optimal transport offers an alternative to maximum likelihood for learning generative autoencoding models. We show that minimizing the p-Wasserstein distance between the generator and the true data distribution is equivalent to the unconstrained min-min optimization of the p-Wasserstein distance between the encoder aggregated posterior and the prior in latent space, plus a reconstruction error. We also identify the role of its trade-off hyperparameter as the capacity of the generator: its Lipschitz constant. Moreover, we prove that optimizing the encoder over any class of universal approximators, such as deterministic neural networks, is enough to come arbitrarily close to the optimum. We therefore advertise this framework, which holds for any metric space and prior, as a sweet-spot of current generative autoencoding objectives. We then introduce the Sinkhorn auto-encoder (SAE), which approximates and minimizes the p-Wasserstein distance in latent space via backprogation through the Sinkhorn algorithm. SAE directly works on samples, i.e. it models the aggregated posterior as an implicit distribution, with no need for a reparameterization trick for gradients estimations. SAE is thus able to work with different metric spaces and priors with minimal adaptations. We demonstrate the flexibility of SAE on latent spaces with different geometries and priors and compare with other methods on benchmark data sets. △ Less

Submitted 15 July, 2019; v1 submitted 2 October, 2018; originally announced October 2018.

Comments: Accepted for oral presentation at UAI19

arXiv:1803.10113 [pdf, ps, other]

Univalent polymorphism

Authors: Benno van den Berg

Abstract: We show that Martin Hyland's effective topos can be exhibited as the homotopy category of a path category $\mathbb{EFF}$. Path categories are categories of fibrant objects in the sense of Brown satisfying two additional properties and as such provide a context in which one can interpret many notions from homotopy theory and Homotopy Type Theory. Within the path category $\mathbb{EFF}$ one can iden… ▽ More We show that Martin Hyland's effective topos can be exhibited as the homotopy category of a path category $\mathbb{EFF}$. Path categories are categories of fibrant objects in the sense of Brown satisfying two additional properties and as such provide a context in which one can interpret many notions from homotopy theory and Homotopy Type Theory. Within the path category $\mathbb{EFF}$ one can identify a class of discrete fibrations which is closed under push forward along arbitrary fibrations (in other words, this class is polymorphic or closed under impredicative quantification) and satisfies propositional resizing. This class does not have a univalent representation, but if one restricts to those discrete fibrations whose fibres are propositions in the sense of Homotopy Type Theory, then it does. This means that, modulo the usual coherence problems, it can be seen as a model of the Calculus of Constructions with a univalent type of propositions. We will also build a more complicated path category in which the class of discrete fibrations whose fibres are sets in the sense of Homotopy Type Theory has a univalent representation, which means that this will be a model of the Calculus of Constructions with a univalent type of sets. △ Less

Submitted 1 August, 2018; v1 submitted 27 March, 2018; originally announced March 2018.

Comments: Corrected typos, updated references and added a section on Church's Thesis

arXiv:1803.05649 [pdf, other]

Sylvester Normalizing Flows for Variational Inference

Authors: Rianne van den Berg, Leonard Hasenclever, Jakub M. Tomczak, Max Welling

Abstract: Variational inference relies on flexible approximate posterior distributions. Normalizing flows provide a general recipe to construct flexible variational posteriors. We introduce Sylvester normalizing flows, which can be seen as a generalization of planar flows. Sylvester normalizing flows remove the well-known single-unit bottleneck from planar flows, making a single transformation much more fle… ▽ More Variational inference relies on flexible approximate posterior distributions. Normalizing flows provide a general recipe to construct flexible variational posteriors. We introduce Sylvester normalizing flows, which can be seen as a generalization of planar flows. Sylvester normalizing flows remove the well-known single-unit bottleneck from planar flows, making a single transformation much more flexible. We compare the performance of Sylvester normalizing flows against planar flows and inverse autoregressive flows and demonstrate that they compare favorably on several datasets. △ Less

Submitted 20 February, 2019; v1 submitted 15 March, 2018; originally announced March 2018.

Comments: Published at UAI 2018, 12 pages, 3 figures, code at: https://github.com/riannevdberg/sylvester-flows

arXiv:1710.09627 [pdf]

SRE: Semantic Rules Engine For the Industrial Internet-Of-Things Gateways

Authors: Charbel El Kaed, Imran Khan, Andre Van Den Berg, Hicham Hossayni, Christophe Saint-Marcel

Abstract: The Advent of the Internet-of-Things (IoT) paradigm has brought opportunities to solve many real-world problems. Energy management, for example, has attracted huge interest from academia, industries, governments and regulatory bodies. It involves collecting energy usage data, analyzing it, and optimizing the energy consumption by applying control strategies. However, in industrial environments, pe… ▽ More The Advent of the Internet-of-Things (IoT) paradigm has brought opportunities to solve many real-world problems. Energy management, for example, has attracted huge interest from academia, industries, governments and regulatory bodies. It involves collecting energy usage data, analyzing it, and optimizing the energy consumption by applying control strategies. However, in industrial environments, performing such optimization is not trivial. The changes in business rules, process control, and customer requirements make it much more challenging. In this paper, a Semantic Rules Engine (SRE) for industrial gateways is presented that allows implementing dynamic and flexible rule-based control strategies. It is simple, expressive, and allows managing rules on-the-fly without causing any service interruption. Additionally, it can handle semantic queries and provide results by inferring additional knowledge from previously defined concepts in ontologies. SRE has been validated and tested on different hardware platforms and in commercial products. Performance evaluations are also presented to validate its conformance to the customer requirements. △ Less

Submitted 26 October, 2017; originally announced October 2017.

Comments: Accepted for publication in forthcoming issue of IEEE Transactions on Industrial Informatics. The content is final but has NOT been proof-read

Journal ref: IEEE Transactions on Industrial Informatics, 2017

arXiv:1708.01155 [pdf, other]

Deep MR to CT Synthesis using Unpaired Data

Authors: Jelmer M. Wolterink, Anna M. Dinkla, Mark H. F. Savenije, Peter R. Seevinck, Cornelis A. T. van den Berg, Ivana Isgum

Abstract: MR-only radiotherapy treatment planning requires accurate MR-to-CT synthesis. Current deep learning methods for MR-to-CT synthesis depend on pairwise aligned MR and CT training images of the same patient. However, misalignment between paired images could lead to errors in synthesized CT images. To overcome this, we propose to train a generative adversarial network (GAN) with unpaired MR and CT ima… ▽ More MR-only radiotherapy treatment planning requires accurate MR-to-CT synthesis. Current deep learning methods for MR-to-CT synthesis depend on pairwise aligned MR and CT training images of the same patient. However, misalignment between paired images could lead to errors in synthesized CT images. To overcome this, we propose to train a generative adversarial network (GAN) with unpaired MR and CT images. A GAN consisting of two synthesis convolutional neural networks (CNNs) and two discriminator CNNs was trained with cycle consistency to transform 2D brain MR image slices into 2D brain CT image slices and vice versa. Brain MR and CT images of 24 patients were analyzed. A quantitative evaluation showed that the model was able to synthesize CT images that closely approximate reference CT images, and was able to outperform a GAN model trained with paired MR and CT images. △ Less

Submitted 3 August, 2017; originally announced August 2017.

Comments: MICCAI 2017 Workshop on Simulation and Synthesis in Medical Imaging

arXiv:1706.02263 [pdf, other]

Graph Convolutional Matrix Completion

Authors: Rianne van den Berg, Thomas N. Kipf, Max Welling

Abstract: We consider matrix completion for recommender systems from the point of view of link prediction on graphs. Interaction data such as movie ratings can be represented by a bipartite user-item graph with labeled edges denoting observed ratings. Building on recent progress in deep learning on graph-structured data, we propose a graph auto-encoder framework based on differentiable message passing on th… ▽ More We consider matrix completion for recommender systems from the point of view of link prediction on graphs. Interaction data such as movie ratings can be represented by a bipartite user-item graph with labeled edges denoting observed ratings. Building on recent progress in deep learning on graph-structured data, we propose a graph auto-encoder framework based on differentiable message passing on the bipartite interaction graph. Our model shows competitive performance on standard collaborative filtering benchmarks. In settings where complimentary feature information or structured data such as a social network is available, our framework outperforms recent state-of-the-art methods. △ Less

Submitted 25 October, 2017; v1 submitted 7 June, 2017; originally announced June 2017.

Comments: 9 pages, 3 figures, updated with additional experimental evaluation

arXiv:1703.06103 [pdf, other]

Modeling Relational Data with Graph Convolutional Networks

Authors: Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, Max Welling

Abstract: Knowledge graphs enable a wide variety of applications, including question answering and information retrieval. Despite the great effort invested in their creation and maintenance, even the largest (e.g., Yago, DBPedia or Wikidata) remain incomplete. We introduce Relational Graph Convolutional Networks (R-GCNs) and apply them to two standard knowledge base completion tasks: Link prediction (recove… ▽ More Knowledge graphs enable a wide variety of applications, including question answering and information retrieval. Despite the great effort invested in their creation and maintenance, even the largest (e.g., Yago, DBPedia or Wikidata) remain incomplete. We introduce Relational Graph Convolutional Networks (R-GCNs) and apply them to two standard knowledge base completion tasks: Link prediction (recovery of missing facts, i.e. subject-predicate-object triples) and entity classification (recovery of missing entity attributes). R-GCNs are related to a recent class of neural networks operating on graphs, and are developed specifically to deal with the highly multi-relational data characteristic of realistic knowledge bases. We demonstrate the effectiveness of R-GCNs as a stand-alone model for entity classification. We further show that factorization models for link prediction such as DistMult can be significantly improved by enriching them with an encoder model to accumulate evidence over multiple inference steps in the relational graph, demonstrating a large improvement of 29.8% on FB15k-237 over a decoder-only baseline. △ Less

Submitted 26 October, 2017; v1 submitted 17 March, 2017; originally announced March 2017.

arXiv:1701.08369 [pdf, ps, other]

A homotopy-theoretic model of function extensionality in the effective topos

Authors: Daniil Frumin, Benno van den Berg

Abstract: We present a way of constructing a Quillen model structure on a full subcategory of an elementary topos, starting with an interval object with connections and a certain dominance. The advantage of this method is that it does not require the underlying topos to be cocomplete. The resulting model category structure gives rise to a model of homotopy type theory with identity types, $Σ$- and $Π$-types… ▽ More We present a way of constructing a Quillen model structure on a full subcategory of an elementary topos, starting with an interval object with connections and a certain dominance. The advantage of this method is that it does not require the underlying topos to be cocomplete. The resulting model category structure gives rise to a model of homotopy type theory with identity types, $Σ$- and $Π$-types, and functional extensionality. We apply the method to the effective topos with the interval object $\nabla 2$. In the resulting model structure we identify uniform inhabited objects as contractible objects, and show that discrete objects are fibrant. Moreover, we show that the unit of the discrete reflection is a homotopy equivalence and the homotopy category of fibrant assemblies is equivalent to the category of modest sets. We compare our work with the path object category construction on the effective topos by Jaap van Oosten. △ Less

Submitted 12 March, 2018; v1 submitted 29 January, 2017; originally announced January 2017.

Comments: v2: The section "A non-contractible uniform object." was removed due to an error in Lemma 6.5, Proposition 7.3 was changed to account for the fact that only the "only if" direction holds

arXiv:1612.03005 [pdf, other]

doi 10.1145/3106328.3106338

No domain left behind: is Let's Encrypt democratizing encryption?

Authors: Maarten Aertsen, Maciej Korczyński, Giovane C. M. Moura, Samaneh Tajalizadehkhoob, Jan van den Berg

Abstract: The 2013 National Security Agency revelations of pervasive monitoring have lead to an "encryption rush" across the computer and Internet industry. To push back against massive surveillance and protect users privacy, vendors, hosting and cloud providers have widely deployed encryption on their hardware, communication links, and applications. As a consequence, the most of web traffic nowadays is enc… ▽ More The 2013 National Security Agency revelations of pervasive monitoring have lead to an "encryption rush" across the computer and Internet industry. To push back against massive surveillance and protect users privacy, vendors, hosting and cloud providers have widely deployed encryption on their hardware, communication links, and applications. As a consequence, the most of web traffic nowadays is encrypted. However, there is still a significant part of Internet traffic that is not encrypted. It has been argued that both costs and complexity associated with obtaining and deploying X.509 certificates are major barriers for widespread encryption, since these certificates are required to established encrypted connections. To address these issues, the Electronic Frontier Foundation, Mozilla Foundation, and the University of Michigan have set up Let's Encrypt (LE), a certificate authority that provides both free X.509 certificates and software that automates the deployment of these certificates. In this paper, we investigate if LE has been successful in democratizing encryption: we analyze certificate issuance in the first year of LE and show from various perspectives that LE adoption has an upward trend and it is in fact being successful in covering the lower-cost end of the hosting market. △ Less

Submitted 9 December, 2016; originally announced December 2016.

Showing 1–50 of 63 results for author: Berg, V