-
Fake or JPEG? Revealing Common Biases in Generated Image Detection Datasets
Authors:
Patrick Grommelt,
Louis Weiss,
Franz-Josef Pfreundt,
Janis Keuper
Abstract:
The widespread adoption of generative image models has highlighted the urgent need to detect artificial content, which is a crucial step in combating widespread manipulation and misinformation. Consequently, numerous detectors and associated datasets have emerged. However, many of these datasets inadvertently introduce undesirable biases, thereby impacting the effectiveness and evaluation of detec…
▽ More
The widespread adoption of generative image models has highlighted the urgent need to detect artificial content, which is a crucial step in combating widespread manipulation and misinformation. Consequently, numerous detectors and associated datasets have emerged. However, many of these datasets inadvertently introduce undesirable biases, thereby impacting the effectiveness and evaluation of detectors. In this paper, we emphasize that many datasets for AI-generated image detection contain biases related to JPEG compression and image size. Using the GenImage dataset, we demonstrate that detectors indeed learn from these undesired factors. Furthermore, we show that removing the named biases substantially increases robustness to JPEG compression and significantly alters the cross-generator performance of evaluated detectors. Specifically, it leads to more than 11 percentage points increase in cross-generator performance for ResNet50 and Swin-T detectors on the GenImage dataset, achieving state-of-the-art results.
We provide the dataset and source codes of this paper on the anonymous website: https://www.unbiased-genimage.org
△ Less
Submitted 28 March, 2024; v1 submitted 26 March, 2024;
originally announced March 2024.
-
Estimating the Robustness of Classification Models by the Structure of the Learned Feature-Space
Authors:
Kalun Ho,
Franz-Josef Pfreundt,
Janis Keuper,
Margret Keuper
Abstract:
Over the last decade, the development of deep image classification networks has mostly been driven by the search for the best performance in terms of classification accuracy on standardized benchmarks like ImageNet. More recently, this focus has been expanded by the notion of model robustness, \ie the generalization abilities of models towards previously unseen changes in the data distribution. Wh…
▽ More
Over the last decade, the development of deep image classification networks has mostly been driven by the search for the best performance in terms of classification accuracy on standardized benchmarks like ImageNet. More recently, this focus has been expanded by the notion of model robustness, \ie the generalization abilities of models towards previously unseen changes in the data distribution. While new benchmarks, like ImageNet-C, have been introduced to measure robustness properties, we argue that fixed testsets are only able to capture a small portion of possible data variations and are thus limited and prone to generate new overfitted solutions. To overcome these drawbacks, we suggest to estimate the robustness of a model directly from the structure of its learned feature-space. We introduce robustness indicators which are obtained via unsupervised clustering of latent representations from a trained classifier and show very high correlations to the model performance on corrupted test data.
△ Less
Submitted 19 August, 2021; v1 submitted 23 June, 2021;
originally announced June 2021.
-
Combining Transformer Generators with Convolutional Discriminators
Authors:
Ricard Durall,
Stanislav Frolov,
Jörn Hees,
Federico Raue,
Franz-Josef Pfreundt,
Andreas Dengel,
Janis Keupe
Abstract:
Transformer models have recently attracted much interest from computer vision researchers and have since been successfully employed for several problems traditionally addressed with convolutional neural networks. At the same time, image synthesis using generative adversarial networks (GANs) has drastically improved over the last few years. The recently proposed TransGAN is the first GAN using only…
▽ More
Transformer models have recently attracted much interest from computer vision researchers and have since been successfully employed for several problems traditionally addressed with convolutional neural networks. At the same time, image synthesis using generative adversarial networks (GANs) has drastically improved over the last few years. The recently proposed TransGAN is the first GAN using only transformer-based architectures and achieves competitive results when compared to convolutional GANs. However, since transformers are data-hungry architectures, TransGAN requires data augmentation, an auxiliary super-resolution task during training, and a masking prior to guide the self-attention mechanism. In this paper, we study the combination of a transformer-based generator and convolutional discriminator and successfully remove the need of the aforementioned required design choices. We evaluate our approach by conducting a benchmark of well-known CNN discriminators, ablate the size of the transformer-based generator, and show that combining both architectural elements into a hybrid model leads to better results. Furthermore, we investigate the frequency spectrum properties of generated images and observe that our model retains the benefits of an attention based generator.
△ Less
Submitted 10 July, 2021; v1 submitted 21 May, 2021;
originally announced May 2021.
-
SpectralDefense: Detecting Adversarial Attacks on CNNs in the Fourier Domain
Authors:
Paula Harder,
Franz-Josef Pfreundt,
Margret Keuper,
Janis Keuper
Abstract:
Despite the success of convolutional neural networks (CNNs) in many computer vision and image analysis tasks, they remain vulnerable against so-called adversarial attacks: Small, crafted perturbations in the input images can lead to false predictions. A possible defense is to detect adversarial examples. In this work, we show how analysis in the Fourier domain of input images and feature maps can…
▽ More
Despite the success of convolutional neural networks (CNNs) in many computer vision and image analysis tasks, they remain vulnerable against so-called adversarial attacks: Small, crafted perturbations in the input images can lead to false predictions. A possible defense is to detect adversarial examples. In this work, we show how analysis in the Fourier domain of input images and feature maps can be used to distinguish benign test samples from adversarial images. We propose two novel detection methods: Our first method employs the magnitude spectrum of the input images to detect an adversarial attack. This simple and robust classifier can successfully detect adversarial perturbations of three commonly used attack methods. The second method builds upon the first and additionally extracts the phase of Fourier coefficients of feature-maps at different layers of the network. With this extension, we are able to improve adversarial detection rates compared to state-of-the-art detectors on five different attack methods.
△ Less
Submitted 2 June, 2021; v1 submitted 4 March, 2021;
originally announced March 2021.
-
Latent Space Conditioning on Generative Adversarial Networks
Authors:
Ricard Durall,
Kalun Ho,
Franz-Josef Pfreundt,
Janis Keuper
Abstract:
Generative adversarial networks are the state of the art approach towards learned synthetic image generation. Although early successes were mostly unsupervised, bit by bit, this trend has been superseded by approaches based on labelled data. These supervised methods allow a much finer-grained control of the output image, offering more flexibility and stability. Nevertheless, the main drawback of s…
▽ More
Generative adversarial networks are the state of the art approach towards learned synthetic image generation. Although early successes were mostly unsupervised, bit by bit, this trend has been superseded by approaches based on labelled data. These supervised methods allow a much finer-grained control of the output image, offering more flexibility and stability. Nevertheless, the main drawback of such models is the necessity of annotated data. In this work, we introduce an novel framework that benefits from two popular learning techniques, adversarial training and representation learning, and takes a step towards unsupervised conditional GANs. In particular, our approach exploits the structure of a latent space (learned by the representation learning) and employs it to condition the generative model. In this way, we break the traditional dependency between condition and label, substituting the latter by unsupervised features coming from the latent space. Finally, we show that this new technique is able to produce samples on demand kee** the quality of its supervised counterpart.
△ Less
Submitted 16 December, 2020;
originally announced December 2020.
-
Module Intersection for the Integration-by-Parts Reduction of Multi-Loop Feynman Integrals
Authors:
Dominik Bendle,
Janko Boehm,
Wolfram Decker,
Alessandro Georgoudis,
Franz-Josef Pfreundt,
Mirko Rahn,
Yang Zhang
Abstract:
In this manuscript, which is to appear in the proceedings of the conference "MathemAmplitude 2019" in Padova, Italy, we provide an overview of the module intersection method for the the integration-by-parts (IBP) reduction of multi-loop Feynman integrals. The module intersection method, based on computational algebraic geometry, is a highly efficient way of getting IBP relations without double pro…
▽ More
In this manuscript, which is to appear in the proceedings of the conference "MathemAmplitude 2019" in Padova, Italy, we provide an overview of the module intersection method for the the integration-by-parts (IBP) reduction of multi-loop Feynman integrals. The module intersection method, based on computational algebraic geometry, is a highly efficient way of getting IBP relations without double propagator or with a bound on the highest propagator degree. In this manner, trimmed IBP systems which are much shorter than the traditional ones can be obtained. We apply the modern, Petri net based, workflow management system GPI-Space in combination with the computer algebra system Singular to solve the trimmed IBP system via interpolation and efficient parallelization. We show, in particular, how to use the new plugin feature of GPI-Space to manage a global state of the computation and to efficiently handle mutable data. Moreover, a Mathematica interface to generate IBPs with restricted propagator degree, which is based on module intersection, is presented in this review.
△ Less
Submitted 14 October, 2020;
originally announced October 2020.
-
Learning Embeddings for Image Clustering: An Empirical Study of Triplet Loss Approaches
Authors:
Kalun Ho,
Janis Keuper,
Franz-Josef Pfreundt,
Margret Keuper
Abstract:
In this work, we evaluate two different image clustering objectives, k-means clustering and correlation clustering, in the context of Triplet Loss induced feature space embeddings. Specifically, we train a convolutional neural network to learn discriminative features by optimizing two popular versions of the Triplet Loss in order to study their clustering properties under the assumption of noisy l…
▽ More
In this work, we evaluate two different image clustering objectives, k-means clustering and correlation clustering, in the context of Triplet Loss induced feature space embeddings. Specifically, we train a convolutional neural network to learn discriminative features by optimizing two popular versions of the Triplet Loss in order to study their clustering properties under the assumption of noisy labels. Additionally, we propose a new, simple Triplet Loss formulation, which shows desirable properties with respect to formal clustering objectives and outperforms the existing methods. We evaluate all three Triplet loss formulations for K-means and correlation clustering on the CIFAR-10 image classification dataset.
△ Less
Submitted 6 July, 2020;
originally announced July 2020.
-
Local Facial Attribute Transfer through Inpainting
Authors:
Ricard Durall,
Franz-Josef Pfreundt,
Janis Keuper
Abstract:
The term attribute transfer refers to the tasks of altering images in such a way, that the semantic interpretation of a given input image is shifted towards an intended direction, which is quantified by semantic attributes. Prominent example applications are photo realistic changes of facial features and expressions, like changing the hair color, adding a smile, enlarging the nose or altering the…
▽ More
The term attribute transfer refers to the tasks of altering images in such a way, that the semantic interpretation of a given input image is shifted towards an intended direction, which is quantified by semantic attributes. Prominent example applications are photo realistic changes of facial features and expressions, like changing the hair color, adding a smile, enlarging the nose or altering the entire context of a scene, like transforming a summer landscape into a winter panorama. Recent advances in attribute transfer are mostly based on generative deep neural networks, using various techniques to manipulate images in the latent space of the generator.
In this paper, we present a novel method for the common sub-task of local attribute transfers, where only parts of a face have to be altered in order to achieve semantic changes (e.g. removing a mustache). In contrast to previous methods, where such local changes have been implemented by generating new (global) images, we propose to formulate local attribute transfers as an inpainting problem. Removing and regenerating only parts of images, our Attribute Transfer Inpainting Generative Adversarial Network (ATI-GAN) is able to utilize local context information to focus on the attributes while kee** the background unmodified resulting in visually sound results.
△ Less
Submitted 12 October, 2020; v1 submitted 7 February, 2020;
originally announced February 2020.
-
Scalable Hyperparameter Optimization with Lazy Gaussian Processes
Authors:
Raju Ram,
Sabine Müller,
Franz-Josef Pfreundt,
Nicolas R. Gauger,
Janis Keuper
Abstract:
Most machine learning methods require careful selection of hyper-parameters in order to train a high performing model with good generalization abilities. Hence, several automatic selection algorithms have been introduced to overcome tedious manual (try and error) tuning of these parameters. Due to its very high sample efficiency, Bayesian Optimization over a Gaussian Processes modeling of the para…
▽ More
Most machine learning methods require careful selection of hyper-parameters in order to train a high performing model with good generalization abilities. Hence, several automatic selection algorithms have been introduced to overcome tedious manual (try and error) tuning of these parameters. Due to its very high sample efficiency, Bayesian Optimization over a Gaussian Processes modeling of the parameter space has become the method of choice. Unfortunately, this approach suffers from a cubic compute complexity due to underlying Cholesky factorization, which makes it very hard to be scaled beyond a small number of sampling steps. In this paper, we present a novel, highly accurate approximation of the underlying Gaussian Process. Reducing its computational complexity from cubic to quadratic allows an efficient strong scaling of Bayesian Optimization while outperforming the previous approach regarding optimization accuracy. The first experiments show speedups of a factor of 162 in single node and further speed up by a factor of 5 in a parallel environment.
△ Less
Submitted 16 January, 2020;
originally announced January 2020.
-
Unmasking DeepFakes with simple Features
Authors:
Ricard Durall,
Margret Keuper,
Franz-Josef Pfreundt,
Janis Keuper
Abstract:
Deep generative models have recently achieved impressive results for many real-world applications, successfully generating high-resolution and diverse samples from complex datasets. Due to this improvement, fake digital contents have proliferated growing concern and spreading distrust in image content, leading to an urgent need for automated ways to detect these AI-generated fake images.
Despite…
▽ More
Deep generative models have recently achieved impressive results for many real-world applications, successfully generating high-resolution and diverse samples from complex datasets. Due to this improvement, fake digital contents have proliferated growing concern and spreading distrust in image content, leading to an urgent need for automated ways to detect these AI-generated fake images.
Despite the fact that many face editing algorithms seem to produce realistic human faces, upon closer examination, they do exhibit artifacts in certain domains which are often hidden to the naked eye. In this work, we present a simple way to detect such fake face images - so-called DeepFakes. Our method is based on a classical frequency domain analysis followed by basic classifier. Compared to previous systems, which need to be fed with large amounts of labeled data, our approach showed very good results using only a few annotated training samples and even achieved good accuracies in fully unsupervised scenarios. For the evaluation on high resolution face images, we combined several public datasets of real and fake faces into a new benchmark: Faces-HQ. Given such high-resolution images, our approach reaches a perfect classification accuracy of 100% when it is trained on as little as 20 annotated samples. In a second experiment, in the evaluation of the medium-resolution images of the CelebA dataset, our method achieves 100% accuracy supervised and 96% in an unsupervised setting. Finally, evaluating a low-resolution video sequences of the FaceForensics++ dataset, our method achieves 91% accuracy detecting manipulated videos.
Source Code: https://github.com/cc-hpc-itwm/DeepFakeDetection
△ Less
Submitted 4 March, 2020; v1 submitted 2 November, 2019;
originally announced November 2019.
-
Semi Few-Shot Attribute Translation
Authors:
Ricard Durall,
Franz-Josef Pfreundt,
Janis Keuper
Abstract:
Recent studies have shown remarkable success in image-to-image translation for attribute transfer applications. However, most of existing approaches are based on deep learning and require an abundant amount of labeled data to produce good results, therefore limiting their applicability. In the same vein, recent advances in meta-learning have led to successful implementations with limited available…
▽ More
Recent studies have shown remarkable success in image-to-image translation for attribute transfer applications. However, most of existing approaches are based on deep learning and require an abundant amount of labeled data to produce good results, therefore limiting their applicability. In the same vein, recent advances in meta-learning have led to successful implementations with limited available data, allowing so-called few-shot learning.
In this paper, we address this limitation of supervised methods, by proposing a novel approach based on GANs. These are trained in a meta-training manner, which allows them to perform image-to-image translations using just a few labeled samples from a new target class. This work empirically demonstrates the potential of training a GAN for few shot image-to-image translation on hair color attribute synthesis tasks, opening the door to further research on generative transfer learning.
△ Less
Submitted 16 October, 2019; v1 submitted 8 October, 2019;
originally announced October 2019.
-
GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks
Authors:
Avraam Chatzimichailidis,
Franz-Josef Pfreundt,
Nicolas R. Gauger,
Janis Keuper
Abstract:
Current training methods for deep neural networks boil down to very high dimensional and non-convex optimization problems which are usually solved by a wide range of stochastic gradient descent methods. While these approaches tend to work in practice, there are still many gaps in the theoretical understanding of key aspects like convergence and generalization guarantees, which are induced by the p…
▽ More
Current training methods for deep neural networks boil down to very high dimensional and non-convex optimization problems which are usually solved by a wide range of stochastic gradient descent methods. While these approaches tend to work in practice, there are still many gaps in the theoretical understanding of key aspects like convergence and generalization guarantees, which are induced by the properties of the optimization surface (loss landscape). In order to gain deeper insights, a number of recent publications proposed methods to visualize and analyze the optimization surfaces. However, the computational cost of these methods are very high, making it hardly possible to use them on larger networks.
In this paper, we present the GradVis Toolbox, an open source library for efficient and scalable visualization and analysis of deep neural network loss landscapes in Tensorflow and PyTorch. Introducing more efficient mathematical formulations and a novel parallelization scheme, GradVis allows to plot 2d and 3d projections of optimization surfaces and trajectories, as well as high resolution second order gradient information for large networks.
△ Less
Submitted 27 September, 2019; v1 submitted 26 September, 2019;
originally announced September 2019.
-
Object Segmentation using Pixel-wise Adversarial Loss
Authors:
Ricard Durall,
Franz-Josef Pfreundt,
Ullrich Köthe,
Janis Keuper
Abstract:
Recent deep learning based approaches have shown remarkable success on object segmentation tasks. However, there is still room for further improvement. Inspired by generative adversarial networks, we present a generic end-to-end adversarial approach, which can be combined with a wide range of existing semantic segmentation networks to improve their segmentation performance. The key element of our…
▽ More
Recent deep learning based approaches have shown remarkable success on object segmentation tasks. However, there is still room for further improvement. Inspired by generative adversarial networks, we present a generic end-to-end adversarial approach, which can be combined with a wide range of existing semantic segmentation networks to improve their segmentation performance. The key element of our method is to replace the commonly used binary adversarial loss with a high resolution pixel-wise loss. In addition, we train our generator employing stochastic weight averaging fashion, which further enhances the predicted output label maps leading to state-of-the-art results. We show, that this combination of pixel-wise adversarial training and weight averaging leads to significant and consistent gains in segmentation performance, compared to the baseline models.
△ Less
Submitted 23 September, 2019;
originally announced September 2019.
-
Integration-by-parts reductions of Feynman integrals using Singular and GPI-Space
Authors:
Dominik Bendle,
Janko Boehm,
Wolfram Decker,
Alessandro Georgoudis,
Franz-Josef Pfreundt,
Mirko Rahn,
Pascal Wasser,
Yang Zhang
Abstract:
We introduce an algebro-geometrically motived integration-by-parts (IBP) reduction method for multi-loop and multi-scale Feynman integrals, using a framework for massively parallel computations in computer algebra. This framework combines the computer algebra system Singular with the workflow management system GPI-Space, which is being developed at the Fraunhofer Institute for Industrial Mathemati…
▽ More
We introduce an algebro-geometrically motived integration-by-parts (IBP) reduction method for multi-loop and multi-scale Feynman integrals, using a framework for massively parallel computations in computer algebra. This framework combines the computer algebra system Singular with the workflow management system GPI-Space, which is being developed at the Fraunhofer Institute for Industrial Mathematics (ITWM). In our approach, the IBP relations are first trimmed by modern algebraic geometry tools and then solved by sparse linear algebra and our new interpolation methods. These steps are efficiently automatized and automatically parallelized by modeling the algorithm in GPI-Space using the language of Petri-nets. We demonstrate the potential of our method at the nontrivial example of reducing two-loop five-point nonplanar double-pentagon integrals. We also use GPI-Space to convert the basis of IBP reductions, and discuss the possible simplification of IBP coefficients in a uniformly transcendental basis.
△ Less
Submitted 25 October, 2019; v1 submitted 12 August, 2019;
originally announced August 2019.
-
Stabilizing GANs with Soft Octave Convolutions
Authors:
Ricard Durall,
Franz-Josef Pfreundt,
Janis Keuper
Abstract:
Motivated by recently published methods using frequency decompositions of convolutions (e.g. Octave Convolutions), we propose a novel convolution scheme to stabilize the training and reduce the likelihood of a mode collapse. The basic idea of our approach is to split convolutional filters into additive high and low frequency parts, while shifting weight updates from low to high during the training…
▽ More
Motivated by recently published methods using frequency decompositions of convolutions (e.g. Octave Convolutions), we propose a novel convolution scheme to stabilize the training and reduce the likelihood of a mode collapse. The basic idea of our approach is to split convolutional filters into additive high and low frequency parts, while shifting weight updates from low to high during the training. Intuitively, this method forces GANs to learn low frequency coarse image structures before descending into fine (high frequency) details. We also show, that the use of the proposed soft octave convolutions reduces common artifacts in the frequency domain of generated images. Our approach is orthogonal and complementary to existing stabilization methods and can simply be plugged into any CNN based GAN architecture. Experiments on the CelebA dataset show the effectiveness of the proposed method.
△ Less
Submitted 17 December, 2020; v1 submitted 29 May, 2019;
originally announced May 2019.
-
Towards Massively Parallel Computations in Algebraic Geometry
Authors:
Janko Boehm,
Wolfram Decker,
Anne Frühbis-Krüger,
Franz-Josef Pfreundt,
Mirko Rahn,
Lukas Ristau
Abstract:
Introducing parallelism and exploring its use is still a fundamental challenge for the computer algebra community. In high performance numerical simulation, on the other hand, transparent environments for distributed computing which follow the principle of separating coordination and computation have been a success story for many years. In this paper, we explore the potential of using this princip…
▽ More
Introducing parallelism and exploring its use is still a fundamental challenge for the computer algebra community. In high performance numerical simulation, on the other hand, transparent environments for distributed computing which follow the principle of separating coordination and computation have been a success story for many years. In this paper, we explore the potential of using this principle in the context of computer algebra. More precisely, we combine two well-established systems: The mathematics we are interested in is implemented in the computer algebra system Singular, whose focus is on polynomial computations, while the coordination is left to the workflow management system GPI-Space, which relies on Petri nets as its mathematical modeling language, and has been successfully used for coordinating the parallel execution (autoparallelization) of academic codes as well as for commercial software in application areas such as seismic data processing. The result of our efforts is a major step towards a framework for massively parallel computations in the application areas of Singular, specifically in commutative algebra and algebraic geometry. As a first test case for this framework, we have modeled and implemented a hybrid smoothness test for algebraic varieties which combines ideas from Hironaka's celebrated desingularization proof with the classical Jacobian criterion. Applying our implementation to two examples originating from current research in algebraic geometry, one of which cannot be handled by other means, we illustrate the behavior of the smoothness test within our framework, and investigate how the computations scale up to 256 cores.
△ Less
Submitted 29 August, 2018;
originally announced August 2018.
-
Sparsity in Deep Neural Networks - An Empirical Investigation with TensorQuant
Authors:
Dominik Marek Loroch,
Franz-Josef Pfreundt,
Norbert Wehn,
Janis Keuper
Abstract:
Deep learning is finding its way into the embedded world with applications such as autonomous driving, smart sensors and aug- mented reality. However, the computation of deep neural networks is demanding in energy, compute power and memory. Various approaches have been investigated to reduce the necessary resources, one of which is to leverage the sparsity occurring in deep neural networks due to…
▽ More
Deep learning is finding its way into the embedded world with applications such as autonomous driving, smart sensors and aug- mented reality. However, the computation of deep neural networks is demanding in energy, compute power and memory. Various approaches have been investigated to reduce the necessary resources, one of which is to leverage the sparsity occurring in deep neural networks due to the high levels of redundancy in the network parameters. It has been shown that sparsity can be promoted specifically and the achieved sparsity can be very high. But in many cases the methods are evaluated on rather small topologies. It is not clear if the results transfer onto deeper topologies. In this paper, the TensorQuant toolbox has been extended to offer a platform to investigate sparsity, especially in deeper models. Several practical relevant topologies for varying classification problem sizes are investigated to show the differences in sparsity for activations, weights and gradients.
△ Less
Submitted 27 August, 2018;
originally announced August 2018.
-
TensorQuant - A Simulation Toolbox for Deep Neural Network Quantization
Authors:
Dominik Marek Loroch,
Norbert Wehn,
Franz-Josef Pfreundt,
Janis Keuper
Abstract:
Recent research implies that training and inference of deep neural networks (DNN) can be computed with low precision numerical representations of the training/test data, weights and gradients without a general loss in accuracy. The benefit of such compact representations is twofold: they allow a significant reduction of the communication bottleneck in distributed DNN training and faster neural net…
▽ More
Recent research implies that training and inference of deep neural networks (DNN) can be computed with low precision numerical representations of the training/test data, weights and gradients without a general loss in accuracy. The benefit of such compact representations is twofold: they allow a significant reduction of the communication bottleneck in distributed DNN training and faster neural network implementations on hardware accelerators like FPGAs. Several quantization methods have been proposed to map the original 32-bit floating point problem to low-bit representations. While most related publications validate the proposed approach on a single DNN topology, it appears to be evident, that the optimal choice of the quantization method and number of coding bits is topology dependent. To this end, there is no general theory available, which would allow users to derive the optimal quantization during the design of a DNN topology. In this paper, we present a quantization tool box for the TensorFlow framework. TensorQuant allows a transparent quantization simulation of existing DNN topologies during training and inference. TensorQuant supports generic quantization methods and allows experimental evaluation of the impact of the quantization on single layers as well as on the full topology. In a first series of experiments with TensorQuant, we show an analysis of fix-point quantizations of popular CNN topologies.
△ Less
Submitted 13 October, 2017;
originally announced October 2017.
-
Using GPI-2 for Distributed Memory Paralleliziation of the Caffe Toolbox to Speed up Deep Neural Network Training
Authors:
Martin Kuehn,
Janis Keuper,
Franz-Josef Pfreundt
Abstract:
Deep Neural Network (DNN) are currently of great inter- est in research and application. The training of these net- works is a compute intensive and time consuming task. To reduce training times to a bearable amount at reasonable cost we extend the popular Caffe toolbox for DNN with an efficient distributed memory communication pattern. To achieve good scalability we emphasize the overlap of compu…
▽ More
Deep Neural Network (DNN) are currently of great inter- est in research and application. The training of these net- works is a compute intensive and time consuming task. To reduce training times to a bearable amount at reasonable cost we extend the popular Caffe toolbox for DNN with an efficient distributed memory communication pattern. To achieve good scalability we emphasize the overlap of computation and communication and prefer fine granu- lar synchronization patterns over global barriers. To im- plement these communication patterns we rely on the the Global address space Programming Interface version 2 (GPI-2) communication library. This interface provides a light-weight set of asynchronous one-sided communica- tion primitives supplemented by non-blocking fine gran- ular data synchronization mechanisms. Therefore, Caf- feGPI is the name of our parallel version of Caffe. First benchmarks demonstrate better scaling behavior com- pared with other extensions, e.g., the Intel TM Caffe. Even within a single symmetric multiprocessing machine with four graphics processing units, the CaffeGPI scales bet- ter than the standard Caffe toolbox. These first results demonstrate that the use of standard High Performance Computing (HPC) hardware is a valid cost saving ap- proach to train large DDNs. I/O is an other bottleneck to work with DDNs in a standard parallel HPC setting, which we will consider in more detail in a forthcoming paper.
△ Less
Submitted 18 August, 2017; v1 submitted 31 May, 2017;
originally announced June 2017.
-
Distributed Training of Deep Neural Networks: Theoretical and Practical Limits of Parallel Scalability
Authors:
Janis Keuper,
Franz-Josef Pfreundt
Abstract:
This paper presents a theoretical analysis and practical evaluation of the main bottlenecks towards a scalable distributed solution for the training of Deep Neuronal Networks (DNNs). The presented results show, that the current state of the art approach, using data-parallelized Stochastic Gradient Descent (SGD), is quickly turning into a vastly communication bound problem. In addition, we present…
▽ More
This paper presents a theoretical analysis and practical evaluation of the main bottlenecks towards a scalable distributed solution for the training of Deep Neuronal Networks (DNNs). The presented results show, that the current state of the art approach, using data-parallelized Stochastic Gradient Descent (SGD), is quickly turning into a vastly communication bound problem. In addition, we present simple but fixed theoretic constraints, preventing effective scaling of DNN training beyond only a few dozen nodes. This leads to poor scalability of DNN training in most practical scenarios.
△ Less
Submitted 5 December, 2016; v1 submitted 22 September, 2016;
originally announced September 2016.
-
Balancing the Communication Load of Asynchronously Parallelized Machine Learning Algorithms
Authors:
Janis Keuper,
Franz-Josef Pfreundt
Abstract:
Stochastic Gradient Descent (SGD) is the standard numerical method used to solve the core optimization problem for the vast majority of machine learning (ML) algorithms. In the context of large scale learning, as utilized by many Big Data applications, efficient parallelization of SGD is in the focus of active research. Recently, we were able to show that the asynchronous communication paradigm ca…
▽ More
Stochastic Gradient Descent (SGD) is the standard numerical method used to solve the core optimization problem for the vast majority of machine learning (ML) algorithms. In the context of large scale learning, as utilized by many Big Data applications, efficient parallelization of SGD is in the focus of active research. Recently, we were able to show that the asynchronous communication paradigm can be applied to achieve a fast and scalable parallelization of SGD. Asynchronous Stochastic Gradient Descent (ASGD) outperforms other, mostly MapReduce based, parallel algorithms solving large scale machine learning problems. In this paper, we investigate the impact of asynchronous communication frequency and message size on the performance of ASGD applied to large scale ML on HTC cluster and cloud environments. We introduce a novel algorithm for the automatic balancing of the asynchronous communication load, which allows to adapt ASGD to changing network bandwidths and latencies.
△ Less
Submitted 5 October, 2015;
originally announced October 2015.
-
Asynchronous Parallel Stochastic Gradient Descent - A Numeric Core for Scalable Distributed Machine Learning Algorithms
Authors:
Janis Keuper,
Franz-Josef Pfreundt
Abstract:
The implementation of a vast majority of machine learning (ML) algorithms boils down to solving a numerical optimization problem. In this context, Stochastic Gradient Descent (SGD) methods have long proven to provide good results, both in terms of convergence and accuracy. Recently, several parallelization approaches have been proposed in order to scale SGD to solve very large ML problems. At thei…
▽ More
The implementation of a vast majority of machine learning (ML) algorithms boils down to solving a numerical optimization problem. In this context, Stochastic Gradient Descent (SGD) methods have long proven to provide good results, both in terms of convergence and accuracy. Recently, several parallelization approaches have been proposed in order to scale SGD to solve very large ML problems. At their core, most of these approaches are following a map-reduce scheme. This paper presents a novel parallel updating algorithm for SGD, which utilizes the asynchronous single-sided communication paradigm. Compared to existing methods, Asynchronous Parallel Stochastic Gradient Descent (ASGD) provides faster (or at least equal) convergence, close to linear scaling and stable accuracy.
△ Less
Submitted 5 October, 2015; v1 submitted 19 May, 2015;
originally announced May 2015.