Skip to main content

Showing 1–34 of 34 results for author: Baader, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.05670  [pdf, other

    cs.LG cs.CR cs.CV

    Certified Robustness to Data Poisoning in Gradient-Based Training

    Authors: Philip Sosnin, Mark N. Müller, Maximilian Baader, Calvin Tsay, Matthew Wicker

    Abstract: Modern machine learning pipelines leverage large amounts of public data, making it infeasible to guarantee data quality and leaving models open to poisoning and backdoor attacks. However, provably bounding model behavior under such attacks remains an open problem. In this work, we address this challenge and develop the first framework providing provable guarantees on the behavior of models trained… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: 15 pages, 5 figures

  2. arXiv:2405.15586  [pdf, other

    cs.LG cs.DC

    DAGER: Exact Gradient Inversion for Large Language Models

    Authors: Ivo Petrov, Dimitar I. Dimitrov, Maximilian Baader, Mark Niklas Müller, Martin Vechev

    Abstract: Federated learning works by aggregating locally computed gradients from multiple clients, thus enabling collaborative training without sharing private client data. However, prior work has shown that the data can actually be recovered by the server using so-called gradient inversion attacks. While these attacks perform well when applied on images, they are limited in the text domain and only permit… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    ACM Class: I.2.7; I.2.11

  3. arXiv:2403.07095  [pdf, other

    cs.LG

    Overcoming the Paradox of Certified Training with Gaussian Smoothing

    Authors: Stefan Balauca, Mark Niklas Müller, Yuhao Mao, Maximilian Baader, Marc Fischer, Martin Vechev

    Abstract: Training neural networks with high certified accuracy against adversarial examples remains an open problem despite significant efforts. While certification methods can effectively leverage tight convex relaxations for bound computation, in training, these methods perform worse than looser relaxations. Prior work hypothesized that this is caused by the discontinuity and perturbation sensitivity of… ▽ More

    Submitted 25 June, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

  4. arXiv:2403.03945  [pdf, other

    cs.LG cs.CR cs.DC

    SPEAR:Exact Gradient Inversion of Batches in Federated Learning

    Authors: Dimitar I. Dimitrov, Maximilian Baader, Mark Niklas Müller, Martin Vechev

    Abstract: Federated learning is a framework for collaborative machine learning where clients only share gradient updates and not their private data with a server. However, it was recently shown that gradient inversion attacks can reconstruct this data from the shared gradients. In the important honest-but-curious setting, existing attacks enable exact reconstruction only for a batch size of $b=1$, with larg… ▽ More

    Submitted 3 June, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

    ACM Class: I.2.11

  5. arXiv:2402.02823  [pdf, other

    cs.LG cs.AI cs.CL cs.CR

    Evading Data Contamination Detection for Language Models is (too) Easy

    Authors: Jasper Dekoninck, Mark Niklas Müller, Maximilian Baader, Marc Fischer, Martin Vechev

    Abstract: Large language models are widespread, with their performance on benchmarks frequently guiding user preferences for one model over another. However, the vast amount of data these models are trained on can inadvertently lead to contamination with public benchmarks, thus compromising performance measurements. While recently developed contamination detection methods try to address this issue, they ove… ▽ More

    Submitted 12 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  6. arXiv:2311.04015  [pdf, ps, other

    cs.LG cs.AI

    Expressivity of ReLU-Networks under Convex Relaxations

    Authors: Maximilian Baader, Mark Niklas Müller, Yuhao Mao, Martin Vechev

    Abstract: Convex relaxations are a key component of training and certifying provably safe neural networks. However, despite substantial progress, a wide and poorly understood accuracy gap to standard networks remains, raising the question of whether this is due to fundamental limitations of convex relaxations. Initial work investigating this question focused on the simple and widely used IBP relaxation. It… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  7. arXiv:2309.03220  [pdf

    cs.HC cs.NE

    Conversational Swarm Intelligence, a Pilot Study

    Authors: Louis Rosenberg, Gregg Willcox, Hans Schumann, Miles Bader, Ganesh Mani, Kokoro Sagae, Devang Acharya, Yuxin Zheng, Andrew Kim, Jialing Deng

    Abstract: Conversational Swarm Intelligence (CSI) is a new method for enabling large human groups to hold real-time networked conversations using a technique modeled on the dynamics of biological swarms. Through the novel use of conversational agents powered by Large Language Models (LLMs), the CSI structure simultaneously enables local dialog among small deliberative groups and global propagation of conver… ▽ More

    Submitted 31 August, 2023; originally announced September 2023.

    Comments: Pending for conference, Collective Intelligence 2023 (ACM)

    ACM Class: H.5.2; H.1.2

  8. Abstraqt: Analysis of Quantum Circuits via Abstract Stabilizer Simulation

    Authors: Benjamin Bichsel, Anouk Paradis, Maximilian Baader, Martin Vechev

    Abstract: Stabilizer simulation can efficiently simulate an important class of quantum circuits consisting exclusively of Clifford gates. However, all existing extensions of this simulation to arbitrary quantum circuits including non-Clifford gates suffer from an exponential runtime. To address this challenge, we present a novel approach for efficient stabilizer simulation on arbitrary quantum circuits, a… ▽ More

    Submitted 14 November, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

    Comments: 22 pages

    Journal ref: Quantum 7, 1185 (2023)

  9. arXiv:2112.05235  [pdf, other

    cs.LG cs.AI

    The Fundamental Limits of Interval Arithmetic for Neural Networks

    Authors: Matthew Mirman, Maximilian Baader, Martin Vechev

    Abstract: Interval analysis (or interval bound propagation, IBP) is a popular technique for verifying and training provably robust deep neural networks, a fundamental challenge in the area of reliable machine learning. However, despite substantial efforts, progress on addressing this key challenge has stagnated, calling into question whether interval arithmetic is a viable path forward. In this paper we p… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    MSC Class: 68T07

  10. arXiv:2111.13650  [pdf, ps, other

    cs.LG cs.AI cs.CV

    Latent Space Smoothing for Individually Fair Representations

    Authors: Momchil Peychev, Anian Ruoss, Mislav Balunović, Maximilian Baader, Martin Vechev

    Abstract: Fair representation learning transforms user data into a representation that ensures fairness and utility regardless of the downstream application. However, learning individually fair representations, i.e., guaranteeing that similar individuals are treated similarly, remains challenging in high-dimensional settings such as computer vision. In this work, we introduce LASSI, the first representation… ▽ More

    Submitted 26 July, 2022; v1 submitted 26 November, 2021; originally announced November 2021.

    Comments: ECCV 2022

  11. arXiv:2110.15804  [pdf, other

    cs.SE cs.AR cs.MS

    Doubt and Redundancy Kill Soft Errors -- Towards Detection and Correction of Silent Data Corruption in Task-based Numerical Software

    Authors: Philipp Samfass, Tobias Weinzierl, Anne Reinarz, Michael Bader

    Abstract: Resilient algorithms in high-performance computing are subject to rigorous non-functional constraints. Resiliency must not increase the runtime, memory footprint or I/O demands too significantly. We propose a task-based soft error detection scheme that relies on error criteria per task outcome. They formalise how ``dubious'' an outcome is, i.e. how likely it contains an error. Our whole simulation… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

  12. arXiv:2108.10565  [pdf, other

    cs.DC cs.MS physics.comp-ph physics.geo-ph

    An Efficient ADER-DG Local Time Step** Scheme for 3D HPC Simulation of Seismic Waves in Poroelastic Media

    Authors: Sebastian Wolf, Martin Galis, Carsten Uphoff, Alice-Agnes Gabriel, Peter Moczo, David Gregor, Michael Bader

    Abstract: Many applications from geosciences require simulations of seismic waves in porous media. Biot's theory of poroelasticity describes the coupling between solid and fluid phases and introduces a stiff source term, thereby increasing computational cost and motivating efficient methods utilising High-Performance Computing. We present a novel realisation of the discontinuous Galerkin scheme with Arbitra… ▽ More

    Submitted 1 March, 2022; v1 submitted 24 August, 2021; originally announced August 2021.

    Comments: 37 pages, 18 figures, published in the Journal of Computational Physics

    Journal ref: Journal of Computational Physics: Volume 455, 2022

  13. arXiv:2107.14552  [pdf, other

    cs.MS math.NA

    High Performance Uncertainty Quantification with Parallelized Multilevel Markov Chain Monte Carlo

    Authors: Linus Seelinger, Anne Reinarz, Leonhard Rannabauer, Michael Bader, Peter Bastian, Robert Scheichl

    Abstract: Numerical models of complex real-world phenomena often necessitate High Performance Computing (HPC). Uncertainties increase problem dimensionality further and pose even greater challenges. We present a parallelization strategy for multilevel Markov chain Monte Carlo, a state-of-the-art, algorithmically scalable Uncertainty Quantification (UQ) algorithm for Bayesian inverse problems, and a new so… ▽ More

    Submitted 30 July, 2021; originally announced July 2021.

  14. arXiv:2107.06640  [pdf, other

    physics.comp-ph cs.DC cs.MS physics.geo-ph

    3D Acoustic-Elastic Coupling with Gravity: The Dynamics of the 2018 Palu, Sulawesi Earthquake and Tsunami

    Authors: Lukas Krenz, Carsten Uphoff, Thomas Ulrich, Alice-Agnes Gabriel, Lauren S. Abrahams, Eric M. Dunham, Michael Bader

    Abstract: We present a highly scalable 3D fully-coupled Earth & ocean model of earthquake rupture and tsunami generation. We model seismic, acoustic and surface gravity wave propagation in elastic (Earth) and acoustic (ocean) materials sourced by physics-based non-linear earthquake dynamic rupture. Complicated geometries, including high-resolution bathymetry, coastlines and segmented earthquake faults are d… ▽ More

    Submitted 22 November, 2021; v1 submitted 13 July, 2021; originally announced July 2021.

    Comments: 13 pages, 6 figures; European Commission Project: ChEESE - Centre of Excellence for Exascale in Solid Earth (EC-H2020-823844)

    Journal ref: SC '21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisNovember 2021 Article No.: 63

  15. arXiv:2107.00228  [pdf, other

    cs.LG cs.CV

    Scalable Certified Segmentation via Randomized Smoothing

    Authors: Marc Fischer, Maximilian Baader, Martin Vechev

    Abstract: We present a new certification method for image and point cloud segmentation based on randomized smoothing. The method leverages a novel scalable algorithm for prediction and certification that correctly accounts for multiple testing, necessary for ensuring statistical guarantees. The key to our approach is reliance on established multiple-testing correction mechanisms as well as the ability to ab… ▽ More

    Submitted 27 July, 2022; v1 submitted 1 July, 2021; originally announced July 2021.

    Comments: ICML'21

  16. arXiv:2102.06700  [pdf, other

    cs.LG cs.AI

    On the Paradox of Certified Training

    Authors: Nikola Jovanović, Mislav Balunović, Maximilian Baader, Martin Vechev

    Abstract: Certified defenses based on convex relaxations are an established technique for training provably robust models. The key component is the choice of relaxation, varying from simple intervals to tight polyhedra. Counterintuitively, loose interval-based training often leads to higher certified robustness than what can be achieved with tighter relaxations, which is a well-known but poorly understood p… ▽ More

    Submitted 12 October, 2022; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: Published in Transactions on Machine Learning Research (TMLR) 10/2022

  17. arXiv:2010.08770  [pdf

    cs.SD cs.CL eess.AS

    Studying the Similarity of COVID-19 Sounds based on Correlation Analysis of MFCC

    Authors: Mohamed Bader, Ismail Shahin, Abdelfatah Hassan

    Abstract: Recently there has been a formidable work which has been put up from the people who are working in the frontlines such as hospitals, clinics, and labs alongside researchers and scientists who are also putting tremendous efforts in the fight against COVID-19 pandemic. Due to the preposterous spread of the virus, the integration of the artificial intelligence has taken a considerable part in the hea… ▽ More

    Submitted 17 October, 2020; originally announced October 2020.

    Comments: 5 pages, 4 figures, conference paper

  18. arXiv:2009.09318  [pdf, ps, other

    cs.LG cs.AI cs.CV stat.ML

    Efficient Certification of Spatial Robustness

    Authors: Anian Ruoss, Maximilian Baader, Mislav Balunović, Martin Vechev

    Abstract: Recent work has exposed the vulnerability of computer vision models to vector field attacks. Due to the widespread usage of such models in safety-critical applications, it is crucial to quantify their robustness against such spatial transformations. However, existing work only provides empirical robustness quantification against vector field deformations via adversarial attacks, which lack provabl… ▽ More

    Submitted 30 January, 2021; v1 submitted 19 September, 2020; originally announced September 2020.

    Comments: Conference Paper at AAAI 2021

  19. TeaMPI -- Replication-based Resilience without the (Performance) Pain

    Authors: Philipp Samfass, Tobias Weinzierl, Benjamin Hazelwood, Michael Bader

    Abstract: In an era where we can not afford to checkpoint frequently, replication is a generic way forward to construct numerical simulations that can continue to run even if hardware parts fail. Yet, replication often is not employed on larger scales, as naïvely mirroring a computation once effectively halves the machine size, and as kee** replicated simulations consistent with each other is not trivial.… ▽ More

    Submitted 1 July, 2020; v1 submitted 25 May, 2020; originally announced May 2020.

  20. An Environment for Sustainable Research Software in Germany and Beyond: Current State, Open Challenges, and Call for Action

    Authors: Hartwig Anzt, Felix Bach, Stephan Druskat, Frank Löffler, Axel Loewe, Bernhard Y. Renard, Gunnar Seemann, Alexander Struck, Elke Achhammer, Piush Aggarwal, Franziska Appel, Michael Bader, Lutz Brusch, Christian Busse, Gerasimos Chourdakis, Piotr W. Dabrowski, Peter Ebert, Bernd Flemisch, Sven Friedl, Bernadette Fritzsch, Maximilian D. Funk, Volker Gast, Florian Goth, Jean-Noël Grad, Sibylle Hermann , et al. (18 additional authors not shown)

    Abstract: Research software has become a central asset in academic research. It optimizes existing and enables new research methods, implements and embeds research knowledge, and constitutes an essential research product in itself. Research software must be sustainable in order to understand, replicate, reproduce, and build upon existing research or conduct new research effectively. In other words, software… ▽ More

    Submitted 5 May, 2020; v1 submitted 27 April, 2020; originally announced May 2020.

    Comments: Official position paper 001 of de-RSE e.V. - Society for Research Software (https://de-rse.org) --- 16 pages, 1 figure + 1 page supplementary material, 4 figures --- Submitted to the F1000 Research Science Policy Research Gateway on 2020-04-03

    Journal ref: F1000Research 2020

  21. arXiv:2003.12787  [pdf, other

    cs.MS

    Vectorization and Minimization of Memory Footprint for Linear High-Order Discontinuous Galerkin Schemes

    Authors: Jean-Matthieu Gallard, Leonhard Rannabauer, Anne Reinarz, Michael Bader

    Abstract: We present a sequence of optimizations to the performance-critical compute kernels of the high-order discontinuous Galerkin solver of the hyperbolic PDE engine ExaHyPE -- successively tackling bottlenecks due to SIMD operations, cache hierarchies and restrictions in the software design. Starting from a generic scalar implementation of the numerical scheme, our first optimized variant applies sta… ▽ More

    Submitted 28 March, 2020; originally announced March 2020.

    Comments: PDSEC 2020

  22. arXiv:2002.12463  [pdf, other

    cs.LG cs.CR stat.ML

    Certified Defense to Image Transformations via Randomized Smoothing

    Authors: Marc Fischer, Maximilian Baader, Martin Vechev

    Abstract: We extend randomized smoothing to cover parameterized transformations (e.g., rotations, translations) and certify robustness in the parameter space (e.g., rotation angle). This is particularly challenging as interpolation and rounding effects mean that image transformations do not compose, in turn preventing direct certification of the perturbed image (unlike certification with $\ell^p$ norms). We… ▽ More

    Submitted 25 August, 2021; v1 submitted 27 February, 2020; originally announced February 2020.

    Comments: Conference Paper at NeurIPS 2020

  23. arXiv:1911.06817  [pdf, other

    cs.MS cs.SE

    Role-Oriented Code Generation in an Engine for Solving Hyperbolic PDE Systems

    Authors: Jean-Matthieu Gallard, Lukas Krenz, Leonhard Rannabauer, Anne Reinarz, Michael Bader

    Abstract: The development of a high performance PDE solver requires the combined expertise of interdisciplinary teams with respect to application domain, numerical scheme and low-level optimization. In this paper, we present how the ExaHyPE engine facilitates the collaboration of such teams by isolating three roles: application, algorithms, and optimization expert. We thus support team members in letting th… ▽ More

    Submitted 28 March, 2020; v1 submitted 15 November, 2019; originally announced November 2019.

    Comments: SC19 SE-HER

  24. arXiv:1910.06477  [pdf, other

    math.NA cs.DC physics.comp-ph physics.geo-ph

    A stable discontinuous Galerkin method for the perfectly matched layer for elastodynamics in first order form

    Authors: Kenneth Duru, Leonhard Rannabauer, Alice-Agnes Gabriel, Gunilla Kreiss, Michael Bader

    Abstract: We present a stable discontinuous Galerkin (DG) method with a perfectly matched layer (PML) for three and two space dimensional linear elastodynamics, in velocity-stress formulation, subject to well-posed linear boundary conditions. First, we consider the elastodynamics equation, in a cuboidal domain, and derive an unsplit PML truncating the domain using complex coordinate stretching. Leveraging t… ▽ More

    Submitted 6 January, 2020; v1 submitted 14 October, 2019; originally announced October 2019.

  25. arXiv:1909.13846  [pdf, other

    cs.LG stat.ML

    Universal Approximation with Certified Networks

    Authors: Maximilian Baader, Matthew Mirman, Martin Vechev

    Abstract: Training neural networks to be certifiably robust is critical to ensure their safety against adversarial attacks. However, it is currently very difficult to train a neural network that is both accurate and certifiably robust. In this work we take a step towards addressing this challenge. We prove that for every continuous function $f$, there exists a network $n$ such that: (i) $n$ approximates… ▽ More

    Submitted 14 January, 2020; v1 submitted 30 September, 2019; originally announced September 2019.

    Comments: ICLR 2020

  26. Lightweight Task Offloading Exploiting MPI Wait Times for Parallel Adaptive Mesh Refinement

    Authors: Philipp Samfass, Tobias Weinzierl, Dominic E. Charrier, Michael Bader

    Abstract: Balancing the workload of sophisticated simulations is inherently difficult, since we have to balance both computational workload and memory footprint over meshes that can change any time or yield unpredictable cost per mesh entity, while modern supercomputers and their interconnects start to exhibit fluctuating performance. We propose a novel lightweight balancing technique for MPI+X to accompany… ▽ More

    Submitted 14 April, 2020; v1 submitted 13 September, 2019; originally announced September 2019.

  27. arXiv:1907.02658  [pdf, other

    math.NA cs.DC physics.geo-ph

    A stable discontinuous Galerkin method for linear elastodynamics in 3D geometrically complex media using physics based numerical fluxes

    Authors: Kenneth Duru, Leonhard Rannabauer, Alice-Agnes Gabriel, On Ki Angel Ling, Heiner Igel, Michael Bader

    Abstract: High order accurate and explicit time-stable solvers are well suited for hyperbolic wave propagation problems. As a result of the complexities of real geometries, internal interfaces and nonlinear boundary and interface conditions, discontinuities and sharp wave fronts may become fundamental features of the solution. Thus, geometrically flexible and adaptive numerical algorithms are critical for h… ▽ More

    Submitted 10 April, 2021; v1 submitted 4 July, 2019; originally announced July 2019.

  28. ExaHyPE: An Engine for Parallel Dynamically Adaptive Simulations of Wave Problems

    Authors: Anne Reinarz, Dominic E. Charrier, Michael Bader, Luke Bovard, Michael Dumbser, Kenneth Duru, Francesco Fambri, Alice-Agnes Gabriel, Jean-Matthieu Gallard, Sven Köppel, Lukas Krenz, Leonhard Rannabauer, Luciano Rezzolla, Philipp Samfass, Maurizio Tavelli, Tobias Weinzierl

    Abstract: ExaHyPE ("An Exascale Hyperbolic PDE Engine") is a software engine for solving systems of first-order hyperbolic partial differential equations (PDEs). Hyperbolic PDEs are typically derived from the conservation laws of physics and are useful in a wide range of application areas. Applications powered by ExaHyPE can be run on a student's laptop, but are also able to exploit thousands of processor c… ▽ More

    Submitted 18 May, 2020; v1 submitted 20 May, 2019; originally announced May 2019.

  29. arXiv:1903.11521  [pdf, other

    cs.MS cs.DC

    Yet Another Tensor Toolbox for discontinuous Galerkin methods and other applications

    Authors: Carsten Uphoff, Michael Bader

    Abstract: The numerical solution of partial differential equations is at the heart of many grand challenges in supercomputing. Solvers based on high-order discontinuous Galerkin (DG) discretisation have been shown to scale on large supercomputers with excellent performance and efficiency, if the implementation exploits all levels of parallelism and is tailored to the specific architecture. However, every ye… ▽ More

    Submitted 27 March, 2019; originally announced March 2019.

    Comments: Submitted to ACM TOMS

  30. Exploiting the Space Filling Curve Ordering of Particles in the Neighbour Search of Gadget3

    Authors: Antonio Ragagnin, Nikola Tchipev, Michael Bader, Klaus Dolag, Nicolay J. Hammer

    Abstract: Gadget3 is nowadays one of the most frequently used high performing parallel codes for cosmological hydrodynamical simulations. Recent analyses have shown t\ hat the Neighbour Search process of Gadget3 is one of the most time-consuming parts. Thus, a considerable speedup can be expected from improvements of the u\ nderlying algorithms. In this work we propose a novel approach for speeding up the N… ▽ More

    Submitted 23 October, 2018; originally announced October 2018.

    Comments: 17 pages, 6 figures, published at Parallel Computing (ParCo)

  31. arXiv:1810.07000  [pdf, other

    physics.comp-ph cs.DC

    Influence of A-Posteriori Subcell Limiting on Fault Frequency in Higher-Order DG Schemes

    Authors: Anne Reinarz, Jean-Matthieu Gallard, Michael Bader

    Abstract: Soft error rates are increasing as modern architectures require increasingly small features at low voltages. Due to the large number of components used in HPC architectures, these are particularly vulnerable to soft errors. Hence, when designing applications that run for long time periods on large machines, algorithmic resilience must be taken into account. In this paper we analyse the inherent re… ▽ More

    Submitted 21 May, 2019; v1 submitted 16 October, 2018; originally announced October 2018.

    Comments: 2018 IEEE/ACM 8th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS)

  32. Studies on the energy and deep memory behaviour of a cache-oblivious, task-based hyperbolic PDE solver

    Authors: Dominic E. Charrier, Benjamin Hazelwood, Ekaterina Tutlyaeva, Michael Bader, Michael Dumbser, Andrey Kudryavtsev, Alexander Moskovsky, Tobias Weinzierl

    Abstract: We study the performance behaviour of a seismic simulation using the ExaHyPE engine with a specific focus on memory characteristics and energy needs. ExaHyPE combines dynamically adaptive mesh refinement (AMR) with ADER-DG. It is parallelized using tasks, and it is cache efficient. AMR plus ADER-DG yields a task graph which is highly dynamic in nature and comprises both arithmetically expensive ta… ▽ More

    Submitted 25 March, 2019; v1 submitted 9 October, 2018; originally announced October 2018.

  33. arXiv:1304.5878  [pdf, other

    cs.RO

    Visual Room-Awareness for Humanoid Robot Self-Localization

    Authors: Markus Bader, Johann Prankl, Markus Vincze

    Abstract: Humanoid robots without internal sensors such as a compass tend to lose their orientation after a fall. Furthermore, re-initialisation is often ambiguous due to symmetric man-made environments. The room-awareness module proposed here is inspired by the results of psychological experiments and improves existing self-localization strategies by map** and matching the visual background with colour h… ▽ More

    Submitted 22 April, 2013; originally announced April 2013.

    Comments: Part of the OAGM/AAPR 2013 proceedings (1304.1876)

    Report number: OAGM-AAPR/2013/04

  34. arXiv:1011.3583  [pdf

    cs.DC cs.GR cs.PF

    Fast GPGPU Data Rearrangement Kernels using CUDA

    Authors: Michael Bader, Hans-Joachim Bungartz, Dheevatsa Mudigere, Srihari Narasimhan, Babu Narayanan

    Abstract: Many high performance-computing algorithms are bandwidth limited, hence the need for optimal data rearrangement kernels as well as their easy integration into the rest of the application. In this work, we have built a CUDA library of fast kernels for a set of data rearrangement operations. In particular, we have built generic kernels for rearranging m dimensional data into n dimensions, including… ▽ More

    Submitted 15 November, 2010; originally announced November 2010.