Search | arXiv e-print repository

A Scalable Approach to Performing Multiplication and Matrix Dot-Products in Unary

Abstract: Stochastic computing is a paradigm in which logical operations are performed on randomly generated bit streams. Complex arithmetic operations can be executed by simple logic circuits, resulting in a much smaller area footprint compared to conventional binary counterparts. However, the random or pseudorandom sources required for generating the bit streams are costly in terms of area and offset the… ▽ More Stochastic computing is a paradigm in which logical operations are performed on randomly generated bit streams. Complex arithmetic operations can be executed by simple logic circuits, resulting in a much smaller area footprint compared to conventional binary counterparts. However, the random or pseudorandom sources required for generating the bit streams are costly in terms of area and offset the advantages. Additionally, due to the inherent randomness, the computation lacks precision, limiting the applicability of this paradigm. Importantly, achieving reasonable accuracy in stochastic computing involves high latency. Recently, deterministic approaches to stochastic computing have been proposed, demonstrating that randomness is not a requirement. By structuring the computation deterministically, exact results can be obtained, and the latency greatly reduced. The bit stream generated adheres to a "unary" encoding, retaining the non-positional nature of the bits while discarding the random bit generation of traditional stochastic computing. This deterministic approach overcomes many drawbacks of stochastic computing, although the latency increases quadratically with each level of logic, becoming unmanageable beyond a few levels. In this paper, we present a method for approximating the results of the deterministic method while maintaining low latency at each level. This improvement comes at the cost of additional logic, but we demonstrate that the increase in area scales with the square root of n, where n represents the equivalent number of binary bits of precision. Our new approach is general, efficient, composable, and applicable to all arithmetic operations performed with stochastic logic. We show that this approach outperforms other stochastic designs for matrix multiplication (dot-product), which is an integral step in nearly all machine learning algorithms. △ Less

Submitted 5 July, 2023; originally announced July 2023.

Comments: 25 pages, 8 figures

arXiv:2307.00686 [pdf, other]

Neural network execution using nicked DNA and microfluidics

Authors: Arnav Solanki, Zak Griffin, Purab Ranjan Sutradhar, Amlan Ganguly, Marc D. Riedel

Abstract: DNA has been discussed as a potential medium for data storage. Potentially it could be denser, could consume less energy, and could be more durable than conventional storage media such as hard drives, solid-state storage, and optical media. However, computing on data stored in DNA is a largely unexplored challenge. This paper proposes an integrated circuit (IC) based on microfluidics that can perf… ▽ More DNA has been discussed as a potential medium for data storage. Potentially it could be denser, could consume less energy, and could be more durable than conventional storage media such as hard drives, solid-state storage, and optical media. However, computing on data stored in DNA is a largely unexplored challenge. This paper proposes an integrated circuit (IC) based on microfluidics that can perform complex operations such as artificial neural network (ANN) computation on data stored in DNA. It computes entirely in the molecular domain without converting data to electrical form, making it a form of in-memory computing on DNA. The computation is achieved by topologically modifying DNA strands through the use of enzymes called nickases. A novel scheme is proposed for representing data stochastically through the concentration of the DNA molecules that are nicked at specific sites. The paper provides details of the biochemical design, as well as the design, layout, and operation of the microfluidics device. Benchmarks are reported on the performance of neural network computation. △ Less

Submitted 2 July, 2023; originally announced July 2023.

Comments: 24 pages, 11 figures

arXiv:2303.11705 [pdf, other]

A Single-Step Multiclass SVM based on Quantum Annealing for Remote Sensing Data Classification

Authors: Amer Delilbasic, Bertrand Le Saux, Morris Riedel, Kristel Michielsen, Gabriele Cavallaro

Abstract: In recent years, the development of quantum annealers has enabled experimental demonstrations and has increased research interest in applications of quantum annealing, such as in quantum machine learning and in particular for the popular quantum SVM. Several versions of the quantum SVM have been proposed, and quantum annealing has been shown to be effective in them. Extensions to multiclass proble… ▽ More In recent years, the development of quantum annealers has enabled experimental demonstrations and has increased research interest in applications of quantum annealing, such as in quantum machine learning and in particular for the popular quantum SVM. Several versions of the quantum SVM have been proposed, and quantum annealing has been shown to be effective in them. Extensions to multiclass problems have also been made, which consist of an ensemble of multiple binary classifiers. This work proposes a novel quantum SVM formulation for direct multiclass classification based on quantum annealing, called Quantum Multiclass SVM (QMSVM). The multiclass classification problem is formulated as a single Quadratic Unconstrained Binary Optimization (QUBO) problem solved with quantum annealing. The main objective of this work is to evaluate the feasibility, accuracy, and time performance of this approach. Experiments have been performed on the D-Wave Advantage quantum annealer for a classification problem on remote sensing data. The results indicate that, despite the memory demands of the quantum annealer, QMSVM can achieve accuracy that is comparable to standard SVM methods and, more importantly, it scales much more efficiently with the number of training examples, resulting in nearly constant time. This work shows an approach for bringing together classical and quantum computation, solving practical problems in remote sensing with current hardware. △ Less

Submitted 21 March, 2023; originally announced March 2023.

Comments: 12 pages, 10 figures, 3 tables. Submitted to IEEE JSTARS

arXiv:2211.15494 [pdf, other]

Automated Routing of Droplets for DNA Storage on a Digital Microfluidics Platform

Authors: Ajay Manicka, Andrew Stephan, Sriram Chari, Gemma Mendonsa, Peyton Okubo, John Stolzberg-Schray, Anil Reddy, Marc Riedel

Abstract: Technologies for sequencing (reading) and synthesizing (writing) DNA have progressed on a Moore's law-like trajectory over the last three decades. This has motivated the idea of using DNA for data storage. Theoretically, DNA-based storage systems could out-compete all existing forms of archival storage. However, a large gap exists between what is theoretically possible in terms of read and write s… ▽ More Technologies for sequencing (reading) and synthesizing (writing) DNA have progressed on a Moore's law-like trajectory over the last three decades. This has motivated the idea of using DNA for data storage. Theoretically, DNA-based storage systems could out-compete all existing forms of archival storage. However, a large gap exists between what is theoretically possible in terms of read and write speeds and what has been practically demonstrated with DNA. This paper introduces a novel approach to DNA storage, with automated assembly on a digital microfluidic biochip. This technology offers unprecedented parallelism in DNA assembly using a dual library of "symbols" and "linkers". An algorithmic solution is discussed for the problem of managing droplet traffic on the device, with prioritized three-dimensional "A*" routing. An overview is given of the software that was developed for routing a large number of droplets in parallel on the device, minimizing congestion and maximizing throughput. △ Less

Submitted 5 July, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

Comments: 15 pages, 23 figures. Submitted to the journal "Lab on a Chip" for publication by the Royal Society of Chemistry

arXiv:2110.14946 [pdf, other]

Towards Large-Scale Rendering of Simulated Crops for Synthetic Ground Truth Generation on Modular Supercomputers

Authors: Dirk Norbert Helmrich, Jens Henrik Göbbert, Mona Giraud, Hanno Scharr, Andrea Schnepf, Morris Riedel

Abstract: Computer Vision problems deal with the semantic extraction of information from camera images. Especially for field crop images, the underlying problems are hard to label and even harder to learn, and the availability of high-quality training data is low. Deep neural networks do a good job of extracting the necessary models from training examples. However, they rely on an abundance of training data… ▽ More Computer Vision problems deal with the semantic extraction of information from camera images. Especially for field crop images, the underlying problems are hard to label and even harder to learn, and the availability of high-quality training data is low. Deep neural networks do a good job of extracting the necessary models from training examples. However, they rely on an abundance of training data that is not feasible to generate or label by expert annotation. To address this challenge, we make use of the Unreal Engine to render large and complex virtual scenes. We rely on the performance of individual nodes by distributing plant simulations across nodes and both generate scenes as well as train neural networks on GPUs, restricting node communication to parallel learning. △ Less

Submitted 28 October, 2021; originally announced October 2021.

Comments: Accepted Poster for the 11th IEEE Symposium on Large Data Analysis and Visualization

ACM Class: I.3; I.4; I.6

arXiv:2108.11976 [pdf, other]

JUWELS Booster -- A Supercomputer for Large-Scale AI Research

Authors: Stefan Kesselheim, Andreas Herten, Kai Krajsek, Jan Ebert, Jenia Jitsev, Mehdi Cherti, Michael Langguth, Bing Gong, Scarlet Stadtler, Amirpasha Mozaffari, Gabriele Cavallaro, Rocco Sedona, Alexander Schug, Alexandre Strube, Roshni Kamath, Martin G. Schultz, Morris Riedel, Thomas Lippert

Abstract: In this article, we present JUWELS Booster, a recently commissioned high-performance computing system at the Jülich Supercomputing Center. With its system architecture, most importantly its large number of powerful Graphics Processing Units (GPUs) and its fast interconnect via InfiniBand, it is an ideal machine for large-scale Artificial Intelligence (AI) research and applications. We detail its s… ▽ More In this article, we present JUWELS Booster, a recently commissioned high-performance computing system at the Jülich Supercomputing Center. With its system architecture, most importantly its large number of powerful Graphics Processing Units (GPUs) and its fast interconnect via InfiniBand, it is an ideal machine for large-scale Artificial Intelligence (AI) research and applications. We detail its system architecture, parallel, distributed model training, and benchmarks indicating its outstanding performance. We exemplify its potential for research application by presenting large-scale AI research highlights from various scientific fields that require such a facility. △ Less

Submitted 30 June, 2021; originally announced August 2021.

Comments: 12 pages, 5 figures. Accepted at ISC 2021, Workshop Deep Learning on Supercomputers. This is a duplicate submission as my previous submission is on hold for several weeks now and my attempts to contact the moderators failed

Report number: 1234567Dummy

arXiv:1310.5095 [pdf, other]

Regularization in Relevance Learning Vector Quantization Using l one Norms

Authors: Martin Riedel, Marika Kästner, Fabrice Rossi, Thomas Villmann

Abstract: We propose in this contribution a method for l one regularization in prototype based relevance learning vector quantization (LVQ) for sparse relevance profiles. Sparse relevance profiles in hyperspectral data analysis fade down those spectral bands which are not necessary for classification. In particular, we consider the sparsity in the relevance profile enforced by LASSO optimization. The latter… ▽ More We propose in this contribution a method for l one regularization in prototype based relevance learning vector quantization (LVQ) for sparse relevance profiles. Sparse relevance profiles in hyperspectral data analysis fade down those spectral bands which are not necessary for classification. In particular, we consider the sparsity in the relevance profile enforced by LASSO optimization. The latter one is obtained by a gradient learning scheme using a differentiable parametrized approximation of the $l_{1}$-norm, which has an upper error bound. We extend this regularization idea also to the matrix learning variant of LVQ as the natural generalization of relevance learning. △ Less

Submitted 18 October, 2013; originally announced October 2013.

Journal ref: 21-th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2013), Bruges : Belgium (2013)

arXiv:cs/0502090 [pdf, ps, other]

UNICORE - From Project Results to Production Grids

Authors: A. Streit, D. Erwin, Th. Lippert, D. Mallmann, R. Menday, M. Rambadt, M. Riedel, M. Romberg, B. Schuller, Ph. Wieder

Abstract: The UNICORE Grid-technology provides a seamless, secure and intuitive access to distributed Grid resources. In this paper we present the recent evolution from project results to production Grids. At the beginning UNICORE was developed as a prototype software in two projects funded by the German research ministry (BMBF). Over the following years, in various European-funded projects, UNICORE evolv… ▽ More The UNICORE Grid-technology provides a seamless, secure and intuitive access to distributed Grid resources. In this paper we present the recent evolution from project results to production Grids. At the beginning UNICORE was developed as a prototype software in two projects funded by the German research ministry (BMBF). Over the following years, in various European-funded projects, UNICORE evolved to a full-grown and well-tested Grid middleware system, which today is used in daily production at many supercomputing centers worldwide. Beyond this production usage, the UNICORE technology serves as a solid basis in many European and International research projects, which use existing UNICORE components to implement advanced features, high level services, and support for applications from a growing range of domains. In order to foster these ongoing developments, UNICORE is available as open source under BSD licence at SourceForge, where new releases are published on a regular basis. This paper is a review of the UNICORE achievements so far and gives a glimpse on the UNICORE roadmap. △ Less

Submitted 24 February, 2005; originally announced February 2005.

Comments: 21 pages, 3 figures

Showing 1–8 of 8 results for author: Riedel, M