Search | arXiv e-print repository

arXiv:2006.12439 [pdf, other]

Fully-parallel Convolutional Neural Network Hardware

Authors: Christiam F. Frasser, Pablo Linares-Serrano, V. Canals, Miquel Roca, T. Serrano-Gotarredona, Josep L. Rossello

Abstract: A new trans-disciplinary knowledge area, Edge Artificial Intelligence or Edge Intelligence, is beginning to receive a tremendous amount of interest from the machine learning community due to the ever increasing popularization of the Internet of Things (IoT). Unfortunately, the incorporation of AI characteristics to edge computing devices presents the drawbacks of being power and area hungry for ty… ▽ More A new trans-disciplinary knowledge area, Edge Artificial Intelligence or Edge Intelligence, is beginning to receive a tremendous amount of interest from the machine learning community due to the ever increasing popularization of the Internet of Things (IoT). Unfortunately, the incorporation of AI characteristics to edge computing devices presents the drawbacks of being power and area hungry for typical machine learning techniques such as Convolutional Neural Networks (CNN). In this work, we propose a new power-and-area-efficient architecture for implementing Articial Neural Networks (ANNs) in hardware, based on the exploitation of correlation phenomenon in Stochastic Computing (SC) systems. The architecture purposed can solve the difficult implementation challenges that SC presents for CNN applications, such as the high resources used in binary-tostochastic conversion, the inaccuracy produced by undesired correlation between signals, and the stochastic maximum function implementation. Compared with traditional binary logic implementations, experimental results showed an improvement of 19.6x and 6.3x in terms of speed performance and energy efficiency, for the FPGA implementation. We have also realized a full VLSI implementation of the proposed SC-CNN architecture demonstrating that our optimization achieve a 18x area reduction over previous SC-DNN architecture VLSI implementation in a comparable technological node. For the first time, a fully-parallel CNN as LENET-5 is embedded and tested in a single FPGA, showing the benefits of using stochastic computing for embedded applications, in contrast to traditional binary logic implementations. △ Less

Submitted 22 June, 2020; originally announced June 2020.

Comments: 8 pages, 6 figures, to be submitted to an IEEE journal

MSC Class: I.2.m; B.7.1

arXiv:2006.02505 [pdf, other]

Stochastic-based Neural Network hardware acceleration for an efficient ligand-based virtual screening

Authors: Christian F. Frasser, Carola de Benito, Vincent Canals, Miquel Roca, Pedro J. Ballester, Josep L. Rossello

Abstract: Artificial Neural Networks (ANN) have been popularized in many science and technological areas due to their capacity to solve many complex pattern matching problems. That is the case of Virtual Screening, a research area that studies how to identify those molecular compounds with the highest probability to present biological activity for a therapeutic target. Due to the vast number of small organi… ▽ More Artificial Neural Networks (ANN) have been popularized in many science and technological areas due to their capacity to solve many complex pattern matching problems. That is the case of Virtual Screening, a research area that studies how to identify those molecular compounds with the highest probability to present biological activity for a therapeutic target. Due to the vast number of small organic compounds and the thousands of targets for which such large-scale screening can potentially be carried out, there has been an increasing interest in the research community to increase both, processing speed and energy efficiency in the screening of molecular databases. In this work, we present a classification model describing each molecule with a single energy-based vector and propose a machine-learning system based on the use of ANNs. Different ANNs are studied with respect to their suitability to identify biochemical similarities. Also, a high-performance and energy-efficient hardware acceleration platform based on the use of stochastic computing is proposed for the ANN implementation. This platform is of utility when screening vast libraries of compounds. As a result, the proposed model showed appreciable improvements with respect previously published works in terms of the main relevant characteristics (accuracy, speed and energy-efficiency). △ Less

Submitted 3 June, 2020; originally announced June 2020.

Comments: 14 pages, 9 Figures, 3 Tables. Paper submitted to an IEEE journal

arXiv:1806.04932 [pdf, other]

Reservoir Computing Hardware with Cellular Automata

Authors: Alejandro Morán, Christiam F. Frasser, Josep L. Rosselló

Abstract: Elementary cellular automata (ECA) is a widely studied one-dimensional processing methodology where the successive iteration of the automaton may lead to the recreation of a rich pattern dynamic. Recently, cellular automata have been proposed as a feasible way to implement Reservoir Computing (RC) systems in which the automata rule is fixed and the training is performed using a linear regression.… ▽ More Elementary cellular automata (ECA) is a widely studied one-dimensional processing methodology where the successive iteration of the automaton may lead to the recreation of a rich pattern dynamic. Recently, cellular automata have been proposed as a feasible way to implement Reservoir Computing (RC) systems in which the automata rule is fixed and the training is performed using a linear regression. In this work we perform an exhaustive study of the performance of the different ECA rules when applied to pattern recognition of time-independent input signals using a RC scheme. Once the different ECA rules have been tested, the most accurate one (rule 90) is selected to implement a digital circuit. Rule 90 is easily reproduced using a reduced set of XOR gates and shift-registers, thus representing a high-performance alternative for RC hardware implementation in terms of processing time, circuit area, power dissipation and system accuracy. The model (both in software and its hardware implementation) has been tested using a pattern recognition task of handwritten numbers (the MNIST database) for which we obtained competitive results in terms of accuracy, speed and power dissipation. The proposed model can be considered to be a low-cost method to implement fast pattern recognition digital circuits. △ Less

Submitted 21 June, 2018; v1 submitted 13 June, 2018; originally announced June 2018.

Comments: 20 pages, 11 figures, draft of an article currently submitted to IEEE journal

arXiv:1202.4495 [pdf]

doi 10.1016/j.patrec.2010.07.008

Stochastic-Based Pattern Recognition Analysis

Authors: V. Canals, A. Morro, J. L. Rosselló

Abstract: In this work we review the basic principles of stochastic logic and propose its application to probabilistic-based pattern-recognition analysis. The proposed technique is intrinsically a parallel comparison of input data to various pre-stored categories using Bayesian techniques. We design smart pulse-based stochastic-logic blocks to provide an efficient pattern recognition analysis. The proposed… ▽ More In this work we review the basic principles of stochastic logic and propose its application to probabilistic-based pattern-recognition analysis. The proposed technique is intrinsically a parallel comparison of input data to various pre-stored categories using Bayesian techniques. We design smart pulse-based stochastic-logic blocks to provide an efficient pattern recognition analysis. The proposed rchitecture is applied to a specific navigation problem. The resulting system is orders of magnitude faster than processor-based solutions. △ Less

Submitted 20 February, 2012; originally announced February 2012.

Journal ref: Published in Pattern Recognition Letters in 2010

arXiv:0710.4759 [pdf]

A Fast Concurrent Power-Thermal Model for Sub-100nm Digital ICs

Authors: J. L. Rossello, V. Canals, S. A. Bota, A. Keshavarzi, J. Segura

Abstract: As technology scales down, the static power is expected to become a significant fraction of the total power. The exponential dependence of static power with the operating temperature makes the thermal profile estimation of high-performance ICs a key issue to compute the total power dissipated in next-generations. In this paper we present accurate and compact analytical models to estimate the sta… ▽ More As technology scales down, the static power is expected to become a significant fraction of the total power. The exponential dependence of static power with the operating temperature makes the thermal profile estimation of high-performance ICs a key issue to compute the total power dissipated in next-generations. In this paper we present accurate and compact analytical models to estimate the static power dissipation and the temperature of operation of CMOS gates. The models are the fundamentals of a performance estimation tool in which numerical procedures are avoided for any computation to set a faster estimation and optimization. The models developed are compared to measurements and SPICE simulations for a 0.12mm technology showing excellent results. △ Less

Submitted 25 October, 2007; originally announced October 2007.

Comments: Submitted on behalf of EDAA (http://www.edaa.com/)

Journal ref: Dans Design, Automation and Test in Europe - DATE'05, Munich : Allemagne (2005)

arXiv:0710.4733 [pdf]

Smart Temperature Sensor for Thermal Testing of Cell-Based ICs

Authors: S. A. Bota, M. Rosales, J. L. Rossello, J. Segura

Abstract: In this paper we present a simple and efficient built-in temperature sensor for thermal monitoring of standard-cell based VLSI circuits. The proposed smart temperature sensor uses a ring-oscillator composed of complex gates instead of inverters to optimize their linearity. Simulation results from a 0.18$μ$m CMOS technology show that the non-linearity error of the sensor can be reduced when an ad… ▽ More In this paper we present a simple and efficient built-in temperature sensor for thermal monitoring of standard-cell based VLSI circuits. The proposed smart temperature sensor uses a ring-oscillator composed of complex gates instead of inverters to optimize their linearity. Simulation results from a 0.18$μ$m CMOS technology show that the non-linearity error of the sensor can be reduced when an adequate set of standard logic gates is selected. △ Less

Submitted 25 October, 2007; originally announced October 2007.

Comments: Submitted on behalf of EDAA (http://www.edaa.com/)

Journal ref: Dans Design, Automation and Test in Europe - DATE'05, Munich : Allemagne (2005)

Showing 1–6 of 6 results for author: Rossello, J L