-
Q-S5: Towards Quantized State Space Models
Authors:
Steven Abreu,
Jens E. Pedersen,
Kade M. Heckel,
Alessandro Pierro
Abstract:
In the quest for next-generation sequence modeling architectures, State Space Models (SSMs) have emerged as a potent alternative to transformers, particularly for their computational efficiency and suitability for dynamical systems. This paper investigates the effect of quantization on the S5 model to understand its impact on model performance and to facilitate its deployment to edge and resource-…
▽ More
In the quest for next-generation sequence modeling architectures, State Space Models (SSMs) have emerged as a potent alternative to transformers, particularly for their computational efficiency and suitability for dynamical systems. This paper investigates the effect of quantization on the S5 model to understand its impact on model performance and to facilitate its deployment to edge and resource-constrained platforms. Using quantization-aware training (QAT) and post-training quantization (PTQ), we systematically evaluate the quantization sensitivity of SSMs across different tasks like dynamical systems modeling, Sequential MNIST (sMNIST) and most of the Long Range Arena (LRA). We present fully quantized S5 models whose test accuracy drops less than 1% on sMNIST and most of the LRA. We find that performance on most tasks degrades significantly for recurrent weights below 8-bit precision, but that other components can be compressed further without significant loss of performance. Our results further show that PTQ only performs well on language-based LRA tasks whereas all others require QAT. Our investigation provides necessary insights for the continued development of efficient and hardware-optimized SSMs.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Neuromorphic Intermediate Representation: A Unified Instruction Set for Interoperable Brain-Inspired Computing
Authors:
Jens E. Pedersen,
Steven Abreu,
Matthias Jobst,
Gregor Lenz,
Vittorio Fra,
Felix C. Bauer,
Dylan R. Muir,
Peng Zhou,
Bernhard Vogginger,
Kade Heckel,
Gianvito Urgese,
Sadasivan Shankar,
Terrence C. Stewart,
Jason K. Eshraghian,
Sadique Sheik
Abstract:
Spiking neural networks and neuromorphic hardware platforms that emulate neural dynamics are slowly gaining momentum and entering main-stream usage. Despite a well-established mathematical foundation for neural dynamics, the implementation details vary greatly across different platforms. Correspondingly, there are a plethora of software and hardware implementations with their own unique technology…
▽ More
Spiking neural networks and neuromorphic hardware platforms that emulate neural dynamics are slowly gaining momentum and entering main-stream usage. Despite a well-established mathematical foundation for neural dynamics, the implementation details vary greatly across different platforms. Correspondingly, there are a plethora of software and hardware implementations with their own unique technology stacks. Consequently, neuromorphic systems typically diverge from the expected computational model, which challenges the reproducibility and reliability across platforms. Additionally, most neuromorphic hardware is limited by its access via a single software frameworks with a limited set of training procedures. Here, we establish a common reference-frame for computations in neuromorphic systems, dubbed the Neuromorphic Intermediate Representation (NIR). NIR defines a set of computational primitives as idealized continuous-time hybrid systems that can be composed into graphs and mapped to and from various neuromorphic technology stacks. By abstracting away assumptions around discretization and hardware constraints, NIR faithfully captures the fundamental computation, while simultaneously exposing the exact differences between the evaluated implementation and the idealized mathematical formalism. We reproduce three NIR graphs across 7 neuromorphic simulators and 4 hardware platforms, demonstrating support for an unprecedented number of neuromorphic systems. With NIR, we decouple the evolution of neuromorphic hardware and software, ultimately increasing the interoperability between platforms and improving accessibility to neuromorphic technologies. We believe that NIR is an important step towards the continued study of brain-inspired hardware and bottom-up approaches aimed at an improved understanding of the computational underpinnings of nervous systems.
△ Less
Submitted 24 November, 2023;
originally announced November 2023.
-
Concepts and Paradigms for Neuromorphic Programming
Authors:
Steven Abreu
Abstract:
The value of neuromorphic computers depends crucially on our ability to program them for relevant tasks. Currently, neuromorphic computers are mostly limited to machine learning methods adapted from deep learning. However, neuromorphic computers have potential far beyond deep learning if we can only make use of their computational properties to harness their full power. Neuromorphic programming wi…
▽ More
The value of neuromorphic computers depends crucially on our ability to program them for relevant tasks. Currently, neuromorphic computers are mostly limited to machine learning methods adapted from deep learning. However, neuromorphic computers have potential far beyond deep learning if we can only make use of their computational properties to harness their full power. Neuromorphic programming will necessarily be different from conventional programming, requiring a paradigm shift in how we think about programming in general. The contributions of this paper are 1) a conceptual analysis of what "programming" means in the context of neuromorphic computers and 2) an exploration of existing programming paradigms that are promising yet overlooked in neuromorphic computing. The goal is to expand the horizon of neuromorphic programming methods, thereby allowing researchers to move beyond the shackles of current methods and explore novel directions.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
Demonstrating (Hybrid) Active Logic Documents and the Ciao Prolog Playground, and an Application to Verification Tutorials
Authors:
Daniela Ferreiro,
José F. Morales,
Salvador Abreu,
Manuel V. Hermenegildo
Abstract:
Active Logic Documents (ALD) are web pages which incorporate embedded Prolog engines that run locally within the browser. ALD offers both a very easy way to add click-to-run capabilities to any kind of teaching materials, independently of the tool used to generate them, as well as a tool-set for generating web-based materials with embedded examples and exercises. Both leverage on (components of)…
▽ More
Active Logic Documents (ALD) are web pages which incorporate embedded Prolog engines that run locally within the browser. ALD offers both a very easy way to add click-to-run capabilities to any kind of teaching materials, independently of the tool used to generate them, as well as a tool-set for generating web-based materials with embedded examples and exercises. Both leverage on (components of) the Ciao Prolog Playground. We present a demonstration of the ALD approach and the Ciao Prolog Playground, as well as a recent extension to ALDs to facilitate the integration of other tools into the system for creating Hybrid Active Logic Documents (HALD). We also present a concrete application of these technologies to the creation of tutorials for a program verification tool.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
Training a spiking neural network on an event-based label-free flow cytometry dataset
Authors:
Muhammed Gouda,
Steven Abreu,
Alessio Lugnan,
Peter Bienstman
Abstract:
Imaging flow cytometry systems aim to analyze a huge number of cells or micro-particles based on their physical characteristics. The vast majority of current systems acquire a large amount of images which are used to train deep artificial neural networks. However, this approach increases both the latency and power consumption of the final apparatus. In this work-in-progress, we combine an event-ba…
▽ More
Imaging flow cytometry systems aim to analyze a huge number of cells or micro-particles based on their physical characteristics. The vast majority of current systems acquire a large amount of images which are used to train deep artificial neural networks. However, this approach increases both the latency and power consumption of the final apparatus. In this work-in-progress, we combine an event-based camera with a free-space optical setup to obtain spikes for each particle passing in a microfluidic channel. A spiking neural network is trained on the collected dataset, resulting in 97.7% mean training accuracy and 93.5% mean testing accuracy for the fully event-based classification pipeline.
△ Less
Submitted 19 March, 2023;
originally announced March 2023.
-
Fifty Years of Prolog and Beyond
Authors:
Philipp Körner,
Michael Leuschel,
João Barbosa,
Vítor Santos Costa,
Verónica Dahl,
Manuel V. Hermenegildo,
Jose F. Morales,
Jan Wielemaker,
Daniel Diaz,
Salvador Abreu,
Giovanni Ciatto
Abstract:
Both logic programming in general, and Prolog in particular, have a long and fascinating history, intermingled with that of many disciplines they inherited from or catalyzed. A large body of research has been gathered over the last 50 years, supported by many Prolog implementations. Many implementations are still actively developed, while new ones keep appearing. Often, the features added by diffe…
▽ More
Both logic programming in general, and Prolog in particular, have a long and fascinating history, intermingled with that of many disciplines they inherited from or catalyzed. A large body of research has been gathered over the last 50 years, supported by many Prolog implementations. Many implementations are still actively developed, while new ones keep appearing. Often, the features added by different systems were motivated by the interdisciplinary needs of programmers and implementors, yielding systems that, while sharing the "classic" core language, and, in particular, the main aspects of the ISO-Prolog standard, also depart from each other in other aspects. This obviously poses challenges for code portability. The field has also inspired many related, but quite different languages that have created their own communities.
This article aims at integrating and applying the main lessons learned in the process of evolution of Prolog. It is structured into three major parts. Firstly, we overview the evolution of Prolog systems and the community approximately up to the ISO standard, considering both the main historic developments and the motivations behind several Prolog implementations, as well as other logic programming languages influenced by Prolog. Then, we discuss the Prolog implementations that are most active after the appearance of the standard: their visions, goals, commonalities, and incompatibilities. Finally, we perform a SWOT analysis in order to better identify the potential of Prolog, and propose future directions along which Prolog might continue to add useful features, interfaces, libraries, and tools, while at the same time improving compatibility between implementations.
△ Less
Submitted 14 March, 2022; v1 submitted 26 January, 2022;
originally announced January 2022.
-
Generating Local Search Neighborhood with Synthesized Logic Programs
Authors:
Mateusz Ślażyński,
Salvador Abreu,
Grzegorz J. Nalepa
Abstract:
Local Search meta-heuristics have been proven a viable approach to solve difficult optimization problems. Their performance depends strongly on the search space landscape, as defined by a cost function and the selected neighborhood operators. In this paper we present a logic programming based framework, named Noodle, designed to generate bespoke Local Search neighborhoods tailored to specific dis…
▽ More
Local Search meta-heuristics have been proven a viable approach to solve difficult optimization problems. Their performance depends strongly on the search space landscape, as defined by a cost function and the selected neighborhood operators. In this paper we present a logic programming based framework, named Noodle, designed to generate bespoke Local Search neighborhoods tailored to specific discrete optimization problems. The proposed system consists of a domain specific language, which is inspired by logic programming, as well as a genetic programming solver, based on the grammar evolution algorithm. We complement the description with a preliminary experimental evaluation, where we synthesize efficient neighborhood operators for the traveling salesman problem, some of which reproduce well-known results.
△ Less
Submitted 18 September, 2019;
originally announced September 2019.
-
Pre-proceedings of the DECLARE 2019 Conference
Authors:
Salvador Abreu,
Petra Hofstedt,
Ulrich John,
Herbert Kuchen,
Dietmar Seipel
Abstract:
This volume constitutes the pre-proceedings of the DECLARE 2019 conference, held on September 9 to 13, 2019 at the University of Technology Cottbus - Senftenberg (Germany).
Declarative programming is an advanced paradigm for the modeling and solving of complex problems. This method has attracted increased attention over the last decades, e.g., in the domains of data and knowledge engineering, da…
▽ More
This volume constitutes the pre-proceedings of the DECLARE 2019 conference, held on September 9 to 13, 2019 at the University of Technology Cottbus - Senftenberg (Germany).
Declarative programming is an advanced paradigm for the modeling and solving of complex problems. This method has attracted increased attention over the last decades, e.g., in the domains of data and knowledge engineering, databases, artificial intelligence, natural language processing, modeling and processing combinatorial problems, and for establishing systems for the web.
The conference DECLARE 2019 aims at cross-fertilizing exchange of ideas and experiences among researches and students from the different communities interested in the foundations, implementation techniques, novel applications, and combinations of high-level, declarative programming and related areas. The technical program of the event included invited talks, presentations of refereed papers, and system demonstrations. DECLARE 2019 consisted of the sub-events INAP, WFLP, and WLP:
INAP - 22nd International Conference on Applications of Declarative Programming and Knowledge Management WFLP - 27th International Workshop on Functional and (Constraint) Logic Programming WLP - 33rd Workshop on (Constraint) Logic Programming
△ Less
Submitted 21 November, 2019; v1 submitted 11 September, 2019;
originally announced September 2019.
-
Automated Architecture Design for Deep Neural Networks
Authors:
Steven Abreu
Abstract:
Machine learning has made tremendous progress in recent years and received large amounts of public attention. Though we are still far from designing a full artificially intelligent agent, machine learning has brought us many applications in which computers solve human learning tasks remarkably well. Much of this progress comes from a recent trend within machine learning, called deep learning. Deep…
▽ More
Machine learning has made tremendous progress in recent years and received large amounts of public attention. Though we are still far from designing a full artificially intelligent agent, machine learning has brought us many applications in which computers solve human learning tasks remarkably well. Much of this progress comes from a recent trend within machine learning, called deep learning. Deep learning models are responsible for many state-of-the-art applications of machine learning. Despite their success, deep learning models are hard to train, very difficult to understand, and often times so complex that training is only possible on very large GPU clusters. Lots of work has been done on enabling neural networks to learn efficiently. However, the design and architecture of such neural networks is often done manually through trial and error and expert knowledge. This thesis inspects different approaches, existing and novel, to automate the design of deep feedforward neural networks in an attempt to create less complex models with good performance that take away the burden of deciding on an architecture and make it more efficient to design and train such deep networks.
△ Less
Submitted 21 August, 2019;
originally announced August 2019.
-
Experimenting with X10 for Parallel Constraint-Based Local Search
Authors:
Danny Munera,
Daniel Diaz,
Salvador Abreu
Abstract:
In this study, we have investigated the adequacy of the PGAS parallel language X10 to implement a Constraint-Based Local Search solver. We decided to code in this language to benefit from the ease of use and architectural independence from parallel resources which it offers. We present the implementation strategy, in search of different sources of parallelism in the context of an implementation of…
▽ More
In this study, we have investigated the adequacy of the PGAS parallel language X10 to implement a Constraint-Based Local Search solver. We decided to code in this language to benefit from the ease of use and architectural independence from parallel resources which it offers. We present the implementation strategy, in search of different sources of parallelism in the context of an implementation of the Adaptive Search algorithm. We extensively discuss the algorithm and its implementation. The performance evaluation on a representative set of benchmarks shows close to linear speed-ups, in all the problems treated.
△ Less
Submitted 17 July, 2013;
originally announced July 2013.
-
Parallel Local Search: Experiments with a PGAS-based programming model
Authors:
Rui Machado,
Salvador Abreu,
Daniel Diaz
Abstract:
Local search is a successful approach for solving combinatorial optimization and constraint satisfaction problems. With the progressing move toward multi and many-core systems, GPUs and the quest for Exascale systems, parallelism has become mainstream as the number of cores continues to increase. New programming models are required and need to be better understood as well as data structures and al…
▽ More
Local search is a successful approach for solving combinatorial optimization and constraint satisfaction problems. With the progressing move toward multi and many-core systems, GPUs and the quest for Exascale systems, parallelism has become mainstream as the number of cores continues to increase. New programming models are required and need to be better understood as well as data structures and algorithms. Such is the case for local search algorithms when run on hundreds or thousands of processing units. In this paper, we discuss some experiments we have been doing with Adaptive Search and present a new parallel version of it based on GPI, a recent API and programming model for the development of scalable parallel applications. Our experiments on different problems show interesting speedups and, more importantly, a deeper interpretation of the parallelization of Local Search methods.
△ Less
Submitted 10 May, 2013; v1 submitted 31 January, 2013;
originally announced January 2013.
-
Online Proceedings of the 11th International Colloquium on Implementation of Constraint LOgic Programming Systems (CICLOPS 2011), Lexington, KY, U.S.A., July 10, 2011
Authors:
Salvador Abreu,
Vitor Santos Costa
Abstract:
These are the revised versions of the papers presented at CICLOPS 2011, a workshop colocated with ICLP 2011.
These are the revised versions of the papers presented at CICLOPS 2011, a workshop colocated with ICLP 2011.
△ Less
Submitted 21 December, 2011;
originally announced December 2011.
-
On the Implementation of GNU Prolog
Authors:
Daniel Diaz,
Salvador Abreu,
Philippe Codognet
Abstract:
GNU Prolog is a general-purpose implementation of the Prolog language, which distinguishes itself from most other systems by being, above all else, a native-code compiler which produces standalone executables which don't rely on any byte-code emulator or meta-interpreter. Other aspects which stand out include the explicit organization of the Prolog system as a multipass compiler, where intermediat…
▽ More
GNU Prolog is a general-purpose implementation of the Prolog language, which distinguishes itself from most other systems by being, above all else, a native-code compiler which produces standalone executables which don't rely on any byte-code emulator or meta-interpreter. Other aspects which stand out include the explicit organization of the Prolog system as a multipass compiler, where intermediate representations are materialized, in Unix compiler tradition. GNU Prolog also includes an extensible and high-performance finite domain constraint solver, integrated with the Prolog language but implemented using independent lower-level mechanisms. This article discusses the main issues involved in designing and implementing GNU Prolog: requirements, system organization, performance and portability issues as well as its position with respect to other Prolog system implementations and the ISO standardization initiative.
△ Less
Submitted 15 December, 2010; v1 submitted 11 December, 2010;
originally announced December 2010.
-
Casting of the WAM as an EAM
Authors:
Paulo André,
Salvador Abreu
Abstract:
Logic programming provides a very high-level view of programming, which comes at the cost of some execution efficiency. Improving performance of logic programs is thus one of the holy grails of Prolog system implementations and a wide range of approaches have historically been taken towards this goal. Designing computational models that both exploit the available parallelism in a given application…
▽ More
Logic programming provides a very high-level view of programming, which comes at the cost of some execution efficiency. Improving performance of logic programs is thus one of the holy grails of Prolog system implementations and a wide range of approaches have historically been taken towards this goal. Designing computational models that both exploit the available parallelism in a given application and that try hard to reduce the explored search space has been an ongoing line of research for many years. These goals in particular have motivated the design of several computational models, one of which is the Extended Andorra Model (EAM). In this paper, we present a preliminary specification and implementation of the EAM with Implicit Control, the WAM2EAM, which supplies regular WAM instructions with an EAM-centered interpretation.
△ Less
Submitted 20 September, 2010;
originally announced September 2010.
-
Distributed Work Stealing for Constraint Solving
Authors:
Vasco Pedro,
Salvador Abreu
Abstract:
With the dissemination of affordable parallel and distributed hardware, parallel and distributed constraint solving has lately been the focus of some attention. To effectually apply the power of distributed computational systems, there must be an effective sharing of the work involved in the search for a solution to a Constraint Satisfaction Problem (CSP) between all the participating agents, and…
▽ More
With the dissemination of affordable parallel and distributed hardware, parallel and distributed constraint solving has lately been the focus of some attention. To effectually apply the power of distributed computational systems, there must be an effective sharing of the work involved in the search for a solution to a Constraint Satisfaction Problem (CSP) between all the participating agents, and it must happen dynamically, since it is hard to predict the effort associated with the exploration of some part of the search space. We describe and provide an initial experimental assessment of an implementation of a work stealing-based approach to distributed CSP solving.
△ Less
Submitted 20 September, 2010;
originally announced September 2010.
-
Parallel local search for solving Constraint Problems on the Cell Broadband Engine (Preliminary Results)
Authors:
Salvator Abreu,
Daniel Diaz,
Philippe Codognet
Abstract:
We explore the use of the Cell Broadband Engine (Cell/BE for short) for combinatorial optimization applications: we present a parallel version of a constraint-based local search algorithm that has been implemented on a multiprocessor BladeCenter machine with twin Cell/BE processors (total of 16 SPUs per blade). This algorithm was chosen because it fits very well the Cell/BE architecture and requ…
▽ More
We explore the use of the Cell Broadband Engine (Cell/BE for short) for combinatorial optimization applications: we present a parallel version of a constraint-based local search algorithm that has been implemented on a multiprocessor BladeCenter machine with twin Cell/BE processors (total of 16 SPUs per blade). This algorithm was chosen because it fits very well the Cell/BE architecture and requires neither shared memory nor communication between processors, while retaining a compact memory footprint. We study the performance on several large optimization benchmarks and show that this achieves mostly linear time speedups, even sometimes super-linear. This is possible because the parallel implementation might explore simultaneously different parts of the search space and therefore converge faster towards the best sub-space and thus towards a solution. Besides getting speedups, the resulting times exhibit a much smaller variance, which benefits applications where a timely reply is critical.
△ Less
Submitted 7 October, 2009;
originally announced October 2009.