-
Parameter Identification for Electrochemical Models of Lithium-Ion Batteries Using Bayesian Optimization
Authors:
Jianzong Pi,
Samuel Filgueira da Silva,
Mehmet Fatih Ozkan,
Abhishek Gupta,
Marcello Canova
Abstract:
Efficient parameter identification of electrochemical models is crucial for accurate monitoring and control of lithium-ion cells. This process becomes challenging when applied to complex models that rely on a considerable number of interdependent parameters that affect the output response. Gradient-based and metaheuristic optimization techniques, although previously employed for this task, are lim…
▽ More
Efficient parameter identification of electrochemical models is crucial for accurate monitoring and control of lithium-ion cells. This process becomes challenging when applied to complex models that rely on a considerable number of interdependent parameters that affect the output response. Gradient-based and metaheuristic optimization techniques, although previously employed for this task, are limited by their lack of robustness, high computational costs, and susceptibility to local minima. In this study, Bayesian Optimization is used for tuning the dynamic parameters of an electrochemical equivalent circuit battery model (E-ECM) for a nickel-manganese-cobalt (NMC)-graphite cell. The performance of the Bayesian Optimization is compared with baseline methods based on gradient-based and metaheuristic approaches. The robustness of the parameter optimization method is tested by performing verification using an experimental drive cycle. The results indicate that Bayesian Optimization outperforms Gradient Descent and PSO optimization techniques, achieving reductions on average testing loss by 28.8% and 5.8%, respectively. Moreover, Bayesian optimization significantly reduces the variance in testing loss by 95.8% and 72.7%, respectively.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Efficient Hardware Acceleration of Sparsely Active Convolutional Spiking Neural Networks
Authors:
Jan Sommer,
M. Akif Özkan,
Oliver Keszocze,
Jürgen Teich
Abstract:
Spiking Neural Networks (SNNs) compute in an event-based matter to achieve a more efficient computation than standard Neural Networks. In SNNs, neuronal outputs (i.e. activations) are not encoded with real-valued activations but with sequences of binary spikes. The motivation of using SNNs over conventional neural networks is rooted in the special computational aspects of SNNs, especially the very…
▽ More
Spiking Neural Networks (SNNs) compute in an event-based matter to achieve a more efficient computation than standard Neural Networks. In SNNs, neuronal outputs (i.e. activations) are not encoded with real-valued activations but with sequences of binary spikes. The motivation of using SNNs over conventional neural networks is rooted in the special computational aspects of SNNs, especially the very high degree of sparsity of neural output activations. Well established architectures for conventional Convolutional Neural Networks (CNNs) feature large spatial arrays of Processing Elements (PEs) that remain highly underutilized in the face of activation sparsity. We propose a novel architecture that is optimized for the processing of Convolutional SNNs (CSNNs) that feature a high degree of activation sparsity. In our architecture, the main strategy is to use less but highly utilized PEs. The PE array used to perform the convolution is only as large as the kernel size, allowing all PEs to be active as long as there are spikes to process. This constant flow of spikes is ensured by compressing the feature maps (i.e. the activations) into queues that can then be processed spike by spike. This compression is performed in run-time using dedicated circuitry, leading to a self-timed scheduling. This allows the processing time to scale directly with the number of spikes. A novel memory organization scheme called memory interlacing is used to efficiently store and retrieve the membrane potentials of the individual neurons using multiple small parallel on-chip RAMs. Each RAM is hardwired to its PE, reducing switching circuitry and allowing RAMs to be located in close proximity to the respective PE. We implemented the proposed architecture on an FPGA and achieved a significant speedup compared to other implementations while needing less hardware resources and maintaining a lower energy consumption.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
DSP-Packing: Squeezing Low-precision Arithmetic into FPGA DSP Blocks
Authors:
Jan Sommer,
M. Akif Özkan,
Oliver Keszocze,
Jürgen Teich
Abstract:
The number of Digital Signal Processor (DSP) resources available in Field Programmable Gate Arrays (FPGAs) is often quite limited. Therefore, full utilization of available DSP resources for the computationally intensive parts of an algorithm is paramount for optimizing the non-functional properties of an implementation (i.e., performance, power, and area). The DSPs available in Xilinx devices impl…
▽ More
The number of Digital Signal Processor (DSP) resources available in Field Programmable Gate Arrays (FPGAs) is often quite limited. Therefore, full utilization of available DSP resources for the computationally intensive parts of an algorithm is paramount for optimizing the non-functional properties of an implementation (i.e., performance, power, and area). The DSPs available in Xilinx devices implement large bit width operators (i.e. a 48-bit accumulator or a $18 \times 27$ multiplier). However, using such a DSP for low-precision quantized data (as is common in image processing or machine learning applications) leaves the DSP resources underutilized. As a remedy, A method has been proposed to pack and compute four 4-bit multiplications on a single DSP in a single clock cycle. This paper presents a generalization of this scheme to arbitrary bit widths and number of multiplications. We also demonstrate that the previously proposed approach leads to errors (Mean Absolute Error (MAE) = 0.37). Furthermore, we explain where these errors come from and how they can be corrected. On top, we introduce a novel approximate method called "Overpacking" which allows to squeeze even more multiplications into a single DSP at the cost of small errors (MAE = 0.47). Overpacking allows to squeeze six 4-bit multiplications into a single DSP compared to just four in the literature. Finally, we introduce an alternative method for packing multiple small-bit width additions into a single 48-bit accumulator for use in applications such as Spiking Neural Networks.
△ Less
Submitted 21 March, 2022;
originally announced March 2022.
-
Inverse Reinforcement Learning Based Stochastic Driver Behavior Learning
Authors:
Mehmet Fatih Ozkan,
Abishek Joseph Rocque,
Yao Ma
Abstract:
Drivers have unique and rich driving behaviors when operating vehicles in traffic. This paper presents a novel driver behavior learning approach that captures the uniqueness and richness of human driver behavior in realistic driving scenarios. A stochastic inverse reinforcement learning (SIRL) approach is proposed to learn a distribution of cost function, which represents the richness of the human…
▽ More
Drivers have unique and rich driving behaviors when operating vehicles in traffic. This paper presents a novel driver behavior learning approach that captures the uniqueness and richness of human driver behavior in realistic driving scenarios. A stochastic inverse reinforcement learning (SIRL) approach is proposed to learn a distribution of cost function, which represents the richness of the human driver behavior with a given set of driver-specific demonstrations. Evaluations are conducted on the realistic driving data collected from the 3D driver-in-the-loop driving simulation. The results show that the learned stochastic driver model is capable of expressing the richness of the human driving strategies under different realistic driving scenarios. Compared to the deterministic baseline driver behavior model, the results reveal that the proposed stochastic driver behavior model can better replicate the driver's unique and rich driving strategies in a variety of traffic conditions.
△ Less
Submitted 5 August, 2021; v1 submitted 1 July, 2021;
originally announced July 2021.
-
HipaccVX: Wedding of OpenVX and DSL-based Code Generation
Authors:
M. Akif Özkan,
Burak Ok,
Bo Qiao,
Jürgen Teich,
Frank Hannig
Abstract:
Writing programs for heterogeneous platforms optimized for high performance is hard since this requires the code to be tuned at a low level with architecture-specific optimizations that are most times based on fundamentally differing programming paradigms and languages. OpenVX promises to solve this issue for computer vision applications with a royalty-free industry standard that is based on a gra…
▽ More
Writing programs for heterogeneous platforms optimized for high performance is hard since this requires the code to be tuned at a low level with architecture-specific optimizations that are most times based on fundamentally differing programming paradigms and languages. OpenVX promises to solve this issue for computer vision applications with a royalty-free industry standard that is based on a graph-execution model. Yet, the OpenVX' algorithm space is constrained to a small set of vision functions. This hinders accelerating computations that are not included in the standard.
In this paper, we analyze OpenVX vision functions to find an orthogonal set of computational abstractions. Based on these abstractions, we couple an existing Domain-Specific Language (DSL) back end to the OpenVX environment and provide language constructs to the programmer for the definition of user-defined nodes. In this way, we enable optimizations that are not possible to detect with OpenVX graph implementations using the standard computer vision functions. These optimizations can double the throughput on an Nvidia GTX GPU and decrease the resource usage of a Xilinx Zynq FPGA by 50% for our benchmarks. Finally, we show that our proposed compiler framework, called HipaccVX, can achieve better results than the state-of-the-art approaches Nvidia VisionWorks and Halide-HLS.
△ Less
Submitted 26 August, 2020;
originally announced August 2020.
-
AnyHLS: High-Level Synthesis with Partial Evaluation
Authors:
M. Akif Özkan,
Arsène Pérard-Gayot,
Richard Membarth,
Philipp Slusallek,
Roland Leissa,
Sebastian Hack,
Jürgen Teich,
Frank Hannig
Abstract:
FPGAs excel in low power and high throughput computations, but they are challenging to program. Traditionally, developers rely on hardware description languages like Verilog or VHDL to specify the hardware behavior at the register-transfer level. High-Level Synthesis (HLS) raises the level of abstraction, but still requires FPGA design knowledge. Programmers usually write pragma-annotated C/C++ pr…
▽ More
FPGAs excel in low power and high throughput computations, but they are challenging to program. Traditionally, developers rely on hardware description languages like Verilog or VHDL to specify the hardware behavior at the register-transfer level. High-Level Synthesis (HLS) raises the level of abstraction, but still requires FPGA design knowledge. Programmers usually write pragma-annotated C/C++ programs to define the hardware architecture of an application. However, each hardware vendor extends its own C dialect using its own vendor-specific set of pragmas. This prevents portability across different vendors. Furthermore, pragmas are not first-class citizens in the language. This makes it hard to use them in a modular way or design proper abstractions. In this paper, we present AnyHLS, an approach to synthesize FPGA designs in a modular and abstract way. AnyHLS is able to raise the abstraction level of existing HLS tools by resorting to programming language features such as types and higher-order functions as follows: It relies on partial evaluation to specialize and to optimize the user application based on a library of abstractions. Then, vendor-specific HLS code is generated for Intel and Xilinx FPGAs. Portability is obtained by avoiding any vendor-specific pragmas at the source code. In order to validate achievable gains in productivity, a library for the domain of image processing is introduced as a case study, and its synthesis results are compared with several state-of-theart Domain-Specific Language (DSL) approaches for this domain.
△ Less
Submitted 21 July, 2020; v1 submitted 13 February, 2020;
originally announced February 2020.
-
Development of a Forecasting and Warning System on the Ecological Life-Cycle of Sunn Pest
Authors:
İsmail Balaban,
Fatih Acun,
Onur Yiğit Arpalı,
Furkan Murat,
Numan Ertuğrul Babaroğlu,
Emre Akci,
Mehmet Çulcu,
Mümtaz Özkan,
Selim Temizer
Abstract:
We provide a machine learning solution that replaces the traditional methods for deciding the pesticide application time of Sunn Pest. We correlate climate data with phases of Sunn Pest in its life-cycle and decide whether the fields should be sprayed. Our solution includes two groups of prediction models. The first group contains decision trees that predict migration time of Sunn Pest from winter…
▽ More
We provide a machine learning solution that replaces the traditional methods for deciding the pesticide application time of Sunn Pest. We correlate climate data with phases of Sunn Pest in its life-cycle and decide whether the fields should be sprayed. Our solution includes two groups of prediction models. The first group contains decision trees that predict migration time of Sunn Pest from winter quarters to wheat fields. The second group contains random forest models that predict the nymphal stage percentages of Sunn Pest which is a criterion for pesticide application. We trained our models on four years of climate data which was collected from Kirşehir and Aksaray. The experiments show that our promised solution make correct predictions with high accuracies.
△ Less
Submitted 5 May, 2019;
originally announced May 2019.