-
Fast 2D Convolutions and Cross-Correlations Using Scalable Architectures
Authors:
Cesar Carranza,
Daniel Llamocca,
Marios Pattichis
Abstract:
The manuscript describes fast and scalable architectures and associated algorithms for computing convolutions and cross-correlations. The basic idea is to map 2D convolutions and cross-correlations to a collection of 1D convolutions and cross-correlations in the transform domain. This is accomplished through the use of the Discrete Periodic Radon Transform (DPRT) for general kernels and the use of…
▽ More
The manuscript describes fast and scalable architectures and associated algorithms for computing convolutions and cross-correlations. The basic idea is to map 2D convolutions and cross-correlations to a collection of 1D convolutions and cross-correlations in the transform domain. This is accomplished through the use of the Discrete Periodic Radon Transform (DPRT) for general kernels and the use of SVD-LU decompositions for low-rank kernels. The approach uses scalable architectures that can be fitted into modern FPGA and Zynq-SOC devices. Based on different types of available resources, for $P\times P$ blocks, 2D convolutions and cross-correlations can be computed in just $O(P)$ clock cycles up to $O(P^2)$ clock cycles. Thus, there is a trade-off between performance and required numbers and types of resources. We provide implementations of the proposed architectures using modern programmable devices (Virtex-7 and Zynq-SOC). Based on the amounts and types of required resources, we show that the proposed approaches significantly outperform current methods.
△ Less
Submitted 24 December, 2021;
originally announced December 2021.
-
Fast and Scalable Computation of the Forward and Inverse Discrete Periodic Radon Transform
Authors:
Cesar Carranza,
Daniel Llamocca,
Marios Pattichis
Abstract:
The Discrete Periodic Radon Transform (DPRT) has been extensively used in applications that involve image reconstructions from projections. This manuscript introduces a fast and scalable approach for computing the forward and inverse DPRT that is based on the use of: (i) a parallel array of fixed-point adder trees, (ii) circular shift registers to remove the need for accessing external memory comp…
▽ More
The Discrete Periodic Radon Transform (DPRT) has been extensively used in applications that involve image reconstructions from projections. This manuscript introduces a fast and scalable approach for computing the forward and inverse DPRT that is based on the use of: (i) a parallel array of fixed-point adder trees, (ii) circular shift registers to remove the need for accessing external memory components when selecting the input data for the adder trees, (iii) an image block-based approach to DPRT computation that can fit the proposed architecture to available resources, and (iv) fast transpositions that are computed in one or a few clock cycles that do not depend on the size of the input image. As a result, for an $N\times N$ image ($N$ prime), the proposed approach can compute up to $N^{2}$ additions per clock cycle. Compared to previous approaches, the scalable approach provides the fastest known implementations for different amounts of computational resources. For example, for a $251\times 251$ image, for approximately $25\%$ fewer flip-flops than required for a systolic implementation, we have that the scalable DPRT is computed 36 times faster. For the fastest case, we introduce optimized architectures that can compute the DPRT and its inverse in just $2N+\left\lceil \log_{2}N\right\rceil+1$ and $2N+3\left\lceil \log_{2}N\right\rceil+B+2$ cycles respectively, where $B$ is the number of bits used to represent each input pixel. On the other hand, the scalable DPRT approach requires more 1-bit additions than for the systolic implementation and provides a trade-off between speed and additional 1-bit additions. All of the proposed DPRT architectures were implemented in VHDL and validated using an FPGA implementation.
△ Less
Submitted 24 December, 2021;
originally announced December 2021.
-
Brokering Policies and Execution Monitors for IoT Middleware
Authors:
Juan Carlos Fuentes Carranza,
Philip W. L. Fong
Abstract:
Event-based systems lie at the heart of many cloud-based Internet-of-Things (IoT) platforms. This combination of the Broker architectural style and the Publisher-Subscriber design pattern provides a way for smart devices to communicate and coordinate with one another. The present design of these cloud-based IoT frameworks lacks measures to (i) protect devices against malicious cloud disconnections…
▽ More
Event-based systems lie at the heart of many cloud-based Internet-of-Things (IoT) platforms. This combination of the Broker architectural style and the Publisher-Subscriber design pattern provides a way for smart devices to communicate and coordinate with one another. The present design of these cloud-based IoT frameworks lacks measures to (i) protect devices against malicious cloud disconnections, (ii) impose information flow control among communicating parties, and (iii) enforce coordination protocols in the presence of compromised devices. In this work, we propose to extend the modular event-based system architecture of Fiege et al., to incorporate brokering policies and execution monitors, in order to address the three protection challenges mentioned above. We formalized the operational semantics of our protection scheme, explored how the scheme can be used to enforce BLP-style information flow control and RBAC-style protection domains, implemented the proposal in an open-source MQTT broker, and evaluated the performance impact of the protection mechanisms.
△ Less
Submitted 27 September, 2018; v1 submitted 26 September, 2018;
originally announced September 2018.