Skip to main content

Showing 1–8 of 8 results for author: Fahmy, S A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18117  [pdf, other

    cs.AR

    Resilient and Secure Programmable System-on-Chip Accelerator Offload

    Authors: InĂªs Pinto Gouveia, Ahmad T. Sheikh, Ali Shoker, Suhaib A. Fahmy, Paulo Esteves-Verissimo

    Abstract: Computational offload to hardware accelerators is gaining traction due to increasing computational demands and efficiency challenges. Programmable hardware, like FPGAs, offers a promising platform in rapidly evolving application areas, with the benefits of hardware acceleration and software programmability. Unfortunately, such systems composed of multiple hardware components must consider integrit… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: To be published in The 43rd International Symposium on Reliable Distributed Systems (SRDS 2024)

  2. arXiv:2201.03950  [pdf, other

    cs.DC cs.AR

    High Throughput Multidimensional Tridiagonal Systems Solvers on FPGAs

    Authors: Kamalavasan Kamalakkannan, Istvan Z. Reguly, Suhaib A. Fahmy, Gihan R. Mudalige

    Abstract: We present a design space exploration for synthesizing optimized, high-throughput implementations of multiple multi-dimensional tridiagonal system solvers on FPGAs. Re-evaluating the characteristics of algorithms for the direct solution of tridiagonal systems, we develop a new tridiagonal solver library aimed at implementing high-performance computing applications on Xilinx FPGA hardware. Key new… ▽ More

    Submitted 11 January, 2022; originally announced January 2022.

    Comments: Under review

  3. Resource-Efficient Federated Learning

    Authors: Ahmed M. Abdelmoniem, Atal Narayan Sahu, Marco Canini, Suhaib A. Fahmy

    Abstract: Federated Learning (FL) enables distributed training by learners using local data, thereby enhancing privacy and reducing communication. However, it presents numerous challenges relating to the heterogeneity of the data distribution, device capabilities, and participant availability as deployments scale, which can impact both model convergence and bias. Existing FL schemes use random participant s… ▽ More

    Submitted 4 November, 2022; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: Accepted to appear in ACM EuroSys 2023

  4. arXiv:2101.01177  [pdf, other

    cs.AR cs.DC cs.PF

    High-Level FPGA Accelerator Design for Structured-Mesh-Based Explicit Numerical Solvers

    Authors: Kamalavasan Kamalakkannan, Gihan R. Mudalige, Istvan Z. Reguly, Suhaib A. Fahmy

    Abstract: This paper presents a workflow for synthesizing near-optimal FPGA implementations for structured-mesh based stencil applications for explicit solvers. It leverages key characteristics of the application class, its computation-communication pattern, and the architectural capabilities of the FPGA to accelerate solvers from the high-performance computing domain. Key new features of the workflow are (… ▽ More

    Submitted 7 January, 2021; v1 submitted 4 January, 2021; originally announced January 2021.

    Comments: Preprint - Accepted to the 35th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2021), May 2021, Portland, Oregon USA

  5. arXiv:1710.05154  [pdf, other

    cs.AR

    High Throughput 2D Spatial Image Filters on FPGAs

    Authors: Abdullah Al-Dujaili, Suhaib A. Fahmy

    Abstract: FPGAs are well established in the signal processing domain, where their fine-grained programmable nature allows the inherent parallelism in these applications to be exploited for enhanced performance. As architectures have evolved, FPGA vendors have added more heterogeneous resources to allow often-used functions to be implemented with higher performance, at lower power and using less area. DSP bl… ▽ More

    Submitted 17 October, 2017; v1 submitted 14 October, 2017; originally announced October 2017.

  6. arXiv:1705.02730  [pdf, other

    cs.AR

    Resource-Aware Just-in-Time OpenCL Compiler for Coarse-Grained FPGA Overlays

    Authors: Abhishek Kumar Jain, Douglas L. Maskell, Suhaib A. Fahmy

    Abstract: FPGA vendors have recently started focusing on OpenCL for FPGAs because of its ability to leverage the parallelism inherent to heterogeneous computing platforms. OpenCL allows programs running on a host computer to launch accelerator kernels which can be compiled at run-time for a specific architecture, thus enabling portability. However, the prohibitive compilation times (specifically the FPGA pl… ▽ More

    Submitted 7 May, 2017; originally announced May 2017.

    Comments: Presented at 3rd International Workshop on Overlay Architectures for FPGAs (OLAF 2017) arXiv:1704.08802

    Report number: OLAF/2017/02

  7. Security in Automotive Networks: Lightweight Authentication and Authorization

    Authors: Philipp Mundhenk, Andrew Paverd, Artur Mrowca, Sebastian Steinhorst, Martin Lukasiewycz, Suhaib A. Fahmy, Samarjit Chakraborty

    Abstract: With the increasing amount of interconnections between vehicles, the attack surface of internal vehicle networks is rising steeply. Although these networks are shielded against external attacks, they often do not have any internal security to protect against malicious components or adversaries who breach the network perimeter. To secure the in-vehicle network, all communicating components must be… ▽ More

    Submitted 15 March, 2017; v1 submitted 10 March, 2017; originally announced March 2017.

    Comments: Authors' preprint of an article to appear in ACM Transactions on Design Automation of Electronic Systems (ACM TODAES) 2017

  8. arXiv:1606.06460  [pdf

    cs.AR

    An Area-Efficient FPGA Overlay using DSP Block based Time-multiplexed Functional Units

    Authors: Xiangwei Li, Abhishek Jain, Douglas Maskell, Suhaib A. Fahmy

    Abstract: Coarse grained overlay architectures improve FPGA design productivity by providing fast compilation and software-like programmability. Throughput oriented spatially configurable overlays typically suffer from area overheads due to the requirement of one functional unit for each compute kernel operation. Hence, these overlays have often been of limited size, supporting only relatively small compute… ▽ More

    Submitted 21 June, 2016; originally announced June 2016.

    Comments: Presented at 2nd International Workshop on Overlay Architectures for FPGAs (OLAF 2016) arXiv:1605.08149

    Report number: OLAF/2016/02