Skip to main content

Showing 1–16 of 16 results for author: Gurkaynak, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15107  [pdf, other

    cs.AR

    Basilisk: An End-to-End Open-Source Linux-Capable RISC-V SoC in 130nm CMOS

    Authors: Paul Scheffler, Philippe Sauter, Thomas Benz, Frank K. Gürkaynak, Luca Benini

    Abstract: Open-source hardware (OSHW) is rapidly gaining traction in academia and industry. The availability of open RTL descriptions, EDA tools, and even PDKs enables a fully auditable supply chain for end-to-end (RTL to layout) open-source silicon, significantly strengthening security and transparency. Despite promising developments, existing OSHW efforts have so far fallen short of producing end-to-end o… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 3 pages, 4 figures. Accepted at SSH-SoC 2024 workshop

  2. arXiv:2406.15068  [pdf, other

    cs.AR

    Occamy: A 432-Core 28.1 DP-GFLOP/s/W 83% FPU Utilization Dual-Chiplet, Dual-HBM2E RISC-V-based Accelerator for Stencil and Sparse Linear Algebra Computations with 8-to-64-bit Floating-Point Support in 12nm FinFET

    Authors: Gianna Paulin, Paul Scheffler, Thomas Benz, Matheus Cavalcante, Tim Fischer, Manuel Eggimann, Yichao Zhang, Nils Wistoff, Luca Bertaccini, Luca Colagrande, Gianmarco Ottavi, Frank K. Gürkaynak, Davide Rossi, Luca Benini

    Abstract: We present Occamy, a 432-core RISC-V dual-chiplet 2.5D system for efficient sparse linear algebra and stencil computations on FP64 and narrow (32-, 16-, 8-bit) SIMD FP data. Occamy features 48 clusters of RISC-V cores with custom extensions, two 64-bit host cores, and a latency-tolerant multi-chiplet interconnect and memory system with 32 GiB of HBM2E. It achieves leading-edge utilization on stenc… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 2 pages, 7 figures. Accepted at the 2024 IEEE Symposium on VLSI Technology & Circuits

  3. arXiv:2405.04257  [pdf, other

    cs.AR

    Insights from Basilisk: Are Open-Source EDA Tools Ready for a Multi-Million-Gate, Linux-Booting RV64 SoC Design?

    Authors: Philippe Sauter, Thomas Benz, Paul Scheffler, Frank K. Gürkaynak, Luca Benini

    Abstract: Designing complex, multi-million-gate application-specific integrated circuits requires robust and mature electronic design automation (EDA) tools. We describe our efforts in enhancing the open-source Yosys+Openroad EDA flow to implement Basilisk, a fully open-source, Linux-booting RV64GC system-on-chip (SoC) design. We analyze the quality-of-results impact of our enhancements to synthesis tools,… ▽ More

    Submitted 8 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: 8 pages, 6 figures, submitted at IWLS 2024

  4. arXiv:2405.03523  [pdf, other

    cs.AR

    Basilisk: Achieving Competitive Performance with Open EDA Tools on an Open-Source Linux-Capable RISC-V SoC

    Authors: Phillippe Sauter, Thomas Benz, Paul Scheffler, Zerun Jiang, Beat Muheim, Frank K. Gürkaynak, Luca Benini

    Abstract: We introduce Basilisk, an optimized application-specific integrated circuit (ASIC) implementation and design flow building on the end-to-end open-source Iguana system-on-chip (SoC). We present enhancements to synthesis tools and logic optimization scripts improving quality of results (QoR), as well as an optimized physical design with an improved power grid and cell placement integration enabling… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 2 pages, 1 figure, accepted as a poster at the RISC-V Summit Europe 2024

  5. FlooNoC: A Multi-Tbps Wide NoC for Heterogeneous AXI4 Traffic

    Authors: Tim Fischer, Michael Rogenmoser, Matheus Cavalcante, Frank K. Gürkaynak, Luca Benini

    Abstract: Meeting the staggering bandwidth requirements of today's applications challenges the traditional narrow and serialized NoCs, which hit hard bounds on the maximum operating frequency. This paper proposes FlooNoC, an open-source, low-latency, fully AXI4-compatible NoC with wide physical channels for latency-tolerant high-bandwidth non-blocking transactions and decoupled latency-critical short messag… ▽ More

    Submitted 6 August, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

  6. arXiv:2302.05996  [pdf, other

    cs.AR

    Quark: An Integer RISC-V Vector Processor for Sub-Byte Quantized DNN Inference

    Authors: MohammadHossein AskariHemmat, Theo Dupuis, Yoan Fournier, Nizar El Zarif, Matheus Cavalcante, Matteo Perotti, Frank Gurkaynak, Luca Benini, Francois Leduc-Primeau, Yvon Savaria, Jean-Pierre David

    Abstract: In this paper, we present Quark, an integer RISC-V vector processor specifically tailored for sub-byte DNN inference. Quark is implemented in GlobalFoundries' 22FDX FD-SOI technology. It is designed on top of Ara, an open-source 64-bit RISC-V vector processor. To accommodate sub-byte DNN inference, Quark extends Ara by adding specialized vector instructions to perform sub-byte quantized operations… ▽ More

    Submitted 12 February, 2023; originally announced February 2023.

    Comments: 5 pages. Accepted for publication in the 56th International Symposium on Circuits and Systems (ISCAS 2023)

    ACM Class: C.1.3; C.3

  7. arXiv:2212.00873  [pdf, other

    cs.AR

    CONVOLVE: Smart and seamless design of smart edge processors

    Authors: M. Gomony, F. Putter, A. Gebregiorgis, G. Paulin, L. Mei, V. Jain, S. Hamdioui, V. Sanchez, T. Grosser, M. Geilen, M. Verhelst, F. Zenke, F. Gurkaynak, B. Bruin, S. Stuijk, S. Davidson, S. De, M. Ghogho, A. Jimborean, S. Eissa, L. Benini, D. Soudris, R. Bishnoi, S. Ainsworth, F. Corradi , et al. (3 additional authors not shown)

    Abstract: With the rise of Deep Learning (DL), our world braces for AI in every edge device, creating an urgent need for edge-AI SoCs. This SoC hardware needs to support high throughput, reliable and secure AI processing at Ultra Low Power (ULP), with a very short time to market. With its strong legacy in edge solutions and open processing platforms, the EU is well-positioned to become a leader in this SoC… ▽ More

    Submitted 2 May, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

  8. arXiv:2209.00889  [pdf, other

    cs.AR

    Soft Tiles: Capturing Physical Implementation Flexibility for Tightly-Coupled Parallel Processing Clusters

    Authors: Gianna Paulin, Matheus Cavalcante, Paul Scheffler, Luca Bertaccini, Yichao Zhang, Frank Gürkaynak, Luca Benini

    Abstract: Modern high-performance computing architectures (Multicore, GPU, Manycore) are based on tightly-coupled clusters of processing elements, physically implemented as rectangular tiles. Their size and aspect ratio strongly impact the achievable operating frequency and energy efficiency, but they should be as flexible as possible to achieve a high utilization for the top-level die floorplan. In this pa… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

    Comments: 6 pages. Accepted for publication in the IEEE Computer Society Annual Symposium on VLSI (ISVLSI) 2022

  9. On-Demand Redundancy Grou**: Selectable Soft-Error Tolerance for a Multicore Cluster

    Authors: Michael Rogenmoser, Nils Wistoff, Pirmin Vogel, Frank Gürkaynak, Luca Benini

    Abstract: With the shrinking of technology nodes and the use of parallel processor clusters in hostile and critical environments, such as space, run-time faults caused by radiation are a serious cross-cutting concern, also impacting architectural design. This paper introduces an architectural approach to run-time configurable soft-error tolerance at the core level, augmenting a six-core open-source RISC-V c… ▽ More

    Submitted 3 October, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

    Journal ref: ISVLSI (2022) 398-401

  10. arXiv:2202.12029  [pdf, other

    cs.CR cs.AR

    Systematic Prevention of On-Core Timing Channels by Full Temporal Partitioning

    Authors: Nils Wistoff, Moritz Schneider, Frank K. Gürkaynak, Gernot Heiser, Luca Benini

    Abstract: Microarchitectural timing channels enable unwanted information flow across security boundaries, violating fundamental security assumptions. They leverage timing variations of several state-holding microarchitectural components and have been demonstrated across instruction set architectures and hardware implementations. Analogously to memory protection, Ge et al. have proposed time protection for p… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. arXiv admin note: text overlap with arXiv:2005.02193

  11. arXiv:2006.14256  [pdf, other

    cs.AR

    Arnold: an eFPGA-Augmented RISC-V SoC for Flexible and Low-Power IoT End-Nodes

    Authors: Pasquale Davide Schiavone, Davide Rossi, Alfio Di Mauro, Frank Gurkaynak, Timothy Saxe, Mao Wang, Ket Chong Yap, Luca Benini

    Abstract: A wide range of Internet of Things (IoT) applications require powerful, energy-efficient and flexible end-nodes to acquire data from multiple sources, process and distill the sensed data through near-sensor data analytics algorithms, and transmit it wirelessly. This work presents Arnold: a 0.5 V to 0.8 V, 46.83 uW/MHz, 600 MOPS fully programmable RISC-V Microcontroller unit (MCU) fabricated in 22… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

  12. arXiv:2005.02193  [pdf, other

    cs.CR cs.AR

    Prevention of Microarchitectural Covert Channels on an Open-Source 64-bit RISC-V Core

    Authors: Nils Wistoff, Moritz Schneider, Frank K. Gürkaynak, Luca Benini, Gernot Heiser

    Abstract: Covert channels enable information leakage across security boundaries of the operating system. Microarchitectural covert channels exploit changes in execution timing resulting from competing access to limited hardware resources. We use the recent experimental support for time protection, aimed at preventing covert channels, in the seL4 microkernel and evaluate the efficacy of the mechanisms agains… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

    Comments: 6 pages, 7 figures, submitted to CARRV '20, additional appendix

  13. arXiv:1803.04783  [pdf, other

    cs.DC cs.AR

    A Scalable Near-Memory Architecture for Training Deep Neural Networks on Large In-Memory Datasets

    Authors: Fabian Schuiki, Michael Schaffner, Frank K. Gürkaynak, Luca Benini

    Abstract: Most investigations into near-memory hardware accelerators for deep neural networks have primarily focused on inference, while the potential of accelerating training has received relatively little attention so far. Based on an in-depth analysis of the key computational patterns in state-of-the-art gradient-based training methods, we propose an efficient near-memory acceleration engine called NTX t… ▽ More

    Submitted 17 October, 2018; v1 submitted 19 February, 2018; originally announced March 2018.

    Comments: 14 pages, submitted to IEEE Transactions on Computers journal

  14. arXiv:1712.01021  [pdf, other

    cs.AR

    An 826 MOPS, 210 uW/MHz Unum ALU in 65 nm

    Authors: Florian Glaser, Stefan Mach, Abbas Rahimi, Frank K. Gürkaynak, Qiuting Huang, Luca Benini

    Abstract: To overcome the limitations of conventional floating-point number formats, an interval arithmetic and variable-width storage format called universal number (unum) has been recently introduced. This paper presents the first (to the best of our knowledge) silicon implementation measurements of an application-specific integrated circuit (ASIC) for unum floating-point arithmetic. The designed chip inc… ▽ More

    Submitted 4 December, 2017; v1 submitted 4 December, 2017; originally announced December 2017.

  15. arXiv:1612.05974  [pdf, other

    cs.AR cs.CR cs.LG cs.NE

    An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics

    Authors: Francesco Conti, Robert Schilling, Pasquale Davide Schiavone, Antonio Pullini, Davide Rossi, Frank Kagan Gürkaynak, Michael Muehlberghuber, Michael Gautschi, Igor Loi, Germain Haugou, Stefan Mangard, Luca Benini

    Abstract: Near-sensor data analytics is a promising direction for IoT endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data is stored or sent over the network at various stages of the analytics pipeline. Using encryption to protect sensitive data at the boundary of the on-chip analytics engine is a way to address data securi… ▽ More

    Submitted 23 April, 2017; v1 submitted 18 December, 2016; originally announced December 2016.

    Comments: 15 pages, 12 figures, accepted for publication to the IEEE Transactions on Circuits and Systems - I: Regular Papers

  16. arXiv:1608.08376  [pdf, other

    cs.AR

    A near-threshold RISC-V core with DSP extensions for scalable IoT Endpoint Devices

    Authors: Michael Gautschi, Pasquale Davide Schiavone, Andreas Traber, Igor Loi, Antonio Pullini, Davide Rossi, Eric Flamand, Frank K. Gurkaynak, Luca Benini

    Abstract: Endpoint devices for Internet-of-Things not only need to work under extremely tight power envelope of a few milliwatts, but also need to be flexible in their computing capabilities, from a few kOPS to GOPS. Near-threshold(NT) operation can achieve higher energy efficiency, and the performance scalability can be gained through parallelism. In this paper we describe the design of an open-source RISC… ▽ More

    Submitted 30 August, 2016; originally announced August 2016.