Skip to main content

Showing 1–16 of 16 results for author: Koziris, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17802  [pdf, other

    cs.AR

    Design, Implementation and Evaluation of the SVNAPOT Extension on a RISC-V Processor

    Authors: Nikolaos-Charalampos Papadopoulos, Stratos Psomadakis, Vasileios Karakostas, Nectarios Koziris, Dionisios N. Pnevmatikatos

    Abstract: The RISC-V SVNAPOT Extension aims to remedy the performance overhead of the Memory Management Unit (MMU), under heavy memory loads. The Privileged Specification defines additional Natural-Power-of-Two (NAPOT) multiples of the 4KB base page size, with 64KB as the default candidate. In this paper we extend the MMU of the Rocket Chip Generator, in order to manage the collocation of 64KB pages along w… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Extended abstract accepted to the RISC-V EU Summit 2024 - June 24-28 Munich, Germany

  2. arXiv:2406.06900  [pdf, other

    cs.DC

    SmartPQ: An Adaptive Concurrent Priority Queue for NUMA Architectures

    Authors: Christina Giannoula, Foteini Strati, Dimitrios Siakavaras, Georgios Goumas, Nectarios Koziris

    Abstract: Concurrent priority queues are widely used in important workloads, such as graph applications and discrete event simulations. However, designing scalable concurrent priority queues for NUMA architectures is challenging. Even though several NUMA-oblivious implementations can scale up to a high number of threads, exploiting the potential parallelism of insert operation, NUMA-oblivious implementation… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  3. arXiv:2302.04225  [pdf

    cs.DC

    Feature-based SpMV Performance Analysis on Contemporary Devices

    Authors: Panagiotis Mpakos, Dimitrios Galanopoulos, Petros Anastasiadis, Nikela Papadopoulou, Nectarios Koziris, Georgios Goumas

    Abstract: The SpMV kernel is characterized by high performance variation per input matrix and computing platform. While GPUs were considered State-of-the-Art for SpMV, with the emergence of advanced multicore CPUs and low-power FPGA accelerators, we need to revisit its performance and energy efficiency. This paper provides a high-level SpMV performance analysis based on structural features of matrices relat… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

    Comments: to appear at IPDPS'23

  4. arXiv:2301.09674  [pdf, other

    cs.AR cs.DC cs.PF

    Architectural Support for Efficient Data Movement in Disaggregated Systems

    Authors: Christina Giannoula, Kailong Huang, Jonathan Tang, Nectarios Koziris, Georgios Goumas, Zeshan Chishti, Nandita Vijaykumar

    Abstract: Resource disaggregation offers a cost effective solution to resource scaling, utilization, and failure-handling in data centers by physically separating hardware devices in a server. Servers are architected as pools of processor, memory, and storage devices, organized as independent failure-isolated components interconnected by a high-bandwidth network. A critical challenge, however, is the high p… ▽ More

    Submitted 23 January, 2023; originally announced January 2023.

    Comments: To appear in the Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS) 2023 and the ACM SIGMETRICS 2023 conference. arXiv admin note: text overlap with arXiv:2301.00414

  5. arXiv:2301.00414  [pdf, other

    cs.AR cs.DC cs.PF

    DaeMon: Architectural Support for Efficient Data Movement in Disaggregated Systems

    Authors: Christina Giannoula, Kailong Huang, Jonathan Tang, Nectarios Koziris, Georgios Goumas, Zeshan Chishti, Nandita Vijaykumar

    Abstract: Resource disaggregation offers a cost effective solution to resource scaling, utilization, and failure-handling in data centers by physically separating hardware devices in a server. Servers are architected as pools of processor, memory, and storage devices, organized as independent failure-isolated components interconnected by a high-bandwidth network. A critical challenge, however, is the high p… ▽ More

    Submitted 18 January, 2023; v1 submitted 1 January, 2023; originally announced January 2023.

    Comments: To appear in the Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS) 2023 and the ACM SIGMETRICS 2023 conference

  6. arXiv:2204.00900  [pdf, ps, other

    cs.AR cs.DC cs.PF

    Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems

    Authors: Christina Giannoula, Ivan Fernandez, Juan Gómez-Luna, Nectarios Koziris, Georgios Goumas, Onur Mutlu

    Abstract: Several manufacturers have already started to commercialize near-bank Processing-In-Memory (PIM) architectures. Near-bank PIM architectures place simple cores close to DRAM banks and can yield significant performance and energy improvements in parallel applications by alleviating data access costs. Real PIM systems can provide high levels of parallelism, large aggregate memory bandwidth and low me… ▽ More

    Submitted 2 April, 2022; originally announced April 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2201.05072

  7. arXiv:2202.01546  [pdf, other

    cs.DB

    QueryER: A Framework for Fast Analysis-Aware Deduplication over Dirty Data

    Authors: Giorgos Alexiou, George Papastefanatos, Vassilis Stamatopoulos, Georgia Koutrika, Nectarios Koziris

    Abstract: In this work, we explore the problem of correctly and efficiently answering complex SPJ queries issued directly on top of dirty data. We introduce QueryER, a framework that seamlessly integrates Entity Resolution into Query Processing. QueryER executes analysis-aware deduplication by weaving ER operators into the query plan. The experimental evaluation of our approach exhibits that it adapts to th… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

  8. arXiv:2201.05072  [pdf, other

    cs.AR cs.DC cs.PF

    SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems

    Authors: Christina Giannoula, Ivan Fernandez, Juan Gómez-Luna, Nectarios Koziris, Georgios Goumas, Onur Mutlu

    Abstract: Several manufacturers have already started to commercialize near-bank Processing-In-Memory (PIM) architectures. Near-bank PIM architectures place simple cores close to DRAM banks and can yield significant performance and energy improvements in parallel applications by alleviating data access costs. Real PIM systems can provide high levels of parallelism, large aggregate memory bandwidth and low me… ▽ More

    Submitted 23 May, 2022; v1 submitted 13 January, 2022; originally announced January 2022.

    Comments: To appear in the Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS) 2022 and the ACM SIGMETRICS 2022 conference

  9. arXiv:2101.07557  [pdf, other

    cs.AR cs.DC

    SynCron: Efficient Synchronization Support for Near-Data-Processing Architectures

    Authors: Christina Giannoula, Nandita Vijaykumar, Nikela Papadopoulou, Vasileios Karakostas, Ivan Fernandez, Juan Gómez-Luna, Lois Orosa, Nectarios Koziris, Georgios Goumas, Onur Mutlu

    Abstract: Near-Data-Processing (NDP) architectures present a promising way to alleviate data movement costs and can provide significant performance and energy benefits to parallel applications. Typically, NDP architectures support several NDP units, each including multiple simple cores placed close to memory. To fully leverage the benefits of NDP and achieve high performance for parallel workloads, efficien… ▽ More

    Submitted 13 February, 2021; v1 submitted 19 January, 2021; originally announced January 2021.

    Comments: To appear in the 27th IEEE International Symposium on High-Performance Computer Architecture (HPCA-27)

  10. arXiv:2009.07723  [pdf, other

    cs.AR

    Enabling Virtual Memory Research on RISC-V with a Configurable TLB Hierarchy for the Rocket Chip Generator

    Authors: Nikolaos Charalampos Papadopoulos, Vasileios Karakostas, Konstantinos Nikas, Nectarios Koziris, Dionisios N. Pnevmatikatos

    Abstract: The Rocket Chip Generator uses a collection of parameterized processor components to produce RISC-V-based SoCs. It is a powerful tool that can produce a wide variety of processor designs ranging from tiny embedded processors to complex multi-core systems. In this paper we extend the features of the Memory Management Unit of the Rocket Chip Generator and specifically the TLB hierarchy. TLBs are ess… ▽ More

    Submitted 16 September, 2020; originally announced September 2020.

    Comments: 7 pages, Fourth Workshop on Computer Architecture Research with RISC-V (CARRV2020)

    ACM Class: B.3.2; B.3.3; B.5.1

  11. arXiv:1802.05536  [pdf, other

    cs.SI physics.soc-ph

    Graph Operator Modeling over Large Graph Datasets

    Authors: Tasos Bakogiannis, Ioannis Giannakopoulos, Dimitrios Tsoumakos, Nectarios Koziris

    Abstract: As graph representations of data emerge in multiple domains, data analysts need to be able to intelligently select among a magnitude of different data graphs based on the effects different graph operators have on them. Exhaustive execution of an operator over the bulk of available data sources is impractical due to the massive resources it requires. Additionally, the same process would have to be… ▽ More

    Submitted 20 August, 2018; v1 submitted 15 February, 2018; originally announced February 2018.

  12. Performance Analysis and Optimization of Sparse Matrix-Vector Multiplication on Modern Multi- and Many-Core Processors

    Authors: Athena Elafrou, Georgios Goumas, Nektarios Koziris

    Abstract: This paper presents a low-overhead optimizer for the ubiquitous sparse matrix-vector multiplication (SpMV) kernel. Architectural diversity among different processors together with structural diversity among different sparse matrices lead to bottleneck diversity. This justifies an SpMV optimizer that is both matrix- and architecture-adaptive through runtime specialization. To this direction, we pre… ▽ More

    Submitted 15 November, 2017; originally announced November 2017.

    Comments: 10 pages, 7 figures, ICPP 2017

  13. arXiv:1704.02855  [pdf, other

    cs.DC cs.DB cs.PF

    A Decision Tree Based Approach Towards Adaptive Profiling of Distributed Applications

    Authors: Ioannis Giannakopoulos, Dimitrios Tsoumakos, Nectarios Koziris

    Abstract: The adoption of the distributed paradigm has allowed applications to increase their scalability, robustness and fault tolerance, but it has also complicated their structure, leading to an exponential growth of the applications' configuration space and increased difficulty in predicting their performance. In this work, we describe a novel, automated profiling methodology that makes no assumptions o… ▽ More

    Submitted 21 May, 2017; v1 submitted 10 April, 2017; originally announced April 2017.

    Comments: 18 pages

  14. arXiv:1702.02978  [pdf, other

    cs.DC

    Elastic Resource Management with Adaptive State Space Partitioning of Markov Decision Processes

    Authors: Konstantinos Lolos, Ioannis Konstantinou, Verena Kantere, Nectarios Koziris

    Abstract: Modern large-scale computing deployments consist of complex applications running over machine clusters. An important issue in these is the offering of elasticity, i.e., the dynamic allocation of resources to applications to meet fluctuating workload demands. Threshold based approaches are typically employed, yet they are difficult to configure and optimize. Approaches based on reinforcement learni… ▽ More

    Submitted 9 February, 2017; originally announced February 2017.

  15. arXiv:1601.07400  [pdf, other

    cs.DC

    Improving virtual host efficiency through resource and interference aware scheduling

    Authors: Evangelos Angelou, Konstantinos Kaffes, Athanasia Asiki, Georgios Goumas, Nectarios Koziris

    Abstract: Modern Infrastructure-as-a-Service Clouds operate in a competitive environment that caters to any user's requirements for computing resources. The sharing of the various types of resources by diverse applications poses a series of challenges in order to optimize resource utilization while avoiding performance degradation caused by application interference. In this paper, we present two scheduling… ▽ More

    Submitted 27 January, 2016; originally announced January 2016.

    Comments: 2nd International Workshop on Dynamic Resource Allocation and Management in Embedded, High Performance and Cloud Computing DREAMCloud 2016 (arXiv:cs/1601.04675)

    Report number: DREAMCloud/2016/02

  16. arXiv:1511.02494  [pdf, other

    cs.PF

    A lightweight optimization selection method for Sparse Matrix-Vector Multiplication

    Authors: Athena Elafrou, Georgios Goumas, Nectarios Koziris

    Abstract: In this paper, we propose an optimization selection methodology for the ubiquitous sparse matrix-vector multiplication (SpMV) kernel. We propose two models that attempt to identify the major performance bottleneck of the kernel for every instance of the problem and then select an appropriate optimization to tackle it. Our first model requires online profiling of the input matrix in order to detect… ▽ More

    Submitted 10 January, 2016; v1 submitted 8 November, 2015; originally announced November 2015.

    Comments: 10 pages