Skip to main content

Showing 1–5 of 5 results for author: Daglis, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.05033  [pdf, other

    cs.AR

    A Case for CXL-Centric Server Processors

    Authors: Albert Cho, Anish Saxena, Moinuddin Qureshi, Alexandros Daglis

    Abstract: The memory system is a major performance determinant for server processors. Ever-growing core counts and datasets demand higher bandwidth and capacity as well as lower latency from the memory system. To keep up with growing demands, DDR--the dominant processor interface to memory over the past two decades--has offered higher bandwidth with every generation. However, because each parallel DDR inter… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

  2. arXiv:2211.16648  [pdf, other

    cs.DC cs.AI cs.LG

    COMET: A Comprehensive Cluster Design Methodology for Distributed Deep Learning Training

    Authors: Divya Kiran Kadiyala, Saeed Rashidi, Taekyung Heo, Abhimanyu Rajeshkumar Bambhaniya, Tushar Krishna, Alexandros Daglis

    Abstract: Modern Deep Learning (DL) models have grown to sizes requiring massive clusters of specialized, high-end nodes to train. Designing such clusters to maximize both performance and utilization--to amortize their steep cost--is a challenging task requiring careful balance of compute, memory, and network resources. Moreover, a plethora of each model's tuning knobs drastically affect the performance, wi… ▽ More

    Submitted 14 March, 2024; v1 submitted 29 November, 2022; originally announced November 2022.

  3. arXiv:2203.02585  [pdf, other

    cs.NI

    NFSlicer: Data Movement Optimization for Shallow Network Functions

    Authors: Anirudh Sarma, Hamed Seyedroudbari, Harshit Gupta, Umakishore Ramachandran, Alexandros Daglis

    Abstract: Network Function (NF) deployments on commodity servers have become ubiquitous in datacenters and enterprise settings. Many commonly used NFs such as firewalls, load balancers and NATs are shallow - i.e., they only examine the packet's header, despite the entire packet being transferred on and off the server. As a result, the gap between moved and inspected data when handling large packets exceeds… ▽ More

    Submitted 4 March, 2022; originally announced March 2022.

    Comments: 13 pages, 16 figures

  4. arXiv:1809.05859  [pdf, other

    cs.AR cs.ET cs.PL

    Exploiting Errors for Efficiency: A Survey from Circuits to Algorithms

    Authors: Phillip Stanley-Marbell, Armin Alaghi, Michael Carbin, Eva Darulova, Lara Dolecek, Andreas Gerstlauer, Ghayoor Gillani, Djordje Jevdjic, Thierry Moreau, Mattia Cacciotti, Alexandros Daglis, Natalie Enright Jerger, Babak Falsafi, Sasa Misailovic, Adrian Sampson, Damien Zufferey

    Abstract: When a computational task tolerates a relaxation of its specification or when an algorithm tolerates the effects of noise in its execution, hardware, programming languages, and system software can trade deviations from correct behavior for lower resource usage. We present, for the first time, a synthesis of research results on computing systems that only make as many errors as their users can tole… ▽ More

    Submitted 16 September, 2018; originally announced September 2018.

    Comments: 35 pages

  5. Design Guidelines for High-Performance SCM Hierarchies

    Authors: Dmitrii Ustiugov, Alexandros Daglis, Javier Picorel, Mark Sutherland, Edouard Bugnion, Babak Falsafi, Dionisios Pnevmatikatos

    Abstract: With emerging storage-class memory (SCM) nearing commercialization, there is evidence that it will deliver the much-anticipated high density and access latencies within only a few factors of DRAM. Nevertheless, the latency-sensitive nature of memory-resident services makes seamless integration of SCM in servers questionable. In this paper, we ask the question of how best to introduce SCM for such… ▽ More

    Submitted 7 March, 2019; v1 submitted 20 January, 2018; originally announced January 2018.

    Comments: Published at MEMSYS'18