Skip to main content

Showing 1–4 of 4 results for author: Chrapek, M

.
  1. arXiv:2406.05085  [pdf, other

    cs.CL cs.AI cs.IR

    Multi-Head RAG: Solving Multi-Aspect Problems with LLMs

    Authors: Maciej Besta, Ales Kubicek, Roman Niggli, Robert Gerstenberger, Lucas Weitzendorf, Mingyuan Chi, Patrick Iff, Joanna Gajda, Piotr Nyczyk, Jürgen Müller, Hubert Niewiadomski, Marcin Chrapek, Michał Podstawski, Torsten Hoefler

    Abstract: Retrieval Augmented Generation (RAG) enhances the abilities of Large Language Models (LLMs) by enabling the retrieval of documents into the LLM context to provide more accurate and relevant responses. Existing RAG solutions do not focus on queries that may require fetching multiple documents with substantially different contents. Such queries occur frequently, but are challenging because the embed… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  2. arXiv:2404.14193  [pdf, other

    cs.DC cs.NI cs.PF

    LLAMP: Assessing Network Latency Tolerance of HPC Applications with Linear Programming

    Authors: Siyuan Shen, Langwen Huang, Marcin Chrapek, Timo Schneider, Jai Dayal, Manisha Gajbe, Robert Wisniewski, Torsten Hoefler

    Abstract: The shift towards high-bandwidth networks driven by AI workloads in data centers and HPC clusters has unintentionally aggravated network latency, adversely affecting the performance of communication-intensive HPC applications. As large-scale MPI applications often exhibit significant differences in their network latency tolerance, it is crucial to accurately determine the extent of network latency… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 19 pages

    ACM Class: C.4

  3. arXiv:2401.10852  [pdf, other

    cs.DC

    Software Resource Disaggregation for HPC with Serverless Computing

    Authors: Marcin Copik, Marcin Chrapek, Larissa Schmid, Alexandru Calotoiu, Torsten Hoefler

    Abstract: Aggregated HPC resources have rigid allocation systems and programming models which struggle to adapt to diverse and changing workloads. Consequently, HPC systems fail to efficiently use the large pools of unused memory and increase the utilization of idle computing resources. Prior work attempted to increase the throughput and efficiency of supercomputing systems through workload co-location and… ▽ More

    Submitted 1 May, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

    Comments: Accepted for publication in the 2024 International Parallel and Distributed Processing Symposium (IPDPS)

  4. arXiv:2309.03628  [pdf, other

    cs.NI cs.DC cs.OS eess.SY

    OSMOSIS: Enabling Multi-Tenancy in Datacenter SmartNICs

    Authors: Mikhail Khalilov, Marcin Chrapek, Siyuan Shen, Alessandro Vezzu, Thomas Benz, Salvatore Di Girolamo, Timo Schneider, Daniele De Sensi, Luca Benini, Torsten Hoefler

    Abstract: Multi-tenancy is essential for unleashing SmartNIC's potential in datacenters. Our systematic analysis in this work shows that existing on-path SmartNICs have resource multiplexing limitations. For example, existing solutions lack multi-tenancy capabilities such as performance isolation and QoS provisioning for compute and IO resources. Compared to standard NIC data paths with a well-defined set o… ▽ More

    Submitted 13 March, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: 12 pages, 14 figures, 103 references