Skip to main content

Showing 1–14 of 14 results for author: Swany, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.06086  [pdf, other

    cs.DC cs.AI cs.DS

    Rhizomes and Diffusions for Processing Highly Skewed Graphs on Fine-Grain Message-Driven Systems

    Authors: Bibrak Qamar Chandio, Prateek Srivastava, Maciej Brodowicz, Martin Swany, Thomas Sterling

    Abstract: The paper provides a unified co-design of 1) a programming and execution model that allows spawning tasks from within the vertex data at runtime, 2) language constructs for \textit{actions} that send work to where the data resides, combining parallel expressiveness of local control objects (LCOs) to implement asynchronous graph processing primitives, 3) and an innovative vertex-centric data-struct… ▽ More

    Submitted 7 May, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.02576

    ACM Class: C.1.4; C.3; C.4; D.1.3

  2. Flexible Communication for Optimal Distributed Learning over Unpredictable Networks

    Authors: Sahil Tyagi, Martin Swany

    Abstract: Gradient compression alleviates expensive communication in distributed deep learning by sending fewer values and its corresponding indices, typically via Allgather (AG). Training with high compression ratio (CR) achieves high accuracy like DenseSGD, but has lower parallel scaling due to high communication cost (i.e., parallel efficiency). Using lower CRs improves parallel efficiency by lowering sy… ▽ More

    Submitted 29 January, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: 2023 IEEE International Conference on Big Data (BigData)

    Journal ref: 2023 IEEE International Conference on Big Data (BigData), 925-935

  3. Accelerating Distributed ML Training via Selective Synchronization

    Authors: Sahil Tyagi, Martin Swany

    Abstract: In distributed training, deep neural networks (DNNs) are launched over multiple workers concurrently and aggregate their local updates on each step in bulk-synchronous parallel (BSP) training. However, BSP does not linearly scale-out due to high communication cost of aggregation. To mitigate this overhead, alternatives like Federated Averaging (FedAvg) and Stale-Synchronous Parallel (SSP) either r… ▽ More

    Submitted 29 January, 2024; v1 submitted 16 July, 2023; originally announced July 2023.

    Journal ref: Tyagi, S., & Swany, M. (2023). Accelerating Distributed ML Training via Selective Synchronization. 2023 IEEE International Conference on Cluster Computing (CLUSTER), 1-12

  4. GraVAC: Adaptive Compression for Communication-Efficient Distributed DL Training

    Authors: Sahil Tyagi, Martin Swany

    Abstract: Distributed data-parallel (DDP) training improves overall application throughput as multiple devices train on a subset of data and aggregate updates to produce a globally shared model. The periodic synchronization at each iteration incurs considerable overhead, exacerbated by the increasing size and complexity of state-of-the-art neural networks. Although many gradient compression techniques propo… ▽ More

    Submitted 29 January, 2024; v1 submitted 20 May, 2023; originally announced May 2023.

    Journal ref: Tyagi, S., & Swany, M. (2023). GraVAC: Adaptive Compression for Communication-Efficient Distributed DL Training. 2023 IEEE 16th International Conference on Cloud Computing (CLOUD), 319-329

  5. GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs

    Authors: Boyuan Zhang, Jiannan Tian, Sheng Di, Xiaodong Yu, Martin Swany, Dingwen Tao, Franck Cappello

    Abstract: Today's graphics processing unit (GPU) applications produce vast volumes of data, which are challenging to store and transfer efficiently. Thus, data compression is becoming a critical technique to mitigate the storage burden and communication cost. LZSS is the core algorithm in many widely used compressors, such as Deflate. However, existing GPU-based LZSS compressors suffer from low throughput d… ▽ More

    Submitted 2 May, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

    Comments: 12 pages, 9 figures, 3 tables, accepted by ACM ICS '23

  6. arXiv:2302.08090  [pdf, other

    quant-ph cs.AI cs.CR

    QTrojan: A Circuit Backdoor Against Quantum Neural Networks

    Authors: Cheng Chu, Lei Jiang, Martin Swany, Fan Chen

    Abstract: We propose a circuit-level backdoor attack, \textit{QTrojan}, against Quantum Neural Networks (QNNs) in this paper. QTrojan is implemented by few quantum gates inserted into the variational quantum circuit of the victim QNN. QTrojan is much stealthier than a prior Data-Poisoning-based Backdoor Attack (DPBA), since it does not embed any trigger in the inputs of the victim QNN or require the access… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Journal ref: ICASSP2023

  7. arXiv:2302.07337  [pdf, other

    cs.RO cs.AI cs.GT cs.MA

    Graph Attention Multi-Agent Fleet Autonomy for Advanced Air Mobility

    Authors: Malintha Fernando, Ransalu Senanayake, Heeyoul Choi, Martin Swany

    Abstract: Autonomous mobility is emerging as a new disruptive mode of urban transportation for moving cargo and passengers. However, designing scalable autonomous fleet coordination schemes to accommodate fast-growing mobility systems is challenging primarily due to the increasing heterogeneity of the fleets, time-varying demand patterns, service area expansions, and communication limitations. We introduce… ▽ More

    Submitted 1 August, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: Accepted to Robotics: Science and Systems, 2023. 14 pages, 13 figures, 3 tables

    Journal ref: Robotics: Science and Systems, 2023

  8. ScaDLES: Scalable Deep Learning over Streaming data at the Edge

    Authors: Sahil Tyagi, Martin Swany

    Abstract: Distributed deep learning (DDL) training systems are designed for cloud and data-center environments that assumes homogeneous compute resources, high network bandwidth, sufficient memory and storage, as well as independent and identically distributed (IID) data across all nodes. However, these assumptions don't necessarily apply on the edge, especially when training neural networks on streaming da… ▽ More

    Submitted 29 January, 2024; v1 submitted 21 January, 2023; originally announced January 2023.

    Journal ref: Tyagi, S., & Swany, M. (2022). ScaDLES: Scalable Deep Learning over Streaming data at the Edge. 2022 IEEE International Conference on Big Data (Big Data), 2113-2122

  9. arXiv:2205.02203  [pdf, other

    cs.RO

    Graphical Games for UAV Swarm Control Under Time-Varying Communication Networks

    Authors: Malintha Fernando, Ransalu Senanayake, Ariful Azad, Martin Swany

    Abstract: We propose a unified framework for coordinating Unmanned Aerial Vehicle (UAV) swarms operating under time-varying communication networks. Our framework builds on the concept of graphical games, which we argue provides a compelling paradigm to subsume the interaction structures found in networked UAV swarms thanks to the shared local neighborhood properties. We present a general-sum, factorizable p… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

    Comments: Presented in Workshop on Intelligent Aerial Robotics, International Conference on Robotics and Automation, 2022

  10. arXiv:2111.04576  [pdf, other

    cs.RO cs.AI eess.SY

    CoCo Games: Graphical Game-Theoretic Swarm Control for Communication-Aware Coverage

    Authors: Malintha Fernando, Ransalu Senanayake, Martin Swany

    Abstract: We propose a novel framework for real-time communication-aware coverage control in networked robot swarms. Our framework unifies the robot dynamics with network-level message-routing to reach consensus on swarm formations in the presence of communication uncertainties by leveraging local information. Specifically, we formulate the communication-aware coverage as a cooperative graphical game, and u… ▽ More

    Submitted 28 April, 2022; v1 submitted 8 November, 2021; originally announced November 2021.

    Comments: 8 pages, 7 figures

    Journal ref: 2022 - IEEE Robotics and Automation Letters

  11. arXiv:2105.10680  [pdf, other

    cs.NI cs.DC

    Cybercosm: New Foundations for a Converged Science Data Ecosystem

    Authors: Mark Asch, François Bodin, Micah Beck, Terry Moore, Michela Taufer, Martin Swany, Jean-Pierre Vilotte

    Abstract: Scientific communities naturally tend to organize around data ecosystems created by the combination of their observational devices, their data repositories, and the workflows essential to carry their research from observation to discovery. However, these legacy data ecosystems are now breaking down under the pressure of the exponential growth in the volume and velocity of these workflows, which ar… ▽ More

    Submitted 29 June, 2021; v1 submitted 22 May, 2021; originally announced May 2021.

    Comments: Updated author list

    MSC Class: ---

  12. arXiv:2011.14795  [pdf

    cs.NI

    Energy Aware Routing with Computational Offloading for Wireless Sensor Networks

    Authors: Adam Barker, Martin Swany

    Abstract: Wireless sensor networks (WSN) are characterized by a network of small, battery powered devices, operating remotely with no pre-existing infrastructure. The unique structure of WSN allow for novel approaches to data reduction and energy preservation. This paper presents a modification to the existing Q-routing protocol by providing an alternate action of performing sensor data reduction in place t… ▽ More

    Submitted 30 November, 2020; originally announced November 2020.

    Comments: 17 pages, NeTIOT 2020

  13. An information services algorithm to heuristically summarize IP addresses for a distributed, hierarchical directory service

    Authors: Marcos Portnoi, Jason Zurawsky, Martin Swany

    Abstract: A distributed, hierarchical information service for computer networks might rely in several instances, located in different layers. A distributed directory service, for example, might be comprised of upper level listings, and local directories. The upper level listings contain a compact version of the local directories. Clients desiring to access the information contained in local directories migh… ▽ More

    Submitted 7 January, 2015; v1 submitted 31 December, 2014; originally announced January 2015.

    Comments: Grid Computing (GRID), 2010 11th IEEE/ACM International Conference on, 25-28 Oct. 2010

  14. arXiv:1408.4939  [pdf

    cs.DC

    Offloading MPI Parallel Prefix Scan (MPI_Scan) with the NetFPGA

    Authors: Omer Arap, Martin Swany

    Abstract: Parallel programs written using the standard Message Passing Interface (MPI) frequently depend upon the ability to efficiently execute collective operations. MPI_Scan is a collective operation defined in MPI that implements parallel prefix scan which is very useful primitive operation in several parallel applications. This operation can be very time consuming. In this paper, we explore the use of… ▽ More

    Submitted 21 August, 2014; originally announced August 2014.

    Comments: Presented at First International Workshop on FPGAs for Software Programmers (FSP 2014) (arXiv:1408.4423)

    Report number: FSP/2014/06