Skip to main content

Showing 1–27 of 27 results for author: Mostafa, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17611  [pdf, other

    cs.LG eess.SP

    Distributed Training of Large Graph Neural Networks with Variable Communication Rates

    Authors: Juan Cervino, Md Asadullah Turja, Hesham Mostafa, Nageen Himayat, Alejandro Ribeiro

    Abstract: Training Graph Neural Networks (GNNs) on large graphs presents unique challenges due to the large memory and computing requirements. Distributed GNN training, where the graph is partitioned across multiple machines, is a common approach to training GNNs on large graphs. However, as the graph cannot generally be decomposed into small non-interacting components, data communication between the traini… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2405.20445  [pdf, other

    cs.LG cs.SI

    GraphAny: A Foundation Model for Node Classification on Any Graph

    Authors: Jianan Zhao, Hesham Mostafa, Mikhail Galkin, Michael Bronstein, Zhaocheng Zhu, Jian Tang

    Abstract: Foundation models that can perform inference on any new task without requiring specific training have revolutionized machine learning in vision and language applications. However, applications involving graph-structured data remain a tough nut for foundation models, due to challenges in the unique feature- and label spaces associated with each graph. Traditional graph ML models such as graph neura… ▽ More

    Submitted 2 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: Preprint. Work in progress

  3. arXiv:2405.05495  [pdf, other

    cs.OH

    PARSAC: Fast, Human-quality Floorplanning for Modern SoCs with Complex Design Constraints

    Authors: Hesham Mostafa, Uday Mallappa, Mikhail Galkin, Mariano Phielipp, Somdeb Majumdar

    Abstract: The floorplanning of Systems-on-a-Chip (SoCs) and of chip sub-systems is a crucial step in the physical design flow as it determines the optimal shapes and locations of the blocks that make up the system. Simulated Annealing (SA) has been the method of choice for tackling classical floorplanning problems where the objective is to minimize wire-length and the total placement area. The goal in indus… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 9 pages, 7 figures

  4. arXiv:2405.05480  [pdf, other

    cs.AR cs.AI cs.LG

    FloorSet -- a VLSI Floorplanning Dataset with Design Constraints of Real-World SoCs

    Authors: Uday Mallappa, Hesham Mostafa, Mikhail Galkin, Mariano Phielipp, Somdeb Majumdar

    Abstract: Floorplanning for systems-on-a-chip (SoCs) and its sub-systems is a crucial and non-trivial step of the physical design flow. It represents a difficult combinatorial optimization problem. A typical large scale SoC with 120 partitions generates a search-space of nearly 10E250. As novel machine learning (ML) approaches emerge to tackle such problems, there is a growing need for a modern benchmark th… ▽ More

    Submitted 27 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: 10 pages, 11 figures

  5. arXiv:2311.17847  [pdf, other

    cs.DC

    FastSample: Accelerating Distributed Graph Neural Network Training for Billion-Scale Graphs

    Authors: Hesham Mostafa, Adam Grabowski, Md Asadullah Turja, Juan Cervino, Alejandro Ribeiro, Nageen Himayat

    Abstract: Training Graph Neural Networks(GNNs) on a large monolithic graph presents unique challenges as the graph cannot fit within a single machine and it cannot be decomposed into smaller disconnected components. Distributed sampling-based training distributes the graph across multiple machines and trains the GNN on small parts of the graph that are randomly sampled every training iteration. We show that… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  6. arXiv:2310.04562  [pdf, other

    cs.CL cs.AI

    Towards Foundation Models for Knowledge Graph Reasoning

    Authors: Mikhail Galkin, Xinyu Yuan, Hesham Mostafa, Jian Tang, Zhaocheng Zhu

    Abstract: Foundation models in language and vision have the ability to run inference on any textual and visual inputs thanks to the transferable representations such as a vocabulary of tokens in language. Knowledge graphs (KGs) have different entity and relation vocabularies that generally do not overlap. The key challenge of designing foundation models on KGs is to learn such transferable representations t… ▽ More

    Submitted 9 April, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: ICLR 2024

  7. arXiv:2302.13285  [pdf, other

    eess.SP cs.IT

    Ultra-Reliable Device-Centric Uplink Communications in Airborne Networks: A Spatiotemporal Analysis

    Authors: Yasser Nabil, Hesham ElSawy, Suhail Al-Dharrab, Hussein Attia, Hassan Mostafa

    Abstract: This paper proposes an ultra-reliable device-centric uplink (URDC-UL) communication scheme for airborne networks. In particular, base stations (BSs) are mounted on unmanned aerial vehicles (UAVs) that travel to schedule UL transmissions and collect data from devices. To attain an ultra-reliable unified device-centric performance, the UL connection is established when the UAV-BS is hovering at the… ▽ More

    Submitted 26 February, 2023; originally announced February 2023.

  8. arXiv:2112.09828  [pdf, other

    cs.CV

    Exploiting Long-Term Dependencies for Generating Dynamic Scene Graphs

    Authors: Shengyu Feng, Subarna Tripathi, Hesham Mostafa, Marcel Nassar, Somdeb Majumdar

    Abstract: Dynamic scene graph generation from a video is challenging due to the temporal dynamics of the scene and the inherent temporal fluctuations of predictions. We hypothesize that capturing long-term temporal dependencies is the key to effective generation of dynamic scene graphs. We propose to learn the long-term dependencies in a video by capturing the object-level consistency and inter-object relat… ▽ More

    Submitted 19 October, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

    Comments: WACV 2023

  9. arXiv:2111.06483  [pdf, other

    cs.LG cs.AI

    Sequential Aggregation and Rematerialization: Distributed Full-batch Training of Graph Neural Networks on Large Graphs

    Authors: Hesham Mostafa

    Abstract: We present the Sequential Aggregation and Rematerialization (SAR) scheme for distributed full-batch training of Graph Neural Networks (GNNs) on large graphs. Large-scale training of GNNs has recently been dominated by sampling-based methods and methods based on non-learnable message passing. SAR on the other hand is a distributed technique that can train any GNN type directly on an entire large gr… ▽ More

    Submitted 15 April, 2022; v1 submitted 11 November, 2021; originally announced November 2021.

  10. arXiv:2111.06312  [pdf, other

    cs.LG cs.AI cs.MS cs.SI

    Implicit SVD for Graph Representation Learning

    Authors: Sami Abu-El-Haija, Hesham Mostafa, Marcel Nassar, Valentino Crespi, Greg Ver Steeg, Aram Galstyan

    Abstract: Recent improvements in the performance of state-of-the-art (SOTA) methods for Graph Representational Learning (GRL) have come at the cost of significant computational resource requirements for training, e.g., for calculating gradients via backprop over many data epochs. Meanwhile, Singular Value Decomposition (SVD) can find closed-form solutions to convex problems, using merely a handful of epochs… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

    Journal ref: Advances in Neural Information Processing Systems (NeurIPS) 2021

  11. arXiv:2109.03563  [pdf, other

    cs.IT

    Data Aggregation in Synchronous Large-scale IoT Networks: Granularity, Reliability, and Delay Tradeoffs

    Authors: Yasser Nabil, Hesham ElSawy, Suhail Al-Dharrab, Hassan Mostafa, Hussein Attia

    Abstract: This paper studies data aggregation in large-scale regularly deployed Internet of Things (IoT) networks, where devices generate synchronized time-triggered traffic (e.g., measurements or updates). The data granularity, in terms of information content and temporal resolution, is parameterized by the sizes of the generated packets and the duty cycle of packet generation. The generated data packets a… ▽ More

    Submitted 8 September, 2021; originally announced September 2021.

  12. arXiv:2106.03213  [pdf, other

    cs.LG cs.AI

    On Local Aggregation in Heterophilic Graphs

    Authors: Hesham Mostafa, Marcel Nassar, Somdeb Majumdar

    Abstract: Many recent works have studied the performance of Graph Neural Networks (GNNs) in the context of graph homophily - a label-dependent measure of connectivity. Traditional GNNs generate node embeddings by aggregating information from a node's neighbors in the graph. Recent results in node classification tasks show that this local aggregation approach performs poorly in graphs with low homophily (het… ▽ More

    Submitted 6 June, 2021; originally announced June 2021.

  13. arXiv:2012.09904  [pdf, other

    cs.CV cs.LG

    Attention-based Image Upsampling

    Authors: Souvik Kundu, Hesham Mostafa, Sharath Nittur Sridhar, Sairam Sundaresan

    Abstract: Convolutional layers are an integral part of many deep neural network solutions in computer vision. Recent work shows that replacing the standard convolution operation with mechanisms based on self-attention leads to improved performance on image classification and object detection tasks. In this work, we show how attention mechanisms can be used to replace another canonical operation: strided tra… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

  14. arXiv:2003.00635  [pdf, other

    cs.AI

    Permutohedral-GCN: Graph Convolutional Networks with Global Attention

    Authors: Hesham Mostafa, Marcel Nassar

    Abstract: Graph convolutional networks (GCNs) update a node's feature vector by aggregating features from its neighbors in the graph. This ignores potentially useful contributions from distant nodes. Identifying such useful distant contributions is challenging due to scalability issues (too many nodes can potentially contribute) and oversmoothing (aggregating features from too many nodes risks swam** out… ▽ More

    Submitted 1 March, 2020; originally announced March 2020.

  15. arXiv:1912.13075  [pdf, other

    cs.LG stat.ML

    Robust Federated Learning Through Representation Matching and Adaptive Hyper-parameters

    Authors: Hesham Mostafa

    Abstract: Federated learning is a distributed, privacy-aware learning scenario which trains a single model on data belonging to several clients. Each client trains a local model on its data and the local models are then aggregated by a central party. Current federated learning methods struggle in cases with heterogeneous client-side data distributions which can quickly lead to divergent local models and a c… ▽ More

    Submitted 30 December, 2019; originally announced December 2019.

  16. arXiv:1907.06916  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Single-bit-per-weight deep convolutional neural networks without batch-normalization layers for embedded systems

    Authors: Mark D. McDonnell, Hesham Mostafa, Runchun Wang, Andre van Schaik

    Abstract: Batch-normalization (BN) layers are thought to be an integrally important layer type in today's state-of-the-art deep convolutional neural networks for computer vision tasks such as classification and detection. However, BN layers introduce complexity and computational overheads that are highly undesirable for training and/or inference on low-power custom hardware implementations of real-time embe… ▽ More

    Submitted 22 July, 2019; v1 submitted 16 July, 2019; originally announced July 2019.

    Comments: 8 pages, published IEEE conference paper

  17. arXiv:1902.05967  [pdf, ps, other

    cs.LG stat.ML

    Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization

    Authors: Hesham Mostafa, Xin Wang

    Abstract: Modern deep neural networks are typically highly overparameterized. Pruning techniques are able to remove a significant fraction of network parameters with little loss in accuracy. Recently, techniques based on dynamic reallocation of non-zero parameters have emerged, allowing direct training of sparse networks without having to pre-train a large dense model. Here we present a novel dynamic sparse… ▽ More

    Submitted 12 May, 2019; v1 submitted 15 February, 2019; originally announced February 2019.

    Comments: Proceedings of the 36th International Conference on MachineLearning, Long Beach, California, PMLR 97, 2019

  18. arXiv:1901.09948  [pdf, other

    cs.NE q-bio.NC

    Surrogate Gradient Learning in Spiking Neural Networks

    Authors: Emre O. Neftci, Hesham Mostafa, Friedemann Zenke

    Abstract: Spiking neural networks are nature's versatile solution to fault-tolerant and energy efficient signal processing. To translate these benefits into hardware, a growing number of neuromorphic spiking neural network processors attempt to emulate biological neural networks. These developments have created an imminent need for methods and tools to enable such systems to solve real-world signal processi… ▽ More

    Submitted 3 May, 2019; v1 submitted 28 January, 2019; originally announced January 2019.

  19. Synaptic Plasticity Dynamics for Deep Continuous Local Learning (DECOLLE)

    Authors: Jacques Kaiser, Hesham Mostafa, Emre Neftci

    Abstract: A growing body of work underlines striking similarities between biological neural networks and recurrent, binary neural networks. A relatively smaller body of work, however, discusses similarities between learning dynamics employed in deep artificial neural networks and synaptic plasticity in spiking neural networks. The challenge preventing this is largely caused by the discrepancy between the dy… ▽ More

    Submitted 20 May, 2020; v1 submitted 26 November, 2018; originally announced November 2018.

    Comments: Published in Frontiers in Neuroscience - Neuromorphic Engineering

    Journal ref: Frontiers in Neuroscience, 2020

  20. arXiv:1711.06756  [pdf, other

    cs.NE cs.LG stat.ML

    Deep supervised learning using local errors

    Authors: Hesham Mostafa, Vishwajith Ramesh, Gert Cauwenberghs

    Abstract: Error backpropagation is a highly effective mechanism for learning high-quality hierarchical features in deep networks. Updating the features or weights in one layer, however, requires waiting for the propagation of error signals from higher layers. Learning using delayed and non-local errors makes it hard to reconcile backpropagation with the learning mechanisms observed in biological neural netw… ▽ More

    Submitted 17 November, 2017; originally announced November 2017.

  21. arXiv:1708.04251  [pdf, other

    cs.NE

    A learning framework for winner-take-all networks with stochastic synapses

    Authors: Hesham Mostafa, Gert Cauwenberghs

    Abstract: Many recent generative models make use of neural networks to transform the probability distribution of a simple low-dimensional noise process into the complex distribution of the data. This raises the question of whether biological networks operate along similar principles to implement a probabilistic model of the environment through transformations of intrinsic noise processes. The intrinsic neur… ▽ More

    Submitted 5 February, 2018; v1 submitted 14 August, 2017; originally announced August 2017.

  22. arXiv:1707.03049  [pdf, other

    cs.NE

    Hardware-efficient on-line learning through pipelined truncated-error backpropagation in binary-state networks

    Authors: Hesham Mostafa, Bruno Pedroni, Sadique Sheik, Gert Cauwenberghs

    Abstract: Artificial neural networks (ANNs) trained using backpropagation are powerful learning architectures that have achieved state-of-the-art performance in various benchmarks. Significant effort has been devoted to develo** custom silicon devices to accelerate inference in ANNs. Accelerating the training phase, however, has attracted relatively little attention. In this paper, we describe a hardware-… ▽ More

    Submitted 16 August, 2017; v1 submitted 15 June, 2017; originally announced July 2017.

    Comments: Now also consider 0/1 binary activations. Memory access statistics reported

  23. NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps

    Authors: Alessandro Aimar, Hesham Mostafa, Enrico Calabrese, Antonio Rios-Navarro, Ricardo Tapiador-Morales, Iulia-Alexandra Lungu, Moritz B. Milde, Federico Corradi, Alejandro Linares-Barranco, Shih-Chii Liu, Tobi Delbruck

    Abstract: Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving many state-of-the-art (SOA) visual processing tasks. Even though Graphical Processing Units (GPUs) are most often used in training and deploying CNNs, their power efficiency is less than 10 GOp/s/W for single-frame runtime inference. We propose a flexible and efficient CNN accelerator architecture… ▽ More

    Submitted 6 March, 2018; v1 submitted 5 June, 2017; originally announced June 2017.

  24. arXiv:1606.08165  [pdf, other

    cs.NE cs.LG

    Supervised learning based on temporal coding in spiking neural networks

    Authors: Hesham Mostafa

    Abstract: Gradient descent training techniques are remarkably successful in training analog-valued artificial neural networks (ANNs). Such training techniques, however, do not transfer easily to spiking networks due to the spike generation hard non-linearity and the discrete nature of spike communication. We show that in a feedforward spiking network that uses a temporal coding scheme where information is e… ▽ More

    Submitted 16 August, 2017; v1 submitted 27 June, 2016; originally announced June 2016.

    Comments: Extended the discussion and introduction. Clarified the training parameters

  25. arXiv:1512.02930  [pdf, other

    cs.NE cs.ET q-bio.NC

    Stochastic Interpretation of Quasi-periodic Event-based Systems

    Authors: Hesham Mostafa, Giacomo Indiveri

    Abstract: Many networks used in machine learning and as models of biological neural networks make use of stochastic neurons or neuron-like units. We show that stochastic artificial neurons can be realized on silicon chips by exploiting the quasi-periodic behavior of mismatched analog oscillators to approximate the neuron's stochastic activation function. We represent neurons by finite state machines (FSMs)… ▽ More

    Submitted 9 December, 2015; originally announced December 2015.

  26. An event-based architecture for solving constraint satisfaction problems

    Authors: Hesham Mostafa, Lorenz K. Müller, Giacomo Indiveri

    Abstract: Constraint satisfaction problems (CSPs) are typically solved using conventional von Neumann computing architectures. However, these architectures do not reflect the distributed nature of many of these problems and are thus ill-suited to solving them. In this paper we present a hybrid analog/digital hardware architecture specifically designed to solve such problems. We cast CSPs as networks of ster… ▽ More

    Submitted 4 May, 2015; originally announced May 2015.

    Comments: First two authors contributed equally to this work

    Journal ref: Nature Communications 6, Article number: 8941 (2015), pg. 1-10

  27. arXiv:1202.3749  [pdf

    cs.AI

    Compact Mathematical Programs For DEC-MDPs With Structured Agent Interactions

    Authors: Hala Mostafa, Victor Lesser

    Abstract: To deal with the prohibitive complexity of calculating policies in Decentralized MDPs, researchers have proposed models that exploit structured agent interactions. Settings where most agent actions are independent except for few actions that affect the transitions and/or rewards of other agents can be modeled using Event-Driven Interactions with Complex Rewards (EDI-CR). Finding the optimal joint… ▽ More

    Submitted 14 February, 2012; originally announced February 2012.

    Report number: UAI-P-2011-PG-523-530