Search | arXiv e-print repository

The Curse of Beam-Squint in ISAC: Causes, Implications, and Mitigation Strategies

Authors: Ahmet M. Elbir, Kumar Vijay Mishra, Abdulkadir Celik, Ahmed M. Eltawil

Abstract: Integrated sensing and communications (ISAC) has emerged as a means to efficiently utilize spectrum and thereby save cost and power. At the higher end of the spectrum, ISAC systems operate at wideband using large antenna arrays to meet the stringent demands for high-resolution sensing and enhanced communications capacity. However, the wideband implementation entails beam-squint, that is, deviation… ▽ More Integrated sensing and communications (ISAC) has emerged as a means to efficiently utilize spectrum and thereby save cost and power. At the higher end of the spectrum, ISAC systems operate at wideband using large antenna arrays to meet the stringent demands for high-resolution sensing and enhanced communications capacity. However, the wideband implementation entails beam-squint, that is, deviations in the generated beam directions because of the narrowband assumption in the analog components. This causes significant degradation in the communications capacity, target detection, and parameter estimation. This article presents the design challenges caused by beam-squint and its mitigation in ISAC systems. In this context, we also discuss several ISAC design perspectives including far-/near-field beamforming, channel/direction estimation, sparse array design, and index modulation. There are also several research opportunities in waveform design, beam training, and array processing to adequately address beam-squint in ISAC. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: Accepted Paper in IEEE Communications Magazine

arXiv:2402.18587 [pdf, other]

doi 10.1109/OJCOMS.2024.3362271

At the Dawn of Generative AI Era: A Tutorial-cum-Survey on New Frontiers in 6G Wireless Intelligence

Authors: Abdulkadir Celik, Ahmed M. Eltawil

Abstract: The majority of data-driven wireless research leans heavily on discriminative AI (DAI) that requires vast real-world datasets. Unlike the DAI, Generative AI (GenAI) pertains to generative models (GMs) capable of discerning the underlying data distribution, patterns, and features of the input data. This makes GenAI a crucial asset in wireless domain wherein real-world data is often scarce, incomple… ▽ More The majority of data-driven wireless research leans heavily on discriminative AI (DAI) that requires vast real-world datasets. Unlike the DAI, Generative AI (GenAI) pertains to generative models (GMs) capable of discerning the underlying data distribution, patterns, and features of the input data. This makes GenAI a crucial asset in wireless domain wherein real-world data is often scarce, incomplete, costly to acquire, and hard to model or comprehend. With these appealing attributes, GenAI can replace or supplement DAI methods in various capacities. Accordingly, this combined tutorial-survey paper commences with preliminaries of 6G and wireless intelligence by outlining candidate 6G applications and services, presenting a taxonomy of state-of-the-art DAI models, exemplifying prominent DAI use cases, and elucidating the multifaceted ways through which GenAI enhances DAI. Subsequently, we present a tutorial on GMs by spotlighting seminal examples such as generative adversarial networks, variational autoencoders, flow-based GMs, diffusion-based GMs, generative transformers, large language models, to name a few. Contrary to the prevailing belief that GenAI is a nascent trend, our exhaustive review of approximately 120 technical papers demonstrates the scope of research across core wireless research areas, including physical layer design; network optimization, organization, and management; network traffic analytics; cross-layer network security; and localization & positioning. Furthermore, we outline the central role of GMs in pioneering areas of 6G network research, including semantic/THz/near-field communications, ISAC, extremely large antenna arrays, digital twins, AI-generated content services, mobile edge computing and edge AI, adversarial ML, and trustworthy AI. Lastly, we shed light on the multifarious challenges ahead, suggesting potential strategies and promising remedies. △ Less

Submitted 2 February, 2024; originally announced February 2024.

arXiv:2401.08186 [pdf, other]

Index Modulation for Integrated Sensing and Communications: A Signal Processing Perspective

Authors: Ahmet M. Elbir, Abdulkadir Celik, Ahmed M. Eltawil, Moeness G. Amin

Abstract: A joint design of both sensing and communication can lead to substantial enhancement for both subsystems in terms of size, cost as well as spectrum and hardware efficiency. In the last decade, integrated sensing and communications (ISAC) has emerged as a means to efficiently utilize the spectrum on a single and shared hardware platform. Recent studies focused on develo** multi-function approache… ▽ More A joint design of both sensing and communication can lead to substantial enhancement for both subsystems in terms of size, cost as well as spectrum and hardware efficiency. In the last decade, integrated sensing and communications (ISAC) has emerged as a means to efficiently utilize the spectrum on a single and shared hardware platform. Recent studies focused on develo** multi-function approaches to share the spectrum between radar sensing and communications. Index modulation (IM) is one particular approach to incorporate information-bearing communication symbols into the emitted radar waveforms. While IM has been well investigated in communications-only systems, the implementation adoption of IM concept in ISAC has recently attracted researchers to achieve improved energy/spectral efficiency while maintaining satisfactory radar sensing performance. This article focuses on recent studies on IM-ISAC, and presents in detail the analytical background and relevance of the major IM-ISAC applications. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: 11pages5figures, submitted to IEEE

arXiv:2311.04322 [pdf, ps, other]

NEAT-MUSIC: Auto-calibration of DOA Estimation for Terahertz-Band Massive MIMO Systems

Authors: Ahmet M. Elbir, Abdulkadir Celik, Ahmed M. Eltawil

Abstract: Terahertz (THz) band is envisioned for the future sixth generation wireless systems thanks to its abundant bandwidth and very narrow beamwidth. These features are one of the key enabling factors for high resolution sensing with milli-degree level direction-of-arrival (DOA) estimation. Therefore, this paper investigates the DOA estimation problem in THz systems in the presence of two major error so… ▽ More Terahertz (THz) band is envisioned for the future sixth generation wireless systems thanks to its abundant bandwidth and very narrow beamwidth. These features are one of the key enabling factors for high resolution sensing with milli-degree level direction-of-arrival (DOA) estimation. Therefore, this paper investigates the DOA estimation problem in THz systems in the presence of two major error sources: 1) gain-phase mismatches, which occur due to the deviations in the radio-frequency circuitry; 2) beam-squint, which is caused because of the deviations in the generated beams at different subcarriers due to ultra-wide bandwidth. An auto-calibration approach, namely NoisE subspAce correcTion technique for MUltiple SIgnal Classification (NEAT-MUSIC), is proposed based on the correction of the noise subspace for accurate DOA estimation in the presence of gain-phase mismatches and beam-squint. To gauge the performance of the proposed approach, the Cramer-Rao bounds are also derived. Numerical results show the effectiveness of the proposed approach. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: Accepted paper in IEEE Wireless Communications Letters. arXiv admin note: text overlap with arXiv:2310.16724

arXiv:2309.14557 [pdf, other]

Disruption Detection for a Cognitive Digital Supply Chain Twin Using Hybrid Deep Learning

Authors: Mahmoud Ashraf, Amr Eltawil, Islam Ali

Abstract: Purpose: Recent disruptive events, such as COVID-19 and Russia-Ukraine conflict, had a significant impact of global supply chains. Digital supply chain twins have been proposed in order to provide decision makers with an effective and efficient tool to mitigate disruption impact. Methods: This paper introduces a hybrid deep learning approach for disruption detection within a cognitive digital supp… ▽ More Purpose: Recent disruptive events, such as COVID-19 and Russia-Ukraine conflict, had a significant impact of global supply chains. Digital supply chain twins have been proposed in order to provide decision makers with an effective and efficient tool to mitigate disruption impact. Methods: This paper introduces a hybrid deep learning approach for disruption detection within a cognitive digital supply chain twin framework to enhance supply chain resilience. The proposed disruption detection module utilises a deep autoencoder neural network combined with a one-class support vector machine algorithm. In addition, long-short term memory neural network models are developed to identify the disrupted echelon and predict time-to-recovery from the disruption effect. Results: The obtained information from the proposed approach will help decision-makers and supply chain practitioners make appropriate decisions aiming at minimizing negative impact of disruptive events based on real-time disruption detection data. The results demonstrate the trade-off between disruption detection model sensitivity, encountered delay in disruption detection, and false alarms. This approach has seldom been used in recent literature addressing this issue. △ Less

Submitted 25 September, 2023; originally announced September 2023.

arXiv:2309.13984 [pdf, other]

Near-field Hybrid Beamforming for Terahertz-band Integrated Sensing and Communications

Authors: Ahmet M. Elbir, Abdulkadir Celik, Ahmed M. Eltawil

Abstract: Terahertz (THz) band communications and integrated sensing and communications (ISAC) are two main facets of the sixth generation wireless networks. In order to compensate the severe attenuation, the THz wireless systems employ large arrays, wherein the near-field beam-squint severely degrades the beamforming accuracy. Contrary to prior works that examine only either narrowband ISAC beamforming or… ▽ More Terahertz (THz) band communications and integrated sensing and communications (ISAC) are two main facets of the sixth generation wireless networks. In order to compensate the severe attenuation, the THz wireless systems employ large arrays, wherein the near-field beam-squint severely degrades the beamforming accuracy. Contrary to prior works that examine only either narrowband ISAC beamforming or far-field models, we introduce an alternating optimization technique for hybrid beamforming design in near-field THz-ISAC scenario. We also propose an efficient approach to compensate near-field beam-squint via baseband beamformers. Via numerical simulations, we show that the proposed approach achieves satisfactory spectral efficiency performance while accurately estimating the near-field beamformers and mitigating the beam-squint without additional hardware components. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: Accepted Paper in 2023 IEEE Global Communications Conference (GLOBECOM), Kuala Lumpur, Malaysia, 2023

arXiv:2307.07242 [pdf, other]

Antenna Selection With Beam Squint Compensation for Integrated Sensing and Communications

Authors: Ahmet M. Elbir, Asmaa Abdallah, Abdulkadir Celik, Ahmed M. Eltawil

Abstract: Next-generation wireless networks strive for higher communication rates, ultra-low latency, seamless connectivity, and high-resolution sensing capabilities. To meet these demands, terahertz (THz)-band signal processing is envisioned as a key technology offering wide bandwidth and sub-millimeter wavelength. Furthermore, THz integrated sensing and communications (ISAC) paradigm has emerged jointly a… ▽ More Next-generation wireless networks strive for higher communication rates, ultra-low latency, seamless connectivity, and high-resolution sensing capabilities. To meet these demands, terahertz (THz)-band signal processing is envisioned as a key technology offering wide bandwidth and sub-millimeter wavelength. Furthermore, THz integrated sensing and communications (ISAC) paradigm has emerged jointly access spectrum and reduced hardware costs through a unified platform. To address the challenges in THz propagation, THz-ISAC systems employ extremely large antenna arrays to improve the beamforming gain for communications with high data rates and sensing with high resolution. However, the cost and power consumption of implementing fully digital beamformers are prohibitive. While hybrid analog/digital beamforming can be a potential solution, the use of subcarrier-independent analog beamformers leads to the beam-squint phenomenon where different subcarriers observe distinct directions because of adopting the same analog beamformer across all subcarriers. In this paper, we develop a sparse array architecture for THz-ISAC with hybrid beamforming to provide a cost-effective solution. We analyze the antenna selection problem under beam-squint influence and introduce a manifold optimization approach for hybrid beamforming design. To reduce computational and memory costs, we propose novel algorithms leveraging grouped subarrays, quantized performance metrics, and sequential optimization. These approaches yield a significant reduction in the number of possible subarray configurations, which enables us to devise a neural network with classification model to accurately perform antenna selection. △ Less

Submitted 14 July, 2023; originally announced July 2023.

Comments: 14pages10figures, submitted to IEEE

arXiv:2305.19295 [pdf, other]

Low Precision Quantization-aware Training in Spiking Neural Networks with Differentiable Quantization Function

Authors: Ayan Shymyrbay, Mohammed E. Fouda, Ahmed Eltawil

Abstract: Deep neural networks have been proven to be highly effective tools in various domains, yet their computational and memory costs restrict them from being widely deployed on portable devices. The recent rapid increase of edge computing devices has led to an active search for techniques to address the above-mentioned limitations of machine learning frameworks. The quantization of artificial neural ne… ▽ More Deep neural networks have been proven to be highly effective tools in various domains, yet their computational and memory costs restrict them from being widely deployed on portable devices. The recent rapid increase of edge computing devices has led to an active search for techniques to address the above-mentioned limitations of machine learning frameworks. The quantization of artificial neural networks (ANNs), which converts the full-precision synaptic weights into low-bit versions, emerged as one of the solutions. At the same time, spiking neural networks (SNNs) have become an attractive alternative to conventional ANNs due to their temporal information processing capability, energy efficiency, and high biological plausibility. Despite being driven by the same motivation, the simultaneous utilization of both concepts has yet to be thoroughly studied. Therefore, this work aims to bridge the gap between recent progress in quantized neural networks and SNNs. It presents an extensive study on the performance of the quantization function, represented as a linear combination of sigmoid functions, exploited in low-bit weight quantization in SNNs. The presented quantization function demonstrates the state-of-the-art performance on four popular benchmarks, CIFAR10-DVS, DVS128 Gesture, N-Caltech101, and N-MNIST, for binary networks (64.05\%, 95.45\%, 68.71\%, and 99.43\% respectively) with small accuracy drops and up to 31$\times$ memory savings, which outperforms existing methods. △ Less

Submitted 30 May, 2023; originally announced May 2023.

Comments: 8 pages, 5 Figures, accepted at IJCNN'23

arXiv:2303.12328 [pdf, other]

Spatial Path Index Modulation in mmWave/THz-Band Integrated Sensing and Communications

Authors: Ahmet M. Elbir, Kumar Vijay Mishra, Asmaa Abdallah, Abdulkadir Celik, Ahmed M. Eltawil

Abstract: As the demand for wireless connectivity continues to soar, the fifth generation and beyond wireless networks are exploring new ways to efficiently utilize the wireless spectrum and reduce hardware costs. One such approach is the integration of sensing and communications (ISAC) paradigms to jointly access the spectrum. Recent ISAC studies have focused on upper millimeter-wave and low terahertz band… ▽ More As the demand for wireless connectivity continues to soar, the fifth generation and beyond wireless networks are exploring new ways to efficiently utilize the wireless spectrum and reduce hardware costs. One such approach is the integration of sensing and communications (ISAC) paradigms to jointly access the spectrum. Recent ISAC studies have focused on upper millimeter-wave and low terahertz bands to exploit ultrawide bandwidths. At these frequencies, hybrid beamformers that employ fewer radio-frequency chains are employed to offset expensive hardware but at the cost of lower multiplexing gains. Wideband hybrid beamforming also suffers from the beam-split effect arising from the subcarrier-independent (SI) analog beamformers. To overcome these limitations, this paper introduces a spatial path index modulation (SPIM) ISAC architecture, which transmits additional information bits via modulating the spatial paths between the base station and communications users. We design the SPIM-ISAC beamformers by first estimating both radar and communications parameters by develo** beam-split-aware algorithms. Then, we propose to employ a family of hybrid beamforming techniques such as hybrid, SI, and subcarrier-dependent analog-only, and beam-split-aware beamformers. Numerical experiments demonstrate that the proposed SPIM-ISAC approach exhibits significantly improved spectral efficiency performance in the presence of beam-split than that of even fully digital non-SPIM beamformers. △ Less

Submitted 22 March, 2023; originally announced March 2023.

Comments: 30pages, Submitted to the IEEE journals

arXiv:2302.03376 [pdf, other]

System-Level Metrics for Non-Terrestrial Networks Under Stochastic Geometry Framework

Authors: Qi Huang, Baha Eddine Youcef Belmekki, Ahmed M. Eltawil, Mohamed-Slim Alouini

Abstract: Non-terrestrial networks (NTNs) are considered one of the key enablers in sixth-generation (6G) wireless networks; and with their rapid growth, system-level metrics analysis adds crucial understanding into NTN system performance. Applying stochastic geometry (SG) as a system-level analysis tool in the context of NTN offers novel insights into the network tradeoffs. In this paper, we study and high… ▽ More Non-terrestrial networks (NTNs) are considered one of the key enablers in sixth-generation (6G) wireless networks; and with their rapid growth, system-level metrics analysis adds crucial understanding into NTN system performance. Applying stochastic geometry (SG) as a system-level analysis tool in the context of NTN offers novel insights into the network tradeoffs. In this paper, we study and highlight NTN common system-level metrics from three perspectives: NTN platform types, typical communication issues, and application scenarios. In addition to summarizing existing research, we study the best-suited SG models for different platforms and system-level metrics which have not been well studied in the literature. In addition, we showcase NTN-dominated prospective application scenarios. Finally, we carry out a performance analysis of system-level metrics for these applications based on SG models. △ Less

Submitted 10 February, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

Comments: 7 pages

arXiv:2301.03973 [pdf, ps, other]

Performance of RIS-empowered NOMA-based D2D Communication under Nakagami-m Fading

Authors: Mohd Hamza Naim Shaikh, Sultangali Arzykulov, Abdulkadir Celik, Ahmed M. Eltawil, G. Nauryzbayev

Abstract: Reconfigurable intelligent surfaces (RISs) have sparked a renewed interest in the research community envisioning future wireless communication networks. In this study, we analyzed the performance of RIS-enabled non-orthogonal multiple access (NOMA) based device-to-device (D2D) wireless communication system, where the RIS is partitioned to serve a pair of D2D users. Specifically, closed-form expres… ▽ More Reconfigurable intelligent surfaces (RISs) have sparked a renewed interest in the research community envisioning future wireless communication networks. In this study, we analyzed the performance of RIS-enabled non-orthogonal multiple access (NOMA) based device-to-device (D2D) wireless communication system, where the RIS is partitioned to serve a pair of D2D users. Specifically, closed-form expressions are derived for the upper and lower limits of spectral efficiency (SE) and energy efficiency (EE). In addition, the performance of the proposed NOMA-based system is also compared with its orthogonal counterpart. Extensive simulation is done to corroborate the analytical findings. The results demonstrate that RIS highly enhances the performance of a NOMA-based D2D network. △ Less

Submitted 10 January, 2023; originally announced January 2023.

Comments: Accepted for Publication in the Proceedings of IEEE VTC-Fall 2022, 5 Pages, 4 Figures

arXiv:2212.13707 [pdf, other]

Thermal Heating in ReRAM Crossbar Arrays: Challenges and Solutions

Authors: Kamilya Smagulova, Mohammed E. Fouda, Ahmed Eltawil

Abstract: The higher speed, scalability and parallelism offered by ReRAM crossbar arrays foster development of ReRAM-based next generation AI accelerators. At the same time, sensitivity of ReRAM to temperature variations decreases R_on/Roff ratio and negatively affects the achieved accuracy and reliability of the hardware. Various works on temperature-aware optimization and remap** in ReRAM crossbar array… ▽ More The higher speed, scalability and parallelism offered by ReRAM crossbar arrays foster development of ReRAM-based next generation AI accelerators. At the same time, sensitivity of ReRAM to temperature variations decreases R_on/Roff ratio and negatively affects the achieved accuracy and reliability of the hardware. Various works on temperature-aware optimization and remap** in ReRAM crossbar arrays reported up to 58\% improvement in accuracy and 2.39$\times$ ReRAM lifetime enhancement. This paper classifies the challenges caused by thermal heat, starting from constraints in ReRAM cells' dimensions and characteristics to their placement in the architecture. In addition, it reviews available solutions designed to mitigate the impact of these challenges, including emerging temperature-resilient DNN training methods. Our work also provides a summary of the techniques and their advantages and limitations. △ Less

Submitted 31 January, 2023; v1 submitted 28 December, 2022; originally announced December 2022.

Comments: 18 pages

arXiv:2211.04540 [pdf, other]

Millimeter-Wave Radar Beamforming with Spatial Path Index Modulation Communications

Authors: Ahmet M. Elbir, Kumar Vijay Mishra, Abdulkadir Çelik, Ahmed M. Eltawil

Abstract: To efficiently utilize the wireless spectrum and save hardware costs, the fifth generation and beyond (B5G) wireless networks envisage integrated sensing and communications (ISAC) paradigms to jointly access the spectrum. In B5G systems, the expensive hardware is usually avoided by employing hybrid beamformers that employ fewer radio-frequency chains but at the cost of the multiplexing gain. Recen… ▽ More To efficiently utilize the wireless spectrum and save hardware costs, the fifth generation and beyond (B5G) wireless networks envisage integrated sensing and communications (ISAC) paradigms to jointly access the spectrum. In B5G systems, the expensive hardware is usually avoided by employing hybrid beamformers that employ fewer radio-frequency chains but at the cost of the multiplexing gain. Recently, it has been proposed to overcome this shortcoming of millimeter wave (mmWave) hybrid beamformers through spatial path index modulation (SPIM), which modulates the spatial paths between the base station and users and improves spectral efficiency. In this paper, we propose an SPIM-ISAC approach for hybrid beamforming to simultaneously generate beams toward both radar targets and communications users. We introduce a low complexity approach for the design of hybrid beamformers, which include radar-only and communications-only beamformers. Numerical experiments demonstrate that our SPIM-ISAC approach exhibits a significant performance improvement over the conventional mmWave-ISAC design in terms of spectral efficiency and the generated beampattern. △ Less

Submitted 22 January, 2023; v1 submitted 8 November, 2022; originally announced November 2022.

Comments: Accepted paper in 2023 IEEE Radar Conference

arXiv:2208.03582 [pdf, other]

Reconfigurable Intelligent Surface Enabled Over-the-Air Uplink Non-orthogonal Multiple Access

Authors: Emre Arslan, Fatih Kilinc, Sultangali Arzykulov, Ali Tugberk Dogukan, Abdulkadir Celik, Ertugrul Basar, Ahmad M. Eltawil

Abstract: Innovative reconfigurable intelligent surface (RIS) technologies are rising and recognized as promising candidates to enhance 6G and beyond wireless communication systems. RISs acquire the ability to manipulate electromagnetic signals, thus, offering a degree of control over the wireless channel and the potential for many more benefits. Furthermore, active RIS designs have recently been introduced… ▽ More Innovative reconfigurable intelligent surface (RIS) technologies are rising and recognized as promising candidates to enhance 6G and beyond wireless communication systems. RISs acquire the ability to manipulate electromagnetic signals, thus, offering a degree of control over the wireless channel and the potential for many more benefits. Furthermore, active RIS designs have recently been introduced to combat the critical double fading problem and other impairments passive RIS designs may possess. In this paper, the potential and flexibility of active RIS technology are exploited for uplink systems to achieve virtual non-orthogonal multiple access (NOMA) through power disparity over-the-air rather than controlling transmit powers at the user side. Specifically, users with identical transmit power, path loss, and distance can communicate with a base station sharing time and frequency resources in a NOMA fashion with the aid of the proposed hybrid RIS system. Here, the RIS is partitioned into active and passive parts and the distinctive partitions serve different users aligning their phases accordingly while introducing a power difference to the users' signals to enable NOMA. First, the end-to-end system model is presented considering two users. Furthermore, outage probability calculations and theoretical error probability analysis are discussed and reinforced with computer simulation results. △ Less

Submitted 6 August, 2022; originally announced August 2022.

arXiv:2207.11531 [pdf, other]

RIS-Assisted Grant-Free NOMA

Authors: Recep Akif Tasci, Fatih Kilinc, Abdulkadir Celik, Asmaa Abdallah, Ahmed M. Eltawil, Ertugrul Basar

Abstract: This paper introduces a reconfigurable intelligent surface (RIS)-assisted grant-free non-orthogonal multiple-access (GF-NOMA) scheme. To ensure the power reception disparity required by the power domain NOMA (PD-NOMA), we propose a joint user clustering and RIS assignment/alignment approach that maximizes the network sum rate by judiciously pairing user equipments (UEs) with distinct channel gains… ▽ More This paper introduces a reconfigurable intelligent surface (RIS)-assisted grant-free non-orthogonal multiple-access (GF-NOMA) scheme. To ensure the power reception disparity required by the power domain NOMA (PD-NOMA), we propose a joint user clustering and RIS assignment/alignment approach that maximizes the network sum rate by judiciously pairing user equipments (UEs) with distinct channel gains, assigning RISs to proper clusters, and aligning RIS phase shifts to the cluster members yielding the highest cluster sum rate. Once UEs are acknowledged with the cluster index, they are allowed to access their resource blocks (RBs) at any time requiring neither further grant acquisitions from the base station (BS) nor power control as all UEs are requested to transmit at the same power. In this way, the proposed approach performs an implicit over-the-air power control with minimal control signaling between BS and UEs, which has shown to deliver up to 20% higher network sum rate than benchmark GF-NOMA and optimal grant-based PD-NOMA schemes depending on the network parameters. The given numerical results also investigate the impact of UE density, RIS deployment, and RIS hardware specifications on the overall performance of the proposed RIS-aided GF-NOMA scheme. △ Less

Submitted 15 June, 2023; v1 submitted 23 July, 2022; originally announced July 2022.

arXiv:2205.15505 [pdf, other]

DNA Pattern Matching Acceleration with Analog Resistive CAM

Authors: **ane Bazzi, Jana Sweidan, Mohammed E. Fouda, Rouwaida Kanj, Ahmed M. Eltawil

Abstract: DNA pattern matching is essential for many widely used bioinformatics applications. Disease diagnosis is one of these applications, since analyzing changes in DNA sequences can increase our understanding of possible genetic diseases. The remarkable growth in the size of DNA datasets has resulted in challenges in discovering DNA patterns efficiently in terms of run time and power consumption. In th… ▽ More DNA pattern matching is essential for many widely used bioinformatics applications. Disease diagnosis is one of these applications, since analyzing changes in DNA sequences can increase our understanding of possible genetic diseases. The remarkable growth in the size of DNA datasets has resulted in challenges in discovering DNA patterns efficiently in terms of run time and power consumption. In this paper, we propose an efficient hardware and software codesign that determines the chance of the occurrence of repeat-expansion diseases using DNA pattern matching. The proposed design parallelizes the DNA pattern matching task using associative memory realized with analog content-addressable memory and implements an algorithm that returns the maximum number of consecutive occurrences of a specific pattern within a DNA sequence. We fully implement all the required hardware circuits with PTM 45-nm technology, and we evaluate the proposed architecture on a practical human DNA dataset. The results show that our design is energy-efficient and significantly accelerates the DNA pattern matching task compared to previous approaches described in the literature. △ Less

Submitted 30 May, 2022; originally announced May 2022.

arXiv:2205.07141 [pdf, other]

BackLink: Supervised Local Training with Backward Links

Authors: Wenzhe Guo, Mohammed E Fouda, Ahmed M. Eltawil, Khaled N. Salama

Abstract: Empowered by the backpropagation (BP) algorithm, deep neural networks have dominated the race in solving various cognitive tasks. The restricted training pattern in the standard BP requires end-to-end error propagation, causing large memory cost and prohibiting model parallelization. Existing local training methods aim to resolve the training obstacle by completely cutting off the backward path be… ▽ More Empowered by the backpropagation (BP) algorithm, deep neural networks have dominated the race in solving various cognitive tasks. The restricted training pattern in the standard BP requires end-to-end error propagation, causing large memory cost and prohibiting model parallelization. Existing local training methods aim to resolve the training obstacle by completely cutting off the backward path between modules and isolating their gradients to reduce memory cost and accelerate the training process. These methods prevent errors from flowing between modules and hence information exchange, resulting in inferior performance. This work proposes a novel local training algorithm, BackLink, which introduces inter-module backward dependency and allows errors to flow between modules. The algorithm facilitates information to flow backward along with the network. To preserve the computational advantage of local training, BackLink restricts the error propagation length within the module. Extensive experiments performed in various deep convolutional neural networks demonstrate that our method consistently improves the classification performance of local training algorithms over other methods. For example, in ResNet32 with 16 local modules, our method surpasses the conventional greedy local training method by 4.00\% and a recent work by 1.83\% in accuracy on CIFAR10, respectively. Analysis of computational costs reveals that small overheads are incurred in GPU memory costs and runtime on multiple GPUs. Our method can lead up to a 79\% reduction in memory cost and 52\% in simulation runtime in ResNet110 compared to the standard BP. Therefore, our method could create new opportunities for improving training algorithms towards better efficiency and biological plausibility. △ Less

Submitted 14 May, 2022; originally announced May 2022.

arXiv:2203.02500 [pdf, other]

Efficient Analog CAM Design

Authors: **ane Bazzi, Jana Sweidan, Mohammed E. Fouda, Rouwaida Kanj, Ahmed M. Eltawil

Abstract: Content Addressable Memories (CAMs) are considered a key-enabler for in-memory computing (IMC). IMC shows order of magnitude improvement in energy efficiency and throughput compared to traditional computing techniques. Recently, analog CAMs (aCAMs) were proposed as a means to improve storage density and energy efficiency. In this work, we propose two new aCAM cells to improve data encoding and rob… ▽ More Content Addressable Memories (CAMs) are considered a key-enabler for in-memory computing (IMC). IMC shows order of magnitude improvement in energy efficiency and throughput compared to traditional computing techniques. Recently, analog CAMs (aCAMs) were proposed as a means to improve storage density and energy efficiency. In this work, we propose two new aCAM cells to improve data encoding and robustness as compared to existing aCAM cells. We propose a methodology to choose the margin and interval width for data encoding. In addition, we perform a comprehensive comparison against prior work in terms of the number of intervals, noise sensitivity, dynamic range, energy, latency, area, and probability of failure. △ Less

Submitted 4 March, 2022; originally announced March 2022.

Comments: This is a revised manuscript that is under consideration for publication at IEEE TCAS-I

arXiv:2203.00662 [pdf, other]

In-memory Associative Processors: Tutorial, Potential, and Challenges

Authors: Mohammed E. Fouda, Hasan Erdem Yantir, Ahmed M. Eltawil, Fadi Kurdahi

Abstract: In-memory computing is an emerging computing paradigm that overcomes the limitations of exiting Von-Neumann computing architectures such as the memory-wall bottleneck. In such paradigm, the computations are performed directly on the data stored in the memory, which highly reduces the memory-processor communications during computation. Hence, significant speedup and energy savings could be achieved… ▽ More In-memory computing is an emerging computing paradigm that overcomes the limitations of exiting Von-Neumann computing architectures such as the memory-wall bottleneck. In such paradigm, the computations are performed directly on the data stored in the memory, which highly reduces the memory-processor communications during computation. Hence, significant speedup and energy savings could be achieved especially with data-intensive applications. Associative processors (APs) were proposed in the seventies and recently were revived thanks to the high-density memories. In this tutorial brief, we overview the functionalities and recent trends of APs in addition to the implementation of each content-addressable memory with different technologies. The AP operations and runtime complexity are also summarized. We also explain and explore the possible applications that can benefit from APs. Finally, the AP limitations, challenges, and future directions are discussed. △ Less

Submitted 12 April, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

Comments: 7 pages

arXiv:2201.07210 [pdf]

Efficient Training of Spiking Neural Networks with Temporally-Truncated Local Backpropagation through Time

Authors: Wenzhe Guo, Mohammed E. Fouda, Ahmed M. Eltawil, Khaled Nabil Salama

Abstract: Directly training spiking neural networks (SNNs) has remained challenging due to complex neural dynamics and intrinsic non-differentiability in firing functions. The well-known backpropagation through time (BPTT) algorithm proposed to train SNNs suffers from large memory footprint and prohibits backward and update unlocking, making it impossible to exploit the potential of locally-supervised train… ▽ More Directly training spiking neural networks (SNNs) has remained challenging due to complex neural dynamics and intrinsic non-differentiability in firing functions. The well-known backpropagation through time (BPTT) algorithm proposed to train SNNs suffers from large memory footprint and prohibits backward and update unlocking, making it impossible to exploit the potential of locally-supervised training methods. This work proposes an efficient and direct training algorithm for SNNs that integrates a locally-supervised training method with a temporally-truncated BPTT algorithm. The proposed algorithm explores both temporal and spatial locality in BPTT and contributes to significant reduction in computational cost including GPU memory utilization, main memory access and arithmetic operations. We thoroughly explore the design space concerning temporal truncation length and local training block size and benchmark their impact on classification accuracy of different networks running different types of tasks. The results reveal that temporal truncation has a negative effect on the accuracy of classifying frame-based datasets, but leads to improvement in accuracy on dynamic-vision-sensor (DVS) recorded datasets. In spite of resulting information loss, local training is capable of alleviating overfitting. The combined effect of temporal truncation and local training can lead to the slowdown of accuracy drop and even improvement in accuracy. In addition, training deep SNNs models such as AlexNet classifying CIFAR10-DVS dataset leads to 7.26% increase in accuracy, 89.94% reduction in GPU memory, 10.79% reduction in memory access, and 99.64% reduction in MAC operations compared to the standard end-to-end BPTT. △ Less

Submitted 13 December, 2021; originally announced January 2022.

Comments: 16

arXiv:2201.03206 [pdf, other]

Configurable Independent Component Analysis Preprocessing Accelerator

Authors: Hsi-Hung Lu, Chung-An Shen, Mohammed E. Fouda, Ahmed M. Eltawil

Abstract: Independent component analysis (ICA) has been used in many applications, including self-interference cancellation for in-band full-duplex wireless systems and anomaly detection in industrial internet of things. This paper presents a high-throughput and highly efficient configurable preprocessing accelerator for the ICA algorithm. The proposed ICA accelerator has three major blocks that perform dat… ▽ More Independent component analysis (ICA) has been used in many applications, including self-interference cancellation for in-band full-duplex wireless systems and anomaly detection in industrial internet of things. This paper presents a high-throughput and highly efficient configurable preprocessing accelerator for the ICA algorithm. The proposed ICA accelerator has three major blocks that perform data centering, covariance matrix for computation, and eigenvalue decomposition (EVD). Specifically, the proposed accelerator is based on a high-performance matrix multiplication array (MMA). The proposed MMA architecture uses time-multiplexed processing so that the efficiency of hardware utilization is greatly enhanced. Furthermore, the processing flow utilizes parallel processing such that the centering, the calculation of the covariance matrix, and EVD are conducted simultaneously and are individually pipelined to maximize throughput. This paper presents the architecture, circuit design, and performance estimates based on post-layout extraction of the proposed preprocessing ICA accelerator. The proposed design achieves a throughput of 40.7 kMatrices per second at complexity of 73.3 kGE. △ Less

Submitted 30 April, 2022; v1 submitted 10 January, 2022; originally announced January 2022.

arXiv:2110.09643 [pdf, other]

In-memory Multi-valued Associative Processor

Authors: Mira Hout, Mohammed E. Fouda, Rouwaida Kanj, Ahmed M. Eltawil

Abstract: In-memory associative processor architectures are offered as a great candidate to overcome memory-wall bottleneck and to enable vector/parallel arithmetic operations. In this paper, we extend the functionality of the associative processor to multi-valued arithmetic. To allow for in-memory compute implementation of arithmetic or logic functions, we propose a structured methodology enabling the auto… ▽ More In-memory associative processor architectures are offered as a great candidate to overcome memory-wall bottleneck and to enable vector/parallel arithmetic operations. In this paper, we extend the functionality of the associative processor to multi-valued arithmetic. To allow for in-memory compute implementation of arithmetic or logic functions, we propose a structured methodology enabling the automatic generation of the corresponding look-up tables (LUTs). We propose two approaches to build the LUTs: a first approach that formalizes the intuition behind LUT pass ordering and a more optimized approach that reduces the number of required write cycles. To demonstrate these methodologies, we present a novel ternary associative processor (TAP) architecture that is employed to implement efficient ternary vector in-place addition. A SPICE-MATLAB co-simulator is implemented to test the functionality of the TAP and to evaluate the performance of the proposed AP ternary in-place adder implementations in terms of energy, delay, and area. Results show that compared to the binary AP adder, the ternary AP adder results in a 12.25\% and 6.2\% reduction in energy and area, respectively. The ternary AP also demonstrates a 52.64\% reduction in energy and a delay that is up to 9.5x smaller when compared to a state-of-art ternary carry-lookahead adder. △ Less

Submitted 18 October, 2021; originally announced October 2021.

arXiv:2109.03934 [pdf, other]

Resistive Neural Hardware Accelerators

Authors: Kamilya Smagulova, Mohammed E. Fouda, Fadi Kurdahi, Khaled Salama, Ahmed Eltawil

Abstract: Deep Neural Networks (DNNs), as a subset of Machine Learning (ML) techniques, entail that real-world data can be learned and that decisions can be made in real-time. However, their wide adoption is hindered by a number of software and hardware limitations. The existing general-purpose hardware platforms used to accelerate DNNs are facing new challenges associated with the growing amount of data an… ▽ More Deep Neural Networks (DNNs), as a subset of Machine Learning (ML) techniques, entail that real-world data can be learned and that decisions can be made in real-time. However, their wide adoption is hindered by a number of software and hardware limitations. The existing general-purpose hardware platforms used to accelerate DNNs are facing new challenges associated with the growing amount of data and are exponentially increasing the complexity of computations. An emerging non-volatile memory (NVM) devices and processing-in-memory (PIM) paradigm is creating a new hardware architecture generation with increased computing and storage capabilities. In particular, the shift towards ReRAM-based in-memory computing has great potential in the implementation of area and power efficient inference and in training large-scale neural network architectures. These can accelerate the process of the IoT-enabled AI technologies entering our daily life. In this survey, we review the state-of-the-art ReRAM-based DNN many-core accelerators, and their superiority compared to CMOS counterparts was shown. The review covers different aspects of hardware and software realization of DNN accelerators, their present limitations, and future prospectives. In particular, comparison of the accelerators shows the need for the introduction of new performance metrics and benchmarking standards. In addition, the major concerns regarding the efficient design of accelerators include a lack of accuracy in simulation tools for software and hardware co-design. △ Less

Submitted 8 September, 2021; originally announced September 2021.

arXiv:2102.10847 [pdf, other]

Deep Learning Based Frequency-Selective Channel Estimation for Hybrid mmWave MIMO Systems

Authors: Asmaa Abdallah, Abdulkadir Celik, Mohammad M. Mansour, Ahmed M. Eltawil

Abstract: Millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems typically employ hybrid mixed signal processing to avoid expensive hardware and high training overheads. {However, the lack of fully digital beamforming at mmWave bands imposes additional challenges in channel estimation. Prior art on hybrid architectures has mainly focused on greedy optimization algorithms to estimate… ▽ More Millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems typically employ hybrid mixed signal processing to avoid expensive hardware and high training overheads. {However, the lack of fully digital beamforming at mmWave bands imposes additional challenges in channel estimation. Prior art on hybrid architectures has mainly focused on greedy optimization algorithms to estimate frequency-flat narrowband mmWave channels, despite the fact that in practice, the large bandwidth associated with mmWave channels results in frequency-selective channels. In this paper, we consider a frequency-selective wideband mmWave system and propose two deep learning (DL) compressive sensing (CS) based algorithms for channel estimation.} The proposed algorithms learn critical apriori information from training data to provide highly accurate channel estimates with low training overhead. In the first approach, a DL-CS based algorithm simultaneously estimates the channel supports in the frequency domain, which are then used for channel reconstruction. The second approach exploits the estimated supports to apply a low-complexity multi-resolution fine-tuning method to further enhance the estimation performance. Simulation results demonstrate that the proposed DL-based schemes significantly outperform conventional orthogonal matching pursuit (OMP) techniques in terms of the normalized mean-squared error (NMSE), computational complexity, and spectral efficiency, particularly in the low signal-to-noise ratio regime. When compared to OMP approaches that achieve an NMSE gap of \$\unit[\{4-10\}]{dB}\$ with respect to the Cramer Rao Lower Bound (CRLB), the proposed algorithms reduce the CRLB gap to only \$\unit[\{1-1.5\}]{dB}\$, while significantly reducing complexity by two orders of magnitude. △ Less

Submitted 22 February, 2021; originally announced February 2021.

Comments: 16 pages, 8 figures, submitted to IEEE transactions on wireless communications. arXiv admin note: text overlap with arXiv:1704.08572 by other authors

arXiv:2011.10852 [pdf, other]

On-Chip Error-triggered Learning of Multi-layer Memristive Spiking Neural Networks

Authors: Melika Payvand, Mohammed E. Fouda, Fadi Kurdahi, Ahmed M. Eltawil, Emre O. Neftci

Abstract: Recent breakthroughs in neuromorphic computing show that local forms of gradient descent learning are compatible with Spiking Neural Networks (SNNs) and synaptic plasticity. Although SNNs can be scalably implemented using neuromorphic VLSI, an architecture that can learn using gradient-descent in situ is still missing. In this paper, we propose a local, gradient-based, error-triggered learning alg… ▽ More Recent breakthroughs in neuromorphic computing show that local forms of gradient descent learning are compatible with Spiking Neural Networks (SNNs) and synaptic plasticity. Although SNNs can be scalably implemented using neuromorphic VLSI, an architecture that can learn using gradient-descent in situ is still missing. In this paper, we propose a local, gradient-based, error-triggered learning algorithm with online ternary weight updates. The proposed algorithm enables online training of multi-layer SNNs with memristive neuromorphic hardware showing a small loss in the performance compared with the state of the art. We also propose a hardware architecture based on memristive crossbar arrays to perform the required vector-matrix multiplications. The necessary peripheral circuitry including pre-synaptic, post-synaptic and write circuits required for online training, have been designed in the sub-threshold regime for power saving with a standard 180 nm CMOS process. △ Less

Submitted 21 November, 2020; originally announced November 2020.

Comments: 15 pages, 11 figures, Journal of Emerging Technology in Circuits and Systems (JETCAS)

arXiv:2008.11356 [pdf, other]

doi 10.1109/TCCN.2021.3105133

UAV-Assisted Cooperative & Cognitive NOMA: Deployment, Clustering, and Resource Allocation

Authors: Sultangali Arzykulov, Abdulkadir Celik, Galymzhan Nauryzbayev, Ahmed M. Eltawil

Abstract: Cooperative and cognitive non-orthogonal multiple access (CCR-NOMA) has been recognized as a promising technique to overcome issues of spectrum scarcity and support massive connectivity envisioned in next-generation wireless networks. In this paper, we investigate the deployment of an unmanned aerial vehicle (UAV) as a relay that fairly serves a large number of secondary users in a hot-spot region… ▽ More Cooperative and cognitive non-orthogonal multiple access (CCR-NOMA) has been recognized as a promising technique to overcome issues of spectrum scarcity and support massive connectivity envisioned in next-generation wireless networks. In this paper, we investigate the deployment of an unmanned aerial vehicle (UAV) as a relay that fairly serves a large number of secondary users in a hot-spot region. The UAV deployment algorithm must jointly account for user clustering, channel assignment, and resource allocation sub-problems. We propose a solution methodology that obtains user clustering and channel assignment based on the optimal resource allocations for a given UAV location. To this end, we derive closed-form optimal power and time allocations and show it delivers optimal max-min fair throughput by consuming less energy and time than geometric programming. Based on optimal resource allocation, the optimal coverage probability is also provided in closed-form, which takes channel estimation errors, hardware impairments, and primary network interference into account. The optimal coverage probabilities are used by the proposed max-min fair user clustering and channel assignment approaches. The results show that the proposed method achieves 100% accuracy in more than five orders of magnitude less time than the optimal benchmark. △ Less

Submitted 25 August, 2020; originally announced August 2020.

arXiv:2004.10506 [pdf, ps, other]

doi 10.1109/ICTC52510.2021.9621035

A Non-Ideal NOMA-based mmWave D2D Networks with Hardware and CSI Imperfections

Authors: Leila Tlebaldiyeva, Galymzhan Nauryzbayev, Sultangali Arzykulov, Yerassyl Akhmetkaziyev, Mohammad S. Hashmi, Ahmed M. Eltawil

Abstract: This letter investigates a non-orthogonal multiple access (NOMA) assisted millimeter-wave device-to-device (D2D) network practically limited by multiple interference noises, transceiver hardware impairments, imperfect successive interference cancellation, and channel state information mismatch. Generalized outage probability expressions for NOMA-D2D users are deduced and achieved results, validate… ▽ More This letter investigates a non-orthogonal multiple access (NOMA) assisted millimeter-wave device-to-device (D2D) network practically limited by multiple interference noises, transceiver hardware impairments, imperfect successive interference cancellation, and channel state information mismatch. Generalized outage probability expressions for NOMA-D2D users are deduced and achieved results, validated by Monte Carlo simulations, are compared with the orthogonal multiple access to show the superior performance of the proposed network model △ Less

Submitted 22 April, 2020; originally announced April 2020.

Comments: 4 pages, 3 figures

arXiv:2004.10499 [pdf, ps, other]

doi 10.1109/OJCOMS.2021.3091606

Hardware and Interference Limited Cooperative CR-NOMA Networks under Imperfect SIC and CSI

Authors: Sultangali Arzykulov, Galymzhan Nauryzbayev, Abdulkadir Celik, Ahmed M. Eltawil

Abstract: The conflation of cognitive radio (CR) and nonorthogonal multiple access (NOMA) concepts is a promising approach to fulfil the massive connectivity goals of future networks given the spectrum scarcity. Accordingly, this letter investigates the outage performance of imperfect cooperative CR-NOMA networks under hardware impairments and interference. Our analysis is involved with the derivation of th… ▽ More The conflation of cognitive radio (CR) and nonorthogonal multiple access (NOMA) concepts is a promising approach to fulfil the massive connectivity goals of future networks given the spectrum scarcity. Accordingly, this letter investigates the outage performance of imperfect cooperative CR-NOMA networks under hardware impairments and interference. Our analysis is involved with the derivation of the end-to-end outage probability (OP) for secondary NOMA users by accounting for imperfect channel state information (CSI), as well as the residual interference caused by successive interference cancellation (SIC) errors and coexisting primary/secondary users. The numerical results validated by Monte Carlo simulations show that CR-NOMA network provides a superior outage performance over orthogonal multiple access. As imperfections become more significant, CR-NOMA is observed to deliver relatively poor outage performance. △ Less

Submitted 22 April, 2020; originally announced April 2020.

Comments: 5 pages, 4 figures

Journal ref: IEEE Open Journal of the Communications Society, vol. 2, pp. 1473-1485, 2021

arXiv:2001.00962 [pdf, other]

Application of ICA on Self-Interference Cancellation of In-band Full Duplex Systems

Authors: Mohammed E. Fouda, Sergey Shaboyan, Ayman Elezabi, Ahmed Eltawil

Abstract: In this letter, we propose a modified version of Fast Independent Component Analysis (FICA) algorithm to solve the self-interference cancellation (SIC) problem in In-band Full Duplex (IBFD) communication systems. The complex mixing problem is mathematically formulated to suit the real-valued blind source separation (BSS) algorithms. In addition, we propose a method to estimate the ambiguity factor… ▽ More In this letter, we propose a modified version of Fast Independent Component Analysis (FICA) algorithm to solve the self-interference cancellation (SIC) problem in In-band Full Duplex (IBFD) communication systems. The complex mixing problem is mathematically formulated to suit the real-valued blind source separation (BSS) algorithms. In addition, we propose a method to estimate the ambiguity factors associated with ICA lumped together with the channels and residual separation error. Experiments were performed on an FD platform where FICA-based BSS was applied for SIC in the frequency domain. Experimental results show superior performance compared to least squares SIC by up to 6 dB gain in the SNR. △ Less

Submitted 3 January, 2020; originally announced January 2020.

arXiv:1910.06152 [pdf, other]

Error-triggered Three-Factor Learning Dynamics for Crossbar Arrays

Authors: Melika Payvand, Mohammed Fouda, Fadi Kurdahi, Ahmed Eltawil, Emre O. Neftci

Abstract: Recent breakthroughs suggest that local, approximate gradient descent learning is compatible with Spiking Neural Networks (SNNs). Although SNNs can be scalably implemented using neuromorphic VLSI, an architecture that can learn in-situ as accurately as conventional processors is still missing. Here, we propose a subthreshold circuit architecture designed through insights obtained from machine lear… ▽ More Recent breakthroughs suggest that local, approximate gradient descent learning is compatible with Spiking Neural Networks (SNNs). Although SNNs can be scalably implemented using neuromorphic VLSI, an architecture that can learn in-situ as accurately as conventional processors is still missing. Here, we propose a subthreshold circuit architecture designed through insights obtained from machine learning and computational neuroscience that could achieve such accuracy. Using a surrogate gradient learning framework, we derive local, error-triggered learning dynamics compatible with crossbar arrays and the temporal dynamics of SNNs. The derivation reveals that circuits used for inference and training dynamics can be shared, which simplifies the circuit and suppresses the effects of fabrication mismatch. We present SPICE simulations on XFAB 180nm process, as well as large-scale simulations of the spiking neural networks on event-based benchmarks, including a gesture recognition task. Our results show that the number of updates can be reduced hundred-fold compared to the standard rule while achieving performances that are on par with the state-of-the-art. △ Less

Submitted 14 October, 2019; originally announced October 2019.

arXiv:1909.01771 [pdf, other]

Spiking Neural Networks for Inference and Learning: A Memristor-based Design Perspective

Authors: M. E. Fouda, F. Kurdahi, A. Eltawil, E. Neftci

Abstract: On metrics of density and power efficiency, neuromorphic technologies have the potential to surpass mainstream computing technologies in tasks where real-time functionality, adaptability, and autonomy are essential. While algorithmic advances in neuromorphic computing are proceeding successfully, the potential of memristors to improve neuromorphic computing have not yet born fruit, primarily becau… ▽ More On metrics of density and power efficiency, neuromorphic technologies have the potential to surpass mainstream computing technologies in tasks where real-time functionality, adaptability, and autonomy are essential. While algorithmic advances in neuromorphic computing are proceeding successfully, the potential of memristors to improve neuromorphic computing have not yet born fruit, primarily because they are often used as a drop-in replacement to conventional memory. However, interdisciplinary approaches anchored in machine learning theory suggest that multifactor plasticity rules matching neural and synaptic dynamics to the device capabilities can take better advantage of memristor dynamics and its stochasticity. Furthermore, such plasticity rules generally show much higher performance than that of classical Spike Time Dependent Plasticity (STDP) rules. This chapter reviews the recent development in learning with spiking neural network models and their possible implementation with memristor-based hardware. △ Less

Submitted 8 October, 2019; v1 submitted 4 September, 2019; originally announced September 2019.

arXiv:1904.10426 [pdf, other]

doi 10.1109/ACCESS.2020.3001876

Power Consumption and Energy-Efficiency for In-Band Full-Duplex Wireless Systems

Authors: Murad Murad, Ahmed M. Eltawil

Abstract: This paper presents an analytical model of power consumption for In-Band Full-Duplex (IBFD) Wireless Local-Area Networks (WLANs). Energy-efficiency is compared for both Half-Duplex (HD) and IBFD networks. The presented analytical model closely matches the results generated by simulation. For a given traffic scenario, IBFD systems exhibit higher power consumption, however at improved energy efficie… ▽ More This paper presents an analytical model of power consumption for In-Band Full-Duplex (IBFD) Wireless Local-Area Networks (WLANs). Energy-efficiency is compared for both Half-Duplex (HD) and IBFD networks. The presented analytical model closely matches the results generated by simulation. For a given traffic scenario, IBFD systems exhibit higher power consumption, however at improved energy efficiency when compared to equivalent HD WLANs. △ Less

Submitted 23 April, 2019; originally announced April 2019.

arXiv:1904.08966 [pdf, other]

Non-Stationary Polar Codes for Resistive Memories

Authors: Marwen Zorgui, Mohammed E. Fouda, Zhiying Wang, Ahmed M. Eltawil, Fadi Kurdahi

Abstract: Resistive memories are considered a promising memory technology enabling high storage densities with in-memory computing capabilities. However, the readout reliability of resistive memories is impaired due to the inevitable existence of wire resistance, resulting in the sneak path problem. Motivated by this problem, we study polar coding over channels with different reliability levels, termed non-… ▽ More Resistive memories are considered a promising memory technology enabling high storage densities with in-memory computing capabilities. However, the readout reliability of resistive memories is impaired due to the inevitable existence of wire resistance, resulting in the sneak path problem. Motivated by this problem, we study polar coding over channels with different reliability levels, termed non-stationary polar codes, and we propose a technique improving its bit error rate (BER) performance. We then apply the framework of non-stationary polar codes to the crossbar array and evaluate its BER performance under two modeling approaches, namely binary symmetric channels (BSCs) and binary asymmetric channels (BSCs). Finally, we propose a technique for biasing the proportion of high-resistance states in the crossbar array and show its advantage in reducing further the BER. Several simulations are carried out using a SPICE-like simulator, exhibiting significant reduction in BER. △ Less

Submitted 18 April, 2019; originally announced April 2019.

arXiv:1903.11720 [pdf, other]

doi 10.1109/ACCESS.2020.3001876

Performance Analysis and Enhancements for In-Band Full-Duplex Wireless Local Area Networks

Authors: Murad Murad, Ahmed M. Eltawil

Abstract: In-Band Full-Duplex (IBFD) is a technique that enables a wireless node to simultaneously transmit a signal and receive another on the same assigned frequency. Thus, IBFD wireless systems can provide up to twice the channel capacity compared to conventional Half-Duplex (HD) systems. In order to study the feasibility of IBFD networks, reliable models are needed to capture anticipated benefits of IBF… ▽ More In-Band Full-Duplex (IBFD) is a technique that enables a wireless node to simultaneously transmit a signal and receive another on the same assigned frequency. Thus, IBFD wireless systems can provide up to twice the channel capacity compared to conventional Half-Duplex (HD) systems. In order to study the feasibility of IBFD networks, reliable models are needed to capture anticipated benefits of IBFD above the physical layer (PHY). In this paper, an accurate analytical model based on Discrete-Time Markov Chain (DTMC) analysis for IEEE 802.11 Distributed Coordination Function (DCF) with IBFD capabilities is proposed. The model captures all parameters necessary to calculate important performance metrics which quantify enhancements introduced as a result of IBFD solutions. Additionally, two frame aggregation schemes for Wireless Local Area Networks (WLANs) with IBFD features are proposed to increase the efficiency of data transmission. Matching analytical and simulation results with less than 1% average errors confirm that the proposed frame aggregation schemes further improve the overall throughput by up to 24% and reduce latency by up to 47% in practical IBFD-WLANs. More importantly, the results assert that IBFD transmission can only reduce latency to a suboptimal point in WLANs, but frame aggregation is necessary to minimize it. △ Less

Submitted 27 March, 2019; originally announced March 2019.

arXiv:1903.01512 [pdf, other]

On Resistive Memories: One Step Row Readout Technique and Sensing Circuitry

Authors: Mohammed E Fouda, Ahmed M. Eltawil, Fadi Kurdahi

Abstract: Transistor-based memories are rapidly approaching their maximum density per unit area. Resistive crossbar arrays enable denser memory due to the small size of switching devices. However, due to the resistive nature of these memories, they suffer from current sneak paths complicating the readout procedure. In this paper, we propose a row readout technique with circuitry that can be used to read {se… ▽ More Transistor-based memories are rapidly approaching their maximum density per unit area. Resistive crossbar arrays enable denser memory due to the small size of switching devices. However, due to the resistive nature of these memories, they suffer from current sneak paths complicating the readout procedure. In this paper, we propose a row readout technique with circuitry that can be used to read {selector-less} resistive crossbar based memories. High throughput reading and writing techniques are needed to overcome the memory-wall bottleneck problem and to enable near memory computing paradigm. The proposed technique can read the entire row of dense crossbar arrays in one cycle, unlike previously published techniques. The requirements for the readout circuitry are discussed and satisfied in the proposed circuit. Additionally, an approximated expression for the power consumed while reading the array is derived. A figure of merit is defined and used to compare the proposed approach with existing reading techniques. Finally, a quantitative analysis of the effect of biasing mismatch on the array size is discussed. △ Less

Submitted 4 March, 2019; originally announced March 2019.

arXiv:1808.01875 [pdf]

Energy and exergy analysis of solar stills with micro/nano particles: A comprehensive study

Authors: S. W. Sharshir, Guilong Peng, A. H. Elsheikh, Talaat A. Talaat, Mohamed A. Eltawil, A. E. Kabeel, Nuo Yang

Abstract: In this paper, a comparative study between modified solar stills (with graphite or copper oxide micro/nano particles) and classical solar still is carried out, based on the productivity and the thermal performance. Exergy destructions in various components of the solar stills have been calculated, analyzed and discussed. Evaporation is faster and the exergy of evaporation is higher at the modified… ▽ More In this paper, a comparative study between modified solar stills (with graphite or copper oxide micro/nano particles) and classical solar still is carried out, based on the productivity and the thermal performance. Exergy destructions in various components of the solar stills have been calculated, analyzed and discussed. Evaporation is faster and the exergy of evaporation is higher at the modified solar stills than that of the classical one. Furthermore, the energy and exergy efficiencies of the modified stills are enhanced compared with the classical one. A brief discussion regarding the effect of different parameters on solar stills efficiency is also presented. The daytime energy efficiency of graphite/water and copper oxide/water mixtures are 41.18% and 38.61%, respectively, but for the classical still is only 29.17%. Moreover, the daytime exergy efficiencies of graphite, copper oxide nanofluid based stills and classical still are 4.32%, 3.78% and 2.63%, respectively. △ Less

Submitted 17 June, 2018; originally announced August 2018.

arXiv:1807.06785 [pdf, other]

Effect of Sensor Error on the Assessment of Seismic Building Damage

Authors: Ahmed Ibrahim, Ahmed Eltawil, Yunsu Na, Sherif El-Tawil

Abstract: Natural disasters affect structural health of buildings, thus directly impacting public safety. Continuous structural monitoring can be achieved by deploying an internet of things (IoT) network of distributed sensors in buildings to capture floor movement. These sensors can be used to compute the displacements of each floor, which can then be employed to assess building damage after a seismic even… ▽ More Natural disasters affect structural health of buildings, thus directly impacting public safety. Continuous structural monitoring can be achieved by deploying an internet of things (IoT) network of distributed sensors in buildings to capture floor movement. These sensors can be used to compute the displacements of each floor, which can then be employed to assess building damage after a seismic event. The peak relative floor displacement is computed, which is directly related to damage level according to government standards. With this information, the building inventory can be classified into immediate occupancy (IO), life safety (LS) or collapse prevention (CP) categories. In this work, we propose a zero velocity update (ZUPT) technique to minimize displacement estimation error. Theoretical derivation and experimental validation are presented. In addition, we investigate modeling sensor error and interstory drift ratio (IDR) distribution. Moreover, we discuss the impact of sensor error on the achieved building classification accuracy. △ Less

Submitted 18 July, 2018; originally announced July 2018.

arXiv:1806.04219 [pdf, other]

Physical Multi-Layer Phantoms for Intra-Body Communications

Authors: Ahmed E. Khorshid, Ibrahim N. Alquaydheb, Ahmed M. Eltawil, Fadi J. Kurdahi

Abstract: This paper presents approaches to creating tissue mimicking materials that can be used as phantoms for evaluating the performance of Body Area Networks (BAN). The main goal of the paper is to describe a methodology to create a repeatable experimental BAN platform that can be customized depending on the BAN scenario under test. Comparisons between different material compositions and percentages are… ▽ More This paper presents approaches to creating tissue mimicking materials that can be used as phantoms for evaluating the performance of Body Area Networks (BAN). The main goal of the paper is to describe a methodology to create a repeatable experimental BAN platform that can be customized depending on the BAN scenario under test. Comparisons between different material compositions and percentages are shown, along with the resulting electrical properties of each mixture over the frequency range of interest for intra-body communications; 100 KHz to 100 MHz. Test results on a composite multi-layer sample are presented confirming the efficacy of the proposed methodology. To date, this is the first paper that provides guidance on how to decide on concentration levels of ingredients, depending on the exact frequency range of operation, and the desired matched electrical characteristics (conductivity vs. permittivity), to create multi-layer phantoms for intra-body communication applications. △ Less

Submitted 11 June, 2018; originally announced June 2018.

MSC Class: 94A40; 94C99; 92C42

arXiv:1406.5555 [pdf, other]

doi 10.1109/TWC.2015.2407876

All-Digital Self-interference Cancellation Technique for Full-duplex Systems

Authors: Elsayed Ahmed, Ahmed M. Eltawil

Abstract: Full-duplex systems are expected to double the spectral efficiency compared to conventional half-duplex systems if the self-interference signal can be significantly mitigated. Digital cancellation is one of the lowest complexity self-interference cancellation techniques in full-duplex systems. However, its mitigation capability is very limited, mainly due to transmitter and receiver circuit's impa… ▽ More Full-duplex systems are expected to double the spectral efficiency compared to conventional half-duplex systems if the self-interference signal can be significantly mitigated. Digital cancellation is one of the lowest complexity self-interference cancellation techniques in full-duplex systems. However, its mitigation capability is very limited, mainly due to transmitter and receiver circuit's impairments. In this paper, we propose a novel digital self-interference cancellation technique for full-duplex systems. The proposed technique is shown to significantly mitigate the self-interference signal as well as the associated transmitter and receiver impairments. In the proposed technique, an auxiliary receiver chain is used to obtain a digital-domain copy of the transmitted Radio Frequency (RF) self-interference signal. The self-interference copy is then used in the digital-domain to cancel out both the self-interference signal and the associated impairments. Furthermore, to alleviate the receiver phase noise effect, a common oscillator is shared between the auxiliary and ordinary receiver chains. A thorough analytical and numerical analysis for the effect of the transmitter and receiver impairments on the cancellation capability of the proposed technique is presented. Finally, the overall performance is numerically investigated showing that using the proposed technique, the self-interference signal could be mitigated to ~3dB higher than the receiver noise floor, which results in up to 76% rate improvement compared to conventional half-duplex systems at 20dBm transmit power values. △ Less

Submitted 20 June, 2014; originally announced June 2014.

Comments: Submitted to IEEE Transactions on Wireless Communications

arXiv:1405.7720 [pdf, other]

Full-Duplex Systems Using Multi-Reconfigurable Antennas

Authors: Elsayed Ahmed, Ahmed M. Eltawil, Zhouyuan Li, Bedri A. Cetiner

Abstract: Full-duplex systems are expected to achieve 100% rate improvement over half-duplex systems if the self-interference signal can be significantly mitigated. In this paper, we propose the first full-duplex system utilizing Multi-Reconfigurable Antenna (MRA) with ?90% rate improvement compared to half-duplex systems. MRA is a dynamically reconfigurable antenna structure, that is capable of changing it… ▽ More Full-duplex systems are expected to achieve 100% rate improvement over half-duplex systems if the self-interference signal can be significantly mitigated. In this paper, we propose the first full-duplex system utilizing Multi-Reconfigurable Antenna (MRA) with ?90% rate improvement compared to half-duplex systems. MRA is a dynamically reconfigurable antenna structure, that is capable of changing its properties according to certain input configurations. A comprehensive experimental analysis is conducted to characterize the system performance in typical indoor environments. The experiments are performed using a fabricated MRA that has 4096 configurable radiation patterns. The achieved MRA-based passive self-interference suppression is investigated, with detailed analysis for the MRA training overhead. In addition, a heuristic-based approach is proposed to reduce the MRA training overhead. The results show that at 1% training overhead, a total of 95dB self-interference cancellation is achieved in typical indoor environments. The 95dB self-interference cancellation is experimentally shown to be sufficient for 90% full-duplex rate improvement compared to half-duplex systems. △ Less

Submitted 29 May, 2014; originally announced May 2014.

Comments: Submitted to IEEE Transactions on Wireless Communications

arXiv:1403.2785 [pdf]

State Dependent Statistical Timing Model for Voltage Scaled Circuits

Authors: Aras Pirbadian, Muhammad S. Khairy, Ahmed M. Eltawil, Fadi J. Kurdahi

Abstract: This paper presents a novel statistical state-dependent timing model for voltage over scaled (VoS) logic circuits that accurately and rapidly finds the timing distribution of output bits. Using this model erroneous VoS circuits can be represented as error-free circuits combined with an error-injector. A case study of a two point DFT unit employing the proposed model is presented and compared to HS… ▽ More This paper presents a novel statistical state-dependent timing model for voltage over scaled (VoS) logic circuits that accurately and rapidly finds the timing distribution of output bits. Using this model erroneous VoS circuits can be represented as error-free circuits combined with an error-injector. A case study of a two point DFT unit employing the proposed model is presented and compared to HSPICE circuit simulation. Results show an accurate match, with significant speedup gains. △ Less

Submitted 11 March, 2014; originally announced March 2014.

arXiv:1401.6437 [pdf, other]

doi 10.1109/TWC.2014.2365536

On Phase Noise Suppression in Full-Duplex Systems

Authors: Elsayed Ahmed, Ahmed M. Eltawil

Abstract: Oscillator phase noise has been shown to be one of the main performance limiting factors in full-duplex systems. In this paper, we consider the problem of self-interference cancellation with phase noise suppression in full-duplex systems. The feasibility of performing phase noise suppression in full-duplex systems in terms of both complexity and achieved gain is analytically and experimentally inv… ▽ More Oscillator phase noise has been shown to be one of the main performance limiting factors in full-duplex systems. In this paper, we consider the problem of self-interference cancellation with phase noise suppression in full-duplex systems. The feasibility of performing phase noise suppression in full-duplex systems in terms of both complexity and achieved gain is analytically and experimentally investigated. First, the effect of phase noise on full-duplex systems and the possibility of performing phase noise suppression are studied. Two different phase noise suppression techniques with a detailed complexity analysis are then proposed. For each suppression technique, both free-running and phase locked loop based oscillators are considered. Due to the fact that full-duplex system performance highly depends on hardware impairments, experimental analysis is essential for reliable results. In this paper, the performance of the proposed techniques is experimentally investigated in a typical indoor environment. The experimental results are shown to confirm the results obtained from numerical simulations on two different experimental research platforms. At the end, the tradeoff between the required complexity and the gain achieved using phase noise suppression is discussed. △ Less

Submitted 6 November, 2014; v1 submitted 24 January, 2014; originally announced January 2014.

Comments: Published in IEEE transactions on wireless communications on October-2014. Please refer to the IEEE version for the most updated document

arXiv:1307.4149 [pdf, other]

doi 10.1109/GLOCOM.2013.6831595

Self-Interference Cancellation with Phase Noise Induced ICI Suppression for Full-Duplex Systems

Authors: Elsayed Ahmed, Ahmed M. Eltawil, Ashutosh Sabharwal

Abstract: One of the main bottlenecks in practical full-duplex systems is the oscillator phase noise, which bounds the possible cancellable self-interference power. In this paper, a digitaldomain self-interference cancellation scheme for full-duplex orthogonal frequency division multiplexing systems is proposed. The proposed scheme increases the amount of cancellable selfinterference power by suppressing th… ▽ More One of the main bottlenecks in practical full-duplex systems is the oscillator phase noise, which bounds the possible cancellable self-interference power. In this paper, a digitaldomain self-interference cancellation scheme for full-duplex orthogonal frequency division multiplexing systems is proposed. The proposed scheme increases the amount of cancellable selfinterference power by suppressing the effect of both transmitter and receiver oscillator phase noise. The proposed scheme consists of two main phases, an estimation phase and a cancellation phase. In the estimation phase, the minimum mean square error estimator is used to jointly estimate the transmitter and receiver phase noise associated with the incoming self-interference signal. In the cancellation phase, the estimated phase noise is used to suppress the intercarrier interference caused by the phase noise associated with the incoming self-interference signal. The performance of the proposed scheme is numerically investigated under different operating conditions. It is demonstrated that the proposed scheme could achieve up to 9dB more self-interference cancellation than the existing digital-domain cancellation schemes that ignore the intercarrier interference suppression. △ Less

Submitted 15 July, 2013; originally announced July 2013.

Comments: To be presented in Global Telecommunications Conference (GLOBECOM 2013). arXiv admin note: text overlap with arXiv:1307.3796

arXiv:1307.3796 [pdf, other]

doi 10.1109/ACSSC.2013.6810483

Self-Interference Cancellation with Nonlinear Distortion Suppression for Full-Duplex Systems

Authors: Elsayed Ahmed, Ahmed M. Eltawil, Ashutosh Sabharwal

Abstract: In full-duplex systems, due to the strong self-interference signal, system nonlinearities become a significant limiting factor that bounds the possible cancellable self-interference power. In this paper, a self-interference cancellation scheme for full-duplex orthogonal frequency division multiplexing systems is proposed. The proposed scheme increases the amount of cancellable self-interference po… ▽ More In full-duplex systems, due to the strong self-interference signal, system nonlinearities become a significant limiting factor that bounds the possible cancellable self-interference power. In this paper, a self-interference cancellation scheme for full-duplex orthogonal frequency division multiplexing systems is proposed. The proposed scheme increases the amount of cancellable self-interference power by suppressing the distortion caused by the transmitter and receiver nonlinearities. An iterative technique is used to jointly estimate the self-interference channel and the nonlinearity coefficients required to suppress the distortion signal. The performance is numerically investigated showing that the proposed scheme achieves a performance that is less than 0.5dB off the performance of a linear full-duplex system. △ Less

Submitted 23 September, 2013; v1 submitted 14 July, 2013; originally announced July 2013.

Comments: To be presented in Asilomar Conference on Signals, Systems & Computers (November 2013)

arXiv:1303.1795 [pdf, other]

doi 10.1109/TWC.2013.060413.121871

Rate Gain Region and Design Tradeoffs for Full-Duplex Wireless Communications

Authors: Elsayed Ahmed, Ahmed Eltawil, Ashutosh Sabharwal

Abstract: In this paper, we analytically study the regime in which practical full-duplex systems can achieve larger rates than an equivalent half-duplex systems. The key challenge in practical full-duplex systems is uncancelled self-interference signal, which is caused by a combination of hardware and implementation imperfections. Thus, we first present a signal model which captures the effect of significan… ▽ More In this paper, we analytically study the regime in which practical full-duplex systems can achieve larger rates than an equivalent half-duplex systems. The key challenge in practical full-duplex systems is uncancelled self-interference signal, which is caused by a combination of hardware and implementation imperfections. Thus, we first present a signal model which captures the effect of significant impairments such as oscillator phase noise, low-noise amplifier noise figure, mixer noise, and analog-to-digital converter quantization noise. Using the detailed signal model, we study the rate gain region, which is defined as the region of received signal-of-interest strength where full-duplex systems outperform half-duplex systems in terms of achievable rate. The rate gain region is derived as a piece-wise linear approximation in log-domain, and numerical results show that the approximation closely matches the exact region. Our analysis shows that when phase noise dominates mixer and quantization noise, full-duplex systems can use either active analog cancellation or base-band digital cancellation to achieve near-identical rate gain regions. Finally, as a design example, we numerically investigate the full-duplex system performance and rate gain region in typical indoor environments for practical wireless applications. △ Less

Submitted 24 January, 2014; v1 submitted 7 March, 2013; originally announced March 2013.

Comments: Accepted on 09-May-2013 for publications at IEEE Transactions on Wireless Communications (check the IEEE website for the final published version)

Journal ref: Wireless Communications, IEEE Transactions on , vol.12, no.7, pp.3556,3565, July 2013

Showing 1–45 of 45 results for author: Eltawil, A