-
Unlabeled Compressed Sensing from Multiple Measurement Vectors
Authors:
Mohamed Akrout,
Amine Mezghani,
Faouzi Bellili
Abstract:
This paper introduces an algorithmic solution to a broader class of unlabeled sensing problems with multiple measurement vectors (MMV). The goal is to recover an unknown structured signal matrix, $\mathbf{X}$, from its noisy linear observation matrix, $\mathbf{Y}$, whose rows are further randomly shuffled by an unknown permutation matrix $\mathbf{U}$. A new Bayes-optimal unlabeled compressed sensi…
▽ More
This paper introduces an algorithmic solution to a broader class of unlabeled sensing problems with multiple measurement vectors (MMV). The goal is to recover an unknown structured signal matrix, $\mathbf{X}$, from its noisy linear observation matrix, $\mathbf{Y}$, whose rows are further randomly shuffled by an unknown permutation matrix $\mathbf{U}$. A new Bayes-optimal unlabeled compressed sensing (UCS) recovery algorithm is developed from the bilinear approximate message passing (Bi-VAMP) framework using non-separable and coupled priors on the rows and columns of the permutation matrix $\mathbf{U}$. In particular, standard unlabeled sensing is a special case of the proposed framework, and UCS further generalizes it by neither assuming a partially shuffled signal matrix $\mathbf{X}$ nor a small-sized permutation matrix $\mathbf{U}$. For the sake of theoretical performance prediction, we also conduct a state evolution (SE) analysis of the proposed algorithm and show its consistency with the asymptotic empirical mean-squared error (MSE). Numerical results demonstrate the effectiveness of the proposed UCS algorithm and its advantage over state-of-the-art baseline approaches in various applications. We also numerically examine the phase transition diagrams of UCS, thereby characterizing the detectability region as a function of the signal-to-noise ratio (SNR).
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Next-slot OFDM-CSI Prediction: Multi-head Self-attention or State Space Model?
Authors:
Mohamed Akrout,
Faouzi Bellili,
Amine Mezghani,
Robert W. Heath
Abstract:
The ongoing fifth-generation (5G) standardization is exploring the use of deep learning (DL) methods to enhance the new radio (NR) interface. Both in academia and industry, researchers are investigating the performance and complexity of multiple DL architecture candidates for specific one-sided and two-sided use cases such as channel state estimation (CSI) feedback, CSI prediction, beam management…
▽ More
The ongoing fifth-generation (5G) standardization is exploring the use of deep learning (DL) methods to enhance the new radio (NR) interface. Both in academia and industry, researchers are investigating the performance and complexity of multiple DL architecture candidates for specific one-sided and two-sided use cases such as channel state estimation (CSI) feedback, CSI prediction, beam management, and positioning. In this paper, we set focus on the CSI prediction task and study the performance and generalization of the two main DL layers that are being extensively benchmarked within the DL community, namely, multi-head self-attention (MSA) and state-space model (SSM). We train and evaluate MSA and SSM layers to predict the next slot for uplink and downlink communication scenarios over urban microcell (UMi) and urban macrocell (UMa) OFDM 5G channel models. Our numerical results demonstrate that SSMs exhibit better prediction and generalization capabilities than MSAs only for SISO cases. For MIMO scenarios, however, the MSA layer outperforms the SSM one. While both layers represent potential DL architectures for future DL-enabled 5G use cases, the overall investigation of this paper favors MSAs over SSMs.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Vector Approximate Message Passing With Arbitrary I.I.D. Noise Priors
Authors:
Mohamed Akrout,
Tiancheng Gao,
Faouzi Bellili,
Amine Mezghani
Abstract:
Approximate message passing (AMP) algorithms are devised under the Gaussianity assumption of the measurement noise vector. In this work, we relax this assumption within the vector AMP (VAMP) framework to arbitrary independent and identically distributed (i.i.d.) noise priors. We do so by rederiving the linear minimum mean square error (LMMSE) to accommodate both the noise and signal estimations wi…
▽ More
Approximate message passing (AMP) algorithms are devised under the Gaussianity assumption of the measurement noise vector. In this work, we relax this assumption within the vector AMP (VAMP) framework to arbitrary independent and identically distributed (i.i.d.) noise priors. We do so by rederiving the linear minimum mean square error (LMMSE) to accommodate both the noise and signal estimations within the message passing steps of VAMP. Numerical results demonstrate how our proposed algorithm handles non-Gaussian noise models as compared to VAMP. This extension to general noise priors enables the use of AMP algorithms in a wider range of engineering applications where non-Gaussian noise models are more appropriate.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Representations Matter: Embedding Modes of Large Language Models using Dynamic Mode Decomposition
Authors:
Mohamed Akrout
Abstract:
Existing large language models (LLMs) are known for generating "hallucinated" content, namely a fabricated text of plausibly looking, yet unfounded, facts. To identify when these hallucination scenarios occur, we examine the properties of the generated text in the embedding space. Specifically, we draw inspiration from the dynamic mode decomposition (DMD) tool in analyzing the pattern evolution of…
▽ More
Existing large language models (LLMs) are known for generating "hallucinated" content, namely a fabricated text of plausibly looking, yet unfounded, facts. To identify when these hallucination scenarios occur, we examine the properties of the generated text in the embedding space. Specifically, we draw inspiration from the dynamic mode decomposition (DMD) tool in analyzing the pattern evolution of text embeddings across sentences. We empirically demonstrate how the spectrum of sentence embeddings over paragraphs is constantly low-rank for the generated text, unlike that of the ground-truth text. Importantly, we find that evaluation cases having LLM hallucinations correspond to ground-truth embedding patterns with a higher number of modes being poorly approximated by the few modes associated with LLM embedding patterns. In analogy to near-field electromagnetic evanescent waves, the embedding DMD eigenmodes of the generated text with hallucinations vanishes quickly across sentences as opposed to those of the ground-truth text. This suggests that the hallucinations result from both the generation techniques and the underlying representation.
△ Less
Submitted 3 September, 2023;
originally announced September 2023.
-
From Multilayer Perceptron to GPT: A Reflection on Deep Learning Research for Wireless Physical Layer
Authors:
Mohamed Akrout,
Amine Mezghani,
Ekram Hossain,
Faouzi Bellili,
Robert W. Heath
Abstract:
Most research studies on deep learning (DL) applied to the physical layer of wireless communication do not put forward the critical role of the accuracy-generalization trade-off in develo** and evaluating practical algorithms. To highlight the disadvantage of this common practice, we revisit a data decoding example from one of the first papers introducing DL-based end-to-end wireless communicati…
▽ More
Most research studies on deep learning (DL) applied to the physical layer of wireless communication do not put forward the critical role of the accuracy-generalization trade-off in develo** and evaluating practical algorithms. To highlight the disadvantage of this common practice, we revisit a data decoding example from one of the first papers introducing DL-based end-to-end wireless communication systems to the research community and promoting the use of artificial intelligence (AI)/DL for the wireless physical layer. We then put forward two key trade-offs in designing DL models for communication, namely, accuracy versus generalization and compression versus latency. We discuss their relevance in the context of wireless communications use cases using emerging DL models including large language models (LLMs). Finally, we summarize our proposed evaluation guidelines to enhance the research impact of DL on wireless communications. These guidelines are an attempt to reconcile the empirical nature of DL research with the rigorous requirement metrics of wireless communications systems.
△ Less
Submitted 14 July, 2023;
originally announced July 2023.
-
Domain Generalization in Machine Learning Models for Wireless Communications: Concepts, State-of-the-Art, and Open Issues
Authors:
Mohamed Akrout,
Amal Feriani,
Faouzi Bellili,
Amine Mezghani,
Ekram Hossain
Abstract:
Data-driven machine learning (ML) is promoted as one potential technology to be used in next-generations wireless systems. This led to a large body of research work that applies ML techniques to solve problems in different layers of the wireless transmission link. However, most of these applications rely on supervised learning which assumes that the source (training) and target (test) data are ind…
▽ More
Data-driven machine learning (ML) is promoted as one potential technology to be used in next-generations wireless systems. This led to a large body of research work that applies ML techniques to solve problems in different layers of the wireless transmission link. However, most of these applications rely on supervised learning which assumes that the source (training) and target (test) data are independent and identically distributed (i.i.d). This assumption is often violated in the real world due to domain or distribution shifts between the source and the target data. Thus, it is important to ensure that these algorithms generalize to out-of-distribution (OOD) data. In this context, domain generalization (DG) tackles the OOD-related issues by learning models on different and distinct source domains/datasets with generalization capabilities to unseen new domains without additional finetuning. Motivated by the importance of DG requirements for wireless applications, we present a comprehensive overview of the recent developments in DG and the different sources of domain shift. We also summarize the existing DG methods and review their applications in selected wireless communication problems, and conclude with insights and open questions.
△ Less
Submitted 13 March, 2023;
originally announced March 2023.
-
Physically Consistent Models for Intelligent Reflective Surface-assisted Communications under Mutual Coupling and Element Size Constraint
Authors:
Mohamed Akrout,
Faouzi Bellili,
Amine Mezghani,
Josef A. Nossek
Abstract:
We investigate the benefits of mutual coupling effects between the passive elements of intelligent reconfigurable surfaces (IRSs) on maximizing the achievable rate of downlink Internet-of-Things (IoT) networks. In this paper, we present an electromagnetic (EM) coupling model for IRSs whose elements are connected minimum scattering antennas (i.e., dipoles). Using Chu's theory, we incorporate the fi…
▽ More
We investigate the benefits of mutual coupling effects between the passive elements of intelligent reconfigurable surfaces (IRSs) on maximizing the achievable rate of downlink Internet-of-Things (IoT) networks. In this paper, we present an electromagnetic (EM) coupling model for IRSs whose elements are connected minimum scattering antennas (i.e., dipoles). Using Chu's theory, we incorporate the finite antenna size constraint on each element of the IRS to obtain the IRS mutual impedance matrix. By maximizing the IRS phase shiters using the gradient ascent procedure, our numerical results show that mutual coupling is indeed crucial to avoid the achievable rate degradation when the spacing between IRS elements is down to a fraction of the wavelength.
△ Less
Submitted 21 February, 2023;
originally announced February 2023.
-
Diffusion-based Data Augmentation for Skin Disease Classification: Impact Across Original Medical Datasets to Fully Synthetic Images
Authors:
Mohamed Akrout,
Bálint Gyepesi,
Péter Holló,
Adrienn Poór,
Blága Kincső,
Stephen Solis,
Katrina Cirone,
Jeremy Kawahara,
Dekker Slade,
Latif Abid,
Máté Kovács,
István Fazekas
Abstract:
Despite continued advancement in recent years, deep neural networks still rely on large amounts of training data to avoid overfitting. However, labeled training data for real-world applications such as healthcare is limited and difficult to access given longstanding privacy, and strict data sharing policies. By manipulating image datasets in the pixel or feature space, existing data augmentation t…
▽ More
Despite continued advancement in recent years, deep neural networks still rely on large amounts of training data to avoid overfitting. However, labeled training data for real-world applications such as healthcare is limited and difficult to access given longstanding privacy, and strict data sharing policies. By manipulating image datasets in the pixel or feature space, existing data augmentation techniques represent one of the effective ways to improve the quantity and diversity of training data. Here, we look to advance augmentation techniques by building upon the emerging success of text-to-image diffusion probabilistic models in augmenting the training samples of our macroscopic skin disease dataset. We do so by enabling fine-grained control of the image generation process via input text prompts. We demonstrate that this generative data augmentation approach successfully maintains a similar classification accuracy of the visual classifier even when trained on a fully synthetic skin disease dataset. Similar to recent applications of generative models, our study suggests that diffusion models are indeed effective in generating high-quality skin images that do not sacrifice the classifier performance, and can improve the augmentation of training datasets after curation.
△ Less
Submitted 11 January, 2023;
originally announced January 2023.
-
Continual Learning-Based MIMO Channel Estimation: A Benchmarking Study
Authors:
Mohamed Akrout,
Amal Feriani,
Faouzi Bellili,
Amine Mezghani,
Ekram Hossain
Abstract:
With the proliferation of deep learning techniques for wireless communication, several works have adopted learning-based approaches to solve the channel estimation problem. While these methods are usually promoted for their computational efficiency at inference time, their use is restricted to specific stationary training settings in terms of communication system parameters, e.g., signal-to-noise…
▽ More
With the proliferation of deep learning techniques for wireless communication, several works have adopted learning-based approaches to solve the channel estimation problem. While these methods are usually promoted for their computational efficiency at inference time, their use is restricted to specific stationary training settings in terms of communication system parameters, e.g., signal-to-noise ratio (SNR) and coherence time. Therefore, the performance of these learning-based solutions will degrade when the models are tested on different settings than the ones used for training. This motivates our work in which we investigate continual supervised learning (CL) to mitigate the shortcomings of the current approaches. In particular, we design a set of channel estimation tasks wherein we vary different parameters of the channel model. We focus on Gauss-Markov Rayleigh fading channel estimation to assess the impact of non-stationarity on performance in terms of the mean square error (MSE) criterion. We study a selection of state-of-the-art CL methods and we showcase empirically the importance of catastrophic forgetting in continuously evolving channel settings. Our results demonstrate that the CL algorithms can improve the interference performance in two channel estimation tasks governed by changes in the SNR level and coherence time.
△ Less
Submitted 19 November, 2022;
originally announced November 2022.
-
A 35-Year Longitudinal Analysis of Dermatology Patient Behavior across Economic & Cultural Manifestations in Tunisia, and the Impact of Digital Tools
Authors:
Mohamed Akrout,
Hayet Amdouni,
Amal Feriani,
Monia Kourda,
Latif Abid
Abstract:
The evolution of behavior of dermatology patients has seen significantly accelerated change over the past decade, driven by surging availability and adoption of digital tools and platforms. Through our longitudinal analysis of this behavior within Tunisia over a 35-year time frame, we identify behavioral patterns across economic and cultural dimensions and how digital tools have impacted those pat…
▽ More
The evolution of behavior of dermatology patients has seen significantly accelerated change over the past decade, driven by surging availability and adoption of digital tools and platforms. Through our longitudinal analysis of this behavior within Tunisia over a 35-year time frame, we identify behavioral patterns across economic and cultural dimensions and how digital tools have impacted those patterns in preceding years. Throughout this work, we highlight the witnessed effects of available digital tools as experienced by patients, and conclude by presenting a vision for how future tools can help address the issues identified across economic and cultural manifestations. Our analysis is further framed around three types of digital tools: "Dr. Google", social media, and artificial intelligence (AI) tools, and across three stages of clinical care: pre-visit, in-visit, and post-visit.
△ Less
Submitted 4 August, 2022;
originally announced August 2022.
-
Super-Wideband Massive MIMO
Authors:
Mohamed Akrout,
Volodymyr Shyianov,
Faouzi Bellili,
Amine Mezghani,
Robert W. Heath
Abstract:
We present a unified model for connected antenna arrays with a large number of tightly integrated (i.e., coupled) antennas in a compact space within the context of massive multiple-input multiple-output (MIMO) communication. We refer to this system as tightly-coupled massive MIMO. From an information-theoretic perspective, scaling the design of tightly-coupled massive MIMO systems in terms of the…
▽ More
We present a unified model for connected antenna arrays with a large number of tightly integrated (i.e., coupled) antennas in a compact space within the context of massive multiple-input multiple-output (MIMO) communication. We refer to this system as tightly-coupled massive MIMO. From an information-theoretic perspective, scaling the design of tightly-coupled massive MIMO systems in terms of the number of antennas, the operational bandwidth, and form factor was not addressed in prior art. We investigate this open research problem using a physically consistent modeling approach for far-field (FF) MIMO communication based on multi-port circuit theory. In doing so, we turn mutual coupling (MC) from a foe to a friend of MIMO systems design, thereby challenging a basic percept in antenna systems engineering that promotes MC mitigation/compensation. We show that tight MC widens the operational bandwidth of antenna arrays thereby unleashing a missing MIMO gain that we coin "bandwidth gain". Furthermore, we derive analytically the asymptotically optimum spacing-to-antenna-size ratio by establishing a condition for tight coupling in the limit of large-size antenna arrays with quasi-continuous apertures. We also optimize the antenna array size while maximizing the achievable rate under fixed transmit power and inter-element spacing. Then, we study the impact of MC on the achievable rate of MIMO systems under line-of-sight (LoS) and Rayleigh fading channels. These results reveal new insights into the design of tightly-coupled massive antenna arrays as opposed to the widely-adopted "disconnected" designs that disregard MC by putting faith in the half-wavelength spacing rule.
△ Less
Submitted 6 May, 2023; v1 submitted 2 August, 2022;
originally announced August 2022.
-
Dynamic Noises of Multi-Agent Environments Can Improve Generalization: Agent-based Models meets Reinforcement Learning
Authors:
Mohamed Akrout,
Amal Feriani,
Bob McLeod
Abstract:
We study the benefits of reinforcement learning (RL) environments based on agent-based models (ABM). While ABMs are known to offer microfoundational simulations at the cost of computational complexity, we empirically show in this work that their non-deterministic dynamics can improve the generalization of RL agents. To this end, we examine the control of an epidemic SIR environments based on eithe…
▽ More
We study the benefits of reinforcement learning (RL) environments based on agent-based models (ABM). While ABMs are known to offer microfoundational simulations at the cost of computational complexity, we empirically show in this work that their non-deterministic dynamics can improve the generalization of RL agents. To this end, we examine the control of an epidemic SIR environments based on either differential equations or ABMs. Numerical simulations demonstrate that the intrinsic noise in the ABM-based dynamics of the SIR model not only improve the average reward but also allow the RL agent to generalize on a wider ranges of epidemic parameters.
△ Less
Submitted 26 March, 2022;
originally announced April 2022.
-
Achievable Rate of Near-Field Communications Based on Physically Consistent Models
Authors:
Mohamed Akrout,
Volodymyr Shyianov,
Faouzi Bellili,
Amine Mezghani,
Robert W. Heath
Abstract:
This paper introduces a novel information-theoretic approach for studying the effects of mutual coupling (MC), between the transmit and receive antennas, on the overall performance of single-input-single-output (SISO) near-field communications. By incorporating the finite antenna size constraint using Chu's theory and under the assumption of canonical-minimum scattering, we derive the MC between t…
▽ More
This paper introduces a novel information-theoretic approach for studying the effects of mutual coupling (MC), between the transmit and receive antennas, on the overall performance of single-input-single-output (SISO) near-field communications. By incorporating the finite antenna size constraint using Chu's theory and under the assumption of canonical-minimum scattering, we derive the MC between two radiating volumes of fixed sizes. Expressions for the self and mutual impedances are obtained by the use of the reciprocity theorem. Based on a circuit-theoretic two-port model for SISO radio communication systems, we establish the achievable rate for a given pair of transmit and receive antenna sizes, thereby providing an upper bound on the system performance under physical size constraints. Through the lens of these findings, we shed new light on the influence of MC on the information-theoretic limits of near-field communications using compact antennas.
△ Less
Submitted 9 December, 2021; v1 submitted 17 November, 2021;
originally announced November 2021.
-
On a Conjecture Regarding the Adam Optimizer
Authors:
Mohamed Akrout,
Douglas Tweed
Abstract:
Why does the Adam optimizer work so well in deep-learning applications? Adam's originators, Kingma and Ba, presented a mathematical argument that was meant to help explain its success, but Bock and colleagues have since reported that a key piece is missing from that argument $-$ an unproven lemma which we will call Bock's conjecture. Here we show that this conjecture is false, but we prove a modif…
▽ More
Why does the Adam optimizer work so well in deep-learning applications? Adam's originators, Kingma and Ba, presented a mathematical argument that was meant to help explain its success, but Bock and colleagues have since reported that a key piece is missing from that argument $-$ an unproven lemma which we will call Bock's conjecture. Here we show that this conjecture is false, but we prove a modified version of it $-$ a generalization of a result of Reddi and colleagues $-$ which can take its place in analyses of Adam.
△ Less
Submitted 8 September, 2022; v1 submitted 15 November, 2021;
originally announced November 2021.
-
Optimizing Binary Symptom Checkers via Approximate Message Passing
Authors:
Mohamed Akrout,
Faouzi Bellili,
Amine Mezghani,
Hayet Amdouni
Abstract:
Symptom checkers have been widely adopted as an intelligent e-healthcare application during the ongoing pandemic crisis. Their performance have been limited by the fine-grained quality of the collected medical knowledge between symptom and diseases. While the binarization of the relationships between symptoms and diseases simplifies the data collection process, it also leads to non-convex optimiza…
▽ More
Symptom checkers have been widely adopted as an intelligent e-healthcare application during the ongoing pandemic crisis. Their performance have been limited by the fine-grained quality of the collected medical knowledge between symptom and diseases. While the binarization of the relationships between symptoms and diseases simplifies the data collection process, it also leads to non-convex optimization problems during the inference step. In this paper, we formulate the symptom checking problem as an underdertermined non-convex optimization problem, thereby justifying the use of the compressive sensing framework to solve it. We show that the generalized vector approximate message passing (G-VAMP) algorithm provides the best performance for binary symptom checkers.
△ Less
Submitted 30 October, 2021;
originally announced November 2021.
-
Benchmarking the Accuracy and Robustness of Feedback Alignment Algorithms
Authors:
Albert Jiménez Sanfiz,
Mohamed Akrout
Abstract:
Backpropagation is the default algorithm for training deep neural networks due to its simplicity, efficiency and high convergence rate. However, its requirements make it impossible to be implemented in a human brain. In recent years, more biologically plausible learning methods have been proposed. Some of these methods can match backpropagation accuracy, and simultaneously provide other extra bene…
▽ More
Backpropagation is the default algorithm for training deep neural networks due to its simplicity, efficiency and high convergence rate. However, its requirements make it impossible to be implemented in a human brain. In recent years, more biologically plausible learning methods have been proposed. Some of these methods can match backpropagation accuracy, and simultaneously provide other extra benefits such as faster training on specialized hardware (e.g., ASICs) or higher robustness against adversarial attacks. While the interest in the field is growing, there is a necessity for open-source libraries and toolkits to foster research and benchmark algorithms. In this paper, we present BioTorch, a software framework to create, train, and benchmark biologically motivated neural networks. In addition, we investigate the performance of several feedback alignment methods proposed in the literature, thereby unveiling the importance of the forward and backward weight initialization and optimizer choice. Finally, we provide a novel robustness study of these methods against state-of-the-art white and black-box adversarial attacks.
△ Less
Submitted 30 August, 2021;
originally announced August 2021.
-
Achievable Rate with Antenna Size Constraint: Shannon meets Chu and Bode
Authors:
Volodymyr Shyianov,
Mohamed Akrout,
Faouzi Bellili,
Amine Mezghani,
Robert W. Heath
Abstract:
Using ideas from Chu and Bode/Fano theories, we characterize the maximum achievable rate over the single-input single-output wireless communication channels under a restriction on the antenna size at the receiver. By employing circuit-theoretic multiport models for radio communication systems, we derive the information-theoretic limits of compact antennas. We first describe an equivalent Chu's ant…
▽ More
Using ideas from Chu and Bode/Fano theories, we characterize the maximum achievable rate over the single-input single-output wireless communication channels under a restriction on the antenna size at the receiver. By employing circuit-theoretic multiport models for radio communication systems, we derive the information-theoretic limits of compact antennas. We first describe an equivalent Chu's antenna circuit under the physical realizability conditions of its reflection coefficient. Such a design allows us to subsequently compute the achievable rate for a given receive antenna size thereby providing a physical bound on the system performance that we compare to the standard size-unconstrained Shannon capacity. We also determine the effective signal-to-noise ratio (SNR) which strongly depends on the antenna size and experiences an apparent finite-size performance degradation where only a fraction of Shannon capacity can be achieved. We further determine the optimal signaling bandwidth which shows that impedance matching is essential in both narrowband and broadband scenarios. We also examine the achievable rate in presence of interference showing that the size constraint is immaterial in interference-limited scenarios. Finally, our numerical results of the derived achievable rate as function of the antenna size and the SNR reveal new insights for the physically consistent design of radio systems.
△ Less
Submitted 16 July, 2021; v1 submitted 10 November, 2020;
originally announced November 2020.
-
Bilinear Generalized Vector Approximate Message Passing
Authors:
Mohamed Akrout,
Anis Housseini,
Faouzi Bellili,
Amine Mezghani
Abstract:
We introduce the bilinear generalized vector approximate message passing (BiG-VAMP) algorithm which jointly recovers two matrices U and V from their noisy product through a probabilistic observation model. BiG-VAMP provides computationally efficient approximate implementations of both max-sum and sumproduct loopy belief propagation (BP). We show how the proposed BiG-VAMP algorithm recovers differe…
▽ More
We introduce the bilinear generalized vector approximate message passing (BiG-VAMP) algorithm which jointly recovers two matrices U and V from their noisy product through a probabilistic observation model. BiG-VAMP provides computationally efficient approximate implementations of both max-sum and sumproduct loopy belief propagation (BP). We show how the proposed BiG-VAMP algorithm recovers different types of structured matrices and overcomes the fundamental limitations of other state-of-the-art approaches to the bilinear recovery problem, such as BiG-AMP, BAd-VAMP and LowRAMP. In essence, BiG-VAMP applies to a broader class of practical applications which involve a general form of structured matrices. For the sake of theoretical performance prediction, we also conduct a state evolution (SE) analysis of the proposed algorithm and show its consistency with the asymptotic empirical mean-squared error (MSE). Numerical results on various applications such as matrix factorization, dictionary learning, and matrix completion demonstrate unambiguously the effectiveness of the proposed BiG-VAMP algorithm and its superiority over stateof-the-art algorithms. Using the developed SE framework, we also examine (as one example) the phase transition diagrams of the matrix completion problem, thereby unveiling a low detectability region corresponding to the low signal-to-noise ratio (SNR) regime.
△ Less
Submitted 14 September, 2020;
originally announced September 2020.
-
Distributed Uplink Beamforming in Cell-Free Networks Using Deep Reinforcement Learning
Authors:
Firas Fredj,
Yasser Al-Eryani,
Setareh Maghsudi,
Mohamed Akrout,
Ekram Hossain
Abstract:
The emergence of new wireless technologies together with the requirement of massive connectivity results in several technical issues such as excessive interference, high computational demand for signal processing, and lengthy processing delays. In this work, we propose several beamforming techniques for an uplink cell-free network with centralized, semi-distributed, and fully distributed processin…
▽ More
The emergence of new wireless technologies together with the requirement of massive connectivity results in several technical issues such as excessive interference, high computational demand for signal processing, and lengthy processing delays. In this work, we propose several beamforming techniques for an uplink cell-free network with centralized, semi-distributed, and fully distributed processing, all based on deep reinforcement learning (DRL). First, we propose a fully centralized beamforming method that uses the deep deterministic policy gradient algorithm (DDPG) with continuous space. We then enhance this method by enabling distributed experience at access points (AP). Indeed, we develop a beamforming scheme that uses the distributed distributional deterministic policy gradients algorithm (D4PG) with the APs representing the distributed agents. Finally, to decrease the computational complexity, we propose a fully distributed beamforming scheme that divides the beamforming computations among APs. The results show that the D4PG scheme with distributed experience achieves the best performance irrespective of the network size. Furthermore, the proposed distributed beamforming technique performs better than the DDPG algorithm with centralized learning only for small-scale networks. The performance superiority of the DDPG model becomes more evident as the number of APs and/or users increases. Moreover, during the operation stage, all DRL models demonstrate a significantly shorter processing time than that of the conventional gradient descent (GD) solution.
△ Less
Submitted 21 October, 2021; v1 submitted 26 June, 2020;
originally announced June 2020.
-
Simultaneous Energy Harvesting and Information Transmission in a MIMO Full-Duplex System: A Machine Learning-Based Design
Authors:
Yasser Al-Eryani,
Mohamed Akrout,
Ekram Hossain
Abstract:
We propose a multiple-input multiple-output (MIMO)-based full-duplex (FD) scheme that enables wireless devices to simultaneously transmit information and harvest energy using the same time-frequency resources. In this scheme, for a MIMO point-to-point set up, the energy transmitting device simultaneously receives information from the energy harvesting device. Furthermore, the self-interference (SI…
▽ More
We propose a multiple-input multiple-output (MIMO)-based full-duplex (FD) scheme that enables wireless devices to simultaneously transmit information and harvest energy using the same time-frequency resources. In this scheme, for a MIMO point-to-point set up, the energy transmitting device simultaneously receives information from the energy harvesting device. Furthermore, the self-interference (SI) at the energy harvesting device caused by the FD mode of operation is utilized as a desired power signal to be harvested by the device. For implementation-friendly antenna selection and MIMO precoding at both the devices, we propose two methods: (i) a sub-optimal method based on relaxation, and (ii) a hybrid deep reinforcement learning (DRL)-based method, specifically, a deep deterministic policy gradient (DDPG)-deep double Q-network (DDQN) method. Finally, we study the performance of the proposed system under the two implementation methods and compare it with that of the conventional time switching-based simultaneous wireless information and power transfer (SWIPT) method.
Findings show that the proposed system gives a significant improvement in spectral efficiency compared to the time switching-based SWIPT. In particular, the DRL-based method provides the highest spectral efficiency. Furthermore, numerical results show that, for the considered system set up, the number of antennas in each device should exceed three to mitigate self-interference to an acceptable level.
△ Less
Submitted 13 February, 2020;
originally announced February 2020.
-
Multiple Access in Dynamic Cell-Free Networks: Outage Performance and Deep Reinforcement Learning-Based Design
Authors:
Yasser Al-Eryani,
Mohamed Akrout,
Ekram Hossain
Abstract:
In future cell-free (or cell-less) wireless networks, a large number of devices in a geographical area will be served simultaneously in non-orthogonal multiple access scenarios by a large number of distributed access points (APs), which coordinate with a centralized processing pool. For such a centralized cell-free network with static predefined beamforming design, we first derive a closed-form ex…
▽ More
In future cell-free (or cell-less) wireless networks, a large number of devices in a geographical area will be served simultaneously in non-orthogonal multiple access scenarios by a large number of distributed access points (APs), which coordinate with a centralized processing pool. For such a centralized cell-free network with static predefined beamforming design, we first derive a closed-form expression of the uplink per-user probability of outage. To significantly reduce the complexity of joint processing of users' signals in presence of a large number of devices and APs, we propose a novel dynamic cell-free network architecture. In this architecture, the distributed APs are partitioned (i.e. clustered) among a set of subgroups with each subgroup acting as a virtual AP equipped with a distributed antenna system (DAS). The conventional static cell-free network is a special case of this dynamic cell-free network when the cluster size is one. For this dynamic cell-free network, we propose a successive interference cancellation (SIC)-enabled signal detection method and an inter-user-interference (IUI)-aware DAS's receive diversity combining scheme. We then formulate the general problem of clustering APs and designing the beamforming vectors with an objective to maximizing the sum rate or maximizing the minimum rate. To this end, we propose a hybrid deep reinforcement learning (DRL) model, namely, a deep deterministic policy gradient (DDPG)-deep double Q-network (DDQN) model, to solve the optimization problem for online implementation with low complexity. The DRL model for sum-rate optimization significantly outperforms that for maximizing the minimum rate in terms of average per-user rate performance. Also, in our system setting, the proposed DDPG-DDQN scheme is found to achieve around $78\%$ of the rate achievable through an exhaustive search-based design.
△ Less
Submitted 23 February, 2020; v1 submitted 28 January, 2020;
originally announced February 2020.
-
Machine Ethics: The Creation of a Virtuous Machine
Authors:
Mohamed Akrout,
Robert Steinbauer
Abstract:
Artificial intelligence (AI) was initially developed as an implicit moral agent to solve simple and clearly defined tasks where all options are predictable. However, it is now part of our daily life powering cell phones, cameras, watches, thermostats, vacuums, cars, and much more. This has raised numerous concerns and some scholars and practitioners stress the dangers of AI and argue against its d…
▽ More
Artificial intelligence (AI) was initially developed as an implicit moral agent to solve simple and clearly defined tasks where all options are predictable. However, it is now part of our daily life powering cell phones, cameras, watches, thermostats, vacuums, cars, and much more. This has raised numerous concerns and some scholars and practitioners stress the dangers of AI and argue against its development as moral agents that can reason about ethics (e.g., Bryson 2008; Johnson and Miller 2008; Sharkey 2017; Tonkens 2009; van Wynsberghe and Robbins 2019). Even though we acknowledge the potential threat, in line with most other scholars (e.g., Anderson and Anderson 2010; Moor 2006; Scheutz 2016; Wallach 2010), we argue that AI advancements cannot be stopped and developers need to prepare AI to sustain explicit moral agents and face ethical dilemmas in complex and morally salient environments.
△ Less
Submitted 7 February, 2020; v1 submitted 1 February, 2020;
originally announced February 2020.
-
On the Adversarial Robustness of Neural Networks without Weight Transport
Authors:
Mohamed Akrout
Abstract:
Neural networks trained with backpropagation, the standard algorithm of deep learning which uses weight transport, are easily fooled by existing gradient-based adversarial attacks. This class of attacks are based on certain small perturbations of the inputs to make networks misclassify them. We show that less biologically implausible deep neural networks trained with feedback alignment, which do n…
▽ More
Neural networks trained with backpropagation, the standard algorithm of deep learning which uses weight transport, are easily fooled by existing gradient-based adversarial attacks. This class of attacks are based on certain small perturbations of the inputs to make networks misclassify them. We show that less biologically implausible deep neural networks trained with feedback alignment, which do not use weight transport, can be harder to fool, providing actual robustness. Tested on MNIST, deep neural networks trained without weight transport (1) have an adversarial accuracy of 98% compared to 0.03% for neural networks trained with backpropagation and (2) generate non-transferable adversarial examples. However, this gap decreases on CIFAR-10 but is still significant particularly for small perturbation magnitude less than 1/2.
△ Less
Submitted 2 October, 2019; v1 submitted 9 August, 2019;
originally announced August 2019.
-
Deep Learning without Weight Transport
Authors:
Mohamed Akrout,
Collin Wilson,
Peter C. Humphreys,
Timothy Lillicrap,
Douglas Tweed
Abstract:
Current algorithms for deep learning probably cannot run in the brain because they rely on weight transport, where forward-path neurons transmit their synaptic weights to a feedback path, in a way that is likely impossible biologically. An algorithm called feedback alignment achieves deep learning without weight transport by using random feedback weights, but it performs poorly on hard visual-reco…
▽ More
Current algorithms for deep learning probably cannot run in the brain because they rely on weight transport, where forward-path neurons transmit their synaptic weights to a feedback path, in a way that is likely impossible biologically. An algorithm called feedback alignment achieves deep learning without weight transport by using random feedback weights, but it performs poorly on hard visual-recognition tasks. Here we describe two mechanisms - a neural circuit called a weight mirror and a modification of an algorithm proposed by Kolen and Pollack in 1994 - both of which let the feedback path learn appropriate synaptic weights quickly and accurately even in large networks, without weight transport or complex wiring.Tested on the ImageNet visual-recognition task, these mechanisms outperform both feedback alignment and the newer sign-symmetry method, and nearly match backprop, the standard algorithm of deep learning, which uses weight transport.
△ Less
Submitted 9 January, 2020; v1 submitted 10 April, 2019;
originally announced April 2019.
-
Improving Skin Condition Classification with a Visual Symptom Checker Trained using Reinforcement Learning
Authors:
Mohamed Akrout,
Amir-massoud Farahmand,
Tory Jarmain,
Latif Abid
Abstract:
We present a visual symptom checker that combines a pre-trained Convolutional Neural Network (CNN) with a Reinforcement Learning (RL) agent as a Question Answering (QA) model. This method increases the classification confidence and accuracy of the visual symptom checker, and decreases the average number of questions asked to narrow down the differential diagnosis. A Deep Q-Network (DQN)-based RL a…
▽ More
We present a visual symptom checker that combines a pre-trained Convolutional Neural Network (CNN) with a Reinforcement Learning (RL) agent as a Question Answering (QA) model. This method increases the classification confidence and accuracy of the visual symptom checker, and decreases the average number of questions asked to narrow down the differential diagnosis. A Deep Q-Network (DQN)-based RL agent learns how to ask the patient about the presence of symptoms in order to maximize the probability of correctly identifying the underlying condition. The RL agent uses the visual information provided by CNN in addition to the answers to the asked questions to guide the QA system. We demonstrate that the RL-based approach increases the accuracy more than 20% compared to the CNN-only approach, which only uses the visual information to predict the condition. Moreover, the increased accuracy is up to 10% compared to the approach that uses the visual information provided by CNN along with a conventional decision tree-based QA system. We finally show that the RL-based approach not only outperforms the decision tree-based approach, but also narrows down the diagnosis faster in terms of the average number of asked questions.
△ Less
Submitted 7 August, 2019; v1 submitted 8 March, 2019;
originally announced March 2019.
-
Hacking Google reCAPTCHA v3 using Reinforcement Learning
Authors:
Ismail Akrout,
Amal Feriani,
Mohamed Akrout
Abstract:
We present a Reinforcement Learning (RL) methodology to bypass Google reCAPTCHA v3. We formulate the problem as a grid world where the agent learns how to move the mouse and click on the reCAPTCHA button to receive a high score. We study the performance of the agent when we vary the cell size of the grid world and show that the performance drops when the agent takes big steps toward the goal. Fina…
▽ More
We present a Reinforcement Learning (RL) methodology to bypass Google reCAPTCHA v3. We formulate the problem as a grid world where the agent learns how to move the mouse and click on the reCAPTCHA button to receive a high score. We study the performance of the agent when we vary the cell size of the grid world and show that the performance drops when the agent takes big steps toward the goal. Finally, we used a divide and conquer strategy to defeat the reCAPTCHA system for any grid resolution. Our proposed method achieves a success rate of 97.4% on a 100x100 grid and 96.7% on a 1000x1000 screen resolution.
△ Less
Submitted 18 April, 2019; v1 submitted 3 March, 2019;
originally announced March 2019.
-
Improving Skin Condition Classification with a Question Answering Model
Authors:
Mohamed Akrout,
Amir-massoud Farahmand,
Tory Jarmain
Abstract:
We present a skin condition classification methodology based on a sequential pipeline of a pre-trained Convolutional Neural Network (CNN) and a Question Answering (QA) model. This method enables us to not only increase the classification confidence and accuracy of the deployed CNN system, but also enables the emulation of the conventional approach of doctors asking the relevant questions in refini…
▽ More
We present a skin condition classification methodology based on a sequential pipeline of a pre-trained Convolutional Neural Network (CNN) and a Question Answering (QA) model. This method enables us to not only increase the classification confidence and accuracy of the deployed CNN system, but also enables the emulation of the conventional approach of doctors asking the relevant questions in refining the ultimate diagnosis and differential. By combining the CNN output in the form of classification probabilities as a prior to the QA model and the image textual description, we greedily ask the best symptom that maximizes the information gain over symptoms. We demonstrate that combining the QA model with the CNN increases the accuracy up to 10% as compared to the CNN alone, and more than 30% as compared to the QA model alone.
△ Less
Submitted 14 November, 2018;
originally announced November 2018.
-
TBD: Benchmarking and Analyzing Deep Neural Network Training
Authors:
Hongyu Zhu,
Mohamed Akrout,
Bojian Zheng,
Andrew Pelegris,
Amar Phanishayee,
Bianca Schroeder,
Gennady Pekhimenko
Abstract:
The recent popularity of deep neural networks (DNNs) has generated a lot of research interest in performing DNN-related computation efficiently. However, the primary focus is usually very narrow and limited to (i) inference -- i.e. how to efficiently execute already trained models and (ii) image classification networks as the primary benchmark for evaluation.
Our primary goal in this work is to…
▽ More
The recent popularity of deep neural networks (DNNs) has generated a lot of research interest in performing DNN-related computation efficiently. However, the primary focus is usually very narrow and limited to (i) inference -- i.e. how to efficiently execute already trained models and (ii) image classification networks as the primary benchmark for evaluation.
Our primary goal in this work is to break this myopic view by (i) proposing a new benchmark for DNN training, called TBD (TBD is short for Training Benchmark for DNNs), that uses a representative set of DNN models that cover a wide range of machine learning applications: image classification, machine translation, speech recognition, object detection, adversarial networks, reinforcement learning, and (ii) by performing an extensive performance analysis of training these different applications on three major deep learning frameworks (TensorFlow, MXNet, CNTK) across different hardware configurations (single-GPU, multi-GPU, and multi-machine). TBD currently covers six major application domains and eight different state-of-the-art models.
We present a new toolchain for performance analysis for these models that combines the targeted usage of existing performance analysis tools, careful selection of new and existing metrics and methodologies to analyze the results, and utilization of domain specific characteristics of DNN training. We also build a new set of tools for memory profiling in all three major frameworks; much needed tools that can finally shed some light on precisely how much memory is consumed by different data structures (weights, activations, gradients, workspace) in DNN training. By using our tools and methodologies, we make several important observations and recommendations on where the future research and optimization of DNN training should be focused.
△ Less
Submitted 13 April, 2018; v1 submitted 16 March, 2018;
originally announced March 2018.