Search | arXiv e-print repository

SeNMo: A Self-Normalizing Deep Learning Model for Enhanced Multi-Omics Data Analysis in Oncology

Authors: Asim Waqas, Aakash Tripathi, Sabeen Ahmed, Ashwin Mukund, Hamza Farooq, Matthew B. Schabath, Paul Stewart, Mia Naeini, Ghulam Rasool

Abstract: Multi-omics research has enhanced our understanding of cancer heterogeneity and progression. Investigating molecular data through multi-omics approaches is crucial for unraveling the complex biological mechanisms underlying cancer, thereby enabling effective diagnosis, treatment, and prevention strategies. However, predicting patient outcomes through integration of all available multi-omics data i… ▽ More Multi-omics research has enhanced our understanding of cancer heterogeneity and progression. Investigating molecular data through multi-omics approaches is crucial for unraveling the complex biological mechanisms underlying cancer, thereby enabling effective diagnosis, treatment, and prevention strategies. However, predicting patient outcomes through integration of all available multi-omics data is an under-study research direction. Here, we present SeNMo (Self-normalizing Network for Multi-omics), a deep neural network trained on multi-omics data across 33 cancer types. SeNMo is efficient in handling multi-omics data characterized by high-width (many features) and low-length (fewer samples) attributes. We trained SeNMo for the task of overall survival using pan-cancer data involving 33 cancer sites from Genomics Data Commons (GDC). The training data includes gene expression, DNA methylation, miRNA expression, DNA mutations, protein expression modalities, and clinical data. We evaluated the model's performance in predicting overall survival using concordance index (C-Index). SeNMo performed consistently well in training regime, with the validation C-Index of 0.76 on GDC's public data. In the testing regime, SeNMo performed with a C-Index of 0.758 on a held-out test set. The model showed an average accuracy of 99.8% on the task of classifying the primary cancer type on the pan-cancer test cohort. SeNMo proved to be a mini-foundation model for multi-omics oncology data because it demonstrated robust performance, and adaptability not only across molecular data types but also on the classification task of predicting the primary cancer type of patients. SeNMo can be further scaled to any cancer site and molecular data type. We believe SeNMo and similar models are poised to transform the oncology landscape, offering hope for more effective, efficient, and patient-centric cancer care. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2404.15197 [pdf, other]

Multi-Task Learning as enabler for General-Purpose AI-native RAN

Authors: Hasan Farooq, Julien Forgeat, Shruti Bothe, Kristijonas Cyras, Md Moin

Abstract: The realization of data-driven AI-native architecture envisioned for 6G and beyond networks can eventually lead to multiple machine learning (ML) workloads distributed at the network edges driving downstream tasks like secondary carrier prediction, positioning, channel prediction etc. The independent life-cycle management of these edge-distributed independent multiple workloads sharing a resource-… ▽ More The realization of data-driven AI-native architecture envisioned for 6G and beyond networks can eventually lead to multiple machine learning (ML) workloads distributed at the network edges driving downstream tasks like secondary carrier prediction, positioning, channel prediction etc. The independent life-cycle management of these edge-distributed independent multiple workloads sharing a resource-constrained compute node e.g., base station (BS) is a challenge that will scale with denser deployments. This study explores the effectiveness of multi-task learning (MTL) approaches in facilitating a general-purpose AI native Radio Access Network (RAN). The investigation focuses on four RAN tasks: (i) secondary carrier prediction, (ii) user location prediction, (iii) indoor link classification, and (iv) line-of-sight link classification. We validate the performance using realistic simulations considering multi-faceted design aspects of MTL including model architecture, loss and gradient balancing strategies, distributed learning topology, data sparsity and task grou**s. The quantification and insights from simulations reveal that for the four RAN tasks considered (i) adoption of customized gate control-based expert architecture with uncertainty-based weighting makes MTL perform either best among all or at par with single task learning (STL) (ii) LoS classification task in MTL setting helps other tasks but its own performance is degraded (iii) for sparse training data, training a single global MTL model is helpful but MTL performance is on par with STL (iv) optimal set of group pairing exists for each task and (v) partial federation is much better than full model federation in MTL setting. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: Accepted for 2024 IEEE ICC Workshop on Edge Learning over 5G Mobile Networks and Beyond

arXiv:2308.02923 [pdf]

An AI-Enabled Framework to Defend Ingenious MDT-based Attacks on the Emerging Zero Touch Cellular Networks

Authors: Aneeqa Ijaz, Waseem Raza, Hasan Farooq, Marvin Manalastas, Ali Imran

Abstract: Deep automation provided by self-organizing network (SON) features and their emerging variants such as zero touch automation solutions is a key enabler for increasingly dense wireless networks and pervasive Internet of Things (IoT). To realize their objectives, most automation functionalities rely on the Minimization of Drive Test (MDT) reports. The MDT reports are used to generate inferences abou… ▽ More Deep automation provided by self-organizing network (SON) features and their emerging variants such as zero touch automation solutions is a key enabler for increasingly dense wireless networks and pervasive Internet of Things (IoT). To realize their objectives, most automation functionalities rely on the Minimization of Drive Test (MDT) reports. The MDT reports are used to generate inferences about network state and performance, thus dynamically change network parameters accordingly. However, the collection of MDT reports from commodity user devices, particularly low cost IoT devices, make them a vulnerable entry point to launch an adversarial attack on emerging deeply automated wireless networks. This adds a new dimension to the security threats in the IoT and cellular networks. Existing literature on IoT, SON, or zero touch automation does not address this important problem. In this paper, we investigate an impactful, first of its kind adversarial attack that can be launched by exploiting the malicious MDT reports from the compromised user equipment (UE). We highlight the detrimental repercussions of this attack on the performance of common network automation functions. We also propose a novel Malicious MDT Reports Identification framework (MRIF) as a countermeasure to detect and eliminate the malicious MDT reports using Machine Learning and verify it through a use-case. Thus, the defense mechanism can provide the resilience and robustness for zero touch automation SON engines against the adversarial MDT attacks △ Less

Submitted 5 August, 2023; originally announced August 2023.

Comments: 15 pages, 5 figures, 1 table

arXiv:2304.12480 [pdf, other]

Towards Addressing Training Data Scarcity Challenge in Emerging Radio Access Networks: A Survey and Framework

Authors: Haneya Naeem Qureshi, Usama Masood, Marvin Manalastas, Syed Muhammad Asad Zaidi, Hasan Farooq, Julien Forgeat, Maxime Bouton, Shruti Bothe, Per Karlsson, Ali Rizwan, Ali Imran

Abstract: The future of cellular networks is contingent on artificial intelligence (AI) based automation, particularly for radio access network (RAN) operation, optimization, and troubleshooting. To achieve such zero-touch automation, a myriad of AI-based solutions are being proposed in literature for modeling and optimizing network behavior to achieve the zero-touch automation goal. However, to work reliab… ▽ More The future of cellular networks is contingent on artificial intelligence (AI) based automation, particularly for radio access network (RAN) operation, optimization, and troubleshooting. To achieve such zero-touch automation, a myriad of AI-based solutions are being proposed in literature for modeling and optimizing network behavior to achieve the zero-touch automation goal. However, to work reliably, AI based automation, requires a deluge of training data. Consequently, the success of AI solutions is limited by a fundamental challenge faced by cellular network research community: scarcity of training data. We present an extensive review of classic and emerging techniques to address this challenge. We first identify the common data types in RAN and their known use-cases. We then present a taxonomized survey of techniques to address training data scarcity for various data types. This is followed by a framework to address the training data scarcity. The framework builds on available information and combination of techniques including interpolation, domain-knowledge based, generative adversarial neural networks, transfer learning, autoencoders, few-shot learning, simulators, and testbeds. Potential new techniques to enrich scarce data in cellular networks are also proposed, such as by matrix completion theory, and domain knowledge-based techniques leveraging different network parameters and geometries. An overview of state-of-the art simulators and testbeds is also presented to make readers aware of current and emerging platforms for real data access. The extensive survey of training data scarcity addressing techniques combined with proposed framework to select a suitable technique for given type of data, can assist researchers and network operators in choosing appropriate methods to overcome the data scarcity challenge in leveraging AI to radio access network automation. △ Less

Submitted 24 April, 2023; originally announced April 2023.

Comments: IEEE Surveys and Tutorials - accepted

arXiv:2201.12899 [pdf, other]

doi 10.1109/TMC.2022.3147191

Interpretable AI-based Large-scale 3D Pathloss Prediction Model for enabling Emerging Self-Driving Networks

Authors: Usama Masood, Hasan Farooq, Ali Imran, Adnan Abu-Dayya

Abstract: In modern wireless communication systems, radio propagation modeling to estimate pathloss has always been a fundamental task in system design and optimization. The state-of-the-art empirical propagation models are based on measurements in specific environments and limited in their ability to capture idiosyncrasies of various propagation environments. To cope with this problem, ray-tracing based so… ▽ More In modern wireless communication systems, radio propagation modeling to estimate pathloss has always been a fundamental task in system design and optimization. The state-of-the-art empirical propagation models are based on measurements in specific environments and limited in their ability to capture idiosyncrasies of various propagation environments. To cope with this problem, ray-tracing based solutions are used in commercial planning tools, but they tend to be extremely time-consuming and expensive. We propose a Machine Learning (ML)-based model that leverages novel key predictors for estimating pathloss. By quantitatively evaluating the ability of various ML algorithms in terms of predictive, generalization and computational performance, our results show that Light Gradient Boosting Machine (LightGBM) algorithm overall outperforms others, even with sparse training data, by providing a 65% increase in prediction accuracy as compared to empirical models and 13x decrease in prediction time as compared to ray-tracing. To address the interpretability challenge that thwarts the adoption of most ML-based models, we perform extensive secondary analysis using SHapley Additive exPlanations (SHAP) method, yielding many practically useful insights that can be leveraged for intelligently tuning the network configuration, selective enrichment of training data in real networks and for building lighter ML-based propagation model to enable low-latency use-cases. △ Less

Submitted 30 January, 2022; originally announced January 2022.

Comments: Accepted at IEEE Transactions on Mobile Computing

arXiv:2109.15175 [pdf, other]

Coordinated Reinforcement Learning for Optimizing Mobile Networks

Authors: Maxime Bouton, Hasan Farooq, Julien Forgeat, Shruti Bothe, Meral Shirazipour, Per Karlsson

Abstract: Mobile networks are composed of many base stations and for each of them many parameters must be optimized to provide good services. Automatically and dynamically optimizing all these entities is challenging as they are sensitive to variations in the environment and can affect each other through interferences. Reinforcement learning (RL) algorithms are good candidates to automatically learn base st… ▽ More Mobile networks are composed of many base stations and for each of them many parameters must be optimized to provide good services. Automatically and dynamically optimizing all these entities is challenging as they are sensitive to variations in the environment and can affect each other through interferences. Reinforcement learning (RL) algorithms are good candidates to automatically learn base station configuration strategies from incoming data but they are often hard to scale to many agents. In this work, we demonstrate how to use coordination graphs and reinforcement learning in a complex application involving hundreds of cooperating agents. We show how mobile networks can be modeled using coordination graphs and how network optimization problems can be solved efficiently using multi- agent reinforcement learning. The graph structure occurs naturally from expert knowledge about the network and allows to explicitly learn coordinating behaviors between the antennas through edge value functions represented by neural networks. We show empirically that coordinated reinforcement learning outperforms other methods. The use of local RL updates and parameter sharing can handle a large number of agents without sacrificing coordination which makes it well suited to optimize the ever denser networks brought by 5G and beyond. △ Less

Submitted 30 September, 2021; originally announced September 2021.

Comments: 14 pages, 8 figures

arXiv:2106.15850 [pdf, other]

doi 10.1038/s44172-022-00043-2

Exploring Robust Architectures for Deep Artificial Neural Networks

Authors: Asim Waqas, Ghulam Rasool, Hamza Farooq, Nidhal C. Bouaynaya

Abstract: The architectures of deep artificial neural networks (DANNs) are routinely studied to improve their predictive performance. However, the relationship between the architecture of a DANN and its robustness to noise and adversarial attacks is less explored. We investigate how the robustness of DANNs relates to their underlying graph architectures or structures. This study: (1) starts by exploring the… ▽ More The architectures of deep artificial neural networks (DANNs) are routinely studied to improve their predictive performance. However, the relationship between the architecture of a DANN and its robustness to noise and adversarial attacks is less explored. We investigate how the robustness of DANNs relates to their underlying graph architectures or structures. This study: (1) starts by exploring the design space of architectures of DANNs using graph-theoretic robustness measures; (2) transforms the graphs to DANN architectures to train/validate/test on various image classification tasks; (3) explores the relationship between the robustness of trained DANNs against noise and adversarial attacks and the robustness of their underlying architectures estimated via graph-theoretic measures. We show that the topological entropy and Olivier-Ricci curvature of the underlying graphs can quantify the robustness performance of DANNs. The said relationship is stronger for complex tasks and large DANNs. Our work will allow autoML and neural architecture search community to explore design spaces of robust and accurate DANNs. △ Less

Submitted 5 April, 2022; v1 submitted 30 June, 2021; originally announced June 2021.

Comments: 27 pages, 16 figures

Journal ref: Commun Eng 1, 46 (2022)

arXiv:2010.12835 [pdf]

Pressure Mode Decomposition Analysis of the Flow past a Cross-flow Oscillating Circular Cylinder

Authors: Muhammad Sufyan, Hamayun Farooq, Imran Akhtar, Zafar Bangash

Abstract: Proper orthogonal decomposition (POD) is often employed in develo** reduced-order models (ROM) in fluid flows for design, control, and optimization. Contrary to the usual practice where velocity field is the focus, we apply the POD analysis on the pressure field data obtained from numerical simulations of the flow past stationary and oscillating cylinders. Since pressure mainly contributes to th… ▽ More Proper orthogonal decomposition (POD) is often employed in develo** reduced-order models (ROM) in fluid flows for design, control, and optimization. Contrary to the usual practice where velocity field is the focus, we apply the POD analysis on the pressure field data obtained from numerical simulations of the flow past stationary and oscillating cylinders. Since pressure mainly contributes to the hydrodynamic forces acting on the structure, we compute the pressure POD modes on the cylinder surface oscillating in lock-in and lock-out regions. These modes are then dissected into sine and cosine magnitudes to estimate their contribution in the development of pressure lift and drag decomposition coefficients, respectively. The key finding of this study is that more POD modes are required to capture the flow physics in nonsynchronous regimes as compared to synchronization case. Engineering application of this study is the development of reduced-order models for effective control techniques. △ Less

Submitted 24 October, 2020; originally announced October 2020.

Comments: 7 figures

arXiv:2009.13922 [pdf]

doi 10.1109/ACCESS.2020.3027258

Mobility Management in Emerging Ultra-Dense Cellular Networks: A Survey, Outlook, and Future Research Directions

Authors: Syed Muhammad Asad Zaidi, Marvin Manalastas, Hasan Farooq, Ali Imran

Abstract: The exponential rise in mobile traffic originating from mobile devices highlights the need for making mobility management in future networks even more efficient and seamless than ever before. Ultra-Dense Cellular Network vision consisting of cells of varying sizes with conventional and mmWave bands is being perceived as the panacea for the eminent capacity crunch. However, mobility challenges in a… ▽ More The exponential rise in mobile traffic originating from mobile devices highlights the need for making mobility management in future networks even more efficient and seamless than ever before. Ultra-Dense Cellular Network vision consisting of cells of varying sizes with conventional and mmWave bands is being perceived as the panacea for the eminent capacity crunch. However, mobility challenges in an ultra-dense heterogeneous network with motley of high frequency and mmWave band cells will be unprecedented due to plurality of handover instances, and the resulting signaling overhead and data interruptions for miscellany of devices. Similarly, issues like user tracking and cell discovery for mmWave with narrow beams need to be addressed before the ambitious gains of emerging mobile networks can be realized. Mobility challenges are further highlighted when considering the 5G deliverables of multi-Gbps wireless connectivity, <1ms latency and support for devices moving at maximum speed of 500km/h, to name a few. Despite its significance, few mobility surveys exist with the majority focused on adhoc networks. This paper is the first to provide a comprehensive survey on the panorama of mobility challenges in the emerging ultra-dense mobile networks. We not only present a detailed tutorial on 5G mobility approaches and highlight key mobility risks of legacy networks, but also review key findings from recent studies and highlight the technical challenges and potential opportunities related to mobility from the perspective of emerging ultra-dense cellular networks. △ Less

Submitted 29 September, 2020; originally announced September 2020.

Comments: in IEEE Access

arXiv:2005.01472 [pdf, other]

doi 10.1109/BlackSeaCom48709.2020.9235002

Neuromorphic AI Empowered Root Cause Analysis of Faults in Emerging Networks

Authors: Shruti Bothe, Usama Masood, Hasan Farooq, Ali Imran

Abstract: Mobile cellular network operators spend nearly a quarter of their revenue on network maintenance and management. A significant portion of that budget is spent on resolving faults diagnosed in the system that disrupt or degrade cellular services. Historically, the operations to detect, diagnose and resolve issues were carried out by human experts. However, with diversifying cell types, increased co… ▽ More Mobile cellular network operators spend nearly a quarter of their revenue on network maintenance and management. A significant portion of that budget is spent on resolving faults diagnosed in the system that disrupt or degrade cellular services. Historically, the operations to detect, diagnose and resolve issues were carried out by human experts. However, with diversifying cell types, increased complexity and growing cell density, this methodology is becoming less viable, both technically and financially. To cope with this problem, in recent years, research on self-healing solutions has gained significant momentum. One of the most desirable features of the self-healing paradigm is automated fault diagnosis. While several fault detection and diagnosis machine learning models have been proposed recently, these schemes have one common tenancy of relying on human expert contribution for fault diagnosis and prediction in one way or another. In this paper, we propose an AI-based fault diagnosis solution that offers a key step towards a completely automated self-healing system without requiring human expert input. The proposed solution leverages Random Forests classifier, Convolutional Neural Network and neuromorphic based deep learning model which uses RSRP map images of faults generated. We compare the performance of the proposed solution against state-of-the-art solution in literature that mostly use Naive Bayes models, while considering seven different fault types. Results show that neuromorphic computing model achieves high classification accuracy as compared to the other models even with relatively small training data △ Less

Submitted 4 May, 2020; originally announced May 2020.

Journal ref: 2020 IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom)

arXiv:2004.01495 [pdf, other]

doi 10.1109/EHB50910.2020.9280115

Can Machine Learning Be Used to Recognize and Diagnose Coughs?

Authors: Charles Bales, Muhammad Nabeel, Charles N. John, Usama Masood, Haneya N. Qureshi, Hasan Farooq, Iryna Posokhova, Ali Imran

Abstract: Emerging wireless technologies, such as 5G and beyond, are bringing new use cases to the forefront, one of the most prominent being machine learning empowered health care. One of the notable modern medical concerns that impose an immense worldwide health burden are respiratory infections. Since cough is an essential symptom of many respiratory infections, an automated system to screen for respirat… ▽ More Emerging wireless technologies, such as 5G and beyond, are bringing new use cases to the forefront, one of the most prominent being machine learning empowered health care. One of the notable modern medical concerns that impose an immense worldwide health burden are respiratory infections. Since cough is an essential symptom of many respiratory infections, an automated system to screen for respiratory diseases based on raw cough data would have a multitude of beneficial research and medical applications. In literature, machine learning has already been successfully used to detect cough events in controlled environments. In this paper, we present a low complexity, automated recognition and diagnostic tool for screening respiratory infections that utilizes Convolutional Neural Networks (CNNs) to detect cough within environment audio and diagnose three potential illnesses (i.e., bronchitis, bronchiolitis and pertussis) based on their unique cough audio features. Both proposed detection and diagnosis models achieve an accuracy of over 89%, while also remaining computationally efficient. Results show that the proposed system is successfully able to detect and separate cough events from background noise. Moreover, the proposed single diagnosis model is capable of distinguishing between different illnesses without the need of separate models. △ Less

Submitted 4 October, 2020; v1 submitted 1 April, 2020; originally announced April 2020.

Comments: Accepted in IEEE International Conference on E-Health and Bioengineering - EHB 2020

arXiv:1910.00095 [pdf, other]

Fitting IVIM with Variable Projection and Simplicial Optimization

Authors: Shreyas Fadnavis, Hamza Farooq, Maryam Afzali, Christoph Lenglet, Tryphon Georgiou, Hu Cheng, Sharlene Newman, Shahnawaz Ahmed, Rafael Neto Henriques, Eric Peterson, Serge Koudoro, Ariel Rokem, Eleftherios Garyfallidis

Abstract: Fitting multi-exponential models to Diffusion MRI (dMRI) data has always been challenging due to various underlying complexities. In this work, we introduce a novel and robust fitting framework for the standard two-compartment IVIM microstructural model. This framework provides a significant improvement over the existing methods and helps estimate the associated diffusion and perfusion parameters… ▽ More Fitting multi-exponential models to Diffusion MRI (dMRI) data has always been challenging due to various underlying complexities. In this work, we introduce a novel and robust fitting framework for the standard two-compartment IVIM microstructural model. This framework provides a significant improvement over the existing methods and helps estimate the associated diffusion and perfusion parameters of IVIM in an automatic manner. As a part of this work we provide capabilities to switch between more advanced global optimization methods such as simplicial homology (SH) and differential evolution (DE). Our experiments show that the results obtained from this simultaneous fitting procedure disentangle the model parameters in a reduced subspace. The proposed framework extends the seminal work originated in the MIX framework, with improved procedures for multi-stage fitting. This framework has been made available as an open-source Python implementation and disseminated to the community through the DIPY project. △ Less

Submitted 15 February, 2020; v1 submitted 27 September, 2019; originally announced October 2019.

arXiv:1502.00290 [pdf]

Survey on Awareness of Privacy Issues in Ubiquitous Environment

Authors: Huma Tabassum, Sameena Javaid, Humera Farooq

Abstract: This study aims to determine privacy awareness among people in ubiquitous environment through a questionnaire based survey. This study aims to determine privacy awareness among people in ubiquitous environment through a questionnaire based survey. △ Less

Submitted 1 February, 2015; originally announced February 2015.

Comments: 5 pages, 3 figures, 1 table. To be published in International Journal of Computer Science & Information Security (IJCSIS) Volume:13 Issue:1

arXiv:1204.1177 [pdf]

Principal Component Analysis-Linear Discriminant Analysis Feature Extractor for Pattern Recognition

Authors: Aamir Khan, Hasan Farooq

Abstract: Robustness of embedded biometric systems is of prime importance with the emergence of fourth generation communication devices and advancement in security systems This paper presents the realization of such technologies which demands reliable and error-free biometric identity verification systems. High dimensional patterns are not permitted due to eigen-decomposition in high dimensional image space… ▽ More Robustness of embedded biometric systems is of prime importance with the emergence of fourth generation communication devices and advancement in security systems This paper presents the realization of such technologies which demands reliable and error-free biometric identity verification systems. High dimensional patterns are not permitted due to eigen-decomposition in high dimensional image space and degeneration of scattering matrices in small size sample. Generalization, dimensionality reduction and maximizing the margins are controlled by minimizing weight vectors. Results show good pattern by multimodal biometric system proposed in this paper. This paper is aimed at investigating a biometric identity system using Principal Component Analysis and Lindear Discriminant Analysis with K-Nearest Neighbor and implementing such system in real-time using SignalWAVE. △ Less

Submitted 5 April, 2012; originally announced April 2012.

Journal ref: IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 6, No 2, November 2011

Showing 1–14 of 14 results for author: Farooq, H