Search | arXiv e-print repository

On Sequential Loss Approximation for Continual Learning

Authors: Menghao Waiyan William Zhu, Ercan Engin Kuruoğlu

Abstract: We introduce for continual learning Autodiff Quadratic Consolidation (AQC), which approximates the previous loss function with a quadratic function, and Neural Consolidation (NC), which approximates the previous loss function with a neural network. Although they are not scalable to large neural networks, they can be used with a fixed pre-trained feature extractor. We empirically study these method… ▽ More We introduce for continual learning Autodiff Quadratic Consolidation (AQC), which approximates the previous loss function with a quadratic function, and Neural Consolidation (NC), which approximates the previous loss function with a neural network. Although they are not scalable to large neural networks, they can be used with a fixed pre-trained feature extractor. We empirically study these methods in class-incremental learning, for which regularization-based methods produce unsatisfactory results, unless combined with replay. We find that for small datasets, quadratic approximation of the previous loss function leads to poor results, even with full Hessian computation, and NC could significantly improve the predictive performance, while for large datasets, when used with a fixed pre-trained feature extractor, AQC provides superior predictive performance. We also find that using tanh-output features can improve the predictive performance of AQC. In particular, in class-incremental Split MNIST, when a Convolutional Neural Network (CNN) with tanh-output features is pre-trained on EMNIST Letters and used as a fixed pre-trained feature extractor, AQC can achieve predictive performance comparable to joint training. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.04111 [pdf, other]

Adaptive Least Mean pth Power Graph Neural Networks

Authors: Changran Peng, Yi Yan, Ercan E. Kuruoglu

Abstract: In the presence of impulsive noise, and missing observations, accurate online prediction of time-varying graph signals poses a crucial challenge in numerous application domains. We propose the Adaptive Least Mean $p^{th}$ Power Graph Neural Networks (LMP-GNN), a universal framework combining adaptive filter and graph neural network for online graph signal estimation. LMP-GNN retains the advantage… ▽ More In the presence of impulsive noise, and missing observations, accurate online prediction of time-varying graph signals poses a crucial challenge in numerous application domains. We propose the Adaptive Least Mean $p^{th}$ Power Graph Neural Networks (LMP-GNN), a universal framework combining adaptive filter and graph neural network for online graph signal estimation. LMP-GNN retains the advantage of adaptive filtering in handling noise and missing observations as well as the online update capability. The incorporated graph neural network within the LMP-GNN can train and update filter parameters online instead of predefined filter parameters in previous methods, outputting more accurate prediction results. The adaptive update scheme of the LMP-GNN follows the solution of a $l_p$-norm optimization, rooting to the minimum dispersion criterion, and yields robust estimation results for time-varying graph signals under impulsive noise. A special case of LMP-GNN named the Sign-GNN is also provided and analyzed, Experiment results on two real-world datasets of temperature graph and traffic graph under four different noise distributions prove the effectiveness and robustness of our proposed LMP-GNN. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.04107 [pdf, other]

Adaptive Graph Normalized Sign Algorithm

Authors: Changran Peng, Yi Yan, Ercan E. Kuruoglu

Abstract: Efficient and robust prediction of graph signals is challenging when the signals are under impulsive noise and have missing data. Exploiting graph signal processing (GSP) and leveraging the simplicity of the classical adaptive sign algorithm, we propose an adaptive algorithm on graphs named the Graph Normalized Sign (GNS). GNS approximated a normalization term into the update, therefore achieving… ▽ More Efficient and robust prediction of graph signals is challenging when the signals are under impulsive noise and have missing data. Exploiting graph signal processing (GSP) and leveraging the simplicity of the classical adaptive sign algorithm, we propose an adaptive algorithm on graphs named the Graph Normalized Sign (GNS). GNS approximated a normalization term into the update, therefore achieving faster convergence and lower error compared to previous adaptive GSP algorithms. In the task of the online prediction of multivariate temperature data under impulsive noise, GNS outputs fast and robust predictions. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.04098 [pdf, other]

Binarized Simplicial Convolutional Neural Networks

Authors: Yi Yan, Ercan E. Kuruoglu

Abstract: Graph Neural Networks have a limitation of solely processing features on graph nodes, neglecting data on high-dimensional structures such as edges and triangles. Simplicial Convolutional Neural Networks (SCNN) represent higher-order structures using simplicial complexes to break this limitation albeit still lacking time efficiency. In this paper, we propose a novel neural network architecture on s… ▽ More Graph Neural Networks have a limitation of solely processing features on graph nodes, neglecting data on high-dimensional structures such as edges and triangles. Simplicial Convolutional Neural Networks (SCNN) represent higher-order structures using simplicial complexes to break this limitation albeit still lacking time efficiency. In this paper, we propose a novel neural network architecture on simplicial complexes named Binarized Simplicial Convolutional Neural Networks (Bi-SCNN) based on the combination of simplicial convolution with a binary-sign forward propagation strategy. The usage of the Hodge Laplacian on a binary-sign forward propagation enables Bi-SCNN to efficiently and effectively represent simplicial features that have higher-order structures than traditional graph node representations. Compared to the previous Simplicial Convolutional Neural Networks, the reduced model complexity of Bi-SCNN shortens the execution time without sacrificing the prediction performance and is less prone to the over-smoothing effect. Experimenting with real-world citation and ocean-drifter data confirmed that our proposed Bi-SCNN is efficient and accurate. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2402.16911 [pdf, other]

Trustworthy Personalized Bayesian Federated Learning via Posterior Fine-Tune

Authors: Mengen Luo, Chi Xu, Ercan Engin Kuruoglu

Abstract: Performance degradation owing to data heterogeneity and low output interpretability are the most significant challenges faced by federated learning in practical applications. Personalized federated learning diverges from traditional approaches, as it no longer seeks to train a single model, but instead tailors a unique personalized model for each client. However, previous work focused only on pers… ▽ More Performance degradation owing to data heterogeneity and low output interpretability are the most significant challenges faced by federated learning in practical applications. Personalized federated learning diverges from traditional approaches, as it no longer seeks to train a single model, but instead tailors a unique personalized model for each client. However, previous work focused only on personalization from the perspective of neural network parameters and lack of robustness and interpretability. In this work, we establish a novel framework for personalized federated learning, incorporating Bayesian methodology which enhances the algorithm's ability to quantify uncertainty. Furthermore, we introduce normalizing flow to achieve personalization from the parameter posterior perspective and theoretically analyze the impact of normalizing flow on out-of-distribution (OOD) detection for Bayesian neural networks. Finally, we evaluated our approach on heterogeneous datasets, and the experimental results indicate that the new algorithm not only improves accuracy but also outperforms the baseline significantly in OOD detection due to the reliable output of the Bayesian approach. △ Less

Submitted 25 February, 2024; originally announced February 2024.

arXiv:2402.16091 [pdf, other]

Bayesian Neural Network For Personalized Federated Learning Parameter Selection

Authors: Mengen Luo, Ercan Engin Kuruoglu

Abstract: Federated learning's poor performance in the presence of heterogeneous data remains one of the most pressing issues in the field. Personalized federated learning departs from the conventional paradigm in which all clients employ the same model, instead striving to discover an individualized model for each client to address the heterogeneity in the data. One of such approach involves personalizing… ▽ More Federated learning's poor performance in the presence of heterogeneous data remains one of the most pressing issues in the field. Personalized federated learning departs from the conventional paradigm in which all clients employ the same model, instead striving to discover an individualized model for each client to address the heterogeneity in the data. One of such approach involves personalizing specific layers of neural networks. However, prior endeavors have not provided a dependable rationale, and some have selected personalized layers that are entirely distinct and conflicting. In this work, we take a step further by proposing personalization at the elemental level, rather than the traditional layer-level personalization. To select personalized parameters, we introduce Bayesian neural networks and rely on the uncertainty they offer to guide our selection of personalized parameters. Finally, we validate our algorithm's efficacy on several real-world datasets, demonstrating that our proposed approach outperforms existing baselines. △ Less

Submitted 25 February, 2024; originally announced February 2024.

arXiv:2401.15304 [pdf, other]

Adaptive Least Mean Squares Graph Neural Networks and Online Graph Signal Estimation

Authors: Yi Yan, Changran Peng, Ercan Engin Kuruoglu

Abstract: The online prediction of multivariate signals, existing simultaneously in space and time, from noisy partial observations is a fundamental task in numerous applications. We propose an efficient Neural Network architecture for the online estimation of time-varying graph signals named the Adaptive Least Mean Squares Graph Neural Networks (LMS-GNN). LMS-GNN aims to capture the time variation and brid… ▽ More The online prediction of multivariate signals, existing simultaneously in space and time, from noisy partial observations is a fundamental task in numerous applications. We propose an efficient Neural Network architecture for the online estimation of time-varying graph signals named the Adaptive Least Mean Squares Graph Neural Networks (LMS-GNN). LMS-GNN aims to capture the time variation and bridge the cross-space-time interactions under the condition that signals are corrupted by noise and missing values. The LMS-GNN is a combination of adaptive graph filters and Graph Neural Networks (GNN). At each time step, the forward propagation of LMS-GNN is similar to adaptive graph filters where the output is based on the error between the observation and the prediction similar to GNN. The filter coefficients are updated via backpropagation as in GNN. Experimenting on real-world temperature data reveals that our LMS-GNN achieves more accurate online predictions compared to graph-based methods like adaptive graph filters and graph convolutional neural networks. △ Less

Submitted 27 January, 2024; originally announced January 2024.

arXiv:2311.11126 [pdf, other]

Bayesian Neural Networks: A Min-Max Game Framework

Authors: Jun** Hong, Ercan Engin Kuruoglu

Abstract: This paper is a preliminary study of the robustness and noise analysis of deep neural networks via a game theory formulation Bayesian Neural Networks (BNN) and the maximal coding rate distortion loss. BNN has been shown to provide some robustness to deep learning, and the minimax method used to be a natural conservative way to assist the Bayesian method. Inspired by the recent closed-loop transcri… ▽ More This paper is a preliminary study of the robustness and noise analysis of deep neural networks via a game theory formulation Bayesian Neural Networks (BNN) and the maximal coding rate distortion loss. BNN has been shown to provide some robustness to deep learning, and the minimax method used to be a natural conservative way to assist the Bayesian method. Inspired by the recent closed-loop transcription neural network, we formulate the BNN via game theory between the deterministic neural network $f$ and the sampling network $f + ξ$ or $f + r*ξ$. Compared with previous BNN, BNN via game theory learns a solution space within a certain gap between the center $f$ and the sampling point $f + r*ξ$, and is a conservative choice with a meaningful prior setting compared with previous BNN. Furthermore, the minimum points between $f$ and $f + r*ξ$ become stable when the subspace dimension is large enough with a well-trained model $f$. With these, the model $f$ can have a high chance of recognizing the out-of-distribution data or noise data in the subspace rather than the prediction level, even if $f$ is in online training after a few iterations of true data. So far, our experiments are limited to MNIST and Fashion MNIST data sets, more experiments with realistic data sets and complicated neural network models should be implemented to validate the above arguments. △ Less

Submitted 29 May, 2024; v1 submitted 18 November, 2023; originally announced November 2023.

Comments: 6 pages, 8 figures,

arXiv:2311.10803 [pdf, other]

Robustness Enhancement in Neural Networks with Alpha-Stable Training Noise

Authors: Xueqiong Yuan, Jipeng Li, Ercan Engin Kuruoğlu

Abstract: With the increasing use of deep learning on data collected by non-perfect sensors and in non-perfect environments, the robustness of deep learning systems has become an important issue. A common approach for obtaining robustness to noise has been to train deep learning systems with data augmented with Gaussian noise. In this work, we challenge the common choice of Gaussian noise and explore the po… ▽ More With the increasing use of deep learning on data collected by non-perfect sensors and in non-perfect environments, the robustness of deep learning systems has become an important issue. A common approach for obtaining robustness to noise has been to train deep learning systems with data augmented with Gaussian noise. In this work, we challenge the common choice of Gaussian noise and explore the possibility of stronger robustness for non-Gaussian impulsive noise, specifically alpha-stable noise. Justified by the Generalized Central Limit Theorem and evidenced by observations in various application areas, alpha-stable noise is widely present in nature. By comparing the testing accuracy of models trained with Gaussian noise and alpha-stable noise on data corrupted by different noise, we find that training with alpha-stable noise is more effective than Gaussian noise, especially when the dataset is corrupted by impulsive noise, thus improving the robustness of the model. The generality of this conclusion is validated through experiments conducted on various deep learning models with image and time series datasets, and other benchmark corrupted datasets. Consequently, we propose a novel data augmentation method that replaces Gaussian noise, which is typically added to the training data, with alpha-stable noise. △ Less

Submitted 17 November, 2023; originally announced November 2023.

arXiv:2311.06747 [pdf, other]

Graph Signal Processing For Cancer Gene Co-Expression Network Analysis

Authors: Radwa Adel, Ercan Engin Kuruoglu

Abstract: Cancer heterogeneity arises from complex molecular interactions. Elucidating systems-level properties of gene interaction networks distinguishing cancer from normal cells is critical for understanding disease mechanisms and develo** targeted therapies. Previous works focused only on identifying differences in network structures. In this study, we used graph frequency analysis of cancer genetic s… ▽ More Cancer heterogeneity arises from complex molecular interactions. Elucidating systems-level properties of gene interaction networks distinguishing cancer from normal cells is critical for understanding disease mechanisms and develo** targeted therapies. Previous works focused only on identifying differences in network structures. In this study, we used graph frequency analysis of cancer genetic signals defined on a co-expression network to describe the spectral properties of underlying cancer systems. We demonstrated that cancer cells exhibit distinctive signatures in the graph frequency content of their gene expression signals. Applying graph frequency filtering, graph Fourier transforms, and its inverse to gene expression from different cancer stages resulted in significant improvements in average F-statistics of the genes compared to using their unfiltered expression levels. We propose graph spectral properties of cancer genetic signals defined on gene co-expression networks as cancer hallmarks with potential application for differential co-expression analysis. △ Less

Submitted 12 November, 2023; originally announced November 2023.

arXiv:2311.00656 [pdf, other]

Online Signal Estimation on the Graph Edges via Line Graph Transformation

Authors: Yi Yan, Ercan Engin Kuruoglu

Abstract: The processing of signals on graph edges is challenging considering that Graph Signal Processing techniques are defined only on the graph nodes. Leveraging the Line Graph to transform a graph edge signal onto the node of its edge-to-vertex dual, we propose the Line Graph Least Mean Square (LGLMS) algorithm for online time-varying graph edge signal prediction. By setting up an $l_2$-norm optimizati… ▽ More The processing of signals on graph edges is challenging considering that Graph Signal Processing techniques are defined only on the graph nodes. Leveraging the Line Graph to transform a graph edge signal onto the node of its edge-to-vertex dual, we propose the Line Graph Least Mean Square (LGLMS) algorithm for online time-varying graph edge signal prediction. By setting up an $l_2$-norm optimization problem, LGLMS forms an adaptive algorithm as the graph edge analogy of the classical adaptive LMS algorithm. Additionally, the LGLMS inherits all the GSP concepts and techniques that can previously be deployed on the graph nodes, but without the need to redefine them on the graph edges. Experimenting with transportation graphs and meteorological graphs, with the signal observations having noisy and missing values, we confirmed that LGLMS is suitable for the online prediction of time-varying edge signals. △ Less

Submitted 28 February, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

arXiv:2310.00642 [pdf, other]

From Bandits Model to Deep Deterministic Policy Gradient, Reinforcement Learning with Contextual Information

Authors: Zhendong Shi, Xiaoli Wei, Ercan E. Kuruoglu

Abstract: The problem of how to take the right actions to make profits in sequential process continues to be difficult due to the quick dynamics and a significant amount of uncertainty in many application scenarios. In such complicated environments, reinforcement learning (RL), a reward-oriented strategy for optimum control, has emerged as a potential technique to address this strategic decision-making issu… ▽ More The problem of how to take the right actions to make profits in sequential process continues to be difficult due to the quick dynamics and a significant amount of uncertainty in many application scenarios. In such complicated environments, reinforcement learning (RL), a reward-oriented strategy for optimum control, has emerged as a potential technique to address this strategic decision-making issue. However, reinforcement learning also has some shortcomings that make it unsuitable for solving many financial problems, excessive resource consumption, and inability to quickly obtain optimal solutions, making it unsuitable for quantitative trading markets. In this study, we use two methods to overcome the issue with contextual information: contextual Thompson sampling and reinforcement learning under supervision which can accelerate the iterations in search of the best answer. In order to investigate strategic trading in quantitative markets, we merged the earlier financial trading strategy known as constant proportion portfolio insurance (CPPI) into deep deterministic policy gradient (DDPG). The experimental results show that both methods can accelerate the progress of reinforcement learning to obtain the optimal solution. △ Less

Submitted 1 October, 2023; originally announced October 2023.

MSC Class: 93A16 ACM Class: I.2.11; G.3

arXiv:2310.00630 [pdf, other]

Sequential Monte Carlo Graph Convolutional Network for Dynamic Brain Connectivity

Authors: Fengfan Zhao, Ercan Engin Kuruoglu

Abstract: An increasingly important brain function analysis modality is functional connectivity analysis which regards connections as statistical codependency between the signals of different brain regions. Graph-based analysis of brain connectivity provides a new way of exploring the association between brain functional deficits and the structural disruption related to brain disorders, but the current impl… ▽ More An increasingly important brain function analysis modality is functional connectivity analysis which regards connections as statistical codependency between the signals of different brain regions. Graph-based analysis of brain connectivity provides a new way of exploring the association between brain functional deficits and the structural disruption related to brain disorders, but the current implementations have limited capability due to the assumptions of noise-free data and stationary graph topology. We propose a new methodology based on the particle filtering algorithm, with proven success in tracking problems, which estimates the hidden states of a dynamic graph with only partial and noisy observations, without the assumptions of stationarity on connectivity. We enrich the particle filtering state equation with a graph Neural Network called Sequential Monte Carlo Graph Convolutional Network (SMC-GCN), which due to the nonlinear regression capability, can limit spurious connections in the graph. Experiment studies demonstrate that SMC-GCN achieves the superior performance of several methods in brain disorder classification. △ Less

Submitted 4 January, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

arXiv:2306.04383 [pdf, other]

Complex Isotropic α-Stable-Rician Model for Heterogeneous SAR Images

Authors: Mutong Li, Ercan Engin Kuruoglu

Abstract: This article introduces a novel probability distribution model, namely Complex Isotropic α-Stable-Rician (CIαSR), for characterizing the data histogram of synthetic aperture radar (SAR) images. Having its foundation situated on the Lévy α-stable distribution suggested by a generalized Central Limit Theorem, the model promises great potential in accurately capturing SAR image features of extreme he… ▽ More This article introduces a novel probability distribution model, namely Complex Isotropic α-Stable-Rician (CIαSR), for characterizing the data histogram of synthetic aperture radar (SAR) images. Having its foundation situated on the Lévy α-stable distribution suggested by a generalized Central Limit Theorem, the model promises great potential in accurately capturing SAR image features of extreme heterogeneity. A novel parameter estimation method based on the generalization of method of moments to expectations of Bessel functions is devised to resolve the model in a relatively compact and computationally efficient manner. Experimental results based on both synthetic and empirical SAR data exhibit the CIαSR model's superior capacity in modelling scenes of a wide range of heterogeneity when compared to other state-of-the-art models as quantified by various performance metrics. Additional experiments are conducted utilizing large-swath SAR images which encompass mixtures of several scenes to help interpret the CIαSR model parameters, and to demonstrate the model's potential application in classification and target detection. △ Less

Submitted 7 June, 2023; originally announced June 2023.

arXiv:2303.11959 [pdf, other]

Optimizing Trading Strategies in Quantitative Markets using Multi-Agent Reinforcement Learning

Authors: Hengxi Zhang, Zhendong Shi, Yuanquan Hu, Wenbo Ding, Ercan E. Kuruoglu, Xiao-** Zhang

Abstract: Quantitative markets are characterized by swift dynamics and abundant uncertainties, making the pursuit of profit-driven stock trading actions inherently challenging. Within this context, reinforcement learning (RL), which operates on a reward-centric mechanism for optimal control, has surfaced as a potentially effective solution to the intricate financial decision-making conundrums presented. Thi… ▽ More Quantitative markets are characterized by swift dynamics and abundant uncertainties, making the pursuit of profit-driven stock trading actions inherently challenging. Within this context, reinforcement learning (RL), which operates on a reward-centric mechanism for optimal control, has surfaced as a potentially effective solution to the intricate financial decision-making conundrums presented. This paper delves into the fusion of two established financial trading strategies, namely the constant proportion portfolio insurance (CPPI) and the time-invariant portfolio protection (TIPP), with the multi-agent deep deterministic policy gradient (MADDPG) framework. As a result, we introduce two novel multi-agent RL (MARL) methods, CPPI-MADDPG and TIPP-MADDPG, tailored for probing strategic trading within quantitative markets. To validate these innovations, we implemented them on a diverse selection of 100 real-market shares. Our empirical findings reveal that the CPPI-MADDPG and TIPP-MADDPG strategies consistently outpace their traditional counterparts, affirming their efficacy in the realm of quantitative trading. △ Less

Submitted 21 December, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

arXiv:2303.04509 [pdf, other]

Fast Cauchy-Rician Modelling of SAR Images with Method of Algebraic Moments Estimator

Authors: Mutong Li, Ercan Engin Kuruoglu

Abstract: SAR technology has been intensively implemented for geo-sensing and map** purposes due to its advantages of high azimuthal resolution and weather-independent operation compared to other remote sensing technologies. Modelling SAR image data consequently becomes a prominent topic of interest, especially for data populations with impulsive signal features, which are common in SAR images of urban ar… ▽ More SAR technology has been intensively implemented for geo-sensing and map** purposes due to its advantages of high azimuthal resolution and weather-independent operation compared to other remote sensing technologies. Modelling SAR image data consequently becomes a prominent topic of interest, especially for data populations with impulsive signal features, which are common in SAR images of urban areas. A recently proposed model named Cauchy-Rician has manifested great potential in modelling extremely heterogeneous SAR images, yet the work only provided a MCMC-based parameter estimator that demands considerable computational power. In this work, a novel analytical parameter estimation method based on algebraic moments is proposed to provide stable and accurate estimation of the parameters of the Cauchy-Rician model with significant improvement on computation speed. △ Less

Submitted 8 March, 2023; originally announced March 2023.

arXiv:2211.06533 [pdf, other]

Adaptive Joint Estimation of Temporal Vertex and Edge Signals

Authors: Yi Yan, Tian Xie, Ercan E. Kuruoglu

Abstract: The adaptive estimation of coexisting temporal vertex (node) and edge signals on graphs is a critical task when a change in edge signals influences the temporal dynamics of the vertex signals. However, the current Graph Signal Processing algorithms mostly consider only the signals existing on the graph vertices and have neglected the fact that signals can reside on the edges. We propose an Adaptiv… ▽ More The adaptive estimation of coexisting temporal vertex (node) and edge signals on graphs is a critical task when a change in edge signals influences the temporal dynamics of the vertex signals. However, the current Graph Signal Processing algorithms mostly consider only the signals existing on the graph vertices and have neglected the fact that signals can reside on the edges. We propose an Adaptive Joint Vertex-Edge Estimation (AJVEE) algorithm for jointly estimating time-varying vertex and edge signals through a time-varying regression, incorporating both vertex signal filtering and edge signal filtering. Accompanying AJVEE is a newly proposed Adaptive Least Mean Square procedure based on the Hodge Laplacian (ALMS-Hodge), which is inspired by classical adaptive filters combining simplicial filtering and simplicial regression. AJVEE is able to operate jointly on the vertices and edges by merging two ALMS-Hodge algorithms specified on the vertices and edges into a unified formulation. A more generalized case extending AJVEE beyond the vertices and edges is being discussed. Experimenting on real-world traffic networks and population mobility networks, we have confirmed that our proposed AJVEE algorithm could accurately and jointly track time-varying vertex and edge signals on graphs. △ Less

Submitted 7 May, 2024; v1 submitted 11 November, 2022; originally announced November 2022.

arXiv:2203.10214 [pdf, other]

Thompson Sampling on Asymmetric $α$-Stable Bandits

Authors: Zhendong Shi, Ercan E. Kuruoglu, Xiaoli Wei

Abstract: In algorithm optimization in reinforcement learning, how to deal with the exploration-exploitation dilemma is particularly important. Multi-armed bandit problem can optimize the proposed solutions by changing the reward distribution to realize the dynamic balance between exploration and exploitation. Thompson Sampling is a common method for solving multi-armed bandit problem and has been used to e… ▽ More In algorithm optimization in reinforcement learning, how to deal with the exploration-exploitation dilemma is particularly important. Multi-armed bandit problem can optimize the proposed solutions by changing the reward distribution to realize the dynamic balance between exploration and exploitation. Thompson Sampling is a common method for solving multi-armed bandit problem and has been used to explore data that conform to various laws. In this paper, we consider the Thompson Sampling approach for multi-armed bandit problem, in which rewards conform to unknown asymmetric $α$-stable distributions and explore their applications in modelling financial and wireless data. △ Less

Submitted 25 March, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

Comments: 8 pages, 4 figures

arXiv:2203.00320 [pdf, ps, other]

doi 10.1007/s11265-022-01802-2

Graph Normalized-LMP Algorithm for Signal Estimation Under Impulsive Noise

Authors: Yi Yan, Radwa Adel, Ercan Engin Kuruoglu

Abstract: In this paper, we introduce an adaptive graph normalized least mean pth power (GNLMP) algorithm for graph signal processing (GSP) that utilizes GSP techniques, including bandlimited filtering and node sampling, to estimate sampled graph signals under impulsive noise. Different from least-squares-based algorithms, such as the adaptive GSP Least Mean Squares (GLMS) algorithm and the normalized GLMS… ▽ More In this paper, we introduce an adaptive graph normalized least mean pth power (GNLMP) algorithm for graph signal processing (GSP) that utilizes GSP techniques, including bandlimited filtering and node sampling, to estimate sampled graph signals under impulsive noise. Different from least-squares-based algorithms, such as the adaptive GSP Least Mean Squares (GLMS) algorithm and the normalized GLMS (GNLMS) algorithm, the GNLMP algorithm has the ability to reconstruct a graph signal that is corrupted by non-Gaussian noise with heavy-tailed characteristics. Compared to the recently introduced adaptive GSP least mean pth power (GLMP) algorithm, the GNLMP algorithm reduces the number of iterations to converge to a steady graph signal. The convergence condition of the GNLMP algorithm is derived, and the ability of the GNLMP algorithm to process multidimensional time-varying graph signals with multiple features is demonstrated as well. Simulations show the performance of the GNLMP algorithm in estimating steady-state and time-varying graph signals is faster than GLMP and more robust in comparison to GLMS and GNLMS. △ Less

Submitted 1 March, 2022; originally announced March 2022.

Journal ref: J Sign Process Syst (2022)

arXiv:2201.05821 [pdf, other]

Adaptive Sign Algorithm for Graph Signal Processing

Authors: Yi Yan, Ercan E. Kuruoglu, Mustafa A. Altinkaya

Abstract: Efficient and robust online processing technique of irregularly structured data is crucial in the current era of data abundance. In this paper, we propose a graph/network version of the classical adaptive Sign algorithm for online graph signal estimation under impulsive noise. Recently introduced graph adaptive least mean squares algorithm is unstable under non-Gaussian impulsive noise and has hig… ▽ More Efficient and robust online processing technique of irregularly structured data is crucial in the current era of data abundance. In this paper, we propose a graph/network version of the classical adaptive Sign algorithm for online graph signal estimation under impulsive noise. Recently introduced graph adaptive least mean squares algorithm is unstable under non-Gaussian impulsive noise and has high computational complexity. The Graph-Sign algorithm proposed in this work is based on the minimum dispersion criterion and therefore impulsive noise does not hinder its estimation quality. Unlike the recently proposed graph adaptive least mean p-th power algorithm, our Graph-Sign algorithm can operate without prior knowledge of the noise distribution. The proposed Graph-Sign algorithm has a faster run time because of its low computational complexity compared to the existing adaptive graph signal processing algorithms. Experimenting on steady-state and time-varying graph signals estimation utilizing spectral properties of bandlimitedness and sampling, the Graph-Sign algorithm demonstrates fast, stable, and robust graph signal estimation performance under impulsive noise modeled by alpha stable, Cauchy, Student's t, or Laplace distributions. △ Less

Submitted 15 January, 2022; originally announced January 2022.

arXiv:2109.14509 [pdf, other]

PAC-Bayes Information Bottleneck

Authors: Zifeng Wang, Shao-Lun Huang, Ercan E. Kuruoglu, Jimeng Sun, Xi Chen, Yefeng Zheng

Abstract: Understanding the source of the superior generalization ability of NNs remains one of the most important problems in ML research. There have been a series of theoretical works trying to derive non-vacuous bounds for NNs. Recently, the compression of information stored in weights (IIW) is proved to play a key role in NNs generalization based on the PAC-Bayes theorem. However, no solution of IIW has… ▽ More Understanding the source of the superior generalization ability of NNs remains one of the most important problems in ML research. There have been a series of theoretical works trying to derive non-vacuous bounds for NNs. Recently, the compression of information stored in weights (IIW) is proved to play a key role in NNs generalization based on the PAC-Bayes theorem. However, no solution of IIW has ever been provided, which builds a barrier for further investigation of the IIW's property and its potential in practical deep learning. In this paper, we propose an algorithm for the efficient approximation of IIW. Then, we build an IIW-based information bottleneck on the trade-off between accuracy and information complexity of NNs, namely PIB. From PIB, we can empirically identify the fitting to compressing phase transition during NNs' training and the concrete connection between the IIW compression and the generalization. Besides, we verify that IIW is able to explain NNs in broad cases, e.g., varying batch sizes, over-parameterization, and noisy labels. Moreover, we propose an MCMC-based algorithm to sample from the optimal weight posterior characterized by PIB, which fulfills the potential of IIW in enhancing NNs in practice. △ Less

Submitted 4 March, 2022; v1 submitted 29 September, 2021; originally announced September 2021.

Comments: ICLR'22 (Spotlight)

arXiv:2009.02623 [pdf, other]

Information Theoretic Counterfactual Learning from Missing-Not-At-Random Feedback

Authors: Zifeng Wang, Xi Chen, Rui Wen, Shao-Lun Huang, Ercan E. Kuruoglu, Yefeng Zheng

Abstract: Counterfactual learning for dealing with missing-not-at-random data (MNAR) is an intriguing topic in the recommendation literature since MNAR data are ubiquitous in modern recommender systems. Missing-at-random (MAR) data, namely randomized controlled trials (RCTs), are usually required by most previous counterfactual learning methods for debiasing learning. However, the execution of RCTs is extra… ▽ More Counterfactual learning for dealing with missing-not-at-random data (MNAR) is an intriguing topic in the recommendation literature since MNAR data are ubiquitous in modern recommender systems. Missing-at-random (MAR) data, namely randomized controlled trials (RCTs), are usually required by most previous counterfactual learning methods for debiasing learning. However, the execution of RCTs is extraordinarily expensive in practice. To circumvent the use of RCTs, we build an information-theoretic counterfactual variational information bottleneck (CVIB), as an alternative for debiasing learning without RCTs. By separating the task-aware mutual information term in the original information bottleneck Lagrangian into factual and counterfactual parts, we derive a contrastive information loss and an additional output confidence penalty, which facilitates balanced learning between the factual and counterfactual domains. Empirical evaluation on real-world datasets shows that our CVIB significantly enhances both shallow and deep models, which sheds light on counterfactual learning in recommendation that goes beyond RCTs. △ Less

Submitted 17 October, 2020; v1 submitted 5 September, 2020; originally announced September 2020.

arXiv:2006.08300 [pdf, other]

doi 10.1109/TGRS.2021.3069091

A Generalized Gaussian Extension to the Rician Distribution for SAR Image Modeling

Authors: Oktay Karakuş, Ercan E. Kuruoglu, Alin Achim

Abstract: In this paper, we present a novel statistical model, $\textit{the generalized-Gaussian-Rician}$ (GG-Rician) distribution, for the characterization of synthetic aperture radar (SAR) images. Since accurate statistical models lead to better results in applications such as target tracking, classification, or despeckling, characterizing SAR images of various scenes including urban, sea surface, or agri… ▽ More In this paper, we present a novel statistical model, $\textit{the generalized-Gaussian-Rician}$ (GG-Rician) distribution, for the characterization of synthetic aperture radar (SAR) images. Since accurate statistical models lead to better results in applications such as target tracking, classification, or despeckling, characterizing SAR images of various scenes including urban, sea surface, or agricultural, is essential. The proposed statistical model is based on the Rician distribution to model the amplitude of a complex SAR signal, the in-phase and quadrature components of which are assumed to be generalized-Gaussian distributed. The proposed amplitude GG-Rician model is further extended to cover the intensity SAR signals. In the experimental analysis, the GG-Rician model is investigated for amplitude and intensity SAR images of various frequency bands and scenes in comparison to state-of-the-art statistical models that include $\mathcal{K}$, Weibull, Gamma, and Lognormal. In order to decide on the most suitable model, statistical significance analysis via Kullback-Leibler divergence and Kolmogorov-Smirnov statistics are performed. The results demonstrate the superior performance and flexibility of the proposed model for all frequency bands and scenes and its applicability on both amplitude and intensity SAR images. The Matlab package is available at https://github.com/oktaykarakus/GG-Rician-SAR-Image-Modelling. △ Less

Submitted 7 April, 2021; v1 submitted 15 June, 2020; originally announced June 2020.

Comments: 20 Pages, 11 figures, 5 tables

arXiv:1904.05586 [pdf, other]

Black-Box Decision based Adversarial Attack with Symmetric $α$-stable Distribution

Authors: Vignesh Srinivasan, Ercan E. Kuruoglu, Klaus-Robert Müller, Wojciech Samek, Shinichi Nakajima

Abstract: Develo** techniques for adversarial attack and defense is an important research field for establishing reliable machine learning and its applications. Many existing methods employ Gaussian random variables for exploring the data space to find the most adversarial (for attacking) or least adversarial (for defense) point. However, the Gaussian distribution is not necessarily the optimal choice whe… ▽ More Develo** techniques for adversarial attack and defense is an important research field for establishing reliable machine learning and its applications. Many existing methods employ Gaussian random variables for exploring the data space to find the most adversarial (for attacking) or least adversarial (for defense) point. However, the Gaussian distribution is not necessarily the optimal choice when the exploration is required to follow the complicated structure that most real-world data distributions exhibit. In this paper, we investigate how statistics of random variables affect such random walk exploration. Specifically, we generalize the Boundary Attack, a state-of-the-art black-box decision based attacking strategy, and propose the Lévy-Attack, where the random walk is driven by symmetric $α$-stable random variables. Our experiments on MNIST and CIFAR10 datasets show that the Lévy-Attack explores the image data space more efficiently, and significantly improves the performance. Our results also give an insight into the recently found fact in the whitebox attacking scenario that the choice of the norm for measuring the amplitude of the adversarial patterns is essential. △ Less

Submitted 11 April, 2019; originally announced April 2019.

arXiv:1711.03633 [pdf, other]

doi 10.1016/j.sigpro.2018.07.028

Beyond trans-dimensional RJMCMC with a case study in impulsive data modeling

Authors: Oktay Karakuş, Ercan E. Kuruoğlu, Mustafa A. Altınkaya

Abstract: Reversible jump Markov chain Monte Carlo (RJMCMC) is a Bayesian model estimation method which has been used for trans-dimensional sampling. In this study, we propose utilization of RJMCMC beyond trans-dimensional sampling. This new interpretation, which we call trans-space RJMCMC, reveals the undiscovered potential of RJMCMC by exploiting the original formulation to explore spaces of different cla… ▽ More Reversible jump Markov chain Monte Carlo (RJMCMC) is a Bayesian model estimation method which has been used for trans-dimensional sampling. In this study, we propose utilization of RJMCMC beyond trans-dimensional sampling. This new interpretation, which we call trans-space RJMCMC, reveals the undiscovered potential of RJMCMC by exploiting the original formulation to explore spaces of different classes or structures. This provides flexibility in using different types of candidate classes in the combined model space such as spaces of linear and nonlinear models or of various distribution families. As an application for the proposed method, we have performed a special case of trans-space sampling, namely trans-distributional RJMCMC in impulsive data modeling. In many areas such as seismology, radar, image, using Gaussian models is a common practice due to analytical ease. However, many noise processes do not follow a Gaussian character and generally exhibit events too impulsive to be successfully described by the Gaussian model. We test the proposed method to choose between various impulsive distribution families to model both synthetically generated noise processes and real-life measurements on power line communications (PLC) impulsive noises and 2-D discrete wavelet transform (2-D DWT) coefficients. △ Less

Submitted 5 May, 2020; v1 submitted 9 November, 2017; originally announced November 2017.

Comments: 22 pages, 9 figures

Report number: SIGPRO-D-17-01574

Journal ref: Signal Processing 153 (2018) 396-410

arXiv:1404.4351 [pdf, ps, other]

Stable Graphical Models

Authors: Navodit Misra, Ercan E. Kuruoglu

Abstract: Stable random variables are motivated by the central limit theorem for densities with (potentially) unbounded variance and can be thought of as natural generalizations of the Gaussian distribution to skewed and heavy-tailed phenomenon. In this paper, we introduce stable graphical (SG) models, a class of multivariate stable densities that can also be represented as Bayesian networks whose edges enc… ▽ More Stable random variables are motivated by the central limit theorem for densities with (potentially) unbounded variance and can be thought of as natural generalizations of the Gaussian distribution to skewed and heavy-tailed phenomenon. In this paper, we introduce stable graphical (SG) models, a class of multivariate stable densities that can also be represented as Bayesian networks whose edges encode linear dependencies between random variables. One major hurdle to the extensive use of stable distributions is the lack of a closed-form analytical expression for their densities. This makes penalized maximum-likelihood based learning computationally demanding. We establish theoretically that the Bayesian information criterion (BIC) can asymptotically be reduced to the computationally more tractable minimum dispersion criterion (MDC) and develop StabLe, a structure learning algorithm based on MDC. We use simulated datasets for five benchmark network topologies to empirically demonstrate how StabLe improves upon ordinary least squares (OLS) regression. We also apply StabLe to microarray gene expression data for lymphoblastoid cells from 727 individuals belonging to eight global population groups. We establish that StabLe improves test set performance relative to OLS via ten-fold cross-validation. Finally, we develop SGEX, a method for quantifying differential expression of genes between different population groups. △ Less

Submitted 16 April, 2014; originally announced April 2014.

arXiv:1101.1456 [pdf, ps, other]

doi 10.1111/j.1365-2966.2011.18398.x

A Bayesian technique for the detection of point sources in CMB maps

Authors: F. Argueso, E. Salerno, D. Herranz, J. L. Sanz, E. E. Kuruoglu, K. Kayabol

Abstract: The detection and flux estimation of point sources in cosmic microwave background (CMB) maps is a very important task in order to clean the maps and also to obtain relevant astrophysical information. In this paper we propose a maximum a posteriori (MAP) approach detection method in a Bayesian scheme which incorporates prior information about the source flux distribution, the locations and the numb… ▽ More The detection and flux estimation of point sources in cosmic microwave background (CMB) maps is a very important task in order to clean the maps and also to obtain relevant astrophysical information. In this paper we propose a maximum a posteriori (MAP) approach detection method in a Bayesian scheme which incorporates prior information about the source flux distribution, the locations and the number of sources. We apply this method to CMB simulations with the characteristics of the Planck satellite channels at 30, 44, 70 and 100 GHz. With a similar level of spurious sources, our method yields more complete catalogues than the matched filter with a 5 sigma threshold. Besides, the new technique allows us to fix the number of detected sources in a non-arbitrary way. △ Less

Submitted 7 January, 2011; originally announced January 2011.

Comments: 9 pages, 9 figures. MNRAS accepted with major revisions

arXiv:1101.1397 [pdf, ps, other]

doi 10.1111/j.1365-2966.2011.18783.x

Joint Bayesian separation and restoration of CMB from convolutional mixtures

Authors: K. Kayabol, J. L. Sanz, D. Herranz, E. E. Kuruoglu, E. Salerno

Abstract: We propose a Bayesian approach to joint source separation and restoration for astrophysical diffuse sources. We constitute a prior statistical model for the source images by using their gradient maps. We assume a t-distribution for the gradient maps in different directions, because it is able to fit both smooth and sparse data. A Monte Carlo technique, called Langevin sampler, is used to estimate… ▽ More We propose a Bayesian approach to joint source separation and restoration for astrophysical diffuse sources. We constitute a prior statistical model for the source images by using their gradient maps. We assume a t-distribution for the gradient maps in different directions, because it is able to fit both smooth and sparse data. A Monte Carlo technique, called Langevin sampler, is used to estimate the source images and all the model parameters are estimated by using deterministic techniques. △ Less

Submitted 7 January, 2011; originally announced January 2011.

Comments: 11 pages, 6 figures. Submitted to MNRAS

arXiv:1101.1396 [pdf, ps, other]

doi 10.1109/TIP.2010.2048613

Adaptive Langevin Sampler for Separation of t-Distribution Modelled Astrophysical Maps

Authors: K. Kayabol, E. E. Kuruoglu, J. L. Sanz, B. Sankur, E. Salerno, D. Herranz

Abstract: We propose to model the image differentials of astrophysical source maps by Student's t-distribution and to use them in the Bayesian source separation method as priors. We introduce an efficient Markov Chain Monte Carlo (MCMC) sampling scheme to unmix the astrophysical sources and describe the derivation details. In this scheme, we use the Langevin stochastic equation for transitions, which enable… ▽ More We propose to model the image differentials of astrophysical source maps by Student's t-distribution and to use them in the Bayesian source separation method as priors. We introduce an efficient Markov Chain Monte Carlo (MCMC) sampling scheme to unmix the astrophysical sources and describe the derivation details. In this scheme, we use the Langevin stochastic equation for transitions, which enables parallel drawing of random samples from the posterior, and reduces the computation time significantly (by two orders of magnitude). In addition, Student's t-distribution parameters are updated throughout the iterations. The results on astrophysical source separation are assessed with two performance criteria defined in the pixel and the frequency domains. △ Less

Submitted 7 January, 2011; originally announced January 2011.

Comments: 12 pages, 6 figures

Journal ref: IEEE Transactions on Signal Processing, vol. 19, issue 9, 2010, pp. 2357-2368

arXiv:astro-ph/0307114 [pdf, ps, other]

doi 10.1051/0004-6361:20035858

An alpha-stable approach to the study of the P(D) distribution of unresolved point sources in CMB sky maps

Authors: D. Herranz, E. E. Kuruoglu, L. Toffolatti

Abstract: We present a new approach to the statistical study and modelling of number counts of faint point sources in astronomical images, i.e. counts of sources whose flux falls below the detection limit of a survey. The approach is based on the theory of alpha-stable distributions. We show that the non-Gaussian distribution of the intensity fluctuations produced by a generic point source population -- w… ▽ More We present a new approach to the statistical study and modelling of number counts of faint point sources in astronomical images, i.e. counts of sources whose flux falls below the detection limit of a survey. The approach is based on the theory of alpha-stable distributions. We show that the non-Gaussian distribution of the intensity fluctuations produced by a generic point source population -- whose number counts follow a simple power law -- belongs to the alpha-stable family of distributions. Even if source counts do not follow a simple power law, we show that the alpha-stable model is still useful in many astrophysical scenarios. With the alpha-stable model it is possible to totally describe the non-Gaussian distribution with a few parameters which are closely related to the parameters describing the source counts, instead of an infinite number of moments. Using statistical tools available in the signal processing literature, we show how to estimate these parameters in an easy and fast way. We demonstrate that the model proves valid when applied to realistic point source number counts at microwave frequencies. In the case of point extragalactic sources observed at CMB frecuencies, our technique is able to successfully fitting the P(D) distribution of deflections and to precisely determining the main parameters which describe the number counts. In the case of the Planck mission, the relative errors on these parameters are small either at low and at high frequencies. We provide a way to deal with the presence of Gaussian noise in the data using the empirical characteristic function of the P(D). The formalism and methods here presented can be very useful also for experiments in other frequency ranges, e.g. X-ray or radio Astronomy. △ Less

Submitted 9 June, 2004; v1 submitted 7 July, 2003; originally announced July 2003.

Comments: 16 pages, 6 figures, final version to appear in A&A (in press)

Journal ref: ECONF C030908:THNT004,2003

Showing 1–30 of 30 results for author: Kuruoglu, E E